A solar pro-forma rests on one number it rarely interrogates: the baseline electricity spend the customer would have paid without the project. Get it wrong by 3% and the error compounds across a twenty-year IRR — and in most deals, the baseline assumption is a larger source of pro-forma error than the production model. A defensible baseline isn't a single blended $/kWh rate. It's a line-itemized reconstruction of historical spend — energy by TOU period, demand by month, fixed charges, riders, taxes — where every input traces back to an actual bill.
This guide covers the four ways a baseline goes soft, how to build one you can defend line by line, and why the intake step is the part that actually costs you.
#Why the baseline matters more than the production model
A pro-forma's savings — the cash flow the whole model discounts — is the gap between what the customer pays today and what they'll pay after the project. The production model estimates the second number. The baseline is the first. An error in the baseline doesn't average out; it shifts the entire savings stack and rides the escalator for the life of the model.
Two reasons the baseline is the riskier input:
- It's an anchor, not an estimate. The production model carries explicit uncertainty bands everyone reviews. The baseline is usually a single rate treated as fact — so its error is invisible and unhedged.
- It compounds. A 3% baseline error grows every year the escalator is applied. Over twenty years, a soft baseline can move the IRR more than a P50-to-P90 swing on production.
If the IC stress-tests production sensitivities but accepts the baseline as given, the model is being scrutinized in the wrong place.
#The 4 ways a solar pro-forma baseline goes wrong
Each of these produces a baseline that looks clean and defends poorly. All four trace back to the same root: a number derived from bills that weren't fully read.
#1. The blended-rate trap
The most common baseline is a single blended $/kWh — total dollars divided by total kWh across some historical window. It's easy to compute and impossible to defend, because it collapses energy, demand, fixed charges, and riders into one number that responds to none of them correctly. Solar offsets energy and some demand; it doesn't touch fixed charges. A blended rate applied uniformly over-credits the project on the fixed portion and misattributes the demand savings. The number pencils. The component logic underneath it is wrong.
#2. Misclassified charges
When bills are hand-keyed, ambiguous line items get folded into catch-all buckets. A demand-response credit lands in "other charges." A power-factor adjustment gets read as a tax. A rider that should escalate separately gets absorbed into the energy rate. Each misclassification shifts the baseline by a fraction of a cent — and a baseline built on misclassified components can't be stress-tested correctly later, because the sensitivities are applied to the wrong lines.
#3. Estimated reads and true-ups
Utility bills with estimated reads bill against an estimate, then true up the next cycle — often with a negative adjustment that makes a month look anomalously low. A baseline averaged over a window that includes estimated months without normalizing for the true-ups drifts every other month. Twelve months of bills that include three estimated reads do not give you twelve clean data points.
#4. Too short a window
A baseline built on three months of bills misses the seasonal shape that drives demand charges and TOU economics. Summer on-peak demand on a C&I account can be multiples of the winter figure. A baseline that doesn't span a full seasonal cycle either over- or under-states the demand component depending on which quarter the bills came from — and demand is often where the solar-plus-storage savings case actually lives.
#How to build a defensible baseline
A defensible baseline is a reconstruction, not a rate. Five steps:
- Pull at least twelve months of bills — a full seasonal cycle — for every account in the deal.
- Line-itemize each bill. Separate energy by TOU period, demand by window, fixed charges, riders, adjustments, and taxes. The blended rate becomes one output of this, not the input.
- Normalize for billing artifacts. Identify estimated reads, pair them with their true-ups, and flag mid-cycle rate changes so each charge is attributed to the correct rate period.
- Map each line to its tariff component. Knowing a $14,000 charge is coincident demand on a Tier 2 industrial tariff — not just "a demand charge" — is what lets you apply the right escalator and the right offset.
- Build assumptions on components, not the blend. Apply escalators, sensitivities, and stress tests at the component level. Now each line in the model traces to a line on a bill, and every assumption is defensible on its own terms.
The output is a baseline where the IC's question — "where did this number come from?" — has an answer for every line.
#The hidden cost: the intake tax
Here's the part the method glosses over. Every step above assumes the bills are already structured data. They aren't. They're PDFs — often scanned, often spanning multiple utilities and tariffs — and getting them into a usable, line-itemized form is the step that actually consumes the deal timeline. (Utility-bill data extraction is the name for that step; why generic OCR fails on utility bills explains why a quick script doesn't do it.)
So the work gets shortcut. A junior associate hand-keys eighteen months of bills into a spreadsheet under deadline, folds two demand-response credits into "other charges," and the baseline shifts from $0.087 to a real $0.091 that nobody surfaces. The deal still pencils — but the IC reviews a number that's quietly wrong, built by the person on the team least equipped to catch a misclassification.
The analyst's job is judgment about deals, not data entry. Hand-keying bills is the tax paid for not having a tool — and it's where the four failure modes above get introduced, because hand-keying under deadline is exactly when charges get misclassified and windows get truncated.
#What Tariform does
Extract removes the intake tax that makes baselines soft. Utility-bill PDFs go in — digital or scanned, across any US or Canadian utility and tariff — and line-itemized, tariff-aware, source-traceable structured data comes out: energy by TOU period, demand by window, fixed charges, riders, taxes, each mapped to its tariff component. Estimated reads and mid-cycle rate changes are handled as known patterns, not anomalies. Every bill reconciles against its printed total or is flagged for review. Every value points back to its source PDF line.
That's the difference between a blended rate you hope holds up and a baseline you can defend line by line at the IC. Book a demo — twenty minutes, a real bill, you see the output. Prefer to try it yourself? Start a free trial — upload a real bill and see the extraction in minutes.

