Ai Ml

ADD for Finance and FP&A: Lock the Assumptions, Prove the Forecast

An AI-built model reconciles at a glance and hides a broken assumption or a double-count. ADD locks the assumption set as the frozen contract, makes reconciliation and integrity checks the red tests, and verifies by backtest against actuals — not by whether the numbers look reasonable.

Tin Dang avatar
Tin Dang
Series hero on warm paper: 'ADD for Finance and FP&A — lock the assumptions; prove the forecast'

The annual plan is due in two weeks. Your AI assistant builds a three-statement model overnight: revenue by product line, headcount by department, a cash flow waterfall, a board narrative. The tabs reconcile. The growth rates look defensible. The EBITDA bridge reads clean. You forward it to the CFO.

Three days later, in a board prep call, someone asks why Q3 gross margin is 200 basis points below last year’s actuals. You go looking. The model is using a blended COGS rate from a product mix that no longer exists — the company discontinued two SKUs in February — and the AI picked it from a prior-year export that nobody thought to label as stale. Every downstream number absorbed that error invisibly. The board deck is wrong. The variance analysis is wrong. And it all looked right, because a broken assumption that reconciles at a glance is the most dangerous kind.

That is fast waste in finance. It is not a spreadsheet error — it is a confident, authoritative artifact built on an unstated assumption that nobody locked down before the work began.

This post translates AI-Driven Development into finance and FP&A. The framework post covers the universal eight-step loop; here the focus is the domain translation — what changes when the producer is building models and forecasts rather than code.

The four failures, in finance

The same four AI-era failures appear in every knowledge-work domain, but in finance they have specific textures.

Fast waste. AI produces a plausible financial model fast. The danger is not a model that obviously fails — it is one that ties, reconciles, and reads authoritative while carrying a broken assumption or a silent double-count buried three levels deep. Because the output looks finished, the error survives the first review and surfaces in an earnings call or a board session.

Context rot. The driver definitions, the chart-of-accounts structure, the revenue recognition policy, the intercompany elimination rules — these live in someone’s head, in a prior model, in a finance wiki that was last updated during the ERP migration. Every new AI session re-guesses them. Every new analyst inherits a model with no stated assumptions.

Trust by inspection breaks down. A column of numbers that sums correctly is not a correct column of numbers. “It ties to last year” is not evidence that the model is right; it is evidence that the error was consistent. Financial output is uniquely susceptible to plausibility bias: well-formatted numbers with reasonable growth rates feel true even when they are not.

Verification ceiling. When AI can produce a 40-tab model in an afternoon, the bottleneck is not building the model — it is knowing whether to trust it. Reviewing a model you did not build, against actuals you must independently pull, under a deadline, is harder than the build itself. Throughput that outpaces your ability to verify is not productivity; it is unreviewed financial risk.

The loop, translated

0 · Ground — load the chart of accounts and actuals

Before any model is built, the producer reads the domain context into a single place: the chart of accounts, the prior three years of actuals from the GL, the revenue recognition policy, the driver definitions currently in use (headcount by department, ACV by segment, utilization rate by practice), and any active accounting policy memos. This is not documentation for the analyst — it is the read-first artifact that the AI must ingest before producing anything.

Without it, the AI guesses driver definitions from context. The guess will often be wrong in a way that looks right.

1 · Specify — state every assumption before the model is built

The specification in FP&A is a structured assumption set. Every material assumption is stated, named, and paired with a refusal code the model must return if that assumption cannot be honored. This is the step where the work of forecasting actually happens — not in the model, but in the negotiation over what the model is allowed to assume.

# FORECAST SPEC — FY2027 Annual Plan (Revenue & Gross Margin)
# Version: 1.0 | Owner: VP Finance | Gate: CFO sign-off required before build
GROUND:
chart_of_accounts: GL export 2024-2026, closed periods only
actuals_source: NetSuite GL export, as of 2026-11-30 close
prior_model: FY2026 plan v3.2 (filed 2025-12-01)
policy_memo: Revenue Recognition Policy v4 (IFRS 15, updated 2024-09)
ASSUMPTIONS — stated explicitly (AI must refuse to build if any are absent):
revenue_growth_rate: 12% YoY on ARR base, applied monthly with linear ramp
gross_margin_rate: 68% blended; breakdown by product line required (SaaS 74%, Services 51%)
headcount_plan: HC by department from approved org chart v2026-Q3; no unplanned hires
attrition_assumption: 8% voluntary, applied monthly starting M2
cogs_driver: hosting costs from vendor contract dated 2026-08; no spot-rate extrapolation
intercompany: all IC transactions eliminated per consolidation policy; no netting against revenue
fx_treatment: USD only; no cross-currency conversion in model scope
REFUSALS (model must return these codes, not silently proceed):
ASSUMPTION-UNSTATED: a material driver is missing from the assumption set above
DOUBLE-COUNT: a revenue or cost line appears in more than one roll-up without an elimination entry
RECONCILIATION-BREAK: any subtotal does not tie to its constituent lines within $1 rounding tolerance
CIRCULAR-REF: any cell references a cell that references it (directly or through a chain)
AFTER-STATE (what success looks like):
- A driver-based three-statement model (P&L, balance sheet, cash flow) that ties to actuals at the opening period
- Every assumption visible in a dedicated Assumptions tab — no hardcoded numbers in formula cells
- Variance analysis vs prior plan and vs actuals embedded on a Waterfall tab
- Board narrative (executive summary, 3 risks, 3 opportunities) referencing only numbers present in the model

The human reads the ASSUMPTION-UNSTATED risk first — the one most likely to be wrong, most expensive when it is. The CFO or controller confirms the assumption set before the model is built. That conversation is the real work of FP&A; the model is the artifact of it.

2 · Scenarios — the model case, the edge, the failure

Three scenarios make “correct” concrete before any cells exist.

Base case. Revenue grows at 12% YoY, headcount follows the approved org chart, COGS follows the vendor contract. The three statements reconcile, the Assumptions tab states every driver, the model opens at the November actuals close.

Edge case — sensitivity run. Revenue growth is 6% (a 50% downside). Gross margin compresses 300 bps because the hosting contract has a minimum commitment that becomes a larger share of lower revenue. The model produces the sensitivity output without changing the base-case assumptions; the two scenarios coexist as separate columns, not overwritten inputs.

Failure case — the broken model. The model does not open at actuals. Opening retained earnings on the balance sheet does not match the GL close. This is a RECONCILIATION-BREAK — the model must return that code and stop, not produce output that ties internally but departs from reality.

3 · Contract — the frozen assumption set

The one human gate in FP&A is CFO or controller sign-off on the assumption set, locked before the model is built out. This is the frozen contract.

# ASSUMPTION CONTRACT — FY2027 Annual Plan
# Status: FROZEN @ v1.0 | Approved: CFO sign-off 2026-12-03
# Checksum: [hash of this file]
revenue_growth_rate: 12.0% ARR YoY, linear monthly ramp
gross_margin_blended: 68.0% (SaaS 74.0% / Services 51.0%)
headcount: per org chart v2026-Q3, no unplanned hires
attrition: 8.0% voluntary, starting M2
cogs_driver: hosting contract 2026-08, fixed schedule
intercompany: full elimination, per consolidation policy v4
fx_scope: USD only
Change protocol: any change to this set requires a new version number,
CFO re-approval, and a model rebuild from the new base. No ad-hoc cell edits.

This is the step that breaks the most common bad habit in finance: building the model and negotiating the assumptions at the same time. When the model is already built, changing an assumption feels expensive — so assumptions quietly harden into facts without ever being explicitly approved. The frozen contract inverts that: assumptions are cheap to change before the model exists, and expensive to change after. Lock them first.

Constrain the what — the assumption set and the reconciliation structure — and leave the how — tab layout, formula approach, scenario mechanics — entirely to the producer.

4 · Acceptance checks — the red tests

These are the checks that must pass for the model to be trusted. Write them before the model is built, run them as a formal checklist when the build is delivered.

# MODEL INTEGRITY CHECKLIST — FY2027 Annual Plan
# Run against every model version before distribution. All items must pass.
RECONCILIATION CHECKS
[ ] P&L opening period ties to GL actuals within $1 rounding tolerance
[ ] Balance sheet balances (Assets = Liabilities + Equity) every period
[ ] Cash flow statement reconciles to balance sheet change in cash every period
[ ] Retained earnings roll ties to net income minus dividends
[ ] Gross margin by product line rolls up to blended gross margin in the summary
ASSUMPTION-STATED CHECKS
[ ] Every driver in the Assumptions tab matches the frozen contract v1.0
[ ] No hardcoded numbers in formula cells (Assumptions tab is the only source of inputs)
[ ] Sensitivity tab references base-case Assumptions tab; no independent hardcodes
INTEGRITY CHECKS
[ ] No circular references (Excel: Formulas → Error Checking → Circular References returns empty)
[ ] No #REF!, #DIV/0!, or #N/A errors in any cell in the distribution range
[ ] Intercompany revenue and cost lines net to zero in the consolidation tab
[ ] No revenue line appears in more than one roll-up without a corresponding elimination entry
VARIANCE CHECKS
[ ] Revenue variance vs prior plan explained on Waterfall tab (no unexplained residual > $50K)
[ ] Gross margin variance vs actuals explained; no line item variance > 5% unexplained
[ ] Board narrative references only figures present in the model (spot-check three cited numbers)
SENSITIVITY BOUNDS
[ ] Sensitivity scenarios do not alter base-case Assumptions tab inputs
[ ] At 6% revenue growth, model does not produce negative cash; if it does, flag as a risk item

These checks exist before the model is built. The AI knows what it must pass. A model delivered without running this checklist is not a deliverable — it is a draft.

5 · Produce — build the model to pass the checks

The AI builds the model under one constraint: make every check on the integrity checklist pass. The how is unconstrained — tab structure, formula architecture, scenario mechanics, narrative framing. The only non-negotiable: do not change the frozen assumption set, and do not produce output that cannot pass the reconciliation checks.

The build instruction is narrow: make the red tests green without touching the contract. The AI can choose whether to use a flat model or a driver tree, whether to stack scenarios horizontally or on separate tabs, whether to write the board narrative in bullets or prose. Those are execution choices. The outcome is fixed.

6 · Verify by evidence — backtest and independent reconciliation

This is where most financial workflows fail. “The model looks reasonable” is not verification. “The CFO reviewed it” is not verification if the review was reading the output rather than checking it against independent data. The standard for trust in ADD is evidence, not inspection.

In FP&A, evidence means two things:

Backtesting against actuals. Before the forecast is distributed, run it against the most recent closed period. If the model, applied to last quarter’s actual inputs, produces a gross margin within the stated tolerance of last quarter’s actual gross margin, that is evidence. If it does not, that is a RECONCILIATION-BREAK — and the broken assumption is now visible, before the board sees it.

Independent reconciliation. An analyst who did not build the model pulls the same numbers from the GL and recomputes three key line items independently. If the independent computation agrees with the model, the reconciliation holds. If it does not, the model stops — it does not get distributed with a note that “the variance is immaterial.”

The adversarial move is a refute-read of the assumption set: a second reviewer argues that the model is wrong, hunting specifically for a hidden double-count, an unstated assumption, or an elimination that was missed. This is not a collegial review — it is a directed attempt to break the model. A model that survives a genuine refute-read has earned a different kind of trust than one that was “reviewed and looked fine.”

7 · Observe and fold — actuals become the next assumption set

Every month, actuals close and the forecast-vs-actual variance is computed. In a conventional process, the variance report is produced, discussed, and filed. In ADD, the variance folds back into the assumption set for the next period.

If Q1 headcount attrition ran at 11% rather than the modeled 8%, the assumption contract for Q2 is updated before the next model is built — not during the build, not after the board presentation. The driver library grows: the attrition rate is now a tracked variable with a history of model vs actual, and the next assumption is informed by that history rather than by a prior-year benchmark that may no longer apply.

The model is living, not filed. The assumption set is versioned, not oral. Every month the loop closes a little tighter.

Constrain the what, free the how

A board model has a fixed what: it must tie to actuals at the opening period, state every assumption explicitly, pass the integrity checklist, and survive an independent reconciliation. Those constraints are non-negotiable. They define the outcome.

The how is entirely open: the tab architecture, the scenario structure, the presentation of the board narrative, the choice of chart types for the waterfall, the level of granularity in the headcount plan. A finance team that specifies the outcome tightly and leaves the execution open gets a model that is both trustworthy and genuinely well-built — because the AI can optimize the execution when it is not also guessing at the outcome.

The failure mode is the inverse: specifying the how (put revenue on tab 3, use an index-match for the product lookup) while leaving the what vague (make the numbers look right). That produces a model that follows the layout spec and fails the reconciliation test.

Ungoverned AI modelADD-governed model
Assumption source Inferred from prior exports; often stale or mismatched Stated explicitly in the frozen contract; CFO-approved before build
Reconciliation Ties internally; may depart from GL actuals silently Reconciliation to GL actuals is a mandatory acceptance check
Driver definitions Implicit in formula logic; not inspectable Named in the Assumptions tab; every formula traces to a named driver
Verification method "It looks reasonable" / CFO read-through Backtest vs actuals + independent reconciliation + refute-read
Change management Ad-hoc cell edits; no version trail Assumption contract versioned; every change requires re-approval
Next cycle Start from the prior model and adjust Actuals fold into the next assumption set; driver history is tracked

What does not transfer

ADD disciplines the process of building financial models. It does not replace the judgment the process requires.

Materiality is a judgment call. The integrity checklist can flag a variance above $50K, but whether that variance is material to a board decision is a question of context and strategy — not a threshold. A CFO can sign off on a model that fails a sensitivity check if the business case justifies it. ADD records that decision; it does not make it.

Strategy is upstream of the model. The assumption that revenue grows at 12% comes from a sales pipeline, a market thesis, and bets on competitive position. ADD governs what happens after that decision is made. It cannot tell you whether 12% is the right number — only whether the model faithfully reflects whatever number the team chose.

Fiduciary accountability is irreducible. A controller signs the close. A CFO presents to the board. An auditor tests the controls. None of those responsibilities can be delegated to a process. ADD is a discipline for producing trustworthy inputs to those judgments — not a substitute for them.

Over-proceduralization is a real risk. A small finance team running a quick scenario analysis does not need a formal assumption contract and a five-item checklist. Even a lightweight version — state your assumptions, reconcile to actuals, do one independent check — captures most of the value without the ceremony.

Next in the series

The next post applies this same loop to Legal and Contracts, where the fast-waste failure is an AI-drafted clause that reads professionally and silently omits a governing-law provision or inverts a liability cap. ADD for Legal and Contracts translates the frozen contract into a clause-level obligation spec, and the acceptance checks into a redline and coverage audit.

0

Next in this series

ADD for Legal: Contract-First, Literally

Continue reading