Part 1 named the thesis: constrain the what, free the how. A thesis needs a foundation. You cannot clamp the what by willpower alone — someone still has to decide what the words mean, what the interface looks like before anyone codes it, and what “done” means as an executable fact.
That is what the five competencies are. Not labels for team roles — five distinct disciplines, each with its own failure it prevents and its own artifact it produces. Three form the foundation the agent stands on. Two are the engine that turns that foundation into working software.
The engine needs ground. A tight red/green loop can sprint confidently in any direction — including the wrong one. The foundation is what keeps the direction fixed.
Five disciplines, not one dial
It is tempting to think of ADD as a single methodology with five labeled buckets. It is closer to the opposite: five pre-existing disciplines — domain modeling, spec-driven development, UX design, test-driven development, and build automation — each re-pointed at an AI builder.
The re-pointing is not cosmetic. When a human writes the code, these disciplines are naturally coupled: the engineer who understands the domain also implements it, carrying tacit knowledge through the work. When an AI agent writes the code, that coupling breaks. The agent has only what you wrote down. A concept that “everyone on the team knows” is invisible to the model. An interface that “looks right when you see it” is undefined until it is specified.
These five are separate disciplines because each catches a different class of failure that the others cannot.
DDD — Domain-Driven: the vocabulary everything else uses
Domain-Driven Development, as ADD uses the term, is the discipline of establishing a shared, precise language and the boundaries it lives in — the core concepts, the modules they belong to, and the invariants that must always hold.
The key word is shared. One name per concept, the same name used everywhere: in the spec, in the contract, in the tests, in the code. When the spec says tenant and the contract says account and the tests say client, you have three names for one idea — and the agent, reading each artifact in isolation, has no reliable way to know they are the same thing.
In ADD the domain language lives in PROJECT.md under a ## Domain section. In ai-proxy, the domain established at the start named provider, credential, budget, and key as distinct concepts with distinct meanings. Every later spec, contract, and test inherited those definitions. A milestone that reused domain terms by name could not accidentally invent a second meaning for a concept that already had one.
The failure DDD prevents: concept drift — where the spec, the contract, the tests, and the generated code each develop slightly different interpretations of the same domain idea, producing a system that is internally consistent but wrong.
SDD — Spec-Driven: the living definition of what is being built
Spec-Driven Development is the discipline of maintaining a living document of what is being built right now — what is settled, what is still open, and what it means for a feature to be done.
The word living is load-bearing. A spec signed once and filed is a description of the system as someone once imagined it. It drifts from the code silently, looking authoritative until it misleads. ADD’s spec is the opposite: it is a layer that changes as the loop learns. When Specify surfaces a riskier assumption than expected, the spec records it. When Build reveals a missing rule, the spec is updated before the build continues. When Observe catches a production surprise, a spec delta re-enters at Specify — not as a ticket for later, but as the next loop’s starting ground.
In ADD the spec has a fixed internal structure: what the feature Must do, what it must Reject (each rejection paired with a named error code), the After-state once it succeeds, and assumptions ranked lowest-confidence first.
# SPEC.md — tenant-budget-enforcementMust: - block any request that would cause a tenant's spend to exceed monthly_limit. - deduct the estimated token cost atomically before forwarding.Reject: - spend would exceed limit -> BUDGET_EXCEEDED (return 429, no forwarding) - monthly_limit absent or zero -> BUDGET_NOT_CONFIGURED (return 500)After: - running_spend reflects the deduction; the request is forwarded.Assumptions — lowest-confidence first: ⚠ the token cost estimate is available before the upstream call; if only after, the deduction must happen post-response and the atomicity guarantee changes.That flagged assumption is what the human reads first — and resolves in one sentence before any code exists. SDD is what makes Specify more than a prompt: a structured artifact the agent can reason about, not a paragraph it has to interpret.
The failure SDD prevents: fast waste — the agent sprinting confidently past an ambiguity you never noticed, because nobody wrote the assumption down where a person could see and challenge it.
UDD — UI/UX-Driven: the contract at the human boundary
UI/UX-Driven Development is the discipline of designing the human and system boundary before building toward it — user flows, every state a screen must handle, and a design source of truth that exists before implementation begins.
The principle is older than ADD: users use the interface, not the spec. UDD produces two things. First, the user flows — the happy path and the alternative paths, stated in terms the domain has named. Second, the UI states every screen must handle: loading, empty, error, and success. A screen that handles only the success state is an incomplete design, not just an incomplete implementation — and it is far cheaper to discover that gap in a wireframe than in a test failure.
When a milestone includes screens, UDD runs a design-definition loop before the build: review the domain into screens and regions, research and reuse existing components, wireframe the structure low-fidelity, then render and capture a real screen. That capture — an image a person approves — is the design-confirm evidence. The build then matches a layout a human has already seen, rather than discovering what it should be after the code lands.
The AI can generate a prototype from a design system. A person owns the empathy — what the user is trying to accomplish, and what “good” feels like from their side. That ownership cannot be delegated to the agent.
The failure UDD prevents: surface mismatch — a feature that satisfies all its internal tests but delivers the wrong experience at the boundary where the user actually encounters it.
TDD — Test-Driven: the executable definition of done
Test-Driven Development in ADD carries its familiar meaning, with one sharpening: tests assert observable behavior, not internals. This is not a style preference. It is what makes the code underneath disposable.
If a test asserts that a specific internal function was called, or that data is stored in a specific internal structure, then changing those internals breaks the test — and the build constraint becomes “preserve the implementation” rather than “preserve the behavior.” The agent cannot freely regenerate the code if the tests are pinned to the code’s interior shape.
Observable behavior means: given these inputs, the system produces these outputs and side effects, as seen from outside. The contract specifies what comes in and what comes out; the tests confirm that contract holds. Everything between those surfaces is the agent’s to choose, change, or discard.
The other discipline TDD enforces is temporal: tests are written before the implementation, and they must fail before the build begins. A test that passes before any code is written is a false gate that will wave bad code through later. In ai-proxy, the agent repeatedly found and fixed test bugs at this gate — assertions reading the wrong attribute, checking the wrong type — before those bugs could silently pass a broken implementation.
The failure TDD prevents: trust-by-inspection — believing a feature is correct because the code reads plausibly, rather than because its behavior has been proven against a defined standard.
ADD — AI/Build-Driven: the engine that runs on the foundation
AI/Build-Driven is the step where the agent writes code. It earns its place as a distinct competency because the constraint that governs it is precise and non-negotiable:
Make every test pass.Do not change any test.Do not change the contract.Stop and ask if any requirement is unclear — do not guess.Within those walls, the how is entirely the agent’s. It chooses the data structure, the algorithm, the library, the internal organization. It is free to discard and regenerate the code at will, because the tests pin only behavior. That freedom is where the model is genuinely good — and it is the freedom that the first four competencies make safe to grant.
The constraint also defines what failure looks like. An agent that weakens a test to make the suite green has inverted the method — it is now preserving the implementation by adjusting the definition of done. An agent that changes the frozen contract to simplify its build has broken the one human gate. Both moves are forbidden by the internal logic of the method: allow them and you no longer have ADD. You have a fast way to produce code that passes its own tests — exactly the trap the method exists to close.
The failure ADD prevents: unconstrained generation — the model inventing scope, silently changing the interface, or producing plausible-but-wrong code with no executable standard to check it against.
How they stack: the foundation and the engine
The five competencies are not peers at the same level. They form a hierarchy — and the direction of that hierarchy is the whole point.
DDD establishes the vocabulary. SDD uses that vocabulary to specify what must be built. UDD specifies the experience at the boundary where users encounter what was built. TDD produces an executable definition of done from the spec and contract. ADD uses those failing tests as its sole directive.
Each layer depends on the one below. A spec that uses ambiguous domain terms produces ambiguous error codes — and ambiguous error codes produce tests that cannot distinguish one failure from another. An agent that builds without failing tests produces code with no checkable standard.
This architecture also explains why context rot is a foundation problem, not an agent problem. When domain language, the current spec stance, and user-experience intent live only in someone’s head, each new session starts cold. The agent fills the gap with plausible guesses — re-deriving the domain, re-interpreting the spec, re-inventing the UI state handling. State that lives in a chat window decays; state that lives in PROJECT.md does not. In ADD, all three foundation concerns are written into one living document — short enough to read first in any session, durable enough to outlive every milestone.
The arrows in the diagram run both ways. Context flows up into the engine. Corrections flow back down: when a build exposes a domain term that was wrong, you stop, fix the foundation, and come forward again. A passing test built on a broken foundation is still the wrong software, fast.
Why the what can be clamped at all
The thesis of Part 1 — constrain the what, free the how — depends on these five competencies being real, maintained disciplines. Without DDD, the spec uses terms the agent re-interprets per session. Without SDD, the agent fills ambiguity with confident guesses. Without UDD, the what includes no human-boundary definition. Without TDD, “done” has no executable meaning. Without ADD operating under those constraints, every one of those freedoms re-opens.
The five competencies are what makes “constrain the what” an action you can actually take, rather than a slogan. Each closes a specific gap. Together they form the only stable surface the agent can build against.
The code the agent produces is disposable. Regenerate it tomorrow with a different model and the tests should still go green. The artifacts — the domain model, the living spec, the design confirmation, the frozen contract, the red test suite — are the durable assets. They are the what, clamped. The agent’s code is the how, freed.
Next in the series: Part 3 goes into Specify and Scenarios — the first two human-led steps, and the discipline that turns a fuzzy need into an unambiguous, buildable definition. Specify and Scenarios →