Skip to content

Spec Complexity Displacement

Writing a spec doesn’t eliminate engineering precision — it relocates the work. A spec tight enough to drive reliable code generation accumulates schemas, pseudocode, and formal constraints until it becomes code-adjacent. Make it vague and reliability collapses; make it exhaustive and model adherence collapses.

The fallacy

People sell “just write a spec” as a shortcut: describe what you want, skip the cost of building it. The fallacy is that you cannot skip the cost of precision. You can only move it.

A spec precise enough to reliably generate correct code must encode type constraints, algorithm logic, schema definitions, and edge case coverage. The OpenAI Symphony specification analyzed by Gabriel Gonzalez contains database schemas, algorithm pseudocode, and configuration checklists: it reads as code, not prose (Gonzalez, 2026).

Two failure modes

Failure Description Outcome
Spec slop Low-precision prose written at speed Unreliable agent output; assumptions propagate
Over-specification Excessive detail accumulates beyond model capacity Adherence to individual instructions degrades as spec grows

Scott Logic found Spec Kit produced 2,000+ lines of Markdown per feature — still introducing bugs — while iterative prompting produced working code ten times faster (Scott Logic, 2025). Addy Osmani names the opposing failure the “curse of instructions”: as detail accumulates, adherence to individual instructions degrades (Osmani, O’Reilly). The sweet spot is narrow.

Complexity is conserved

Spec-driven development relocates complexity rather than removing it — planning replaces chaos, but the total work does not shrink (Thoughtworks, 2025).

graph LR
    A["Vague prose spec"] -->|"Spec slop"| B["Unreliable output"]
    C["Calibrated spec"] -->|"Complexity relocated"| D["Reliable output"]
    E["Exhaustive prose spec"] -->|"Curse of instructions"| F["Unreliable output"]

What replaces verbose specs

Formal enforcement gives precision-sensitive work a verification step that prose cannot:

Mechanism Encodes Verifiable
Type signatures and interfaces Shape and contract Yes — compiler
Tests as acceptance criteria Behavioral requirements Yes — test runner
Database schemas Data structure Yes — migration
Linters and format rules Style and structure Yes — CI
Prose spec Intent, rationale No

Reserve prose for what has no formal equivalent: business rationale, priority trade-offs, user intent. Delegate the precision work to artifacts that enforce rather than describe (Anthropic).

A spec is not the same as code

A spec covers every possible implementation; code is one of them. A spec is more abstract and transferable than code. But the precision needed for reliable generation pulls it toward code-like structure. The claim is not that specs are useless. It is that a spec precise enough to generate reliable code converges toward code-like structure, and the “simpler than writing code” argument collapses.

Example

A team writes an initial spec for a user authentication feature:

"Users should be able to log in with email and password."

After several iterations to improve agent reliability, the spec becomes:

"POST /auth/login accepts { email: string, password: string }. Validate email format with RFC 5322 regex. Hash password using bcrypt with cost factor 12. Return 200 with { token: string, expires_at: ISO8601 } on success. Return 401 with { error: "invalid_credentials" } for unknown email or wrong password. Rate-limit to 5 attempts per IP per 15 minutes using a sliding window; return 429 on breach. Log all attempts to the auth audit table with timestamp, IP, and outcome."

The second version is precise enough to generate reliable code. But it is also a type signature, a schema, a rate-limiting algorithm, and a logging requirement written as prose. The complexity did not disappear. It moved from code into the spec.

Feedback