Spec Complexity Displacement¶

Writing a spec doesn’t eliminate engineering precision — it relocates the work. A spec tight enough to drive reliable code generation accumulates schemas, pseudocode, and formal constraints until it becomes code-adjacent. Make it vague and reliability collapses; make it exhaustive and model adherence collapses.

The fallacy¶

People sell “just write a spec” as a shortcut: describe what you want, skip the cost of building it. The fallacy is that you cannot skip the cost of precision. You can only move it.

A spec precise enough to reliably generate correct code must encode type constraints, algorithm logic, schema definitions, and edge case coverage. The OpenAI Symphony specification analyzed by Gabriel Gonzalez contains database schemas, algorithm pseudocode, and configuration checklists: it reads as code, not prose (Gonzalez, 2026).

Two failure modes¶

Failure	Description	Outcome
Spec slop	Low-precision prose written at speed	Unreliable agent output; assumptions propagate
Over-specification	Excessive detail accumulates beyond model capacity	Adherence to individual instructions degrades as spec grows

Scott Logic found Spec Kit produced 2,000+ lines of Markdown per feature — still introducing bugs — while iterative prompting produced working code ten times faster (Scott Logic, 2025). Addy Osmani names the opposing failure the “curse of instructions”: as detail accumulates, adherence to individual instructions degrades (Osmani, O’Reilly). The sweet spot is narrow.

Complexity is conserved¶

Spec-driven development relocates complexity rather than removing it — planning replaces chaos, but the total work does not shrink (Thoughtworks, 2025).

graph LR
    A["Vague prose spec"] -->|"Spec slop"| B["Unreliable output"]
    C["Calibrated spec"] -->|"Complexity relocated"| D["Reliable output"]
    E["Exhaustive prose spec"] -->|"Curse of instructions"| F["Unreliable output"]

What replaces verbose specs¶

Formal enforcement gives precision-sensitive work a verification step that prose cannot:

Mechanism	Encodes	Verifiable
Type signatures and interfaces	Shape and contract	Yes — compiler
Tests as acceptance criteria	Behavioral requirements	Yes — test runner
Database schemas	Data structure	Yes — migration
Linters and format rules	Style and structure	Yes — CI
Prose spec	Intent, rationale	No

Reserve prose for what has no formal equivalent: business rationale, priority trade-offs, user intent. Delegate the precision work to artifacts that enforce rather than describe (Anthropic).

A spec is not the same as code¶

A spec covers every possible implementation; code is one of them. A spec is more abstract and transferable than code. But the precision needed for reliable generation pulls it toward code-like structure. The claim is not that specs are useless. It is that a spec precise enough to generate reliable code converges toward code-like structure, and the “simpler than writing code” argument collapses.

Example¶

A team writes an initial spec for a user authentication feature:

"Users should be able to log in with email and password."

After several iterations to improve agent reliability, the spec becomes:

"POST /auth/login accepts { email: string, password: string }. Validate email format with RFC 5322 regex. Hash password using bcrypt with cost factor 12. Return 200 with { token: string, expires_at: ISO8601 } on success. Return 401 with { error: "invalid_credentials" } for unknown email or wrong password. Rate-limit to 5 attempts per IP per 15 minutes using a sliding window; return 429 on breach. Log all attempts to the auth audit table with timestamp, IP, and outcome."

The second version is precise enough to generate reliable code. But it is also a type signature, a schema, a rate-limiting algorithm, and a logging requirement written as prose. The complexity did not disappear. It moved from code into the spec.