Objective Drift: When Agents Lose the Thread¶

After context compression, agents can continue working productively on a subtly wrong objective — the original intent lost in summarisation.

Learn it hands-on: work through the Objective Drift guided lesson, which includes quizzes.

Why it happens¶

Summarization favors high-frequency content. A constraint such as "do not change public method signatures" appears once. The core task, "refactor for DI", recurs across many messages. So summarization discards the constraint as noise (LangChain on context management). Downstream steps compound the error. Each tool call is consistent with the compressed objective, so the agent builds toward the wrong target with no internal signal.

A second trigger is instruction fade-out. Models deprioritize the initial instructions as history grows, even when those instructions remain present (Bui, 2026 §3.2).

Detection and mitigation¶

Watch for these signals: the agent "completes" without satisfying the original requirement, the output format diverges from the spec, or the agent solves a subtly different problem.

Preserve intent in structured summaries. A named session_intent field survives compression better than prose. LangChain recommends structured summaries that keep task objectives. A session recap formalizes this as a fixed-schema, agent-authored artifact, written at each boundary: compaction, resume, or fork.

Anchor constraints in the system prompt. System-prompt content is less likely to be paraphrased away during summarization.

Use bounded tasks. The Ralph Wiggum Loop bounds each session to one task. Each restart re-reads the original specification from disk.

Add event-driven reminders. Re-inject objectives at decision points (Bui, 2026 §2.3.4).

Example¶

A long-running agent receives this task: "Refactor the UserService class to use dependency injection. Do not change any public method signatures." After dozens of tool calls, compaction compresses the context. The prose summary keeps "refactor UserService for DI" but drops the constraint about method signatures. The agent then renames get_user_by_id to find_user. That fits the refactor goal, but it violates the original constraint.

The fix is a structured session-intent file. You write it before the agent starts, and it survives compression verbatim:

// session_intent.json — written by the orchestrator, re-read after compaction
{
  "objective": "Refactor UserService to use dependency injection",
  "constraints": [
    "Do not change any public method signatures",
    "Do not modify files outside src/services/user_service.py and its tests"
  ],
  "completion_criteria": "All existing tests pass; no public method signatures changed",
  "created_at": "2025-11-14T09:00:00Z"
}

The system prompt instructs the agent to re-read session_intent.json at the start of every new message and before any file modification:

SYSTEM_PROMPT = """
You are a refactoring agent. Before each action:
1. Read session_intent.json
2. Confirm your planned action satisfies all constraints listed there
3. If any constraint would be violated, stop and report instead of proceeding
"""

Together, the structured intent file and the system-prompt anchor keep the exact constraints through summarization and hold the agent's attention on them all session.

When this backfires¶

Short sessions: session_intent.json adds overhead for sessions that never reach compaction.
Exploratory tasks: strict anchoring blocks legitimate course corrections mid-session.
Compaction policy mismatch: structured summaries only help if the compressor keeps named fields, and many paraphrase them anyway.

Key Takeaways¶

Objective drift occurs when summarisation loses task specifics or instructions fade from attention.
The agent appears productive but solves the wrong problem — drift is subtle, not obvious.
Structured summaries with a named session-intent field resist drift better than prose.
Event-driven reminders counter fade-out by re-injecting objectives at decision points.
Bounded sessions (Ralph Wiggum Loop) prevent drift from accumulating across iterations.

The Ralph Wiggum Loop
Attention Latch: When Agents Stay Anchored to Stale Instructions — the structural over-squashing mechanism behind instruction fade-out
Post-Compaction Re-read Protocol — restores instruction compliance after compaction
Event-Driven System Reminders — counters fade-out by injecting targeted reminders
Context Compression Strategies: Offloading and Summarisation — tiered compression that preserves task intent through summarisation
Context Poisoning — hallucinated facts compound through context
Distractor Interference — irrelevant instructions reduce compliance
The Kitchen Sink Session — mixing unrelated tasks fills context with noise
Assumption Propagation — early misunderstandings compound over time, similar to how drift compounds after compression
The Infinite Context Anti-Pattern — context overload dilutes attention, accelerating drift
Token Preservation Backfire — token-saving instructions create a competing objective that undermines task completion
Spec Complexity Displacement — constraints that grow too complex to track reliably, compounding drift risk