Event-Driven System Reminders¶
Inject targeted guidance at specific points during agent execution to combat instruction fade-out and reinforce safety constraints without bloating the static system prompt.
Learn it hands-on with When the Prompt Fades — a guided lesson with quizzes.
The problem: instruction fade-out¶
Static system prompts lose effect over long sessions. The model gradually deprioritizes the initial instructions as the conversation history grows, even when those instructions stay present in context (Bui, 2025 §3.2).
This differs from context compression. Instructions can survive compaction but still fail to shape behavior, because they sit in a low-attention region of the context (Bui, 2025 §3.2). Research on long-context LLMs confirms that models attend most reliably to content at the start and end of the context, and recall middle content less well (Liu et al., 2023).
Event detectors¶
Rather than repeat every instruction continuously, event detectors watch for specific conditions that warrant a targeted re-injection (Bui, 2025 §2.3.4):
- Repeated tool failures — the agent is stuck in a retry loop
- Approaching token budget — context pressure calls for a strategy change
- Safety violations — a dangerous command was attempted or approval was denied
- Extended execution time — long-running sessions where fade-out is likely
Each detector triggers a reminder injection at the next decision point rather than interrupting mid-execution.
Template resolution¶
Each triggered event is matched against reminder templates. The templates combine static guidance with dynamic variables — current file, last error, iteration count, tool failure count. They fall into three categories (Bui, 2025 §2.3.4):
- Safety guardrails — reinforce constraints on destructive operations
- Tool-usage guidance — redirect tool selection when the agent uses the wrong approach
- Error-recovery strategies — supply escalating recovery tactics
Escalating severity via guardrail counters¶
Guardrail counters track how often violations occur and raise reminder severity to match (Bui, 2025 §2.3.4):
| Count | Severity | Tone |
|---|---|---|
| 1 | Advisory | Suggest alternative approach |
| 2--3 | Warning | State constraint explicitly |
| 4+ | Mandatory | Require compliance, block progress |
This graduated response avoids overreacting to a single misstep, while still stopping persistent violations.
User-role injection¶
Inject reminders as user messages rather than append them to the system prompt. User messages appear in the conversation history and stay in the model's attention window more reliably than system prompt additions. This placement also lets explicit user commands override a reminder when appropriate (Bui, 2025 §2.3.4).
Graceful degradation¶
If an event detector fails, the system carries on without injecting a reminder. Reminders are additive safety, not critical-path requirements. The agent runs on its baseline system prompt, which still works. This stops a failure in the reminder infrastructure from cascading into an agent failure (Bui, 2025 §2.3.4).
Example¶
This minimal Python implementation shows event-driven reminder injection. A ReminderMiddleware class wraps the message list, watches tool call outcomes, and injects a user-role message when a detector fires.
class ReminderMiddleware:
TOOL_FAILURE_THRESHOLD = 3
TEMPLATES = {
"repeated_tool_failure": (
"You have called `{tool}` {count} times without success. "
"Stop retrying. Review what you know and try a different approach."
),
"approaching_token_budget": (
"You are approaching the context limit. Summarise your findings "
"so far in under 200 words before continuing."
),
}
def __init__(self):
self.tool_failure_counts: dict[str, int] = {}
def record_tool_result(self, tool_name: str, is_error: bool) -> str | None:
if not is_error:
self.tool_failure_counts.pop(tool_name, None)
return None
self.tool_failure_counts[tool_name] = (
self.tool_failure_counts.get(tool_name, 0) + 1
)
count = self.tool_failure_counts[tool_name]
if count >= self.TOOL_FAILURE_THRESHOLD:
return self.TEMPLATES["repeated_tool_failure"].format(
tool=tool_name, count=count
)
return None
def inject(self, messages: list, reminder: str) -> list:
# Inject as a user message so it lands in the attention window
return messages + [{"role": "user", "content": reminder}]
A reminder is generated only when the failure count crosses the threshold. The inject method appends it as a user message, not a system prompt addition, in line with the user-role injection pattern above. If record_tool_result is never called, for example because the detector crashes, messages is returned unchanged and graceful degradation holds.
When this backfires¶
Event-driven reminders help long-running, multi-step agents, but they cost more than they are worth in simpler settings:
- Short sessions: the reminder infrastructure is pure overhead when a session rarely runs past a few dozen exchanges. Instruction fade-out is negligible, a well-structured static system prompt covers the case, and the detector and template machinery adds complexity for no benefit.
- Detector false positives: a badly tuned failure threshold fires on normal retries and injects redundant guidance into a working flow. The piled-up user-role injections consume tokens and can themselves sit in the low-attention middle of the context, recreating the problem they are meant to solve.
- Template drift: if you do not keep reminder templates consistent with the system prompt, injected user messages can contradict the baseline instructions and produce confused behavior that is harder to debug than plain fade-out.
- Context token pressure: each injected reminder consumes tokens. Under a tight context budget, frequent injection speeds up the context pressure it aims to ease (Bui, 2025 §2.3.4).
Prefer event-driven reminders for long-running or safety-critical agents. For short, well-scoped tasks, a well-structured static system prompt carries lower risk.
Key Takeaways¶
- Static system prompts fade in effectiveness over extended sessions; event-driven reminders counter this.
- Detect specific conditions (repeated failures, budget pressure, safety violations) rather than injecting reminders on a schedule.
- Escalate reminder severity via guardrail counters: advisory at 1 violation, warning at 2–3, mandatory at 4+.
- Inject reminders as user messages for better attention persistence than system prompt additions.
- Design reminders as additive safety — detector failures must not break the agent.
Related¶
- Objective Drift: When Agents Lose the Thread
- Critical Instruction Repetition
- Hooks and Lifecycle Events: Intercepting Agent Behavior
- Steering Running Agents
- Domain-Specific System Prompts
- Post-Compaction Re-read Protocol
- Context Compression Strategies: Offloading and Summarisation — the compression mechanism that event-driven reminders complement
- The Instruction Compliance Ceiling — why adding more instructions past a threshold degrades compliance; reminders counter this without adding static rules