Circuit Breaker
The non-LLM safety valve. Five halt conditions, counted deterministically, with production defaults tuned for a solo consultant\'s budget. An agent that\'s stuck never runs your card down.
How it works
The circuit breaker lives at
apps/api/src/engine/circuit-breaker.ts. It\'s a
session-scoped class that observes the stream of
assistant, tool_use,
[heartbeat], and usage events
emitted by each specialist task, maintains per-task counters,
and returns a HaltReason when a threshold trips.
Plain JS maps and arrays; no async runtime entanglement; every
public method wrapped in try/catch. A bug in the breaker must
never crash the API server.
Task granularity comes from the [heartbeat]
stream. When a specialist emits
phase: "starting", the breaker pushes its ID
onto an active-task stack and creates a counter bucket. When
it emits phase: "done" or
"error", the bucket is dropped. Stdout events
route into the task at the top of the stack. Duration and
idle checks run every second from a
setInterval in session.ts.
When a threshold is crossed, the breaker returns a structured
HaltReason. The caller kills the subprocess and
surfaces an error_alert card with a machine-readable
slug in the metadata (tool_call_limit,
token_spend_limit, etc.). You see a human
explanation; your logs carry the full context.
The 5 halt reasons
- ToolCallLimit — more than 50 tool calls per task. Defaults from
agent-constitutions/circuit-breaker.md. Tunable viaBB_CIRCUIT_BREAKER_TOOL_CAP. - TokenSpendLimit — more than 5,000¢ ($50) per task. Cost estimated from Anthropic usage blobs via
estimateUsageCents()using Sonnet 4 list rates. - DurationLimit — more than 1,800 seconds (30 minutes) of wall-clock from task start.
- IdleTimeout — more than 300 seconds (5 minutes) with no observed events. Checked on every 1-second sweep; a genuinely silent task still trips.
- OutputLoop — three consecutive assistant outputs with pairwise Jaccard similarity ≥ 0.95. The classic "stuck in a loop" signature.
What you see in the UI
You see an error_alert card in your Inbox with
a short human explanation and two actions: Let me try
again (re-run with raised limits) and Pause — I\'ll
investigate. The metadata carries the halt reason slug
and the actual counters (e.g. "tool calls: 51 of 50") for
your diagnostics view.
Diagnostics also shows a live breaker snapshot — current active tasks, per-task tool calls, per-task token spend, duration, idle time. Useful when you\'re debugging a specialist that keeps tripping on a specific job.
A concrete example
You ask for "deep competitor research on the top 200 real estate brokerages in LA." The Research Specialist starts scraping. After 50 tool calls, the breaker trips ToolCallLimit and halts the task. You get a card: "Research paused after 50 lookups — that\'s a lot. Want me to narrow the scope, or raise the limit?" Two actions. You pick Narrow scope, the CEO re-plans with "top 20" instead, and the new task completes without tripping.
Technical details
// apps/api/src/engine/circuit-breaker.ts
export const DEFAULT_BREAKER_CONFIG: BreakerConfig = {
maxToolCallsPerTask: 50,
maxTokenSpendCentsPerTask: 5000, // = $50.00
maxTaskDurationSecs: 1800, // = 30 min
maxIdleSecs: 300, // = 5 min
similarityJaccardThreshold: 0.95,
};
export type HaltReason =
| { kind: "ToolCallLimit"; actual: number; limit: number }
| { kind: "TokenSpendLimit"; actualCents: number; limitCents: number }
| { kind: "DurationLimit"; actualSecs: number; limitSecs: number }
| { kind: "IdleTimeout"; idleSecs: number; limitSecs: number }
| { kind: "OutputLoop"; windowSize: number;
similarityJaccard: number; threshold: number };
Known limitation: when N specialists from the same
delegate_parallel call run concurrently, all land
on the active-task stack, but only the topmost sees stdout
counters. Duration, idle, and heartbeat-liveness are still
tracked per task; tool-call and token-spend counters pool
into whichever sibling reached the stack first. Acceptable
for v1; will be resolved by per-task event attribution in v2.
Related features
- Evaluator gate — the other layer of the safety story.
- The CEO agent — the orchestrator whose tasks the breaker supervises.
- Power Grid — where breaker trips show up as red cards.
Related concepts
FAQ
Why not ask the LLM to decide when to stop?
Because an LLM that's stuck in a loop is the last thing you want asking itself if it's stuck. The breaker is deliberately non-LLM: plain JavaScript counters, a Jaccard-similarity check, and sync code paths. It counts; it doesn't reason. That's the point — a broken safety net is worse than no safety net.
What do I see when the breaker trips?
An error_alert card in your Inbox explaining which halt reason tripped and which task hit it. You can retry with raised limits, hand the task back to the CEO for a new plan, or dismiss. No silent failures.
Can I tune the thresholds?
Yes. Four env vars: BB_CIRCUIT_BREAKER_TOOL_CAP, BB_CIRCUIT_BREAKER_COST_CAP_CENTS, BB_CIRCUIT_BREAKER_DURATION_SECS, BB_CIRCUIT_BREAKER_IDLE_SECS, and BB_CIRCUIT_BREAKER_JACCARD. Invalid values fall back to the production defaults — a bad env var must never disable the breaker.
How is loop detection defined exactly?
Three consecutive assistant outputs, tokenized by whitespace (capped at 512 tokens each), compared pairwise. If both consecutive pairs clear Jaccard similarity 0.95, the task trips OutputLoop. "Two empty sets compare as 1.0" is intentional — no output is itself a loop signal.
Try Black Box
Safety-by-default. Stuck agents halt cleanly before they burn your budget or your attention.