Circuit Breaker

The non-LLM safety valve. Five halt conditions, counted deterministically, with production defaults tuned for a solo consultant\'s budget. An agent that\'s stuck never runs your card down.

How it works

The circuit breaker lives at apps/api/src/engine/circuit-breaker.ts. It\'s a session-scoped class that observes the stream of assistant, tool_use, [heartbeat], and usage events emitted by each specialist task, maintains per-task counters, and returns a HaltReason when a threshold trips. Plain JS maps and arrays; no async runtime entanglement; every public method wrapped in try/catch. A bug in the breaker must never crash the API server.

Task granularity comes from the [heartbeat] stream. When a specialist emits phase: "starting", the breaker pushes its ID onto an active-task stack and creates a counter bucket. When it emits phase: "done" or "error", the bucket is dropped. Stdout events route into the task at the top of the stack. Duration and idle checks run every second from a setInterval in session.ts.

When a threshold is crossed, the breaker returns a structured HaltReason. The caller kills the subprocess and surfaces an error_alert card with a machine-readable slug in the metadata (tool_call_limit, token_spend_limit, etc.). You see a human explanation; your logs carry the full context.

The 5 halt reasons

ToolCallLimit — more than 50 tool calls per task. Defaults from agent-constitutions/circuit-breaker.md. Tunable via BB_CIRCUIT_BREAKER_TOOL_CAP.
TokenSpendLimit — more than 5,000¢ ($50) per task. Cost estimated from Anthropic usage blobs via estimateUsageCents() using Sonnet 4 list rates.
DurationLimit — more than 1,800 seconds (30 minutes) of wall-clock from task start.
IdleTimeout — more than 300 seconds (5 minutes) with no observed events. Checked on every 1-second sweep; a genuinely silent task still trips.
OutputLoop — three consecutive assistant outputs with pairwise Jaccard similarity ≥ 0.95. The classic "stuck in a loop" signature.

What you see in the UI

You see an error_alert card in your Inbox with a short human explanation and two actions: Let me try again (re-run with raised limits) and Pause — I\'ll investigate. The metadata carries the halt reason slug and the actual counters (e.g. "tool calls: 51 of 50") for your diagnostics view.

Diagnostics also shows a live breaker snapshot — current active tasks, per-task tool calls, per-task token spend, duration, idle time. Useful when you\'re debugging a specialist that keeps tripping on a specific job.

A concrete example

You ask for "deep competitor research on the top 200 real estate brokerages in LA." The Research Specialist starts scraping. After 50 tool calls, the breaker trips ToolCallLimit and halts the task. You get a card: "Research paused after 50 lookups — that\'s a lot. Want me to narrow the scope, or raise the limit?" Two actions. You pick Narrow scope, the CEO re-plans with "top 20" instead, and the new task completes without tripping.

Technical details

// apps/api/src/engine/circuit-breaker.ts
export const DEFAULT_BREAKER_CONFIG: BreakerConfig = {
  maxToolCallsPerTask: 50,
  maxTokenSpendCentsPerTask: 5000,   // = $50.00
  maxTaskDurationSecs: 1800,         // = 30 min
  maxIdleSecs: 300,                  // = 5 min
  similarityJaccardThreshold: 0.95,
};

export type HaltReason =
  | { kind: "ToolCallLimit";   actual: number; limit: number }
  | { kind: "TokenSpendLimit"; actualCents: number; limitCents: number }
  | { kind: "DurationLimit";   actualSecs: number; limitSecs: number }
  | { kind: "IdleTimeout";     idleSecs: number; limitSecs: number }
  | { kind: "OutputLoop";      windowSize: number;
                                similarityJaccard: number; threshold: number };

Known limitation: when N specialists from the same delegate_parallel call run concurrently, all land on the active-task stack, but only the topmost sees stdout counters. Duration, idle, and heartbeat-liveness are still tracked per task; tool-call and token-spend counters pool into whichever sibling reached the stack first. Acceptable for v1; will be resolved by per-task event attribution in v2.

Related features

Evaluator gate — the other layer of the safety story.
The CEO agent — the orchestrator whose tasks the breaker supervises.
Power Grid — where breaker trips show up as red cards.

Related concepts

FAQ

Why not ask the LLM to decide when to stop?

Because an LLM that's stuck in a loop is the last thing you want asking itself if it's stuck. The breaker is deliberately non-LLM: plain JavaScript counters, a Jaccard-similarity check, and sync code paths. It counts; it doesn't reason. That's the point — a broken safety net is worse than no safety net.

What do I see when the breaker trips?

An error_alert card in your Inbox explaining which halt reason tripped and which task hit it. You can retry with raised limits, hand the task back to the CEO for a new plan, or dismiss. No silent failures.

Can I tune the thresholds?

Yes. Four env vars: BB_CIRCUIT_BREAKER_TOOL_CAP, BB_CIRCUIT_BREAKER_COST_CAP_CENTS, BB_CIRCUIT_BREAKER_DURATION_SECS, BB_CIRCUIT_BREAKER_IDLE_SECS, and BB_CIRCUIT_BREAKER_JACCARD. Invalid values fall back to the production defaults — a bad env var must never disable the breaker.

How is loop detection defined exactly?

Three consecutive assistant outputs, tokenized by whitespace (capped at 512 tokens each), compared pairwise. If both consecutive pairs clear Jaccard similarity 0.95, the task trips OutputLoop. "Two empty sets compare as 1.0" is intentional — no output is itself a loop signal.

Try Black Box

Safety-by-default. Stuck agents halt cleanly before they burn your budget or your attention.

See pricing How it works