Docs · Core Concepts · The Evaluator gate

The Evaluator gate

The Evaluator is a separate specialist with one job: read the deliverable, find the holes, send it back if it fails. Every other specialist's output passes through it. The Evaluator never produces work — only critiques it.

In one breath

Independent agent, separate prompt, separate tools.
Two-rejection rule prevents thrash; the third attempt is a re-plan, not a re-try.
Per-specialist rubric — code is checked differently from copy.

How the gate works

When a specialist returns, the CEO routes the deliverable to the Evaluator with the original brief and a rubric. The Evaluator either accepts (deliverable continues to the owner) or rejects with a structured critique (CEO re-delegates with the critique attached). Two consecutive rejections trigger a re-plan rather than a third try.

Why a separate agent

Self-evaluation by the same agent that produced the work is a known failure mode — models are biased toward their own output. A separate Evaluator with a different system prompt and a different tool belt catches what the producer cannot see in its own writing.

What the rubric checks

Per-task: brief satisfaction, voice match, factual claims have citations, no fabricated data, correct format, no leaked private information. The rubric is per-specialist — what the Evaluator checks for a Coding deliverable is different from what it checks for a Content deliverable.

How the gate works

Why a separate agent

What the rubric checks

Related