Why is it called a "CEO agent"?

Because it does the job of a chief executive in a small business — sets priorities, delegates work, holds the team accountable to the goal, and reports outcomes to the owner.

Does the CEO agent do the actual work?

Rarely. It plans, delegates, reviews, and synthesizes. Specialist agents write the code, draft the copy, crawl the web. Keeping the CEO thin is a design principle.

How does the CEO decide which specialist to call?

It matches the sub-task to each specialist's declared capabilities, tool set, and past performance. The Claude Agent SDK exposes specialists as subagents with typed descriptions.

What happens when the CEO gets stuck?

Circuit breakers fire. If the CEO loops, exceeds budget, or hits a safety rule, the system halts the session, logs the cause, and returns a clear error to the owner.

Can I customize the CEO agent?

On Black Box, the CEO's system prompt is fixed but the operating style — autonomy level, approval thresholds, voice — is tunable per workspace. On Enterprise plans, you can load a custom constitution.

Blog ·Definition· April 23, 2026 ·12 min read

What is a CEO agent and how does it work?

The top-level orchestrator that decomposes goals, delegates to specialists, enforces evaluation gates, and synthesizes results — the chief executive of the multi-agent stack.

TL;DR

A CEO agent is the supervisor agent in a multi-agent system. It turns the owner's goal into a plan, routes sub-tasks to specialist agents, checks outputs through an evaluator, and hands the owner a single synthesized result. It does the job of a chief executive in a small business — in software.

Every multi-agent system has a coordinator. The coordinator is sometimes called a supervisor, sometimes an orchestrator, sometimes a manager agent. When the system is trying to behave like a business, the coordinator is usually called a CEO agent — and for good reason. The job maps almost one-to-one onto what a real chief executive does in a small company. This is the long-form definition. The one-liner lives at glossary/ceo-agent.

The precise definition

A CEO agent is the top-level agent in a multi-agent AI system that receives the user's goal, decomposes it into sub-tasks, routes each sub-task to the most appropriate specialist agent, observes returned work, enforces evaluation gates, escalates decisions to the human owner when required, and synthesizes a single final outcome. It is distinguished from specialist agents by scope — it does not do the underlying work itself — and distinguished from a simple router by the fact that it reasons, re-plans, and remembers.

In plain English

When a founder runs a six-person startup, they spend very little time writing code, drafting copy, or cleaning data. They spend most of their time deciding: what to build, who should build it, whether it's ready to ship, what to communicate to the customer. The CEO agent does that job, at software speed.

You type "launch our spring cohort." Behind the scenes, the CEO agent reads its memory of your business, parses the goal into five or six concrete sub-tasks, picks a specialist for each, watches the specialists work, catches problems, loops until the output is good, and delivers a summary. You never wrote the plan. The CEO wrote the plan.

Crucially, a good CEO agent is thin. It does not write the landing page; the content specialist does. It does not deploy the code; the coding specialist does. The CEO's value is judgment — picking the right next move, holding the bar on quality, and getting out of the way when the work is proceeding well.

The history

The "supervisor + workers" pattern is old. In classical AI, hierarchical task networks (HTNs) decomposed goals into sub-goals a planner solved. In concurrent computing, the actor model (Hewitt, 1973) gave us the supervision tree — a parent actor that spawns children, restarts them on failure, and reports upstream. Erlang/OTP made this mainstream in production systems.

The language-model version arrived with AutoGen (Microsoft, 2023), which introduced a GroupChatManager that selected which agent spoke next. LangGraph formalized it as a supervisor node. OpenAI's Swarm (late 2024) and Agents SDK added handoffs. Anthropic's Claude Agent SDK provides subagent primitives that let a parent agent invoke named children with their own context.

"CEO agent" as naming convention gained traction because the business-operations framing resonates with the non-technical audience using these products. The mechanics are identical to supervisor/orchestrator/manager agents in the academic literature. The branding signals who the product is for.

Why it is different from a router

A router is stateless. It takes an input and returns which agent should handle it. It does not remember, plan, or retry.

A CEO agent is stateful. Between turns it carries a plan, a scratchpad of what has happened, outstanding sub-tasks, and a view of which specialists are busy. When an evaluator rejects a draft, the CEO decides whether to revise with the same specialist, route to a different one, or escalate to the owner. It reasons about the work as an ongoing project, not a one-shot dispatch.

Why it is different from a workflow engine

A workflow engine runs a graph you drew. A CEO agent generates the graph at runtime for each goal, adapts it when steps fail, and reasons about when the goal is met. The workflow engine is Zapier. The CEO agent is the general contractor who decides which trades to call in what order to get your kitchen renovated.

How a CEO agent works, step by step

Inside a typical turn, a CEO agent executes roughly this loop:

Parse the goal. Take the user's message. Combine it with workspace memory — brand, prior work, preferences. Surface ambiguities the owner needs to resolve.
Plan. Generate a list of sub-tasks with dependencies. Estimate the difficulty, the specialists involved, the tools required, the risk profile.
Delegate. For each sub-task, pick a specialist, write a scoped brief, hand off with success criteria.
Observe. Receive the returned work. Run it through the evaluator with a rubric. Pass or reject.
Re-plan. If rejected, decide: revise, reroute, or escalate. If passed, mark the sub-task done and move to the next.
Escalate. Any action flagged high-stakes — spend, send-to-list, deploy — goes into the approval inbox before execution.
Synthesize. When all sub-tasks are done or blocked, produce a single structured summary for the owner.

Real-world example

A SaaS founder types: "Our waitlist hit 500. Open up a paid beta with a proper pricing page and a Stripe checkout." The CEO agent:

Plans: waitlist communication, pricing page, Stripe product setup, checkout flow, activation email, analytics event wiring.
Delegates "design the pricing page structure" to Content, "build the page" to Coding, "set up Stripe products" to Business Ops, "wire activation email" to Content and Coding together.
Evaluator reviews the pricing page for clarity, legal exposure, conversion best practices. Flags the "cancel anytime" claim for a tooltip. Coding revises.
Business Ops creates Stripe products, gets back three product IDs, asks the CEO to confirm monthly vs annual defaults. The CEO escalates to the owner via Approval Inbox.
Owner approves monthly default, confirms the "cancel anytime" language. The CEO resumes.
CEO synthesizes: "Paid beta is live. Pricing at /pricing, Stripe IDs created, welcome email queued, three analytics events wired. Waitlist blast ready in drafts — needs your send approval."

From a one-sentence goal to a working paid beta in a single session, with the owner only making two judgment calls.

How Black Box implements this

Black Box's CEO agent runs on the Claude Agent SDK. Its system prompt enforces four principles: decompose before acting, delegate over do, enforce the evaluator gate, and escalate the irreversible. It has read/write access to workspace memory, the full specialist roster (18 specialists), and a set of hooks that let us intercept tool use for safety checks. Specialists are invoked as subagents — each with its own context window — so the CEO stays lean on tokens while the team stays deep on capability.

Circuit breakers cap the session on budget, turn count, and error rate. The CEO's own reasoning is shown in the Action Feed so owners can audit why a decision was made, not just what happened.

Key takeaways

The CEO agent is the supervisor in a multi-agent system — it plans, delegates, reviews, synthesizes.
A good CEO is thin: it does not do the work, it coordinates the work.
The pattern descends from actor-model supervision trees and AutoGen-style group chat managers.
It differs from a router (stateful vs stateless) and from a workflow engine (plans vs executes a fixed graph).
On Black Box, the CEO agent uses the Claude Agent SDK, writes to workspace memory, and routes irreversible actions to an approval inbox.

Frequently asked questions

Why "CEO agent"?

Because the job maps onto what a chief executive does: set priorities, delegate, hold quality bars, escalate what can't be delegated.

Does the CEO do the actual work?

Rarely. It plans and delegates. Specialists do the work.

How does it pick specialists?

By matching sub-tasks to specialist capabilities declared in the subagent descriptions, plus past-performance signals.

What if it gets stuck?

Circuit breakers halt the loop. The system logs the cause and returns a clear error.

Can I customize it?

Operating style is tunable per workspace. Enterprise plans can load custom constitutions.