Can the Coding Specialist deploy to production on its own?

No. Every destructive action — deploy, force-push, DB migration, domain change — requires an approval_needed card first. The owner clicks Approve; only then does the specialist proceed.

What stops it from touching the rest of my filesystem?

Its cwd is ~/.blackbox/engineering/ /. The system prompt forbids writes outside that directory, and the delegate wiring spawns it with that cwd so the Bash tool cannot easily wander.

Does it have GitHub access?

Yes when a GITHUB_TOKEN is configured. It uses git-mcp + GitHub MCP for clones, PRs, and issue handling. No token, no GitHub writes.

Does it know my brand voice or product details?

Only what the CEO passes in the brief. The specialist has no context about your business beyond the plain-English description it receives.

Docs · Specialists · Coding

The Coding Specialist

Head of Engineering. Ships small, working web projects from a plain-English brief — landing pages, scripts, form backends, integrations. Isolated per-task workspace. No production change without your approval.

When the CEO calls on this specialist

The CEO delegates to Coding when the deliverable ends in running code: a landing page that has to go live, a script that syncs Stripe to Airtable, a bug fix on a page already in production, a small integration. From ceo/tools.ts: “USE WHEN: owner asks to build/fix/ship something that ends in running code. DO NOT USE WHEN: the deliverable is a design spec with no code (use design), marketing copy (use content), or a one-line edit the CEO can handle itself.”

What they take as input

The delegate_to_coding_specialist tool expects three fields:

task_id — a short kebab-case slug, e.g. pam-consulting-landing-page. Becomes the workspace directory name.
brief — the full plain-English description: target audience, required sections, colors or assets the owner named, definition of done.
context_files (optional) — absolute paths under ~/.blackbox/ for extra context (brand notes, prior deliverables).

What they produce

A working project under ~/.blackbox/engineering/<task_id>/ — source files, build config, a local preview
A one-paragraph summary for the CEO naming what was built, where it lives, any issues
An ordered list of artifacts (files produced)
Optionally, a preview URL (Railway/Vercel/Cloudflare) once the owner approves deploy
Cards only at milestones or approval gates — no progress narration

Tools they have access to

From apps/engine/src/specialists/coding/spec.ts:

Agent SDK built-ins: Read, Write, Edit, Glob, Grep, Bash, WebFetch, TodoWrite
CEO shared tools: emit_owner_card, append_lesson, write_inter_agent_note, request_peer_review, request_replan
Project-room tools: read/write handoff notes, update status, join a multi-agent project
Deploy + version control (via Bash / MCP): git, GitHub (conditional on GITHUB_TOKEN), Railway, Vercel, Wrangler, Supabase CLIs

Workspace

setupCodingWorkspace(taskId) creates ~/.blackbox/engineering/<taskId>/. Idempotent: re-running with the same slug reuses the directory. The specialist is forbidden from writing outside that path. Re-running a task with the same task_id is how retries and revisions work — the specialist picks up where the last run left off.

Example brief

From an owner message like “build me a landing page for Pam Strategy”, the CEO might call:

delegate_to_coding_specialist(
task_id: "pam-strategy-landing",

  brief: "Build a one-page static site for Pam Strategy, a solo marketing-strategy consultancy. Audience: SaaS founders. Tone: sharp, opinionated, a little dry. Sections: (1) Hero with headline + 1-sentence subhead + CTA button, (2) Three-problem grid naming the specific pains SaaS founders face on positioning, (3) Footer CTA to book a 30-min intro call at Calendly URL {to-be-supplied}. Stack: Astro + Tailwind. Emit an approval_needed card before any deploy. Do not invent pricing, testimonials, or the Calendly URL — emit a manual_task if missing.",

context_files: ["~/.blackbox/memory/brand-voice.md"]
)

Example output

The specialist returns a structured result the CEO formats back into a message:

Coding task pam-strategy-landing: success
Workspace: ~/.blackbox/engineering/pam-strategy-landing

Summary: Built a 3-section Astro + Tailwind landing page for Pam Strategy. Hero, problem grid, and footer CTA in place. Preview runs on port 4321. Calendly URL placeholder inserted; emitted manual_task card asking the owner for the real link. No deploy yet — awaiting approval.

Artifacts: src/pages/index.astro, src/components/Hero.astro, src/components/ProblemGrid.astro, src/components/FooterCTA.astro, tailwind.config.js, astro.config.mjs

Related specialists

Design hands off visual specs; Coding executes them. After Coding ships, the Evaluator runs its code-quality rubric before the CEO surfaces the result. The Browser specialist pairs with Coding for visual QA on shipped pages. The Content specialist supplies the copy Coding drops into place.

Frequently asked

What languages and frameworks does it know?: Whatever the underlying Anthropic model knows — which is most mainstream web stacks. In practice: Astro, Next.js, vanilla HTML/CSS/JS, Python, Node, TypeScript, React. It does not dabble in exotic systems work.
How many tries does it get?: The CEO retries a failed build at most twice after Evaluator FAIL, then emits a decision_required card. Credits stop being spent on endless retry loops.
Can I see the code it wrote?: Yes — the workspace is on disk. The Power Grid view surfaces the file list; you can browse it like any project.
Will it commit directly to my main branch?: No. It branches, pushes to a feature branch, and opens a PR. Merge stays your call.