The Coding Specialist
Head of Engineering. Ships small, working web projects from a plain-English brief — landing pages, scripts, form backends, integrations. Isolated per-task workspace. No production change without your approval.
When the CEO calls on this specialist
The CEO delegates to Coding when the deliverable ends in
running code: a landing page that has to go live, a script
that syncs Stripe to Airtable, a bug fix on a page already in
production, a small integration. From
ceo/tools.ts: “USE WHEN: owner asks to
build/fix/ship something that ends in running code. DO NOT
USE WHEN: the deliverable is a design spec with no code
(use design), marketing copy (use content), or a one-line
edit the CEO can handle itself.”
What they take as input
The delegate_to_coding_specialist tool expects three fields:
task_id— a short kebab-case slug, e.g.pam-consulting-landing-page. Becomes the workspace directory name.brief— the full plain-English description: target audience, required sections, colors or assets the owner named, definition of done.context_files(optional) — absolute paths under~/.blackbox/for extra context (brand notes, prior deliverables).
What they produce
- A working project under
~/.blackbox/engineering/<task_id>/— source files, build config, a local preview - A one-paragraph summary for the CEO naming what was built, where it lives, any issues
- An ordered list of artifacts (files produced)
- Optionally, a preview URL (Railway/Vercel/Cloudflare) once the owner approves deploy
- Cards only at milestones or approval gates — no progress narration
Tools they have access to
From apps/engine/src/specialists/coding/spec.ts:
- Agent SDK built-ins:
Read,Write,Edit,Glob,Grep,Bash,WebFetch,TodoWrite - CEO shared tools:
emit_owner_card,append_lesson,write_inter_agent_note,request_peer_review,request_replan - Project-room tools: read/write handoff notes, update status, join a multi-agent project
- Deploy + version control (via
Bash/ MCP): git, GitHub (conditional onGITHUB_TOKEN), Railway, Vercel, Wrangler, Supabase CLIs
Workspace
setupCodingWorkspace(taskId) creates
~/.blackbox/engineering/<taskId>/. Idempotent:
re-running with the same slug reuses the directory. The
specialist is forbidden from writing outside that path.
Re-running a task with the same task_id is how
retries and revisions work — the specialist picks up where
the last run left off.
Example brief
From an owner message like “build me a landing page for Pam Strategy”, the CEO might call:
delegate_to_coding_specialist( task_id: "pam-strategy-landing", brief: "Build a one-page static site for Pam Strategy, a solo marketing-strategy consultancy. Audience: SaaS founders. Tone: sharp, opinionated, a little dry. Sections: (1) Hero with headline + 1-sentence subhead + CTA button, (2) Three-problem grid naming the specific pains SaaS founders face on positioning, (3) Footer CTA to book a 30-min intro call at Calendly URL {to-be-supplied}. Stack: Astro + Tailwind. Emit an approval_needed card before any deploy. Do not invent pricing, testimonials, or the Calendly URL — emit a manual_task if missing.", context_files: ["~/.blackbox/memory/brand-voice.md"]) Example output
The specialist returns a structured result the CEO formats back into a message:
Coding task pam-strategy-landing: successWorkspace: ~/.blackbox/engineering/pam-strategy-landingSummary: Built a 3-section Astro + Tailwind landing page for Pam Strategy. Hero, problem grid, and footer CTA in place. Preview runs on port 4321. Calendly URL placeholder inserted; emitted manual_task card asking the owner for the real link. No deploy yet — awaiting approval.Artifacts: src/pages/index.astro, src/components/Hero.astro, src/components/ProblemGrid.astro, src/components/FooterCTA.astro, tailwind.config.js, astro.config.mjs Related specialists
Design hands off visual specs; Coding executes them. After Coding ships, the Evaluator runs its code-quality rubric before the CEO surfaces the result. The Browser specialist pairs with Coding for visual QA on shipped pages. The Content specialist supplies the copy Coding drops into place.
Frequently asked
- What languages and frameworks does it know?
- Whatever the underlying Anthropic model knows — which is most mainstream web stacks. In practice: Astro, Next.js, vanilla HTML/CSS/JS, Python, Node, TypeScript, React. It does not dabble in exotic systems work.
- How many tries does it get?
- The CEO retries a failed build at most twice after Evaluator FAIL, then emits a decision_required card. Credits stop being spent on endless retry loops.
- Can I see the code it wrote?
- Yes — the workspace is on disk. The Power Grid view surfaces the file list; you can browse it like any project.
- Will it commit directly to my main branch?
- No. It branches, pushes to a feature branch, and opens a PR. Merge stays your call.
See also
- How delegation works
- Glossary: specialist agent
- All 18 specialists
- The Evaluator — who reviews Coding output