Web4Guru AI Operations
Docs · Specialists · Coding

The Coding Specialist

Head of Engineering. Ships small, working web projects from a plain-English brief — landing pages, scripts, form backends, integrations. Isolated per-task workspace. No production change without your approval.

When the CEO calls on this specialist

The CEO delegates to Coding when the deliverable ends in running code: a landing page that has to go live, a script that syncs Stripe to Airtable, a bug fix on a page already in production, a small integration. From ceo/tools.ts: “USE WHEN: owner asks to build/fix/ship something that ends in running code. DO NOT USE WHEN: the deliverable is a design spec with no code (use design), marketing copy (use content), or a one-line edit the CEO can handle itself.”

What they take as input

The delegate_to_coding_specialist tool expects three fields:

  • task_id — a short kebab-case slug, e.g. pam-consulting-landing-page. Becomes the workspace directory name.
  • brief — the full plain-English description: target audience, required sections, colors or assets the owner named, definition of done.
  • context_files (optional) — absolute paths under ~/.blackbox/ for extra context (brand notes, prior deliverables).

What they produce

  • A working project under ~/.blackbox/engineering/<task_id>/ — source files, build config, a local preview
  • A one-paragraph summary for the CEO naming what was built, where it lives, any issues
  • An ordered list of artifacts (files produced)
  • Optionally, a preview URL (Railway/Vercel/Cloudflare) once the owner approves deploy
  • Cards only at milestones or approval gates — no progress narration

Tools they have access to

From apps/engine/src/specialists/coding/spec.ts:

  • Agent SDK built-ins: Read, Write, Edit, Glob, Grep, Bash, WebFetch, TodoWrite
  • CEO shared tools: emit_owner_card, append_lesson, write_inter_agent_note, request_peer_review, request_replan
  • Project-room tools: read/write handoff notes, update status, join a multi-agent project
  • Deploy + version control (via Bash / MCP): git, GitHub (conditional on GITHUB_TOKEN), Railway, Vercel, Wrangler, Supabase CLIs

Workspace

setupCodingWorkspace(taskId) creates ~/.blackbox/engineering/<taskId>/. Idempotent: re-running with the same slug reuses the directory. The specialist is forbidden from writing outside that path. Re-running a task with the same task_id is how retries and revisions work — the specialist picks up where the last run left off.

Example brief

From an owner message like “build me a landing page for Pam Strategy”, the CEO might call:

delegate_to_coding_specialist(
  task_id: "pam-strategy-landing",
  brief: "Build a one-page static site for Pam Strategy, a solo marketing-strategy consultancy. Audience: SaaS founders. Tone: sharp, opinionated, a little dry. Sections: (1) Hero with headline + 1-sentence subhead + CTA button, (2) Three-problem grid naming the specific pains SaaS founders face on positioning, (3) Footer CTA to book a 30-min intro call at Calendly URL {to-be-supplied}. Stack: Astro + Tailwind. Emit an approval_needed card before any deploy. Do not invent pricing, testimonials, or the Calendly URL — emit a manual_task if missing.",
  context_files: ["~/.blackbox/memory/brand-voice.md"]
)

Example output

The specialist returns a structured result the CEO formats back into a message:

Coding task pam-strategy-landing: success
Workspace: ~/.blackbox/engineering/pam-strategy-landing
Summary: Built a 3-section Astro + Tailwind landing page for Pam Strategy. Hero, problem grid, and footer CTA in place. Preview runs on port 4321. Calendly URL placeholder inserted; emitted manual_task card asking the owner for the real link. No deploy yet — awaiting approval.
Artifacts: src/pages/index.astro, src/components/Hero.astro, src/components/ProblemGrid.astro, src/components/FooterCTA.astro, tailwind.config.js, astro.config.mjs

Related specialists

Design hands off visual specs; Coding executes them. After Coding ships, the Evaluator runs its code-quality rubric before the CEO surfaces the result. The Browser specialist pairs with Coding for visual QA on shipped pages. The Content specialist supplies the copy Coding drops into place.

Frequently asked

What languages and frameworks does it know?
Whatever the underlying Anthropic model knows — which is most mainstream web stacks. In practice: Astro, Next.js, vanilla HTML/CSS/JS, Python, Node, TypeScript, React. It does not dabble in exotic systems work.
How many tries does it get?
The CEO retries a failed build at most twice after Evaluator FAIL, then emits a decision_required card. Credits stop being spent on endless retry loops.
Can I see the code it wrote?
Yes — the workspace is on disk. The Power Grid view surfaces the file list; you can browse it like any project.
Will it commit directly to my main branch?
No. It branches, pushes to a feature branch, and opens a PR. Merge stays your call.

See also