Web4Guru AI Operations

The Meta-Learning Loop

Black Box and the agents that build Black Box are one connected learning system. Guidance flows both ways. The product teaches its builders; the builders teach the product. This is the moat.

How it works

At runtime, the product\'s agents — the CEO, the 18 specialists, the Evaluator — read from a shared guidance directory at ~/.blackbox/guidance/. This contains the orchestration playbook, the lessons-learned log, the agent constitutions, the Skill Pack registry, and inter-agent notes. The CEO consults these files through the consult_guidance MCP tool before making any non-trivial decision.

At buildtime, the Claude Code agents that write Black Box code read the same files. The repo\'s CLAUDE.md references the guidance directory; subagents spawned by Claude Code have read/write access; the per-repo memory files live alongside the product\'s shared guidance. When we improve a CEO prompt, we update the guidance. When we discover a better delegation pattern during a build session, we write it to the guidance. The product picks it up on next restart.

Four feedback edges make up the loop:

  1. Product → Evaluator → Lessons. The Evaluator reviews deliverables, notes recurring weaknesses, writes them to lessons-learned/<specialist>-patterns.md.
  2. Lessons → Build agents. The next build session, our Claude Code agents read product lessons, refactor the specialist\'s prompt or tool set, ship an update.
  3. Build agents → Orchestration playbook. Patterns discovered during development (like "plan-before-delegate") go into the shared orchestration playbook.
  4. Playbook → Product agents. The product\'s CEO uses the same playbook. The pattern lands in every customer session immediately.

This is not "we write good docs." It\'s one guidance system that both contexts read and write. The runtime and buildtime agents have the same shared memory about how this kind of work is done well.

What you see in the UI

As an owner, you rarely see the meta-loop directly — and that\'s fine. What you see is: the CEO gets a little better each release. Error modes you hit once stop happening. Cards that used to need three back-and-forths start arriving pre-drafted. Your captured business knowledge feeds back into your agents\' context, so each week the team starts from a sharper brief than the last.

A concrete example

A build session in April 2026 discovers that the CEO works better if it emits a propose_plan card before delegating to a sequence of specialists. That pattern goes into guidance/orchestration-playbook.md. On the next engine release, every customer\'s CEO starts proposing plans. Simultaneously, the Claude Code agents working on Black Box also start proposing plans for code tasks. The same rule, two contexts. The improvement compounds: one update, two populations of agents gain it.

The reverse direction: a customer\'s Evaluator logs "Coding Specialist forgets to add error handling in Railway deploy scripts" to lessons-learned. The next build session, that lesson gets surfaced in the Coding Specialist\'s system prompt as an explicit check. Customer sessions after that release stop hitting the problem.

Technical details

The shared systems, in pairs:

  • ~/.blackbox/guidance/ ↔ repo CLAUDE.md + memory files — same function, different contexts.
  • Product lessons-learned/ ↔ build session findings — feed each other.
  • Product playbooks ↔ build methodology — same patterns.
  • Product inter-agent notes ↔ build delegation patterns — same orchestration.
  • Product MCP Memory knowledge graph ↔ Claude Code persistent memory — both remember across sessions.

The consult_guidance and update_guidance and append_lesson tools (in apps/engine/src/ceo/tools.ts) are how agents read and write the shared library at runtime. The build-time equivalents are direct filesystem reads from the repo\'s memory directory.

Why this is the moat

Single-agent coding tools (Cursor, Windsurf, Cline, and every "AI pair programmer") are one-directional: humans build the tool, the tool serves users. There\'s no return channel from the user\'s usage to the tool\'s behavior.

Black Box is a loop. The tool serves users → the Evaluator notes what went well or badly → those notes become guidance → the next release of the tool is better → customers benefit → more usage → more notes. Not ML training (we\'re not fine-tuning the base model). Structured organizational learning that compounds with every release.

A competitor can copy the 18 specialists, the CEO loop, even the UI. Nobody can copy the memory.

Related features

  • Evaluator gate — the source of many runtime lessons.
  • The CEO agent — the primary consumer of shared guidance.
  • Skill Packs — the mechanism by which new patterns ship without an engine release.

Related concepts

FAQ

Isn't this just "we write good docs"?

No. The key is that the guidance system is shared between runtime and buildtime. The product agents read ~/.blackbox/guidance/ to execute. The Claude Code agents that build Black Box read the same files via CLAUDE.md references. An improvement to the orchestration playbook for the product immediately becomes an improvement to how we build the product.

Does this mean the product learns from individual users?

Lessons-learned are written per-session by the Evaluator. They live in the owner's ~/.blackbox/ and can also be merged upstream into our shared guidance library (with consent). That upstream merge is how one owner's discovered pattern becomes every owner's benefit.

What's an example of a lesson that flowed both ways?

The "plan-before-delegate" pattern: we discovered while building Black Box that the CEO works better if it proposes a plan card before delegating. That went into the orchestration playbook. Next customer session, the CEO started proposing plans. Next build session, our build agents did the same thing. Same guidance; two contexts.

Why is this the moat?

Single-agent tools (Cursor, Windsurf, Cline) are one-directional: humans build, the tool serves. Black Box is a loop: the tool serves, users generate insights, insights improve the tool. The product literally gets better the more it's used — not from ML training, from structured organizational learning. No competitor can copy the memory.

Try Black Box

The product that gets better the more it\'s used — not from ML, from structured learning.