The head of Claude Code at Anthropic just revealed how he builds software—and it's not what you'd expect. Boris Cherny runs five AI agents in parallel, uses the slowest model available, and maintains a single text file that makes every mistake a permanent lesson. The kicker? You can copy his entire setup today, for free.
Why This Matters: The $200/Month Tool Has a Zero-Dollar Alternative
While Claude Code costs up to $200 per month and has sparked a developer revolt over rate limits, the workflow powering it is surprisingly accessible. Cherny's thread on X has gone viral because it proves you don't need enterprise budgets to work like an enterprise team—you need better orchestration.
The timing is perfect. Open-source alternatives like Goose (from Block) now offer nearly identical functionality to Claude Code, running entirely on your local machine. No subscription. No rate limits. No cloud dependency.
The Five-Agent Strategy That Turns Coding Into Starcraft
Cherny doesn't code linearly. He runs five Claude instances simultaneously in his terminal, numbered 1-5, using system notifications to know when each needs input. While one agent runs tests, another refactors legacy code, and a third drafts documentation.
He also runs 5-10 additional instances in his browser, using a "teleport" command to hand off sessions between web and local environments. One developer who implemented this setup described it as feeling "more like Starcraft than traditional coding"—you're commanding autonomous units, not typing syntax.
The practical takeaway: parallelism multiplies output. If you're still working on one task at a time while waiting for AI responses, you're leaving productivity on the table.
The Counterintuitive Model Choice: Slower Is Faster
Cherny exclusively uses Opus 4.5—Anthropic's heaviest, slowest model. "It's the best coding model I've ever used," he wrote. "Even though it's bigger & slower than Sonnet, since you have to steer it less and it's better at tool use, it is almost always faster than using a smaller model in the end."
The insight: the bottleneck isn't token generation speed. It's the human time spent correcting AI mistakes. Paying the "compute tax" for a smarter model upfront eliminates the "correction tax" later.
For learners, this means: don't default to the fastest model. Test whether a more capable model reduces your total time-to-working-code, even if each response takes longer.
The Single File That Makes AI Remember Your Mistakes
Standard large language models don't "remember" your coding style between sessions. Cherny's team solves this with a single file: CLAUDE.md, checked into their git repository.
"Anytime we see Claude do something incorrectly we add it to the CLAUDE.md, so Claude knows not to do it next time," he explained. Every mistake becomes a rule. The longer the team works together, the smarter the agent becomes.
This is immediately actionable: create a project-level instructions file. When the AI generates code that violates your conventions, document the correction. Over time, you're building a custom AI that understands your specific context.
Slash Commands and Subagents Automate the Tedium
Cherny uses custom slash commands—shortcuts checked into the repository—to handle complex operations with a keystroke. His /commit-push-pr command, invoked dozens of times daily, handles git commits, messages, and pull requests autonomously.
He also deploys specialized subagents: a code-simplifier to clean up architecture after the main work is done, and a verify-app agent to run end-to-end tests before shipping.
The pattern: identify your most repetitive workflows, then create commands or agents that execute them automatically. The goal isn't to eliminate human judgment—it's to eliminate human busywork.
Verification Loops: Why AI That Tests Itself Is 2-3x Better
The real unlock isn't generation—it's verification. "Claude tests every single change I land to claude.ai/code using the Claude Chrome extension," Cherny wrote. "It opens a browser, tests the UI, and iterates until the code works and the UX feels good."
Giving the AI a way to verify its own work—through browser automation, bash commands, or test suites—improves output quality by "2-3x," according to Cherny. The agent doesn't just write code; it proves the code works.
For learners: prioritize tools that can execute and test code, not just generate it. The feedback loop is where the learning happens—for you and the AI.
What This Means for Learners: You Can Start Today
Cherny's workflow isn't locked behind enterprise paywalls. Here's how to replicate it:
1. Run multiple AI sessions in parallel. Open 3-5 browser tabs or terminal windows. Assign each a specific task. Use notifications or tab labels to track which needs input.
2. Create a project instructions file. Name it CLAUDE.md, INSTRUCTIONS.md, or similar. Document your coding conventions, common mistakes, and project-specific context. Reference it in every AI conversation.
3. Use the most capable model you can afford. Test whether a slower, smarter model reduces your total correction time. For free users, this might mean using GPT-4 instead of GPT-3.5, or Gemini Pro instead of Flash.
4. Build verification into your workflow. Don't just generate code—run it. Use tools like Replit, CodeSandbox, or local environments where the AI can execute and test its own output.
5. Automate your repetitive tasks. Identify the commands you run dozens of times daily. Create aliases, scripts, or custom AI prompts that execute them automatically.
The programmers who adopt this workflow first won't just be more productive. They'll be playing an entirely different game—and everyone else will still be typing.