Abstract editorial illustration of a recursive loop of glowing AI subagents around a Claude Code workspace, deep navy and teal palette, one dominant focal subject

Anthropic's Cherny: recursive agent loops are the next coding era

AIntelligenceHub
··7 min read

Anthropic's Claude Code lead Boris Cherny told Meta's @Scale that agents prompting agents is the next era of coding, and that long-running recursive loops will define how teams ship software for the rest of 2026.

The head of Anthropic's Claude Code team told a Meta @Scale audience this week that agentic coding is moving into its third era. Two years ago, programmers wrote source code by hand. Last year, agents wrote the code. Now, agents are starting to prompt other agents that then write the code, and Anthropic thinks that shift is as big as the move from hand-written source to agent-written code was in 2024.

The pattern that makes the new era work, Claude Code creator Boris Cherny said, is what practitioners are starting to call the "loop": a long-running, recursive structure where subagents watch a codebase, open pull requests, and keep working in the background until something tells them to stop.

Cherny spoke on June 19 at Meta's @Scale conference in a fireside chat with Meta engineer Jesse Chen. The video of the talk was posted to YouTube the same day, and TechCrunch's Maxwell Zeff wrote up the conversation on June 22. The framing matters. Cherny is not a commentator on agentic AI. He is the engineer who leads the team that built Claude Code, which has become one of the most widely used agentic coding tools in the industry. When he says loops are the next step, the room listens, and a lot of engineering teams are about to copy what his team is already running internally.

What Cherny actually runs in his own work

In the second half of the fireside chat, around the 32-minute mark, Cherny got specific about the loops he keeps open. He described two persistent subagents that are running in the background of his own development environment as he talks. One is constantly looking for ways to improve the architecture of the code he is working on. The other is hunting for duplicated abstractions that should be unified into a shared helper. Both agents open pull requests the same way any human contributor would, and because the code keeps moving underneath them, neither one ever reaches a clean stopping point. They just keep going.

The pattern is a sharp break from how most teams think about agentic coding today. The conventional mental model is start, check, stop. A developer kicks off an agent, watches the first few steps, intervenes when it drifts, and ends the run when the task looks done. The loop pattern removes the human as the unit of progress. The agents are not waiting for the next prompt. They are waiting for a signal that the work is finished, and a subagent makes the call about when that signal arrives, not a hard-coded rule.

The idea of a recursive loop is older than most people in the room. A function that calls itself until a condition is met is one of the first things taught in any intro computer science course. The version that Anthropic, OpenAI, and the rest of the agentic-AI crowd are now running is structurally similar, with one important difference. A normal recursive loop has a hard termination condition, and an agentic loop does not. A subagent gets to decide when the loop is done, and that means the loop can run for hours or days, across thousands of tokens, without an external signal stopping it.

The most popular open implementation of the pattern is the "Ralph Loop," named for the long-suffering Ralph Wiggum character from The Simpsons. The Ralph Loop is a deliberately simple trick: it asks the model to summarize everything it has done so far and then asks if it has accomplished its goal. If the model says yes, the loop ends. If the model says no, the loop runs again with the new context attached. The pattern has been picked up by open-source projects across the agentic coding ecosystem, and Anthropic's own Claude Code documentation has 148 references to loops across its overview and how-to guides, which is a sign that the pattern is no longer experimental for the team that wrote it. The same push shows up in adjacent products: the Claude Design update from the same week folded in design-system import and a Claude Code handoff, which is the kind of integration surface a loop pattern can exploit once it is running.

The token economics and the cost ceiling

The cheapest part of an agentic loop is the engineering. The expensive part is the inference. Every time the loop runs, it spends tokens, and because the loop is designed not to stop, the spend has no ceiling. Cherny was clear-eyed about this. "If that sounds expensive, it should," he said, and then pointed out that Anthropic is ultimately in the business of selling tokens, so the calculus that works for the model provider is not the same as the calculus that works for everyone else who adopts the pattern.

Cherny framed loops as an extension of test-time compute, the idea that you keep spending inference cycles on a problem until you get an answer you trust. "As big as the step from source code to agents was, loops are just as important and as big a step," he said. The bet is that for any task with a clear definition of done, the right answer is to keep running the loop until the answer is consistent, and that for tasks without a clear end, the loop is the engine that improves the system over time. He also pointed to a public observation OpenAI researcher Noam Brown made earlier this month that contemporary models can solve nearly any problem if you throw enough compute at them, and that the right way to make sure a problem gets solved is to keep throwing compute at it.

For an enterprise engineering team, the question is how much budget they are willing to burn to keep a loop running. The pattern works best for what the literature calls "hill-climbing" problems, where the model can make incremental progress and a verifier can tell when the result is good enough. Code quality, test coverage, and architectural cleanup are all hill-climbing problems. A loop that keeps refactoring and submitting pull requests will, in expectation, leave the codebase a little better each cycle, and the human in the loop gets to review the diffs and accept or reject the changes. The pattern fails for problems that need a single decisive answer, like a security incident or a customer-facing incident response, where the cost of an indefinite loop is a problem the human cannot afford.

This is the part of the talk that the industry is going to spend the most time arguing about. The model providers have an incentive to push the loop pattern because every minute of loop time is a minute of token revenue. The customer has an incentive to push back because every minute of loop time is a minute of their inference budget. The right answer for most teams is somewhere in the middle: tight loops for narrow, well-defined problems, with a hard cap on token spend and a human who reviews the final artifact before it ships.

Where agent loops land in the next six months

The next six months of agent tooling are going to be defined by how the major platforms let developers configure, monitor, and shut down loops. Claude Code, GitHub Copilot Workspace, and the new generation of agentic IDEs all need a way to expose loop state, loop duration, and loop cost to the developer in a way that is more useful than a log file. The first platform that ships a clean, trustworthy loop management interface will set the standard for the rest, and the second-tier tools will spend the next year catching up.

For a broader look at how Claude Code and the agentic IDE category compare against each other, the best AI coding agents in 2026 page walks through the current state of the market, including how loops are exposed (or not) across the major platforms. The short version: only a few of the major tools have a real loop primitive today. Most of them still treat the agent as a single-run process that ends when the model thinks it is done, and the user has no clean way to extend the run, inspect the history, or hand control back to the model with a fresh context.

The takeaway for engineering leaders is straightforward. If your team is going to be serious about agentic coding in the second half of 2026, you need a clear policy on loops. Which problems are loop-safe? Which ones are not? How much token spend is acceptable per loop, and who has the authority to change that number mid-quarter? How do you audit the artifacts the loop produced, and how do you roll back a loop that drifted in a way the verifier missed? Boris Cherny and his team have answers to all of those questions, and so does the open-source Ralph Loop community. The teams that have not started having those conversations are going to learn the answers the hard way, and the cost of that learning is going to come straight out of their inference budget.

Weekly newsletter

Get a weekly summary of our most popular articles

Every week we send one email with a summary of the most popular articles on AIntelligenceHub so you can stay up-to-date on the latest AI trends and topics.

One weekly email. No sponsored sends. Unsubscribe when you want.

Comments

Every comment is reviewed before it appears on the site.

Comments stay pending until review. Posts with more than two links are held back.

Related articles