Entry
When to Reach for Multiple Agents (and When Not To)
Multi-agent is the current default answer, and it is usually the wrong one. When orchestration earns its overhead, when a single agent wins, and the patterns that hold up in production.
00 / Masthead
Entry
Multi-agent is the current default answer, and it is usually the wrong one. When orchestration earns its overhead, when a single agent wins, and the patterns that hold up in production.
"Multi-agent" is having a moment, which means a lot of teams are building three agents to do the work of one and wondering why it got slower and harder to debug.
I want to make the unfashionable case first: most problems want a single, well-built agent. A second agent is not free. It adds a coordination layer, a place for messages to get lost, and a debugging surface that grows with every agent you add. Before you split the work, the work has to actually be splittable. Orchestration is a tool in the system around the model, and like the others it earns its place or it does not.
A single agent running a plan-act-observe loop handles a surprising amount. It picks a step, calls a tool, reads the result, picks the next step. State stays in one place. When it breaks, there is one trace to read. You can reason about it.
This is the right shape for anything sequential, anything where each step depends on the last, anything where the steps share a lot of context. Which is most real work. The bar for adding a second agent should be high, and "it sounds more sophisticated" does not clear it.
There are two situations where splitting genuinely helps, and they are specific.
The first is parallel, independent subtasks. If the work decomposes into pieces that do not depend on each other (research five companies, review twelve files, summarise eight documents) then running them as separate agents in parallel is faster and each one keeps a clean, small context instead of one agent juggling all twelve at once. The key word is independent. The moment the pieces need to share state mid-flight, the coordination cost eats the benefit.
The second is genuinely different jobs. An agent that writes and an agent that reviews are doing different work with different success criteria, and keeping them separate is the point. This is the cross-model review pattern I lean on hard: one model produces, a different model checks, and because they have different weights they have different blind spots. One agent grading its own homework is just self-consistency wearing a second hat.
When you do go multi-agent in production, the shape that survives is hierarchical. One orchestrator owns the goal and the state, and it delegates narrow pieces to specialist sub-agents that each do one thing and report back.
This works because it keeps the thing that is hard about multi-agent (shared state, coordination) in one place, the supervisor, instead of smearing it across a committee of peers all trying to talk to each other. Peer-to-peer agent swarms look elegant in a diagram and turn into a group chat where nobody is in charge.
Flow-Next is built this way. The orchestrator owns the task graph, and when it needs context it dispatches specialists: one scout reads the codebase, one finds existing patterns, one fetches the relevant docs, one runs gap analysis. Each returns a focused result the orchestrator folds into the plan. None of them holds the whole problem, and none of them needs to. That division is what lets each sub-agent keep a small, clean context and do its narrow job well.
Every agent boundary is a place information has to be handed across, and handoffs lose things. The supervisor summarises the task for the specialist, the specialist summarises its findings back, and detail leaks at both ends. With one agent, the context is continuous. With five, you are playing telephone, and the more agents you add, the more the message-passing dominates the actual work.
So the real design question is not "how many agents" but "where are the natural seams." Split along seams that are genuinely independent and the system gets faster and cleaner. Split a tightly coupled problem just to have more agents and you have built a distributed system with all of distributed systems' problems and a language model in every node, which is a sentence that should make you nervous.
Start with one good agent. Reach for a second only when the work splits cleanly or the jobs are genuinely different. Put one thing in charge of the state. That is most of the way to a multi-agent system that holds up instead of one that just has a lot of agents.
Orchestration is one of five layers. The rest (tools, verification, memory, guardrails, observability) are in the field guide to harness engineering.