In 2023, Cognition AI's Devin and similar systems demonstrated a well-documented failure pattern: a single LLM asked to write, test, deploy, and document a software project would lose coherence across the thousands of tokens such a task requires. The context window — the agent's working memory — filled up. Earlier decisions got dropped. Code written in step 2 contradicted assumptions made in step 47. The agent couldn't hold the entire problem in mind at once, and the output degraded accordingly.
The engineering response was architectural: break the problem into specialized sub-agents, each working on a bounded sub-task, coordinated by an orchestrator that maintains the high-level plan. This is the origin story of multi-agent systems in production AI.
Single-agent architectures run into three hard limits that cannot be solved by making the model bigger or the context window longer — though both help at the margins.
The first ceiling is context length. Every LLM has a finite context window. Long-horizon tasks — auditing a codebase, researching a legal case, planning a multi-week project — generate more information than fits. When the window fills, the agent either truncates (losing early context) or summarizes (losing detail). Either way, fidelity degrades.
The second ceiling is specialization. A single agent must be a generalist. But many real tasks require deep domain expertise in multiple areas simultaneously — legal reasoning and financial modeling, for instance. Fine-tuned or prompted specialist agents consistently outperform generalists on domain-specific subtasks, a finding documented across multiple 2023–2024 benchmarks including HELM and AgentBench.
The third ceiling is parallelism. A single agent is sequential. If a task has ten independent subtasks, a single agent completes them one after another. A multi-agent system can run them simultaneously, compressing wall-clock time dramatically.
Multi-agent systems are not about making AI more powerful in a single pass. They are about restructuring work so that bounded, specialized agents each operate within their competence — and an orchestrator stitches the results into a coherent whole.
The 2024 paper "AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation" from Microsoft Research documented concrete scenarios where multi-agent conversation frameworks outperformed single-agent approaches. On math problem solving, coding tasks, and decision-making games, the multi-agent setup improved success rates — not because any individual agent was smarter, but because the architecture caught errors through inter-agent critique.
Google DeepMind's work on AlphaCode 2 in 2023 used a pipeline of specialized models — one generating candidate solutions, another filtering them, another scoring — rather than a single model doing everything. The result was competitive performance on competitive programming at the International Olympiad level, which no single model had achieved.
The pattern is consistent: tasks with high complexity, long horizons, or multiple independent sub-problems benefit structurally from decomposition across agents. The question is not whether to use multiple agents but how to orchestrate them without introducing new failure modes.
Multi-agent systems introduce coordination overhead and new failure modes — miscommunication between agents, error propagation, and orchestration loops. The architectural gains only materialize if the orchestration is well-designed. Complexity for its own sake makes systems worse, not better.
You'll work with an AI tutor to analyze concrete task descriptions and identify which of the three single-agent ceilings — context, specialization, or parallelism — each one hits hardest. Then you'll reason about whether a multi-agent architecture is justified.
Salesforce's Einstein Copilot, launched in 2024, uses an orchestration layer that sits above multiple specialized agents: one for CRM data retrieval, one for email drafting, one for calendar actions, one for analytics queries. A user request like "prepare for my 3pm meeting with Acme Corp" triggers the orchestrator to fan out tasks to the relevant sub-agents simultaneously — the CRM agent pulls deal history, the analytics agent surfaces recent activity metrics, the email agent checks correspondence. The orchestrator waits for all results and assembles a coherent briefing. No single agent could do this without deep integration into every data system; the multi-agent architecture keeps each agent's scope bounded and maintainable.
The orchestrator is the strategic layer. It receives a high-level goal, decomposes it into sub-tasks, routes those sub-tasks to the appropriate sub-agents, monitors progress, and synthesizes results. Critically, the orchestrator does not need to know how each sub-agent accomplishes its task — it only needs to know what each sub-agent can do and what it returns.
This separation of concerns is what makes the architecture scalable. Adding a new capability to the system means adding a new sub-agent and registering it with the orchestrator — not retraining or re-prompting a monolithic agent. The 2024 Anthropic documentation on Claude's tool use describes exactly this pattern: Claude acting as orchestrator, calling tools (which may themselves be other Claude instances) to accomplish bounded sub-tasks.
Task decomposition — breaking a goal into sub-tasks. Routing — choosing which sub-agent handles which sub-task. Dependency management — ensuring sub-tasks run in the right order when outputs from one feed into another. Synthesis — assembling sub-agent outputs into a coherent final result. Error handling — deciding what to do when a sub-agent fails or returns an unexpected result.
What the orchestrator does not do: execute domain-specific work itself. An orchestrator that starts writing code or querying databases directly is an orchestrator that has grown beyond its role — and that growth typically introduces the coherence problems the architecture was designed to avoid.
Sub-agents are designed around a single principle: do one thing well and return a structured output the orchestrator can use. The sub-agent should be stateless where possible — it receives a bounded task, executes it, and returns a result, without maintaining memory of previous calls. This makes sub-agents individually testable, replaceable, and debuggable.
OpenAI's Assistants API, documented in detail in its 2024 developer documentation, implements this model explicitly: each assistant has a defined set of tools, is invoked with a specific task, and returns a structured response. Multiple assistants are coordinated through a shared thread that the orchestrating assistant manages — the thread serving as the shared working memory the individual assistants don't maintain themselves.
The hardest orchestration design decision is not what the orchestrator does — it's where sub-agent boundaries go. Draw them too wide and sub-agents become mini-monoliths. Draw them too narrow and the orchestrator spends more time coordinating than the sub-agents spend working. This is an engineering judgment call that documentation from Anthropic, OpenAI, and Google all describe as context-dependent.
You'll work with an AI tutor to design the orchestration layer for a given complex task. You'll specify what the orchestrator decides, what each sub-agent handles, what structured outputs they return, and how the orchestrator synthesizes the results.
In 2024, researchers at Stanford and Carnegie Mellon published "AgentBench: Evaluating LLMs as Agents," which benchmarked agent performance across operating system, database, knowledge graph, card game, lateral thinking, and house-holding tasks. One consistent finding: agents that passed outputs as unstructured natural language between pipeline stages performed significantly worse than those using structured formats. The natural language handoffs introduced ambiguity — downstream agents had to interpret upstream outputs rather than simply parse them, introducing error at every stage junction. The paper explicitly recommended structured inter-agent communication as a design requirement for production systems.
Multi-agent systems use four primary communication patterns, each suited to different task structures. Choosing the wrong pattern for a task type is one of the most common sources of multi-agent system failures documented in the 2023–2024 research literature.
The communication pattern should match the dependency structure of the task, not the preferences of the designer. Sequential work requires sequential patterns. Independent work is wasted on sequential patterns. Peer-to-peer critique is expensive and should be reserved for tasks where quality validation is genuinely difficult.
Beyond the structural pattern, agents need a mechanism for actually exchanging information. The two primary mechanisms are shared memory — a common data store all agents can read from and write to — and message passing — agents explicitly sending structured messages to other agents or to a central message bus.
OpenAI's Assistants API uses a thread-based shared memory model: all agents in a workflow share access to a thread that serves as the canonical record of the task's state. Any agent can read the thread; outputs are appended to it. This avoids the fragmentation that occurs when each agent maintains its own isolated context.
LangChain's agent framework, documented extensively in its 2024 release notes, uses a message-passing model with a shared state object. Each agent receives a copy of the current state, performs its task, and returns an updated state object. The orchestrator (a "supervisor" in LangChain terminology) routes state updates between agents. This model makes the task's history explicitly traceable — every state transition is logged — which is essential for debugging complex multi-agent workflows.
Shared memory creates write-conflict risks when multiple agents update the same state simultaneously in parallel patterns. Message passing avoids this but requires explicit serialization logic. Production multi-agent systems almost always choose one pattern and enforce it consistently — mixing both within a single system is a documented source of hard-to-debug inconsistencies.
You'll work with an AI tutor to analyze multi-agent system scenarios and select the appropriate communication pattern. You'll also work through the shared memory vs. message passing tradeoff for a given system.
In early 2024, security researchers at companies including Trail of Bits documented a class of attack called "prompt injection in multi-agent pipelines." The attack works as follows: malicious content in an external data source (a webpage, a document, an email) contains instructions that, when processed by a sub-agent with web browsing or document reading tools, cause that sub-agent to change its behavior — exfiltrating data, making unauthorized API calls, or sending false information upstream to the orchestrator. The orchestrator, trusting its sub-agents, acts on the corrupted output. The attack propagates upward. Anthropic's 2024 guidance on Claude in agentic contexts explicitly flags this as a live threat and recommends minimal footprint, explicit human confirmation before irreversible actions, and skepticism about claimed permissions from environmental context.
In a sequential multi-agent pipeline, an error in stage two doesn't just affect stage two's output — it becomes the input to stage three, which works on corrupted data, and passes its corrupted output to stage four. By stage five, the original error may be unrecognizable, buried under layers of subsequent processing. This is the cascade problem, and it's one reason the research literature consistently recommends validation steps between pipeline stages rather than pure end-to-end processing.
Google's 2024 documentation for Gemini in agentic workflows describes "verification agents" — lightweight sub-agents whose sole job is to check the output of a previous sub-agent before passing it downstream. This adds latency but dramatically reduces the risk of cascade failures propagating through a long pipeline. The cost-benefit analysis depends on the reversibility of the task: for tasks where downstream errors are catastrophic or hard to reverse, verification agents are worth the overhead.
Any multi-agent pipeline that takes irreversible real-world actions — sending emails, making purchases, modifying databases, executing code — should have verification checkpoints before those actions. Anthropic's guidance for Claude in agentic settings frames this as a first principle: prefer reversible actions, request minimal permissions, and confirm with humans before irreversible steps.
Multi-agent systems need an explicit trust model. When an orchestrator delegates to a sub-agent, the sub-agent should not inherit all the orchestrator's permissions automatically. Principle of least privilege — each component gets only the permissions it needs to do its specific job — is as important in multi-agent AI systems as it is in traditional software security.
The 2024 OWASP Top 10 for Large Language Model Applications lists "excessive agency" as a top risk: an LLM agent with more permissions than its task requires creates unnecessary attack surface. In a multi-agent context, this means sub-agents should be scoped to exactly the tools and data they need. A summarization sub-agent has no business being able to send emails. A data retrieval sub-agent shouldn't be able to modify the database it's reading.
Multi-agent systems amplify both capability and risk. Each additional agent that can take real-world actions is another point of failure and another potential attack surface. The architectural response is not to limit capability but to enforce minimal footprint, explicit permissions, and human oversight at the points where mistakes would be irreversible. This is the design philosophy documented across Anthropic, Google, and OpenAI's 2024 agentic safety guidance.
Use the AI below to explore the concepts from Lesson 4 in depth. Ask questions, challenge assumptions, and work through practical scenarios related to lesson 4.