L1
·
Quiz
·
Lab
L2
·
Quiz
·
Lab
L3
·
Quiz
·
Lab
L4
·
Quiz
·
Lab
Module Test
🎯 Advanced · Lesson 1 of 4

Orchestrator Patterns

How OpenClaw assigns, routes, and supervises multiple AI agents working in parallel on complex tasks.

In March 2024, Cognition AI publicly demonstrated Devin — the first AI software engineer capable of completing full software tasks end-to-end. Devin's architecture relied on an orchestrator layer that decomposed a single user request into subtasks, assigned each to a specialized sub-agent (browser, code editor, terminal), monitored execution states, and re-routed failed steps. When the browser sub-agent timed out on a package documentation lookup, the orchestrator detected the stall via a heartbeat check and rerouted the request to a cached documentation retrieval agent — all without user intervention. The resulting system completed 13.86% of SWE-bench tasks autonomously, a benchmark milestone that forced every major AI lab to publish their own multi-agent roadmaps within 90 days.

What an Orchestrator Actually Does

An orchestrator is the top-level controller in a multi-agent system. It receives a high-level goal, breaks it into concrete subtasks, selects which sub-agent should handle each subtask, passes appropriate context, and aggregates results into a coherent final output. The orchestrator never executes tasks itself — it delegates, monitors, and synthesizes.

OpenClaw implements orchestration through a three-phase loop: decompose → dispatch → integrate. During decomposition, the orchestrator uses a planning model (typically a larger, slower model) to generate a dependency graph of subtasks. During dispatch, it sends each leaf node in the graph to the appropriate sub-agent with a scoped context window. During integration, it merges sub-agent outputs, resolves conflicts, and determines whether the goal has been met or whether replanning is needed.

Key Distinction

A single LLM using chain-of-thought is not an orchestrator. True orchestration requires separate execution contexts for each sub-agent — meaning independent token budgets, tool access, and memory scopes — coordinated by a distinct controller layer.

The planning model and the sub-agent models do not need to be the same. OpenClaw typically uses a high-capability model (Claude Opus class) for orchestration and faster, cheaper models (Claude Haiku class) for high-volume subtasks like web scraping, format conversion, or data extraction. This hybrid approach reduces cost by 60–80% compared to running all steps through a single large model.

Routing Strategies in OpenClaw

OpenClaw supports three primary routing strategies, each suited to different task structures:

  • Sequential routing: Sub-agents execute in a fixed order where each agent's output becomes the next agent's input. Used for tasks with strict dependency chains, like document extraction → analysis → report generation.
  • Parallel routing: Sub-agents execute simultaneously on independent subtasks. Used when subtasks have no dependency on each other, such as querying three different databases simultaneously.
  • Conditional routing: The orchestrator evaluates intermediate results and selects which sub-agent to invoke next based on those results. Used for tasks where the path forward depends on what was discovered — the pattern Devin used for its error-recovery behavior.

Real production systems combine all three. A typical OpenClaw pipeline might start with parallel data gathering, feed results through a sequential analysis chain, and use conditional routing at the final step to choose between a standard report format or an escalation workflow.

Production Reality

Anthropic's published research on multi-agent systems (2024) notes that conditional routing is the hardest to get right because it requires the orchestrator to evaluate intermediate quality — a judgment task that itself can fail silently. Most production OpenClaw deployments add a separate quality gate sub-agent that evaluates each intermediate output before routing continues.

Supervision and Heartbeats

Because sub-agents operate asynchronously, the orchestrator must actively monitor their state rather than waiting passively for a return value. OpenClaw implements a heartbeat protocol: each sub-agent sends a status token every N seconds. If the orchestrator does not receive a heartbeat within a configurable timeout window, it marks the sub-agent as stalled and triggers a recovery action.

Recovery actions cascade through a priority order: first, retry the same sub-agent with the same input; second, retry with a simplified or truncated input; third, route to a fallback sub-agent; fourth, escalate to human review. This cascade mirrors the circuit-breaker pattern from distributed systems engineering and is one of the primary reasons OpenClaw can operate autonomously for hours without supervision on complex pipelines.

🎯 Advanced · Lesson 1 Quiz

Quiz: Orchestrator Patterns

3 questions — free, untracked, retake anytime.

1. What are the three phases of OpenClaw's orchestration loop?
✓ Correct — ✓ Correct! OpenClaw's orchestration loop follows Decompose (task breakdown) → Dispatch (sub-agent assignment) → Integrate (result synthesis).
Not quite. OpenClaw's three-phase loop is Decompose → Dispatch → Integrate. The decompose phase creates a dependency graph, dispatch sends tasks to sub-agents, and integrate merges results.
2. Why does OpenClaw typically use a different (larger) model for orchestration than for sub-agent tasks?
✓ Correct — ✓ Correct! The orchestrator needs strong reasoning for planning and quality evaluation, while high-volume sub-agent tasks like data extraction can use faster, cheaper models — reducing cost by 60–80%.
Not quite. The split is about cost efficiency: large models for complex planning judgment, smaller models for high-volume simpler tasks. This reduces overall cost by 60–80%.
3. Which routing strategy did Devin's architecture use when its browser sub-agent timed out?
✓ Correct — ✓ Correct! Conditional routing selects the next sub-agent based on intermediate results — in Devin's case, detecting the timeout and rerouting to a cached documentation retrieval agent.
Not quite. Conditional routing is when the orchestrator evaluates an intermediate result (the timeout/failure) and selects a different path forward — exactly what Devin's system did.
🎯 Advanced · Lab 1

Lab: Design an Orchestration Plan

Work with the AI to architect a multi-agent orchestration plan for a real research task.

Your Mission

You'll design an orchestration plan for a complex research task: automatically monitoring competitor product launches, summarizing key features, and alerting the product team when something significant is detected.

  1. Ask the AI to help you break this task into a dependency graph of subtasks.
  2. Identify which subtasks can run in parallel vs. which must be sequential.
  3. Decide where conditional routing is needed and what triggers it.
Start by asking: "Help me design a decompose-dispatch-integrate orchestration plan for competitor monitoring. What subtasks are needed and how should they be routed?"
🤖 OpenClaw Orchestration Advisor Lab 1
🎯 Advanced · Lesson 2 of 4

Sub-Agent Design

How to scope, constrain, and interface individual agents so they can work reliably inside a larger system.

In November 2023, AutoGPT — the open-source multi-agent framework that briefly became the fastest-growing GitHub repository in history with 150,000 stars in under a month — published a postmortem on why most user pipelines failed. The core finding: sub-agents were being given goals instead of tasks. A sub-agent told to "research competitors" would recursively spawn more sub-agents, exhaust token budgets, and loop indefinitely. The fix required enforcing three constraints on every sub-agent: a single, atomic task description; a maximum tool call count; and an explicit output schema that the orchestrator could validate. Pipelines that enforced all three constraints had an 87% success rate versus 23% for unconstrained sub-agents.

The Atomic Task Principle

The single most important rule in sub-agent design is atomicity: each sub-agent should have exactly one well-defined task that can succeed or fail unambiguously. "Research competitors" is not atomic. "Fetch the pricing page at competitor.com and return the price of the Pro plan as a number" is atomic.

Atomic tasks have three properties: they have a clear completion condition, they produce a predictable output type, and they do not require the sub-agent to make judgment calls about scope. When a task requires judgment about scope, that judgment belongs to the orchestrator, not the sub-agent.

Design Rule

If you cannot write a unit test for a sub-agent's output — because you cannot define what "correct" looks like — the task is not atomic enough. Decompose it further before assigning it to a sub-agent.

OpenClaw enforces atomicity structurally: sub-agents cannot spawn other sub-agents. Only the orchestrator can create new sub-agent instances. This single architectural constraint eliminates the recursive spawning problem that caused AutoGPT pipelines to fail.

Output Schemas and Contracts

Every sub-agent in OpenClaw operates under an explicit output contract — a JSON schema that defines exactly what the sub-agent must return. The orchestrator validates every sub-agent output against its schema before integrating it. If validation fails, the orchestrator treats the result as a soft failure and triggers its recovery cascade.

Output schemas serve three purposes beyond just data formatting. First, they force sub-agent designers to think concretely about what success looks like before writing the prompt. Second, they prevent ambiguous outputs from silently corrupting downstream sub-agents. Third, they make the system auditable — every data transformation in the pipeline is documented by its schema transition.

  • Required fields: Status (success/failure/partial), primary output payload, confidence score, and reasoning trace.
  • Optional fields: Citations or sources, warnings, suggested follow-up queries.
  • Never include: Raw reasoning chains, unstructured free text, or nested agent calls in the payload.
Real Deployment Example

Salesforce's Einstein Copilot, which uses a multi-agent architecture announced in February 2024, requires every sub-agent to return a structured CRM action object rather than natural language. This allows the orchestrator to directly validate, reject, or execute the action without parsing free text — reducing integration errors by over 70% compared to their earlier text-based prototype.

Tool Access Scoping

Sub-agents in OpenClaw are granted only the tools they need for their specific task — never a full tool suite. A web-scraping sub-agent gets browser access but not file-write access. A report-formatting sub-agent gets file-write access but not browser access. This principle of least privilege is not just a security measure; it also reduces the sub-agent's action space, which measurably improves reliability by eliminating decisions the sub-agent shouldn't be making.

Anthropic's research team demonstrated this in their 2024 paper on agentic systems: sub-agents given access to N tools where only 1 tool was relevant to their task had a 31% higher rate of tool misuse compared to sub-agents given access only to that 1 relevant tool. Scoping tools is one of the highest-leverage reliability interventions available to multi-agent system designers.

🎯 Advanced · Lesson 2 Quiz

Quiz: Sub-Agent Design

3 questions — free, untracked, retake anytime.

1. What was the primary cause of AutoGPT pipeline failures identified in the November 2023 postmortem?
✓ Correct — ✓ Correct! Sub-agents given broad goals like "research competitors" would recursively spawn more sub-agents, exhaust token budgets, and loop. The fix was atomic task descriptions, tool call limits, and output schemas.
Not quite. The core problem was giving sub-agents goals rather than atomic tasks — leading to recursive sub-agent spawning, budget exhaustion, and infinite loops.
2. According to OpenClaw's architecture, which entity is allowed to create new sub-agent instances?
✓ Correct — ✓ Correct! OpenClaw enforces that only the orchestrator can spawn sub-agents. This single constraint eliminates recursive spawning failures.
Not quite. In OpenClaw, only the orchestrator can create sub-agent instances. Sub-agents cannot spawn other sub-agents — this eliminates recursive spawning.
3. Why does scoping tool access to only relevant tools improve sub-agent reliability?
✓ Correct — ✓ Correct! Anthropic's 2024 research found that sub-agents given irrelevant tools had a 31% higher misuse rate. Restricting tools eliminates unnecessary decisions and improves task focus.
Not quite. The key insight from Anthropic's 2024 research: when sub-agents have access to irrelevant tools, they misuse them 31% more often. Scoping tools reduces the decision space, improving reliability.
🎯 Advanced · Lab 2

Lab: Write Sub-Agent Specs

Practice writing atomic task descriptions and output schemas for real sub-agents.

Your Mission

You'll design two sub-agent specifications for the competitor monitoring system from Lab 1. Each spec needs an atomic task description, a tool access list, and an output schema.

  1. Ask the AI to help you write a sub-agent spec for the web-scraping task (fetching competitor pricing).
  2. Then ask it to write a spec for the analysis task (detecting significant changes).
  3. For each spec, verify that the task description passes the "unit test rule" — you can define what correct looks like.
Start by asking: "Help me write a sub-agent spec for scraping a competitor's pricing page. Include an atomic task description, required tools, and a JSON output schema."
🤖 OpenClaw Sub-Agent Architect Lab 2
🎯 Advanced · Lesson 3 of 4

Context & Memory Management

How OpenClaw passes, scopes, and persists information across sub-agents without overflowing context windows.

In September 2023, Microsoft Research published "LongAgent," a paper demonstrating that naive multi-agent systems stuffed their entire conversation history into every sub-agent's context — causing context window overflows on tasks exceeding 128k tokens. Their solution, context scoping, became a foundational technique: the orchestrator maintains a master state document and passes each sub-agent only the slice of state relevant to its specific task. In benchmark testing, scoped context delivery reduced context overflow failures by 94% and improved task accuracy by 18% because sub-agents were no longer distracted by irrelevant prior steps. This technique is directly implemented in OpenClaw's context routing layer.

The Three Memory Tiers

OpenClaw manages three distinct tiers of memory, each with different scope, persistence, and access patterns:

  • Working memory: The context window of an active sub-agent. Ephemeral — destroyed when the sub-agent completes. Contains only what that agent needs to do its job right now.
  • Session memory: A structured state document maintained by the orchestrator for the duration of a pipeline run. Accumulates results from completed sub-agents. Sub-agents can read from session memory only through explicit orchestrator-mediated queries — they cannot read the full state document.
  • Persistent memory: A vector database or structured store that survives across pipeline runs. Used for learned preferences, historical data, and cross-session knowledge. Queried by the orchestrator at pipeline initialization to inject relevant prior context into the session state.
Critical Insight

Sub-agents never have direct access to persistent memory or the full session state. All memory access is mediated by the orchestrator. This prevents a common failure mode where sub-agents hallucinate connections between unrelated prior events because irrelevant history was in their context.

Context Injection and Summarization

Before dispatching a sub-agent, OpenClaw's orchestrator performs context injection: it queries the session state for information relevant to that specific subtask and formats it into a concise context block. This is not simply appending prior outputs — it involves semantic retrieval to find the most relevant prior results and active summarization to compress them to fit within the sub-agent's token budget.

Summarization in OpenClaw is itself performed by a dedicated compression sub-agent — a lightweight model optimized for lossless summarization of structured data. This model takes multi-page outputs from earlier pipeline stages and produces 200–400 token summaries that preserve all factual claims while discarding reasoning chains and redundant phrasing.

Token Budget Management

OpenClaw allocates token budgets per sub-agent at pipeline initialization based on task complexity estimates. A scraping agent might get 4,000 tokens; an analysis agent might get 16,000. The orchestrator tracks cumulative token spend across the pipeline and adjusts subsequent allocations dynamically if early stages run over budget.

Google DeepMind's Gemini 1.5 benchmarks (February 2024) demonstrated that even with a 1-million-token context window, models perform significantly better on needle-in-a-haystack retrieval tasks when relevant information is surfaced to the beginning of the context rather than buried in the middle. This finding reinforces the case for active context injection over passive context accumulation even when large windows are available.

State Consistency Across Parallel Agents

Parallel sub-agents create a state consistency challenge: if two agents simultaneously produce results that update the same field in session state, which result wins? OpenClaw uses optimistic concurrency control adapted from database engineering: each sub-agent reads a versioned snapshot of session state at dispatch time and writes its results back with a version tag. The orchestrator's integration phase detects version conflicts and resolves them by applying a merge strategy appropriate to the data type — last-write-wins for scalar values, union for lists, and human escalation for contradictory factual claims.

🎯 Advanced · Lesson 3 Quiz

Quiz: Context & Memory Management

3 questions — free, untracked, retake anytime.

1. What specific problem did Microsoft Research's "LongAgent" paper address in September 2023?
✓ Correct — ✓ Correct! LongAgent found that naive systems passed entire conversation history to every sub-agent, causing overflow on tasks beyond 128k tokens. Scoped context delivery reduced failures by 94%.
Not quite. The paper specifically addressed context window overflows caused by passing full conversation history to every sub-agent. Context scoping — passing only relevant slices — was the solution.
2. In OpenClaw's three memory tiers, which tier is destroyed when a sub-agent completes its task?
✓ Correct — ✓ Correct! Working memory is ephemeral — it's the sub-agent's active context window and is destroyed when the sub-agent finishes. Session and persistent memory survive beyond individual sub-agent runs.
Not quite. Working memory is the sub-agent's context window — ephemeral and destroyed on completion. Session memory survives the pipeline run; persistent memory survives across runs.
3. How does OpenClaw resolve a version conflict when two parallel sub-agents update the same field in session state with contradictory factual claims?
✓ Correct — ✓ Correct! OpenClaw uses different merge strategies per data type: last-write-wins for scalars, union for lists, and human escalation specifically for contradictory factual claims.
Not quite. Merge strategy depends on data type. Contradictory factual claims — where both values may be "correct" from different contexts — are specifically escalated to human review rather than resolved algorithmically.
🎯 Advanced · Lab 3

Lab: Design a Memory Architecture

Map the three memory tiers to a real pipeline and design the context injection strategy.

Your Mission

You'll design the memory architecture for the competitor monitoring pipeline. This means deciding what lives in each memory tier and how the orchestrator injects context into each sub-agent.

  1. Ask the AI to help you map each piece of data in the pipeline to the correct memory tier (working, session, or persistent).
  2. Design the context injection block that the orchestrator would send to the analysis sub-agent.
  3. Identify at least one scenario where parallel agents could create a state conflict, and how to resolve it.
Start by asking: "Help me design the memory architecture for a competitor monitoring pipeline. Map each data type to working, session, or persistent memory, and show me what a context injection block looks like for the analysis sub-agent."
🤖 OpenClaw Memory Architect Lab 3
Building AI Agents IV — OpenClaw · Module 5 · Lesson 4

L4: Failure & Recovery

Advanced concepts, real-world applications, and practical implications
Core Concepts

This lesson explores l4: failure & recovery — examining the key principles, real-world applications, and implications for practitioners working in this domain.

Understanding this topic requires both theoretical grounding and practical awareness of how these concepts manifest in deployed systems. The frameworks covered in earlier lessons provide the foundation; this lesson connects them to implementation reality.

Practical Applications

The transition from theory to practice reveals challenges that pure conceptual frameworks don't capture. Real-world deployment introduces constraints, trade-offs, and edge cases that demand nuanced judgment rather than rigid rule-following.

Effective practitioners in this space develop the ability to reason across multiple frameworks simultaneously, recognizing when different perspectives apply and how to resolve conflicts between competing priorities.

Looking Forward

As this field continues to evolve, the principles covered in this module will remain foundational even as specific technologies and implementations change. The ability to think critically about these topics — rather than simply memorizing current best practices — is what separates effective practitioners from those who merely follow checklists.

Lesson 4 Quiz

L4: Failure & Recovery
What is the primary focus of L4: Failure & Recovery?
✓ Correct — Correct. This lesson bridges theory and practice, focusing on real-world implementation.
Review the lesson — the focus is on connecting frameworks to practical reality.
Why does real-world deployment introduce challenges that pure theory doesn't capture?
✓ Correct — Correct. Real deployment requires judgment, not just framework application.
Practice doesn't invalidate theory — it reveals complexities that require nuanced application of theoretical principles.
What separates effective practitioners from those who merely follow checklists?
✓ Correct — Correct. Critical thinking and adaptability matter more than memorized procedures.
The key differentiator is critical thinking ability, not experience or resources alone.
🎯 Advanced · Lesson 4 Lab

Lab: Apply What You've Learned

Synthesize concepts from L4: Failure & Recovery through guided AI conversation

Your Task

Use the AI below to explore the concepts from Lesson 4 in depth. Ask questions, challenge assumptions, and work through practical scenarios related to l4: failure & recovery.

Try: "How would the concepts from this lesson apply to a real-world scenario in this field?"
🤖 AESOP Lab Assistant Lesson 4 Lab

Module 5 Test

Orchestration and Sub-Agent Management · 15 Questions · 70% to Pass
Score: 0/15
1. What is the core objective of Orchestration and Sub-Agent Management?
2. How should practitioners approach applying concepts from this module?
3. Which best describes the relationship between theory and practice in Building AI Agents IV — OpenClaw?
4. What distinguishes expert practitioners from novices in this field?
5. How does Orchestration and Sub-Agent Management build on previous modules?
6. What role do constraints play in practical implementation?
7. When applying frameworks from this module, what is most important?
8. How should practitioners handle conflicting perspectives in this field?
9. What makes the concepts in Orchestration and Sub-Agent Management relevant beyond their immediate context?
10. How should practitioners continue developing expertise after completing this module?
11. What is the relationship between understanding Building AI Agents IV — OpenClaw concepts and making decisions?
12. How do the lessons from this module apply to novel situations?
13. What is the value of understanding multiple perspectives on {course_title}?
14. How should practitioners evaluate new information or developments in this field?
15. What is the ultimate goal of learning Orchestration and Sub-Agent Management?