When Toru Iwatani's team at Namco shipped Pac-Man in 1980, each ghost ran its own tiny FSM cycling between three states: Chase, Scatter, and Frightened. The transition between Chase and Scatter fired on a hard-coded timer — Blinky chased for 20 seconds, scattered for 7, then repeated. That mechanical rhythm, discovered and documented by Jamey Pittman in his 2009 "Pac-Man Dossier," is why the game feels fair: players learn the pattern, then exploit it. Forty-four years later, FSMs remain the foundational vocabulary of NPC logic in every game engine on Earth.
A Finite State Machine (FSM) is a computational model describing a system that can exist in exactly one of a fixed number of states at any moment. Transitions between states are triggered by specific events or conditions. The machine is "finite" because the full list of possible states is known and bounded at design time.
In game NPC terms: an enemy guard is either Patrol, Alert, Chase, or Attack. It cannot be in two states simultaneously. When the player crosses a sensor radius, the condition "player spotted" fires a transition from Patrol → Alert. When the guard closes within melee range, another condition fires Alert → Attack. When the player escapes the detection cone for five seconds, Attack → Patrol.
Each state contains its own entry action (what the NPC does when it arrives), update action (what it does every frame while inside the state), and exit action (cleanup before leaving). This three-part structure is formalized in the Unity game engine's Animator State Machine system and in Unreal Engine's Behavior Tree node callbacks.
FSMs dominated NPC design through the 1990s and 2000s for three concrete reasons. First, predictability: designers can enumerate every possible state transition on a whiteboard. If a bug appears — the guard walks into a wall forever — the designer knows exactly which transition is missing or mis-conditioned. Second, performance: a switch statement over an enum costs almost nothing at runtime, a critical constraint when CPU budgets were measured in kilobytes of addressable memory. Third, toolability: FSMs map directly to visual node graphs, enabling non-programmers to wire NPC logic. Valve's Half-Life (1998) used FSMs extensively for its HECU soldiers, and the readable clarity of the system allowed level designers — not just engineers — to tune enemy aggression by adjusting timer thresholds.
The canonical limitation of FSMs is the state explosion problem. As NPC behavior grows richer, the number of required states multiplies. An enemy that must handle patrol, alert, chase, combat, low-health retreat, flanking, cover-seeking, and group-coordination needs dozens of states and potentially hundreds of transitions. Managing that graph becomes error-prone. This limitation directly motivated the invention of Hierarchical FSMs, Behavior Trees, and utility AI — all topics in this module.
State: a discrete mode of behavior (Patrol, Chase, Attack). Transition: a conditional edge between states. Event: the trigger that evaluates a transition condition. Entry/Update/Exit Actions: the three lifecycle hooks within each state. Understanding this vocabulary is prerequisite for every NPC system covered in this module.
A Hierarchical FSM (HFSM) nests states inside parent states. A parent state "Combat" might contain children: Melee, Ranged, and Retreat. Any transition defined at the Combat level — such as "player leaves the room → exit Combat" — applies automatically to all children without being duplicated on each child. This dramatically reduces the number of explicit transitions required.
Halo: Combat Evolved (Bungie, 2001) deployed HFSMs for its Covenant AI. The system was documented in post-mortems by engineers Damián Isla and Robert Zubek. Grunt enemies maintained a top-level state of Engage or Flee, and within Engage they had sub-states for Covering, Throwing Grenade, and Advancing. The hierarchy meant a single "player killed" event at the top level triggered a flee cascade through every sub-state, rather than requiring a transition coded on each individual child node. Halo's AI was widely cited in the 2002–2005 game industry press as a benchmark for believable NPC behavior.
When designing an FSM, start with the minimum viable set of states — three to five — and add states only when a behavior cannot be expressed as a transition or an entry/exit action of an existing state. Premature state proliferation is the most common FSM design error in student game projects.
You're designing a stealth-game guard NPC. Work with the AI assistant to define states, transitions, entry/exit actions, and edge cases for your FSM. The assistant will ask probing questions, suggest improvements, and flag common design errors like state explosion or missing transitions.
Describe your initial state list, then iterate based on feedback. Try to reach at least three full design exchanges.
By 2004, Bungie's engineers knew their HFSM approach for Halo: Combat Evolved was reaching its limits. For Halo 2, they restructured their AI pipeline around modular, composable logic trees. Lead AI engineer Damián Isla's 2005 GDC presentation "Handling Complexity in the Halo 2 AI" described the core problem: adding a new behavior to an HFSM required touching dozens of existing transitions. The new system — a direct ancestor of what the industry would later standardize as Behavior Trees — allowed a designer to add a "use vehicle" behavior without modifying any existing combat logic. The modularity was the breakthrough. Isla's talk seeded behavior tree adoption across the industry.
A Behavior Tree (BT) is a directed acyclic graph where the root node is ticked every frame (or at a set interval), and execution flows downward through parent nodes to leaf nodes. Each node returns one of three statuses: Success, Failure, or Running. The tree is evaluated top-down, and parent nodes use their children's return values to decide what to do next.
The four fundamental node types are:
Sequence nodes execute children left-to-right and return Failure the moment any child fails. Think of them as logical AND: "move to cover AND crouch AND shoot." If any step fails, the whole sequence fails.
Selector nodes (also called Fallback nodes) execute children left-to-right and return Success the moment any child succeeds. Think of them as logical OR: "try melee — if that fails, try ranged — if that fails, try flee." The first successful option wins.
Decorator nodes wrap a single child and modify its behavior — for example, Inverter (flip Success/Failure), Repeater (run N times), or UntilFail (run until child fails).
Leaf nodes are the actual actions (Move, Attack, PlayAnimation) and conditions (IsPlayerVisible, HasAmmo, HealthBelow50Percent). They perform real work and report whether they succeeded or are still running.
The critical difference between a BT and an FSM is that a BT has no persistent state transitions. Every tick starts fresh from the root. The NPC's "current behavior" emerges from which branches succeed on that tick, based on the current world state. This makes BTs far easier to extend — new behaviors are new branches, not new states with new transitions wired to every existing state.
Unreal Engine 4 and 5 ship a built-in Behavior Tree editor as a first-class tool. Epic's documentation describes their BT implementation as "event-driven," meaning nodes only re-evaluate when their observed conditions change — a significant performance optimization over naive per-tick traversal. The Unreal BT system pairs with a Blackboard: a shared memory key-value store where the NPC writes and reads world-state values like "last known player position," "current health," and "is cover available." Conditions in the tree read from the Blackboard rather than querying the world directly each frame.
The The Last of Us (Naughty Dog, 2013) AI system, described by lead AI programmer Max Dyckhoff in a 2013 GDC post-mortem, combined behavior trees with a sophisticated perception system. Clickers navigated entirely on sound cues written to a shared perception blackboard. The BT's condition nodes read echolocation hit data; action nodes triggered path-finding and lunge attacks. The result was enemies that felt genuinely reactive without any machine learning.
The most frequent error in student-designed behavior trees is the "god sequence" — a single Sequence node with twenty children that attempts to encode an entire NPC's behavior in one linear chain. When the fourth child fails, the NPC does nothing for that tick, leaving it standing still mid-combat. The fix is to restructure logic into a Selector at the top level, with separate Sequences for each major behavioral mode (combat, search, patrol), so that failure in one mode falls through to the next.
A second common mistake is missing Running status propagation. If a Move action takes three seconds to complete, it must return Running on every intermediate tick. Forgetting to handle Running causes trees to restart the move action each frame, producing jittering movement or immediate failure.
Structure your BT root as a Selector containing three to five high-priority Sequences: Dead, Injured-Retreat, Combat, Alert-Search, and Patrol. Priority descends left to right. This ensures a dead NPC never starts patrolling, and a wounded NPC always retreats before attacking. This pattern directly mirrors the documented structure used in Unreal Engine's built-in AI character templates.
Design a behavior tree for an enemy soldier NPC in a third-person shooter. You need to handle: patrol, player detection, taking cover, attacking, retreating when low health, and calling for reinforcements. Work with the AI assistant to structure your Selector/Sequence hierarchy and identify your leaf-node conditions and actions.
The assistant will critique your proposed tree structure, suggest missing nodes, and explain when to use Sequence vs. Selector. Aim for at least three substantive exchanges.
Will Wright's The Sims shipped in 2000 with no FSM and no behavior tree. Instead, each Sim evaluated every possible action — Sleep, Eat, Socialize, Use Toilet, Watch TV — by computing a utility score for each, then selected the highest scorer. The score for "Sleep" factored in the Sim's current Energy motive, time of day, and proximity to a bed. The score for "Eat" factored in Hunger motive and food availability. This system, described by Wright and lead programmer Jamie Doornbos in Gamasutra's 2000 postmortem, produced emergent comedic behaviors — Sims collapsing from exhaustion mid-conversation rather than politely excusing themselves — precisely because no designer hard-coded the priority order between needs. The scores determined everything.
A utility AI system assigns every available action a real-number score based on the current game state. The NPC selects the action with the highest score (or, in some designs, samples probabilistically from the top N). The score for an action is computed by a utility function — a formula that takes one or more input variables and maps them to a 0.0–1.0 range.
The input variable is called a consideration. A consideration for "Attack Player" might be: how close is the player? A consideration for "Flee to Cover" might be: how low is current health? Each consideration is passed through a response curve — a mathematical function that shapes how the raw value maps to a score contribution. Common curve shapes are linear (score rises proportionally), logistic (S-curve — slow at extremes, fast in the middle), and exponential (score accelerates rapidly as input increases).
Multiple considerations are multiplied together to produce a final action score. Multiplication (rather than addition) ensures that if any single consideration is near zero — "health is fine but player is invisible" — the action score is suppressed even if other considerations are high. This prevents nonsensical behaviors like attacking an unseen enemy just because health happens to be high.
Action Score = C₁(x₁) × C₂(x₂) × C₃(x₃) × … × Cₙ(xₙ) — where each Cᵢ is a response curve applied to input variable xᵢ. The highest-scoring action is selected. Response curve shapes are tunable parameters, giving designers fine control over NPC decision thresholds without writing conditional logic.
Guerrilla Games' Killzone 2 (2009) used a utility-based system for its Helghast soldiers. At GDC 2009, lead AI programmer Alex Champandard and engine director Arjen Barten presented "AI Postmortem of Killzone 2," describing how each Helghast evaluated a pool of tactical actions — advance, suppress, flank, take cover — and selected the highest scorer. The system made Helghast feel tactically adaptive without scripted sequences: a Helghast running low on ammo would score "take cover and reload" highly, while one in open terrain would score "advance and suppress" highly.
Alien: Isolation (Creative Assembly, 2014) took a different approach for the Alien itself: two parallel AI systems dubbed "AI Director 1" and "AI Director 2." AI Director 1 controlled the Alien's visible behavior using rules. AI Director 2 used utility scoring to decide where to place the Alien when it was off-screen, factoring in time since last encounter, player stress level (estimated from movement speed and hiding behavior), and area of the ship. This was documented by lead AI programmer Jonty Barnes at GDC 2014. The result was an Alien that felt as if it was genuinely hunting the player — adjusting its patrolling to areas the player hadn't recently fled from.
The practical skill in utility AI design is curve authoring. A linear curve for "distance to player" means an NPC scores attacking equally whether the player is 5 meters or 50 meters away (scaled linearly). An exponential curve means the attack score explodes only when the player is very close. Dave Mark, who developed the Dual Utility Reasoner architecture and wrote extensively about utility AI in "Behavioral Mathematics for Game AI" (2009), advocates for providing designers with a library of named curves — Linear, InverseLinear, Quadratic, Logistic, Step — and a visual editor to tune them, rather than requiring code changes per behavior adjustment. This pattern is implemented in the open-source IAUS (Infinite Axis Utility System) used in several Unity projects.
Utility AI excels when many actions compete on a continuum — social NPCs, survival AI, tactical soldiers. Behavior Trees excel when logic is explicitly hierarchical and priority ordering is clear — platformer enemies, stealth guards, scripted encounters. Many production systems combine both: a Behavior Tree at the top level to handle major state transitions, with utility scoring inside specific action-selection leaf nodes.
Design a utility AI system for a survival-game NPC (a friendly companion AI). Your companion must evaluate actions: Attack Threat, Gather Resources, Heal Player, Take Cover, and Forage for Food. For each action, define the considerations (input variables), their response curve shapes, and the rationale for multiplying them together.
The AI assistant will push you to justify your curve shapes, identify missing considerations, and suggest how to balance scores so no single action dominates in all situations. Aim for three or more design exchanges.
When Monolith Productions shipped F.E.A.R. in 2005, the game's AI attracted immediate attention from the press and industry alike. Enemies flanked the player, suppressed fire while teammates advanced, communicated cover positions, and dove through windows to avoid grenades. Lead AI programmer Jeff Orkin documented the system in detail: it used GOAP — Goal-Oriented Action Planning — rather than a traditional FSM or behavior tree. Each AI agent held a goal (KillPlayer, TakeCover, FleeFromGrenade) and a library of actions. A forward-planning algorithm searched for the lowest-cost sequence of actions that would satisfy the current goal, replanning dynamically when circumstances changed. The result was squad tactics that emerged from the planning system rather than being hand-authored, producing behavior that felt genuinely intelligent.
Each system covered in this module excels in specific conditions but struggles in others. FSMs are predictable and cheap but explode in state count as behavior complexity grows. Behavior Trees are modular and prioritizable but can become unwieldy when many behaviors compete on a continuum rather than a clear hierarchy. Utility AI handles continuous, competing actions elegantly but lacks the natural priority-override structure that hierarchical logic provides.
Production games rarely use any single system in isolation. The standard approach is layered architecture: different systems handle different levels of abstraction, each doing what it does best.
Layer 1 — High-Level State (FSM): A simple FSM governs the NPC's major mode. States might be Combat, Patrol, Flee, and Idle. Transitions are infrequent and triggered by large-scale events — player spotted, health below 25%, squad all dead. This layer is cheap and predictable. Designers can enumerate every possible top-level state and guarantee the NPC is never simultaneously in Combat and Patrol.
Layer 2 — Tactical Decisions (Behavior Tree): Inside the Combat state, a behavior tree governs moment-to-moment tactics. A root Selector chooses between Attack, TakeCover, Reload, and CallForReinforcements based on tree traversal order. The tree's modularity makes it easy to add a new tactic — ThrowGrenade — without rearchitecting the state machine. Damián Isla's documented approach from the Halo series used exactly this kind of tree within a high-level state context.
Layer 3 — Action Selection (Utility AI): Within certain behavior tree leaf nodes — particularly where multiple equivalent options exist — utility scoring breaks ties. When the NPC could advance to cover A, B, or C, a utility function scoring distance, exposure, and proximity to the player selects the best option. This avoids hard-coded priority rules for decisions that are genuinely continuous in nature.
Goal-Oriented Action Planning (GOAP) is an alternative to both behavior trees and utility AI for complex tactical behavior. The NPC defines goals (reach objective, eliminate threat) and a library of actions with preconditions and effects. A planning algorithm — typically A* over a state space — finds the cheapest action sequence that achieves the goal. GOAP is powerful because new behaviors emerge from new actions added to the library, without any designer needing to manually wire how they connect. F.E.A.R.'s squad AI used GOAP to produce flanking and suppression behavior that Monolith's Jeff Orkin described as requiring minimal hand-authoring of specific tactics.
The most recent development in NPC AI architecture adds a fourth layer above the behavioral stack: LLM-based dialogue. The FSM/BT/utility system governs what the NPC does — how it moves, fights, and reacts. The LLM governs what the NPC says about it. An enemy soldier whose behavior tree has just transitioned to TakeCover state might simultaneously be running an LLM dialogue system that produces a taunting remark contextually appropriate to the player's last action.
Ubisoft's NEO NPC prototype (GDC 2024) demonstrated this pattern: traditional behavior systems handled navigation and combat, while Claude (Anthropic) powered the NPC's verbal layer. The two systems were loosely coupled — the dialogue system received structured state signals (health, location, current objective) from the behavior system and incorporated them into natural-language context. This architecture preserves the reliability of hand-authored behavioral logic while adding the flexibility of generative dialogue.
The key engineering constraint is latency budget. A behavior tree tick runs in microseconds; an LLM response takes 0.5–3 seconds. In practice, LLM dialogue is triggered asynchronously on specific events (entering a new zone, player performing a notable action) rather than every frame, so the behavioral system never blocks on the LLM response.
When choosing which system to use at each architectural layer, ask: Is the decision categorical (yes/no, mode A/mode B) or continuous (how much, which of many)? Categorical decisions favor FSMs and BT priority ordering. Continuous decisions favor utility scoring. Planning-under-uncertainty with many possible action sequences favors GOAP. No single answer is universally correct — the right architecture depends on the specific behavior you need to produce.
Apply and extend the concepts from this lesson through guided conversation with an AI assistant.
Use this lab to explore how the concepts from Lesson 4 apply to your own questions and interests. The AI assistant is here to help you think through complex scenarios.
15 questions covering all lessons — free, untracked, retake anytime.