Games shipped the first commercial AI in the 1970s — the ghosts in Pac-Man, the bots in Quake, the opponent in chess programs on home computers. For fifty years, game AI lived in its own world, mostly disconnected from academic AI: scripted, heuristic, small, fast, fun.
That separation is ending. The same transformers that write text and generate images are now generating quests, voicing NPCs, simulating opponents that learn from the player, procedurally creating worlds that respond to the player's specific choices, and producing music that scores the player's specific journey. Games are becoming infinite in ways that were impossible a decade ago.
This course is the first of two on AI in game design. It covers the history of game AI, the key AI techniques game designers now use, how generative models change what a game can be, and the craft of designing AI-powered experiences that feel alive without feeling uncanny.
If you finish every module, here's who you become:
When Wolfgang von Kempelen unveiled his chess-playing automaton — the Mechanical Turk — before Empress Maria Theresa, audiences assumed they were witnessing a thinking machine. The cabinet's gears whirred; a turbaned figure moved pieces with mechanical precision. It fooled Napoleon Bonaparte and Benjamin Franklin. It was, of course, a fraud: a hidden human chess master operated it from inside. But the fraud mattered. It proved that people desperately wanted to believe machines could think — and that desire would drive a field of engineering for two centuries.
The real reckoning came in 1950, when Claude Shannon, a Bell Labs mathematician, published "Programming a Computer for Playing Chess" in Philosophical Magazine. Shannon had never seen a computer play chess. He was theorizing from first principles. In that paper he described two strategies — what he called Type A (exhaustive search) and Type B (selective, heuristic search) — frameworks that would structure game AI development for the next fifty years.
Shannon's 1950 paper is the founding document of game AI. He was the first to calculate the game-tree complexity of chess at roughly 10120 possible games — a number he called the "Shannon number." No computer could brute-force that. The implication was profound: any game AI would need to prioritize, not just calculate. That insight — that useful AI must be selective — is the ancestor of every enemy pathfinding algorithm, every difficulty-scaling system, and every procedurally generated dungeon in existence.
Shannon distinguished between minimax search (assume your opponent plays optimally, pick the move that minimizes their best outcome) and evaluation functions (a scoring system that rates board positions without searching all the way to checkmate). Both concepts appear verbatim in game design today. When a first-person-shooter enemy "decides" whether to take cover or advance, it is running a descendant of Shannon's evaluation function.
The paper also introduced the concept of alpha-beta pruning in spirit, though the formal algorithm came from John McCarthy in 1956. Pruning means cutting off branches of a game tree that cannot possibly be better than what you've already found — dramatically reducing computation. Modern real-time strategy game AI, including the pathfinding trees in StarCraft II's built-in bot opponents, uses pruned search trees.
In 1951, at the Festival of Britain, Ferranti Ltd. demonstrated Nimrod — a dedicated computer built solely to play the mathematical game Nim. Nimrod could not play anything else. It had no screen; it used indicator lights. But it beat most human opponents, and thousands of Londoners queued to play it. The machine proved that a computer could master a rule-bound game completely — a concept that would later be called solved games.
That same year, Cambridge PhD student Alexander Douglas created OXO (noughts and crosses / tic-tac-toe) as part of his dissertation on human-computer interaction. OXO ran on the EDSAC computer and displayed its board on a cathode-ray tube oscilloscope — making it arguably the first computer game with a graphical display. Its AI used a simple perfect-play algorithm: because tic-tac-toe is fully solved, the computer never lost when playing optimally. Douglas wasn't trying to entertain; he was studying how humans interacted with machines. But in doing so he created the template of a game where the computer is your opponent.
These early systems shared a key limitation: they were domain-specific. Nimrod could only play Nim. OXO could only play tic-tac-toe. The challenge that would define the next decade was creating AI flexible enough to adapt to arbitrary rule sets — a challenge that maps directly onto the challenge of making NPCs that behave sensibly across varied game scenarios.
An evaluation function assigns a numeric score to a game state without playing the game to completion. In chess, a simple evaluation function might count material (pawns, rooks, etc.) and add positional bonuses. In a modern game, an enemy AI's evaluation function might weigh line-of-sight, health, ammo, and distance to cover. The better the evaluation function, the smarter the agent — regardless of how much search depth it uses.
Between 1955 and 1965 a cluster of AI milestones established the techniques that video game designers would later adopt wholesale. Arthur Samuel at IBM built a checkers-playing program in 1956 that used machine learning — it improved its own evaluation function by playing thousands of games against itself. This was the first documented instance of a game AI that got better through self-play, a technique reinvented at scale by DeepMind's AlphaGo in 2016.
John McCarthy's development of the LISP programming language in 1958 gave AI researchers a tool flexible enough to represent game rules symbolically. Many early game AI programs were written in LISP precisely because its list-processing structure matched the branching nature of game trees.
By 1962, MIT student Steve Russell and colleagues had created Spacewar! — the first widely distributed video game. Its "Poor Man's Coriolis" autopilot mode, added by Dan Edwards and Peter Samson, gave computer-controlled ships a basic targeting heuristic. It was a tiny step, but it was the first time game AI had to work in real time, under tight computational constraints, against a human player — exactly the conditions that would define commercial game AI for the next six decades.
Arthur Samuel's checkers program defeated the Connecticut state checkers champion in 1962 — the first time a computer program had beaten an expert human at a game that required genuine skill. Samuel published his learning algorithm in the IBM Journal of Research and Development. The paper introduced the term "machine learning" in print for the first time.
In this lab you'll explore the historical milestones from Lesson 1 in depth. Ask your AI partner about specific events, why they mattered, and how they connect to AI features you see in games today. Try to make at least 3 exchanges to complete the lab.
When Toru Iwatani designed Pac-Man for Namco in 1979–80, he made a decision that would become one of the most-studied examples in game AI: he gave each of the four ghosts a distinct behavioral algorithm. Blinky always chased Pac-Man's current tile. Pinky targeted four tiles ahead of Pac-Man's direction (with a famous bug: when Pac-Man faced up, Pinky actually targeted four tiles up and four tiles left simultaneously due to a signed-integer overflow). Inky used a complex vector calculation involving both Pac-Man's position and Blinky's. Clyde alternated between chasing and fleeing. Iwatani later said the goal was to make the ghosts feel like personalities, not algorithms — and he succeeded so completely that players anthropomorphized them without knowing the code.
The behavioral logic controlling Pac-Man's ghosts is a Finite State Machine (FSM) — one of the oldest and most durable structures in game AI. An FSM is a system with a fixed set of states (Chase, Scatter, Frightened, Eaten) and clearly defined transitions between them triggered by specific conditions. Each ghost at any moment is in exactly one state. When Pac-Man eats a power pellet, all ghosts transition to Frightened. When the timer expires, they transition back to Chase or Scatter based on the current phase.
FSMs were not invented for games. They come from automata theory — the mathematical study of abstract machines — formalized by Warren McCulloch and Walter Pitts in their 1943 paper on neural nets, and later by Stephen Kleene in 1951. But game designers discovered FSMs were perfectly suited to their needs: they are predictable, debuggable, and computationally cheap. On the limited hardware of arcade boards, a ghost running a 4-state FSM used negligible resources.
The key innovation Iwatani's team introduced was the Chase / Scatter phase alternation. Ghosts do not always chase. They periodically enter Scatter mode, retreating to a fixed corner. This prevents the game from becoming unsolvable — always-chasing ghosts would corner players too efficiently. The phase timer (documented in Jamey Pittman's exhaustive 2009 reverse-engineering study "The Pac-Man Dossier") runs on a fixed schedule: 7 seconds scatter, 20 seconds chase, repeating with decreasing scatter time across levels. This rhythm is what makes Pac-Man feel "alive" rather than mechanically relentless.
Pinky's targeting bug — where "up" maps to "up-and-left" due to a pointer offset in the original 8080 assembly — was never patched. Iwatani's team either didn't notice or left it in. The bug makes Pinky slightly less effective than intended, which accidentally improves game balance. It's one of the earliest documented cases of an AI bug becoming a design feature — a theme that recurs throughout game AI history.
Released in 1978, Taito's Space Invaders — designed by Tomohiro Nishikado — contains no AI in the decision-making sense. The invaders follow a fixed left-right-down marching pattern. They do not respond to the player's position. Yet Space Invaders feels challenging and progressively difficult, and this is because Nishikado exploited a hardware limitation as a design mechanic: the TaIto 8080 chip could only animate a limited number of sprites simultaneously, so as the player shot invaders, the remaining ones rendered faster. The game sped up as you won.
This was an accident Nishikado preserved deliberately. He had intended to give the invaders more complex movement at constant speed, but the speed-up created better tension. It demonstrated a principle that game designers still invoke: the appearance of intelligence (escalating threat, adaptive difficulty) can be achieved through mechanical constraints rather than actual decision-making algorithms. Space Invaders "adapts" to your performance without any AI at all.
Both Pac-Man and Space Invaders belong to the golden age of arcade games (roughly 1978–1983), a period when game AI was defined by working within severe hardware limits. The techniques developed then — FSMs, phase alternation, scripted escalation — remain in active use in mobile games, indie titles, and educational game engines precisely because they are resource-efficient and well-understood.
Donkey Kong (Nintendo, 1981, designed by Shigeru Miyamoto) introduced scripted obstacle AI: barrels rolled down ramps following physics-like rules, fireballs pursued Mario using simple proximity detection. Each hazard type had a different behavioral script, creating the feeling of a varied, antagonistic environment from a small amount of code. The "moving enemy with one targeting rule" paradigm from Donkey Kong appears in countless platformers today.
Galaga (Namco, 1981) extended Space Invaders' formation concept by adding attack formations — pre-scripted dive patterns that individual enemies executed on a schedule. The famous "capture" mechanic, where a Galaga boss could tractor-beam the player's ship, was a scripted multi-state behavior: approach, activate beam, wait, return. It created the illusion of a sophisticated adversary through a carefully choreographed sequence, not adaptive decision-making.
The lesson across all these titles: arcade AI was theater, not intelligence. It created the experience of facing a thinking opponent through pattern design, phase scheduling, and exploitation of hardware quirks. Understanding this distinction — between actual decision-making and the performance of decision-making — is foundational to designing game AI today, because both approaches remain valid depending on the game's needs.
Finite State Machine (FSM): A computational model consisting of a fixed number of states, an initial state, and rules (transitions) that determine how the system moves from one state to another based on inputs or conditions. In game AI, each state represents a behavior (patrol, chase, attack, flee) and transitions represent triggers (player spotted, health below 25%, etc.). FSMs remain the most widely used NPC behavioral architecture in commercial games.
In this lab, work with the AI assistant to design a Finite State Machine for an NPC of your choosing. Describe a game concept, then ask the assistant to help you define states, transitions, and how each state affects gameplay feel. You can also ask the assistant to critique existing FSM designs from arcade history. Complete at least 3 exchanges.
When Garry Kasparov resigned Game 6 of his match against Deep Blue on May 11, 1997, the front pages declared that machines had beaten humanity at chess. What they missed was the more consequential story: how Deep Blue won. The IBM RS/6000 SP system evaluated 200 million chess positions per second using custom chess chips, pruned search trees, and an evaluation function hand-tuned by Grandmaster Joel Benjamin. It was not artificial general intelligence. It was a spectacularly optimized special-purpose machine — exactly the kind of domain-specific, hand-crafted AI that commercial games were also building, at smaller scale, throughout the 1990s.
The games industry watched. The lesson they took was not "neural networks." It was: better search, better evaluation, better pruning. These were lessons directly applicable to the RTS and FPS games exploding in popularity on PCs and consoles across the decade.
A* (pronounced "A-star") was published by Peter Hart, Nils Nilsson, and Bertram Raphael at the Stanford Research Institute in 1968, in a paper titled "A Formal Basis for the Heuristic Determination of Minimum Cost Paths." It is an extension of Dijkstra's shortest-path algorithm that uses a heuristic — a cost estimate to the goal — to search more efficiently. A* is guaranteed to find the shortest path if the heuristic never overestimates the true cost (the admissibility condition).
Game developers adopted A* in the late 1980s and it became ubiquitous in the 1990s with the rise of real-time strategy games. Warcraft: Orcs & Humans (Blizzard, 1994) used A* on a tile grid to move units around the map. Baldur's Gate (BioWare, 1998) used A* across an isometric environment with hundreds of simultaneously pathfinding NPCs. Age of Empires II (Microsoft, 1999) used a modified A* with flow-field optimizations to handle large armies.
A* is so computationally efficient and well-understood that it remains the dominant pathfinding algorithm in games today. Unity's NavMesh and Unreal Engine's NavigationMesh both use A* variants under the hood. The algorithm's combination of optimality (finds the best path) and efficiency (ignores clearly bad paths via the heuristic) makes it ideal for real-time game applications where pathfinding must complete in milliseconds per frame.
A* maintains two lists: an open list (nodes to be evaluated) and a closed list (nodes already evaluated). For each node it calculates f(n) = g(n) + h(n), where g(n) is the actual cost from start to the current node, and h(n) is the heuristic estimate from the current node to the goal. The algorithm always expands the open-list node with the lowest f(n) score. In a game context, nodes are tiles or waypoints; g(n) is movement cost; h(n) is usually straight-line (Euclidean) or Manhattan distance to target.
Real-time strategy games created a new AI problem: not one smart agent, but hundreds of simultaneous agents, each needing to pathfind, target, and make decisions in real time. The approaches that emerged in the 1990s RTS era established techniques still in use.
Flocking algorithms, based on Craig Reynolds' 1986 "Boids" simulation (published at SIGGRAPH), gave groups of units cohesion, separation, and alignment behaviors that made armies feel organic rather than mechanical. Reynolds demonstrated that three simple rules (steer toward group center, avoid neighbors, match neighbor velocity) produced realistic flock behavior — an early example of emergent AI complexity from simple rules.
Influence maps, used in titles like Total Annihilation (1997) and later in the AI of StarCraft (1998), overlaid a strategic heat map on the game world — tracking enemy presence, resource concentration, and danger zones. The AI could "read" this map to make higher-level strategic decisions (where to attack, where to defend, where to expand) without simulating every detail. This separation of tactical AI (how do I move this unit?) from strategic AI (what should my army do?) became a standard architectural pattern.
StarCraft's AI deserves special mention. Blizzard's 1998 RTS featured distinct AI personalities for each of three races (Terran, Protoss, Zerg), each with different macro-strategies hard-coded into their decision trees. The game was balanced well enough that its built-in AI remained a legitimate challenge for casual players for years — and its complexity made StarCraft the benchmark problem for academic game AI research for over a decade, culminating in DeepMind's AlphaStar project in 2019.
Doom (id Software, 1993) used no pathfinding at all — demons ran directly toward the player using simple compass-direction logic, unable to navigate around walls without direct line-of-sight. But Quake (1996) introduced waypoint-based navigation: hand-placed points connected in a graph, along which enemies could move. It was crude (developers had to manually place hundreds of waypoints per level), but it produced enemies that could navigate 3D space.
The leap came with Half-Life (Valve, 1998). Valve's AI team, led by Mike Abrash and Yahn Bernier, built soldiers (the HECU Marines) that took cover, threw grenades to flush players from cover, suppressed fire to limit player movement, and coordinated flanking maneuvers. These behaviors used a combination of FSMs and scripted sequences, but the marine behavior felt genuinely tactical — players reported feeling genuinely outwitted. Half-Life's AI was discussed extensively in the games press as a new benchmark for believable enemies.
The technical foundation was the concept of cover nodes: points in the environment tagged as valid cover positions. The marine AI evaluated available cover nodes by line-of-sight to the player and distance, then navigated to the best one. This was A* pathfinding plus environmental metadata — a pattern that became the standard for tactical shooter AI and remains in use in modern games like The Last of Us Part II (2020), which extends cover-node AI with vocal coordination and emotional state modeling.
Craig Reynolds' "Boids" simulation (SIGGRAPH 1986) proved that complex group behavior — schooling fish, flocking birds, marching soldiers — could emerge from three simple local rules: separation (avoid crowding neighbors), alignment (steer toward average heading of neighbors), and cohesion (steer toward average position of neighbors). Reynolds won a technical Academy Award for this work in 1998 for its influence on CGI. Its influence on game AI group movement is equally profound.
Use the AI below to explore Lesson 3 concepts in depth. Challenge assumptions and work through scenarios.
The history of AI and the history of games are not parallel tracks — they are the same track. The problems game designers needed to solve (how do you search a vast decision space? how do you evaluate a position without playing it out completely? how do you learn from experience?) were the foundational problems of AI research. Games provided the benchmarks, the tractable problem sizes, and the public drama that drove funding and attention. The algorithms that emerged from that crucible now power recommendation systems, robotics, protein folding prediction, and the large language models that are reshaping game design itself.
Shannon's 1950 minimax formulation — assume both players play optimally; minimize the opponent's maximum outcome — became the structural template for all adversarial AI reasoning. The game tree that minimax navigates is not merely a chess abstraction; it is a general model of sequential decision-making under adversarial conditions. Every planning algorithm in modern AI that reasons forward through possible futures (including Monte Carlo Tree Search, used in AlphaGo) is a descendant of Shannon's game tree.
Alpha-beta pruning, formalized by John McCarthy and colleagues in 1956 and refined by Donald Knuth and Ronald Moore in 1975, demonstrated that intelligent search requires knowing what not to compute. Alpha-beta cuts branches that cannot improve on already-found solutions — effectively halving the search depth achievable in fixed compute time. This principle — that efficient intelligence requires principled ignorance — recurs throughout modern AI. Attention mechanisms in transformers are, in a sense, an evolved form of the same insight: compute more on the relevant, less on the irrelevant.
The chess program lineage from Shannon through Deep Thought to IBM's Deep Blue also established the concept of hardware-algorithm co-design as a legitimate AI strategy. Deep Blue used custom VLSI chips optimized for chess move generation. This approach — designing hardware around the specific computational structure of an algorithm — prefigures the GPU revolution in deep learning and the development of Google's TPUs for neural network inference.
The most direct throughline from game AI to modern AI runs through reinforcement learning (RL). In 1992, Gerald Tesauro at IBM Research trained TD-Gammon — a backgammon-playing neural network — using temporal-difference learning, a technique rooted in Arthur Samuel's 1956 self-play checkers work. TD-Gammon learned to play at a near-expert level by playing hundreds of thousands of games against itself, updating its neural network weights based on the difference between predicted and actual outcomes. It was not programmed with backgammon strategy; it discovered strategy through experience.
TD-Gammon was not merely a game curiosity. Tesauro's paper introduced the broader AI research community to the practical viability of using games as RL training environments — a methodology that would define the next thirty years of RL research. The game environment provides the three components RL requires: an agent (the program), an environment (the game), and a reward signal (win/loss/score). Games are RL research laboratories by design.
AlphaGo, released by DeepMind in 2016, synthesized the entire preceding history of game AI into a single system. It combined: deep convolutional neural networks (for evaluating board positions), Monte Carlo Tree Search (the probabilistic descendant of minimax), and reinforcement learning through self-play (the direct descendant of TD-Gammon). When AlphaGo defeated Lee Sedol 4-1 in March 2016, it demonstrated that a system trained entirely through self-play — with no human-provided game strategies — could surpass the world's best human player in one of the most complex games ever devised. The successor system, AlphaZero, generalized this approach to chess, shogi, and Go from scratch, reaching superhuman performance in all three games within 24 hours of training.
TD learning updates predictions based on the difference between successive predictions, rather than waiting for the final outcome. In backgammon terms: if TD-Gammon predicts a 60% win probability at move 20 and 45% at move 21, it adjusts its move-20 evaluation toward 45% — without waiting for the game to end. This bootstrapping from intermediate predictions was Gerald Tesauro's key insight. TD-learning is the foundation of modern Q-learning, which underlies the deep RL systems that train game-playing AIs today.
A* pathfinding, which game developers adopted in the 1980s and made ubiquitous through RTS and action games in the 1990s, turned out to be one of the most practically valuable algorithms in all of applied computing. The same properties that made it ideal for navigating game maps — optimality, efficiency through heuristics, flexibility across different cost functions — make it essential in domains far removed from entertainment.
Robotics uses A* and its descendants (D* for dynamic environments, RRT for continuous space) for autonomous navigation. Self-driving vehicle systems path-plan using graph search algorithms directly derived from the game AI literature. Google Maps and similar routing systems use Dijkstra variants and bidirectional A* to compute driving routes across graphs with hundreds of millions of nodes — the same fundamental algorithm that moved units across Warcraft's tile grids in 1994.
The influence runs in both directions. Academic robotics research on motion planning fed back into game AI: the navigation mesh (NavMesh) — a polygon-based representation of navigable space that replaced tile grids in 3D games — was developed collaboratively between academic robot motion planning researchers and game AI programmers in the late 1990s. Unity and Unreal Engine both ship NavMesh systems as first-class engine features today.
Games have served as the primary benchmark environment for AI evaluation precisely because they offer unambiguous, reproducible, scalable measures of performance. The ELO rating system, invented for chess, is now used to evaluate large language models. Atari games were used as the benchmark suite in DeepMind's landmark 2015 paper introducing DQN (deep Q-networks) — the first demonstration that a single neural network architecture could learn to play diverse games from raw pixel input, without game-specific programming.
The pattern continues: StarCraft II served as the benchmark for DeepMind's AlphaStar (2019), which demonstrated that RL agents could handle partial information, long time horizons, and large action spaces simultaneously — properties that distinguish real-world planning problems from the clean abstractions of board games. Minecraft, through the MineRL competition series, became the benchmark for hierarchical RL — learning to compose skills at multiple timescales. OpenAI Five (2019) demonstrated multi-agent RL at scale by fielding a team of neural networks that defeated the world champions at Dota 2.
In every case, the game came first as entertainment; the AI benchmark came second as science. The game designer's craft — building environments that are challenging, fair, measurable, and fun — turns out to be almost identical to the AI researcher's need for training environments that are challenging, fair, measurable, and tractable. That alignment was not accidental. It reflects the deep structural fact that both activities are about creating and navigating systems of meaningful decisions.
Game AI began as a simplified testbed for techniques too computationally expensive to deploy in the real world. It is now the origin point for techniques that are reshaping that real world — and those techniques are returning to games. The same transformer architectures that learned to predict text are now generating NPC dialogue, quest narratives, and procedural game worlds. The reinforcement learning methods that mastered Go are now training game characters that adapt to individual player behavior. The feedback loop that Shannon initiated in 1950 is accelerating.
Apply and extend the concepts from this lesson through guided conversation with an AI assistant.
Use this lab to explore how the concepts from Lesson 4 apply to your own questions and interests. The AI assistant is here to help you think through complex scenarios.
15 questions covering all lessons — free, untracked, retake anytime.