In 1976, Will Crowther wrote a 700-line FORTRAN program called ADVENT β the first text adventure. Players typed commands; the program branched. Decades later, Twine games, HBO's Bandersnatch, and AI-driven chatbots all operate on the same structural skeleton Crowther sketched in that Fortran listing: a node, a set of exits, and a state machine tracking where the reader has been.
A branching narrative is any story whose path through events is determined by choices β either reader choices, system state, or a combination of both. The defining unit is the node: a discrete story beat that presents information and, usually, one or more decision points that route the reader forward.
Three structural archetypes dominate the field. The linear branch offers choices that converge back to the same spine β the reader feels agency but the author controls the destination. The parallel branch sends readers down genuinely different paths that never reconverge, multiplying the writing required. The nested branch (or "diamond structure") opens up locally then closes back, a compromise that preserves authorial economy while still delivering meaningful divergence.
The worst mistake new interactive authors make is writing prose first and inserting branch points later. Branching is structural, not ornamental. If you plan the graph after the writing, you will find that most of your prose only fits one path, and you will be forced either to rewrite everything or to make choices that feel cosmetic.
The professional workflow β validated by studios from Inkle to BioWare β is: graph first, prose second. Sketch nodes and edges as a flowchart or plain text outline. Identify bottleneck nodes early (they will anchor your theme). Only then write node prose.
This workflow maps directly onto how you should prompt AI. When you ask a language model to "write a branching story about X," you will get something that reads like a branching story but is structurally incoherent β choices that go nowhere, states that are never checked, endings that appear arbitrarily. The solution is to prompt for structure first: ask the AI to output a node/edge list, review it, correct the graph, then ask it to write prose for each node.
"Generate a node/edge map for a 12-node branching story about [topic]. Format each node as: NODE_ID | summary (1 sentence) | choices: [label β target_node_id]. Do not write prose yet. Identify which nodes are bottlenecks."
Twine, released by Chris Klimas in 2009 and now at version 2, became the dominant free tool for interactive fiction because it made the graph visible. Its visual editor shows nodes as boxes and links as arrows, making structural mistakes immediately apparent. By 2023 the Twine itch.io tag had over 30,000 games β making it the largest single repository of branching narrative work in existence.
Twine's two main story formats β Harlowe and Sugarcube β both support state variables through macro syntax. When prompting AI to generate Twine-compatible scripts, specifying the format (and pasting a short example of correct macro syntax) dramatically improves output quality.
The term "interactive fiction" was coined by Infocom co-founder Marc Blank in 1979 to distinguish their text adventures from arcade games. Infocom's Z-machine virtual machine β designed so one codebase could run on any platform β was a direct ancestor of today's platform-agnostic interactive story runtimes. The Z-machine specification is still publicly available and still used in IF competitions.
Language models excel at certain branching tasks and struggle with others. Understanding the gap helps you design prompts that play to AI strengths.
Writing node prose for a given context. Generating plausible choice labels. Expanding a one-sentence node summary into a full scene. Suggesting emotionally resonant leaf node outcomes. Maintaining voice consistency across nodes when given examples.
Tracking global state across a long conversation. Ensuring every edge leads to a valid node ID. Maintaining consistent character knowledge across non-linear paths. Avoiding "phantom choices" that appear to matter but don't change outcomes.
The practical implication: use AI as a node-level writer and a brainstorming partner for graph topology, but maintain the graph yourself in a separate document or spreadsheet. Never trust a language model to count its own nodes accurately β always verify the node/edge list before proceeding to prose.
In this lab you will practice the graph-first workflow by prompting the AI assistant to generate a structured node/edge map for a short branching story. Focus on getting clean structure β node IDs, summaries, and labeled edges β before any prose is written.
Work through at least three exchanges: first request the graph structure, then ask for refinements (add a bottleneck node, adjust the number of endings), then request one node expanded to full prose.
When BioWare shipped Mass Effect 3 in 2012, the ending offered three choices that β despite over 80 hours of prior branching β produced nearly identical outcomes. The player response became known as the "Indoctrination Theory" controversy and generated a formal petition of 65,000 signatures demanding revised endings. BioWare released an Extended Cut DLC. The episode is the most-cited case study in professional game writing curricula for what happens when choices fail to deliver on their implicit promise of consequence.
Choice designer Richard Rouse III (Ubi Soft, The Suffering) identified three criteria for meaningful player choice, now standard in interactive narrative pedagogy: choices must be distinct (each option must feel genuinely different from the others), consequential (the choice must change something β the story, the world, the character), and informed (the reader must have enough context to make a real decision, not a blind guess).
A fourth criterion, added by narrative designer Anna Anthropy in her 2012 book Rise of the Videogame Zinesters, is expressive: the best choices let the reader say something about who they are, not just solve a puzzle. This distinction between expressive and instrumental choices maps onto the difference between RPG dialogue wheels and adventure game puzzles.
Distinct β each option genuinely differs Β· Consequential β something changes as a result Β· Informed β reader has enough context Β· Expressive β the choice reveals or constructs identity
An illusory choice is one where all options produce the same narrative outcome. It can be literal (all paths converge at the same next node regardless of selection) or functional (paths diverge but the outcome is emotionally or informationally identical). Readers are surprisingly good at detecting both kinds, and the detection destroys trust in the narrative system.
In 2015, researchers at MIT's Comparative Media Studies program studied 200 Twine games and found that 61% of choice points were functionally illusory β they led to paths that reconverged within two nodes with no state change. The study concluded that most first-time authors default to the comfort of illusory choice because writing genuinely divergent paths is expensive.
One of the highest-value applications of AI in branching narrative work is choice auditing. Once you have a draft node/edge map, you can paste it to a language model and ask it to flag illusory choices, identify where consequences are missing, and suggest what state changes each choice should produce. This is faster and more systematic than manual review.
A well-formed audit prompt identifies the four criteria explicitly, so the model knows what to check. It also asks the model to suggest β not just identify β improvements, which produces more actionable output.
"Review the following node/edge map. For each choice point, evaluate: (1) Are the options truly distinct? (2) What does each option change (state, relationship, world)? (3) Does the reader have enough context to choose meaningfully? (4) Is there an expressive dimension? Flag any illusory choices and suggest a specific consequence that could be added to fix them."
Two options create binary thinking but are easy to write and tend to feel high-stakes. Three options are the cognitive sweet spot β they imply a spectrum (typically: safe/cautious, risky/bold, and a lateral third option that reframes the problem). Four or more options require more cognitive load from the reader and are best reserved for moments where the choice is the point (character-defining moments, final decisions).
Avoid symmetrical consequences. If choosing A gains you an ally and choosing B gains you information, the choices feel distinct. If A gives you +5 strength and B gives you +5 intelligence, the choices feel like menu items β mechanically differentiated but narratively inert.
The technique of hidden state β where a choice plants a flag that only matters ten nodes later β is one of the most powerful tools in interactive narrative design. It creates the sensation of a world that remembers. When prompting AI to design choice consequences, explicitly ask it to include at least one "planted flag" per major choice cluster: a consequence that will manifest later, not immediately.
Inkle's 80 Days (2014) features over 750,000 words of branching prose across 169 cities, yet maintains remarkably high choice quality. Lead writer Meg Jayanth described the approach in her 2015 GDC talk: every choice was stress-tested by asking "what does this tell the reader about Passepartout?" β keeping expressive weight on choices even when the informational stakes were low.
Sometimes you want the sensation of agency without the writing cost of full divergence. The false branch technique β also called "bark variation" in game narrative writing β produces different prose for the same structural outcome. The reader experiences a unique path; the node graph stays manageable.
False branches are legitimate when used consciously and sparingly. They become a problem when they are the only tool in use. A well-designed interactive narrative should mix false branches (for texture and re-read value) with true branches (for genuine consequence) and use state variables to make the two kinds feel consistent.
Take a branching structure β either the one you built in Lab 1, or a short one you paste in now β and use the AI assistant to audit it for choice quality. The goal is to identify illusory choices and redesign at least two of them to include genuine consequences.
Work through at least three exchanges: paste your node map and request an audit, discuss specific problem choices, then ask for redesigned choice language and consequence specifications.
In 2018, Netflix released Black Mirror: Bandersnatch β a branching film with over five hours of recorded content, 250 segments, and more than a trillion possible orderings according to the production team. The film used a proprietary state engine called Branch Manager to track viewer choices and gate content behind prior decisions. A viewer who fed Stefan the wrong breakfast cereal early in the film would encounter different dialogue from his father seventy minutes later. The system made memory visible β viewers could feel the accumulation of their choices. Bandersnatch won the Emmy for Outstanding Television Movie in 2019.
State is any information about the story world or the reader's history that persists across nodes. At its simplest, state is a set of boolean flags: "has the reader met Character X? Yes/No." At its most complex, it is a full simulation of character relationships, inventory, world conditions, and narrative history β a model of everything that has happened.
State enables two capabilities that pure branching cannot provide: conditional content (show this node only if state X is true) and accumulative consequence (a consequence that compounds across multiple prior choices). Both are central to the sensation that a branching story is a coherent world rather than a decision tree.
Flag β boolean, present/absent. "Has met the rebel leader." Simple and computationally cheap.
Counter β integer, tracks frequency. "Number of times player chose mercy." Enables graded outcomes.
Score β float or ranked value. "Relationship with Mara: 0β100." Enables nuanced relationship branching.
Gate β block content behind a required state. "This option only appears if you have the key card."
Flavor β modify prose based on state. "If relationship score > 60, Mara smiles. Otherwise she looks away."
Consequence β trigger new content based on accumulated state. "If mercy_count > 3, unlock the pacifist ending."
Language models do not have persistent memory across sessions, and even within a session their ability to track complex state degrades as context length grows. This is the fundamental tension between AI's prose-writing strength and its state-management weakness.
The standard solution used by interactive narrative studios integrating AI (including Latitude for AI Dungeon and Inworld AI for NPC dialogue) is state externalisation: maintain state in a separate data structure (a JSON object, a spreadsheet, a custom game engine variable store), and inject the relevant state into each AI prompt as a context block. The AI never tracks state; the external system does. The AI reads state from the injected context and generates state-aware prose.
"Current state: {player_name: 'Asha', met_rebel_leader: true, mercy_count: 4, mara_relationship: 72, current_node: 'NODE_17'}. Given this state, write the prose for NODE_17. Mara should acknowledge the player's history of mercy choices. Include a dialogue line where Mara's response reflects the relationship score above 70."
Latitude's AI Dungeon, launched in 2019 using GPT-2 and upgraded through GPT-3 and later models, was the first widely-used system to test AI as a live branching narrative engine. By 2021 it had over 1.5 million daily active users. The core product challenge was exactly the state problem: as stories grew longer, the AI model would forget early decisions, contradict established facts, or introduce characters that had been killed chapters earlier.
Latitude's engineering team published a 2021 post-mortem describing their solution: a "memory injection" system that maintained a structured summary of key facts and injected them into every prompt as a pinned context block. The summary included: character status (alive/dead/location), key choices made, relationship states, and world facts established. The system dramatically reduced continuity errors but required human curation to remain accurate.
Modern long-context models (Claude 3, GPT-4 Turbo, Gemini 1.5) have context windows of 100,000β1,000,000 tokens, reducing but not eliminating the state problem. Studies by Anthropic and Google show that models reliably recall injected facts placed at the beginning or end of context but show degraded recall for facts buried in the middle β the "lost in the middle" problem documented in Nelson Liu et al. (2023). For branching narrative, this means critical state should always be injected at the top of each prompt, not buried in story history.
There are three practical prompt techniques for state-aware interactive narrative generation with current AI tools.
State block injection (described above) is the most reliable. Before each node prose request, prepend a structured state summary. Keep it compact β under 200 words β and use consistent key names so you can parse and update it programmatically.
Accumulative summary prompting asks the AI to maintain and output a running state summary at the end of each response: "After writing the prose, output a JSON state block reflecting any changes this node produces." You then feed that block back in the next prompt. This is useful for rapid prototyping but requires careful review β models will sometimes silently modify state values incorrectly.
Conditional prose variants ask the AI to write multiple prose versions of a single node, each corresponding to a different state condition: "Write three versions of NODE_22: one where the player has high trust with Mara (score > 70), one where trust is neutral (30β70), and one where trust is low (< 30)." You then select the appropriate version in your game engine based on actual state.
When asking AI to generate Twine-compatible content with state, including the correct macro syntax in your system prompt dramatically improves output. For Sugarcube: <<if $mercy_count gte 3>>...<</if>> and <<set $met_rebel to true>>. For Harlowe: (if: $mercy_count >= 3)[...] and (set: $met_rebel to true). Paste examples of both in your prompt and specify which format you want.
In this lab you will practice state injection prompting. Define a small state block for your story (3β5 variables), then ask the AI to write node prose that is explicitly shaped by that state. In subsequent exchanges, change one or two state values and ask for revised prose β observing how the output changes.
This models the real workflow: external state management feeding into per-prompt AI generation. Work through at least three exchanges with different state configurations.
In December 2019, Nick Walton and his team at Latitude released AI Dungeon built on OpenAI's GPT-2. Within two months, it had been played by over a million people. Unlike any prior interactive fiction system, AI Dungeon did not execute a pre-written graph β it generated narrative in response to any free-text player input. The player could type anything; the model would continue the story. This was the first mass-market demonstration that a language model could serve as a live narrative engine rather than a content retrieval system. The product also immediately surfaced the central tension of live generation: without a pre-written graph, authorial intent could only be enforced through the system prompt, not through structural constraints.
In traditional interactive narrative, the author controls the story completely: every node is pre-written, every edge is pre-defined, every state change is intentional. The reader navigates a garden of forking paths, but all paths were planted by the author. Agency is real but bounded.
In AI-driven live narrative, the author becomes a system designer rather than a text producer. The author writes: the system prompt (tone, world rules, character personalities, content guidelines), the initial context (opening scene, world state, player character), and the constraints (what the AI should never do, what it should always maintain). The actual prose of the story is generated dynamically in response to player input.
This is a genuine creative shift, not just a technical one. It requires a different skill set: world-building rigor (because the AI will extrapolate from your rules relentlessly), constraint design (because prohibitions that aren't explicit will be violated), and quality auditing (because you cannot pre-read every possible path).
Pre-written IF: author controls every word. AI-driven IF: author controls the rules the AI follows to generate every word. Creative control moves from sentence-level to system-level. The author's voice is expressed through prompt engineering, not prose writing.
The commercial application closest to mainstream adoption as of 2024 is AI-driven NPC dialogue β using language models to generate character speech in response to player input rather than selecting from a pre-written dialogue tree. Inworld AI, founded in 2021, raised $50 million in Series A funding in 2023 and signed partnerships with Niantic and LG, among others.
Inworld's architecture is instructive: each NPC has a character brain β a structured document specifying personality, backstory, goals, speaking style, knowledge limits (what the character knows/doesn't know), and content guardrails. The character brain is injected into every AI call. Player speech is passed in; character response is generated out. The game engine manages state; Inworld manages generation. This is state externalisation applied to live NPC dialogue.
When the AI generates story content dynamically, the author cannot pre-vet every possible path. Constraints that are implicit in a pre-written graph must be made explicit in prompts. A character who "would never betray the player" in a pre-written IF is simply never written betraying the player. In live generation, that character will betray the player if the constraint isn't stated and the situation makes betrayal plausible to the model.
Effective constraint design for live narrative AI requires three layers: hard prohibitions (content the AI must never generate regardless of player input), soft style constraints (tone, register, vocabulary that should be maintained), and character integrity constraints (personality traits, knowledge limits, and motivations that must remain stable across the session).
World rules β physics, magic, technology limits
Tone declaration β register, vocabulary, pacing
Hard prohibitions β explicit never-do list
Character dossiers β personality, goals, knowledge limits
Injected state block β current story state
Character drift β personality changes over turns
World inconsistency β established facts contradicted
Escalation β tone darkens beyond design intent
Constraint erosion β prohibited content generated after many turns
Repetition β same phrases recycled across nodes
The most sophisticated current implementations use hybrid architectures that combine pre-written structural nodes with AI-generated prose within those nodes. The graph is pre-designed (ensuring structural integrity), but the prose at each node is generated dynamically (ensuring responsiveness to player history and enabling variation on replay).
This hybrid approach was described by game designer Emily Short in her 2022 survey of AI in interactive narrative for the Electronic Literature Organization: "The emerging consensus is that AI generation works best when it is structurally framed. Give AI a room to furnish, not a building to design." Short's analogy is the best practical heuristic for where to draw the line between authored structure and AI generation.
"Give AI a room to furnish, not a building to design." β Use pre-authored graph structure to define what AI-generated prose must accomplish in each node, and use AI generation for the texture, variation, and responsiveness within that structure. This preserves authorial intent while gaining AI's strengths in variation and state-aware prose.
Pre-written IF is evaluated by reading it. Live-generated IF cannot be fully read before release β the space of possible paths is too large. Instead, quality is evaluated through red-teaming (deliberately trying to break constraints), sampling (generating and reviewing hundreds of random play-throughs), and player feedback loops (monitoring what players do and say the AI generates).
The field of narrative quality metrics for AI-generated interactive fiction is nascent. Researchers at Georgia Tech's Expressive Intelligence Studio (EIS) have proposed five measurable dimensions: coherence (internal logical consistency), expressivity (variance and surprise), constraint adherence (prohibition compliance rate), character integrity (personality stability), and engagement (proxy-measured by session length and replay rate). Using these five dimensions as evaluation criteria β even informally β gives you a framework for iterating on your system prompt.
In this lab you will draft a complete system prompt for an AI-driven interactive narrative β a "character brain" or world-system specification β and test it against the AI assistant. The goal is to design constraints robust enough that the AI maintains your intent across multiple exchanges.
Work through at least three exchanges: first draft your system prompt and ask for feedback on its constraint coverage (hard prohibitions, soft style, character integrity), then test it by roleplaying a player input and observing how the AI responds within your defined constraints, then iterate on one weakness you identify.