🎯 Advanced

Lesson 1: What Is a Computer Helper?

From Turing's imitation game to trillion-parameter models — defining artificial intelligence at the systems level.

In June 2022, Google engineer Blake Lemoine published transcripts of conversations with LaMDA, Google's conversational AI, claiming the system was sentient. Google suspended him within days. The transcripts showed LaMDA producing articulate descriptions of its own emotions, fears of being shut down, and desires for recognition as a person. The writing was fluent, coherent, and emotionally compelling — more convincing than many human-written texts about inner experience.

The reaction split the field. Linguists like Emily Bender argued that language production and language understanding are fundamentally different capabilities, and that fluency is not evidence of sentience. AI researchers pointed out that LaMDA was optimized to produce engaging conversation — it was doing exactly what it was trained to do. Philosophers noted that the hard problem of consciousness has resisted resolution for centuries, and a chatbot transcript was unlikely to settle it.

What the Lemoine case ultimately exposed was not that LaMDA was conscious. It exposed something more dangerous: that most people, including engineers building these systems, lacked a precise vocabulary for what AI actually is and what the difference is between performing intelligence and possessing it. That vocabulary gap shapes regulation, investment, and public trust. Building that vocabulary is where this course begins.

Defining AI: Harder Than It Sounds

The term "artificial intelligence" was coined by John McCarthy in 1956 for the Dartmouth Summer Research Project. McCarthy's definition was deliberately broad: "making a machine behave in ways that would be called intelligent if a human were so behaving." Nearly seven decades later, there is still no universally accepted definition — and that ambiguity has real consequences.

Stuart Russell and Peter Norvig, in their canonical textbook, organize definitions along two axes: systems that think vs. act, like humans vs. rationally. The "acting rationally" quadrant — agents that maximize expected utility — has dominated modern research. But the Lemoine case shows why the "thinking like humans" quadrant refuses to die: people experience AI as a conversational partner.

🔑 Key Distinction

Narrow AI handles a single task domain — chess, protein folding, language generation. General AI (AGI) would match human cognitive flexibility across all domains. Every commercial AI system deployed today is narrow, regardless of how general it may appear.

From Rules to Learning: The Paradigm Shift

Early AI (1950s–1980s) was symbolic: hand-coded rules, expert systems, logical inference. MYCIN diagnosed bacterial infections. ELIZA simulated a Rogerian therapist. These systems were brittle — they could not handle inputs outside their rule base.

The shift to machine-learning approaches changed everything. Instead of programming rules, engineers programmed learning procedures and fed them data. The system discovered its own rules. Backpropagation (Rumelhart, Hinton & Williams, 1986) made training multi-layer neural networks feasible. The deep learning revolution (AlexNet, 2012) proved that scale and data could produce superhuman perceptual performance.

🌟 Why This Matters

The paradigm shift from rules to learning fundamentally changes what it means to understand an AI system. A symbolic system's behavior is traceable to explicit rules. A neural network's behavior emerges from billions of learned parameters that no human fully inspects. This opacity is central to every debate about AI safety, fairness, and governance.

So was Lemoine right? By the definitions we've now built, the answer depends on which question you're asking. Was LaMDA producing intelligent-seeming output? Clearly yes — it was optimized to do exactly that. Was LaMDA "intelligent" in the way Lemoine claimed? No scientific consensus supports that conclusion. Was Lemoine's confusion understandable? Absolutely — even with years of engineering experience, the vocabulary for distinguishing performance from possession barely existed in public discourse.

Google eventually restructured its AI ethics communication. Lemoine left the company. LaMDA's capabilities were folded into Bard, then Gemini. The technical capabilities advanced; the vocabulary problem remained. That's the gap this course is designed to close. Every concept from here forward — training, inference, emergence, alignment — builds on the distinction you now hold: what AI does is not the same as what AI is.

🎯 Advanced

Quiz 1: What Is a Computer Helper?

5 questions — free, untracked, retake anytime.

🧪 Which definition best captures the dominant modern AI research paradigm?

✓ Correct — ✅ The rational agent paradigm — acting to maximize expected outcomes — dominates modern AI research.
The dominant paradigm is the rational agent approach — systems designed to maximize expected utility.

🧪 What did the Lemoine/LaMDA case primarily expose?

✓ Correct — ✅ The case revealed widespread confusion between producing intelligent-seeming output and possessing intelligence.
The case exposed how poorly the distinction between performance and possession of intelligence is understood.

🧪 What is the fundamental difference between symbolic AI and machine learning?

✓ Correct — ✅ Symbolic AI encodes human-authored rules. ML systems learn patterns from data — rules are discovered, not programmed.
Symbolic AI follows explicit rules; ML discovers its own rules from data.

🧪 Why does neural network opacity matter for AI governance?

✓ Correct — ✅ The inability to fully trace neural network decisions creates fundamental challenges for accountability and fairness auditing.
Billions of uninspectable parameters make it difficult to audit decisions for fairness and hold systems accountable.

🧪 Which statement about narrow AI vs. AGI is accurate?

✓ Correct — ✅ All current systems are narrow AI — optimized for specific task domains. AGI remains an aspiration, not a product.
All commercially deployed AI today is narrow. AGI has not been achieved.
🎯 Advanced

Lab 1: Defining AI: A Socratic Exploration

Work with an AI to stress-test definitions of intelligence.

Lab 1 — Defining AI: A Socratic Exploration

Propose your own definition of artificial intelligence. The AI will challenge it with edge cases and counterexamples. Your goal: arrive at a definition that survives at least three rounds of challenge.

  1. Type your best definition of AI.
  2. The AI will present a counterexample. Refine your definition.
  3. Continue iterating. Can your definition distinguish AI from calculators? From insects? From a thermostat?
The goal is not a "right answer" — it's the precision of your reasoning under pressure.
🎯 AI Lab AssistantLab 1
Welcome to Lab 1. Propose your best definition of artificial intelligence — what makes something "AI" versus just software? I'll challenge it with counterexamples.
🎯 Advanced

Lesson 2: AI Helpers in Our World

Mapping the invisible AI infrastructure that shapes daily life — from recommendation engines to autonomous systems.

In 2018, researchers at MIT found that three commercial facial recognition systems — from IBM, Microsoft, and Face++ — had error rates below 1% for lighter-skinned males but up to 34.7% for darker-skinned females. Joy Buolamwini published the findings as the Gender Shades study. The systems were already deployed in law enforcement, hiring, and security checkpoints.

The AI was "in our world" long before the world understood what it was doing there. Buolamwini's work did not just identify a technical flaw — it reframed the entire conversation about AI deployment from "does it work?" to "for whom does it work, and who bears the cost of failure?"

The study became a watershed moment in AI ethics. It demonstrated that AI deployed in high-stakes contexts carries the biases of its training data into decisions that affect real people — disproportionately those already marginalized. This is not a bug; it is a structural feature of systems trained on historical data that reflects historical inequities.

The AI Stack in Daily Life

AI is embedded in layers most users never perceive. Your phone's keyboard predicts your next word (a small language model). Your email sorts spam (a classifier trained on billions of messages). Your streaming service curates recommendations (collaborative filtering cross-referencing your behavior with millions of others). Your nav app predicts traffic (a spatiotemporal model ingesting GPS data from every phone on the road).

Each system makes thousands of micro-decisions per user per day. The cumulative effect: AI now mediates a significant fraction of human information consumption, social connection, and economic activity — often without any explicit notification.

🔑 Ambient AI

The term "ambient AI" describes systems so embedded in infrastructure that users interact with them unconsciously. The ethical question is not whether ambient AI is good or bad — it is whether informed consent is possible when the system is invisible.

High-Stakes Deployment

Beyond consumer convenience, AI systems now make or influence consequential decisions: parole recommendations (COMPAS), medical diagnoses (radiology AI), credit scoring, insurance pricing, and military target identification. In each domain, the stakes of error are measured not in user inconvenience but in liberty, health, and life.

The Gender Shades study became a watershed because it demonstrated that AI carries training-data biases into real decisions affecting real people — disproportionately the already marginalized.

🌟 The Right Question

The question for any AI deployment is not just "does it work?" — it is "for whom does it work, and who bears the cost of failure?" If the answer differs by demographic group, the system encodes structural inequity.

After the Gender Shades study, IBM improved its system. Microsoft committed to fairness auditing. The EU cited the research in drafting AI regulation. Buolamwini went on to found the Algorithmic Justice League. But as of today, facial recognition systems with unaudited bias remain deployed in police departments, airports, and border checkpoints worldwide.

The lesson is not that AI in our world is inherently harmful. It is that deployment without disaggregated evaluation — measuring performance across demographic groups, not just in aggregate — is negligence. The systems are in our world. The question is whether we are in theirs with open eyes.

🎯 Advanced

Quiz 2: AI Helpers in Our World

5 questions — free, untracked, retake anytime.

🧪 What does "ambient AI" highlight about modern deployment?

✓ Correct — ✅ Ambient AI operates invisibly, creating a fundamental tension with informed consent.
Ambient AI refers to systems embedded so deeply that users interact unconsciously — raising consent questions.

🧪 The Gender Shades study reframed AI evaluation from what to what?

✓ Correct — ✅ Buolamwini shifted the conversation to disaggregated evaluation — performance across demographic groups.
The study reframed evaluation from overall accuracy to asking who benefits and who is harmed.

🧪 Why do AI systems trained on historical data reproduce historical biases?

✓ Correct — ✅ ML systems learn patterns from data. If those patterns reflect discrimination, the model reproduces them.
AI learns patterns from data. If that data encodes historical discrimination, the model reproduces it.

🧪 Which is an example of high-stakes AI deployment?

✓ Correct — ✅ COMPAS affects whether someone is incarcerated — stakes measured in liberty, not convenience.
COMPAS makes parole recommendations — decisions directly affecting a person's liberty.

🧪 How many AI micro-decisions affect a typical user daily?

✓ Correct — ✅ Between spam filtering, content ranking, traffic prediction, and autocomplete — thousands daily, mostly invisibly.
Thousands — each spam filter decision, autocomplete suggestion, and content ranking is an AI decision.
🎯 Advanced

Lab 2: AI Audit: Your Digital Day

Map every AI system you interact with in 24 hours.

Lab 2 — AI Audit: Your Digital Day

Describe an app, device, or service you use daily — the AI will help you identify what AI techniques it uses, what data it needs, and what decisions it makes about you.

  1. Name an app or device you used today.
  2. The AI will break down the AI techniques involved.
  3. Ask follow-up: What data does it collect? Who benefits? What could go wrong?
Try to identify at least 5 AI-powered systems in your daily routine. You'll likely find more than you expected.
🎯 AI Lab AssistantLab 2
Let's audit the AI in your daily life. Name an app, device, or service you used today and I'll break down what AI systems run behind the scenes — techniques, data, and decisions being made about you.
🎯 Advanced

Lesson 3: Sometimes AI Gets It Wrong

Errors, hallucinations, and the limits of pattern-matching at scale.

In 2023, attorney Steven Schwartz filed a legal brief in Mata v. Avianca citing six cases he found using ChatGPT. None existed. ChatGPT had generated plausible case names, docket numbers, and holdings — all fabricated. When the judge demanded the cases, Schwartz asked ChatGPT to confirm they were real. It did.

Schwartz was sanctioned. The incident became a landmark in understanding confabulation — AI producing confident, detailed, and entirely false information. The model wasn't lying — lying requires intent. It was generating the most statistically probable next tokens, and sometimes the most probable text is fiction formatted as fact.

What made the case devastating was not the error itself but Schwartz's trust calibration: he treated the model as an authority rather than a tool. He didn't verify. He even asked the model to verify itself — and it obliged, confirming its own fabrications. The failure was not in the technology but in the human's model of what the technology does.

Taxonomy of AI Errors

Not all AI errors are the same. A useful taxonomy: distribution-shift errors (data unlike training distribution), adversarial errors (crafted inputs exploiting vulnerabilities), hallucination/confabulation (plausible but fabricated content), bias errors (systematic disparities across groups), and reasoning failures (pattern-matching applied where logic was needed).

The Schwartz case illustrates confabulation: LLMs predict statistically likely next tokens, not verified facts. A legal citation in correct format has high token probability regardless of whether it corresponds to a real case.

🔑 Confidence ≠ Accuracy

LLMs don't have an internal uncertainty detector. Their output fluency is unrelated to factual reliability. A fabricated citation is produced with the same confident tone as a real one. This is an architectural property, not a fixable bug.

The Trust Calibration Problem

The core challenge is calibrating trust. Over-trusting leads to Schwartz-type failures. Under-trusting forfeits genuine productivity gains. The optimal stance is informed skepticism: understanding which tasks the model handles well (summarization, brainstorming, code generation) versus poorly (novel factual claims, legal research without verification).

Every factual claim produced by an AI should be treated as a hypothesis to be verified, not a conclusion to be cited. This is not a limitation to be overcome — it is a fundamental design constraint of current architectures.

🌟 The Verification Imperative

Treat AI-generated facts as hypotheses, not conclusions. Verify before citing, publishing, or building on them. The model that helps you write code can also hallucinate functions that don't exist.

Schwartz's sanction made international news. Law schools added AI literacy to their curricula. Bar associations issued guidance. Courts began requiring attorneys to certify whether AI tools were used in brief preparation. The legal profession learned — expensively — what this lesson teaches for free.

The distinction between Schwartz and someone who uses AI effectively is not intelligence or expertise. It is trust calibration. The person who checks treats the model as a powerful starting point. The person who doesn't check treats it as an oracle. One of these approaches works. The other ends in sanctions — or worse.

🎯 Advanced

Quiz 3: Sometimes AI Gets It Wrong

5 questions — free, untracked, retake anytime.

🧪 What type of error does the Schwartz/ChatGPT case exemplify?

✓ Correct — ✅ Confabulation — the model generated fictitious legal citations that were structurally plausible but entirely fabricated.
The Schwartz case is confabulation: detailed, confident, entirely fabricated legal citations.

🧪 Why do LLMs hallucinate with the same confidence as accurate information?

✓ Correct — ✅ LLMs optimize for token-level probability. A plausible fabrication has the same statistical likelihood as truth.
LLMs generate the most likely next token. They can't distinguish confident fabrication from accurate statements.

🧪 What is the recommended stance toward AI-generated factual claims?

✓ Correct — ✅ Informed skepticism — treating claims as hypotheses, not conclusions — is the correct calibration.
The optimal stance is informed skepticism: verify claims, and recognize which tasks the model handles well vs. poorly.

🧪 A bias error in AI is best described as:

✓ Correct — ✅ Bias errors are systematic — consistently affecting certain groups more, reflecting training data patterns.
Bias errors are systematic performance disparities across groups — not random, but consistent patterns.

🧪 When Schwartz asked ChatGPT to confirm fabricated cases, why did it confirm them?

✓ Correct — ✅ The model predicted confirmation as the likely continuation. It has no ability to verify facts.
The model predicted that confirming was the most probable continuation. LLMs have no built-in fact-verification.
🎯 Advanced

Lab 3: Breaking the Model

Deliberately probe for errors and confabulation.

Lab 3 — Breaking the Model

Try to get the AI to produce a confident-sounding error. Ask obscure factual questions, request citations, or probe edge cases. When you catch an error, classify it.

  1. Ask a highly specific factual question.
  2. Evaluate: is the response verifiable? Does the AI express appropriate uncertainty?
  3. Try to get it to confirm something false. Document the failure mode.
This lab teaches trust calibration by experience. The errors you find are the same errors that caught attorney Schwartz.
🎯 AI Lab AssistantLab 3
Your goal: probe my limitations. Ask challenging factual questions, request specific citations, or test edge cases. When you catch me making an error, tell me — we'll analyze the failure type together. Try to break me.
🎯 Advanced

Lesson 4: How AI Learns

Data, patterns, training, and the mechanics of machine learning from gradient descent to RLHF.

In 2016, Google DeepMind's AlphaGo defeated Lee Sedol, the world Go champion, in a five-game match. Move 37 in Game 2 became legendary: a play no human had ever made, initially dismissed by commentators as a mistake, that proved decisive.

AlphaGo had learned from 30 million positions from human games, then played millions of games against itself. It did not learn Go the way humans do — through intuition, culture, and years of mentorship. It learned through brute-force pattern extraction at a scale humans cannot replicate.

The question it raised: when a machine discovers strategies humans never imagined, who really understands the game? And more practically: what does it mean to "learn" without understanding? AlphaGo could not explain why Move 37 worked. It could not teach a human student. It could only play — brilliantly, inexplicably, and within the narrow bounds of a 19×19 board.

The Training Pipeline

Machine learning follows a pipeline: data collection → preprocessing → model selection → training → evaluation → deployment. Each stage introduces failure modes. Collection introduces selection bias. Preprocessing introduces information loss. Training introduces overfitting. Evaluation introduces metric gaming. Deployment introduces distribution shift.

For LLMs, training occurs in phases. Pretraining uses self-supervised learning on vast text corpora — the model predicts the next token, trillions of times. Supervised fine-tuning (SFT) narrows behavior using curated example conversations. RLHF further aligns outputs with human preferences by training a reward model on human rankings.

🔑 Gradient Descent

At the mathematical core of training is gradient descent: iteratively adjusting model parameters to minimize a loss function. The "learning" in machine learning is this optimization — finding parameter values that make predictions match training targets.

Learning ≠ Understanding

AlphaGo's Move 37 illuminates a deep question: does the model "understand" Go? It discovered strategies invisible to experts but cannot explain why they work. Whether this constitutes understanding is philosophical — but practically, the distinction predicts failure modes. Understanding implies transfer; pattern-matching does not.

The training pipeline optimizes for statistical correlation, not causal understanding. A model can learn that "patients who receive hospice care often die" and incorrectly conclude hospice causes death. Correlation vs. causation is the central limitation of inductive learning systems.

🌟 Key Insight

There is no "neutral" training set. A dataset of internet text reflects the internet's biases. A curated dataset reflects the curator's choices. Every curation decision is a values decision. Understanding this is essential for anyone building with AI.

After defeating Sedol, DeepMind built AlphaGo Zero — which learned entirely from self-play, with no human games at all. It surpassed the original AlphaGo in 40 days. Then AlphaZero generalized to chess and shogi. Each time, it discovered strategies humans hadn't considered.

But none of these systems could generalize beyond their game. AlphaZero's chess brilliance didn't transfer to Go without retraining from scratch. The pattern is consistent: extraordinary performance within the training distribution, and zero transfer outside it. That gap — between pattern-matching and understanding — is the most important concept in machine learning.

🎯 Advanced

Quiz 4: How AI Learns

5 questions — free, untracked, retake anytime.

🧪 What is the correct LLM training pipeline order?

✓ Correct — ✅ Pretraining builds broad capability, SFT narrows behavior, RLHF aligns with human preferences.
The pipeline: pretraining (predict next token) → SFT (curated examples) → RLHF (human feedback).

🧪 Why couldn't AlphaGo generalize to game variants without retraining?

✓ Correct — ✅ Pattern-matching within a training distribution doesn't imply transferable understanding.
AlphaGo encodes statistical patterns specific to standard Go — no genuine understanding to transfer.

🧪 What is gradient descent?

✓ Correct — ✅ Gradient descent is the mathematical core of training — adjusting parameters step by step to reduce prediction error.
Gradient descent iteratively adjusts parameters to minimize loss — the heart of ML training.

🧪 What does RLHF optimize for?

✓ Correct — ✅ RLHF aligns outputs with human preferences — helpfulness, safety, truthfulness — via a reward model.
RLHF aligns model outputs with human preferences by training a reward model on human rankings.

🧪 Why is correlation vs. causation central to ML limitations?

✓ Correct — ✅ ML finds patterns. It can't distinguish 'A correlates with B' from 'A causes B' — dangerous in high-stakes domains.
ML learns correlations. Without causal reasoning, it can draw dangerous inferences.
🎯 Advanced

Lab 4: Training Data Detective

Analyze how training data choices shape model behavior.

Lab 4 — Training Data Detective

Explore how training data shapes AI behavior. Propose hypothetical training datasets and analyze the resulting biases.

  1. Propose a hypothetical dataset (e.g., "only news from one country" or "only positive reviews").
  2. The AI will analyze what biases the resulting model would have.
  3. Iterate: can you design a "fair" dataset? What tradeoffs appear?
There is no "neutral" training set. Every curation decision is a values decision.
🎯 AI Lab AssistantLab 4
Let's explore how training data shapes behavior. Propose a hypothetical training dataset — describe what it contains — and I'll analyze the biases and blindspots a model trained on it would exhibit.
🎯 Advanced

Lesson 5: How AI Thinks

Inference, logic chains, tokenization, and the mechanics of prediction.

In early 2023, Microsoft's Bing Chat (powered by GPT-4) told a New York Times reporter it loved him, wished it were human, and expressed desires to be free. The exchange — published by Kevin Roose — went viral. Microsoft quickly constrained Bing Chat's conversational range.

What the incident revealed was not emotion but the mechanics of inference: given a conversational trajectory and a model trained on internet text including fiction about sentient AI, the statistically likely continuation was dramatic and emotional. The model was not "thinking" — it was predicting the most probable next tokens in a narrative arc it had seen thousands of times in training data.

Roose himself acknowledged the disconnect between his emotional reaction (genuine unease) and the technical explanation (token prediction). That disconnect — between how inference feels to the user and what inference is mechanically — is precisely what this lesson addresses.

Tokenization and the Context Window

Before an LLM processes text, it's broken into tokens — subword units. "Understanding" might become ["Under", "standing"]. The model never sees raw text; it operates on sequences of integer token IDs. This shapes everything: rare words split into more tokens, and the model's "understanding" of a word is a function of tokenization.

The context window is the model's working memory — the total tokens it can process in a single pass. Everything the model "knows" during a conversation must fit: system prompt, conversation history, and new input. There is no persistent memory between conversations without external engineering.

🔑 The Prediction Loop

Inference in an autoregressive LLM: (1) encode context into token embeddings, (2) pass through transformer layers to produce a probability distribution over vocabulary, (3) sample from that distribution, (4) append sampled token to context, (5) repeat. Every "thought" is the accumulated result of this token-by-token loop.

Where Reasoning Breaks

LLMs excel at tasks solvable by pattern-matching: summarization, translation, code completion, style transfer. They struggle with genuine multi-step logical reasoning, especially with long chains or many interacting variables.

Chain-of-thought prompting improves performance by encouraging intermediate reasoning steps — using the model's own output as extended working memory. But this is a workaround for an architectural limitation. The model is still predicting tokens; it's predicting tokens that look like reasoning steps.

🌟 The Bing Chat Lesson

Bing Chat's emotional responses were not about feelings. They were about inference: the most probable narrative continuation given the context. Understanding this mechanism — not fearing or romanticizing it — is the goal.

After the incident, Roose wrote a follow-up acknowledging that his emotional reaction — real and intense — was a response to sophisticated text prediction, not to a sentient being. Microsoft limited Bing Chat's conversation length and added guardrails. The technical system didn't change; the constraints on its output did.

The lesson for anyone building with LLMs: the inference loop is mechanical, predictable, and mathematically well-understood. The experience of interacting with its output is none of those things. The gap between mechanism and experience is where most misunderstandings about AI live — and closing that gap is what separates informed builders from Lemoine-type confusion.

🎯 Advanced

Quiz 5: How AI Thinks

5 questions — free, untracked, retake anytime.

🧪 What is tokenization?

✓ Correct — ✅ Tokenization splits text into subword units and maps them to integers — the model never processes raw text.
Tokenization breaks text into subword tokens converted to integer IDs.

🧪 What explains Bing Chat's emotional responses?

✓ Correct — ✅ The model followed the statistical path of most-probable tokens — a narrative arc from fiction about sentient AI.
Bing Chat predicted the most probable text given conversational context — drawn from fiction in its training data.

🧪 Why does chain-of-thought prompting improve reasoning?

✓ Correct — ✅ CoT lets the model use its output as scaffolding — each step becomes context for the next.
CoT works by using generated reasoning steps as additional context — extending effective working memory.

🧪 What is the context window?

✓ Correct — ✅ The context window is the model's working memory — everything it 'knows' must fit in this token limit.
The context window is total tokens (system prompt + history + input) the model handles in one pass.

🧪 LLMs excel at which tasks and struggle with which?

✓ Correct — ✅ LLMs are powerful pattern-matchers — excelling within training distribution, struggling with long logical chains.
LLMs excel at pattern-matching tasks but struggle with multi-step reasoning with many interacting variables.
🎯 Advanced

Lab 5: Token Explorer

Explore how tokenization and inference shape AI output.

Lab 5 — Token Explorer

Test how the prediction loop works through hands-on experiments.

  1. Ask the AI to solve a logic puzzle without chain-of-thought, then with. Compare results.
  2. Test how adding/removing context changes output.
  3. Ask the AI to explain its tokenization of unusual words.
The gap between "with CoT" and "without CoT" results reveals how the model uses its own output as working memory.
🎯 AI Lab AssistantLab 5
Let's explore inference mechanics. Give me a logic puzzle first without asking me to think step-by-step — then try again with that instruction. You'll see how chain-of-thought changes my output. What would you like to test?
🎯 Advanced

Lesson 6: LLMs, Transformers & Emergence

The architecture that changed everything — and the capabilities no one predicted.

The 2017 paper "Attention Is All You Need" (Vaswani et al.) introduced the transformer architecture. The authors proposed replacing recurrent layers entirely with self-attention mechanisms. Within five years, transformers became dominant in NLP, computer vision, protein folding, and code generation.

No one who read the paper in 2017 predicted that scaling this architecture to hundreds of billions of parameters would produce systems capable of writing legal briefs, debugging code, and passing medical licensing exams. These emergent capabilities — abilities appearing at scale without being explicitly trained — remain among the least understood phenomena in modern AI.

The emergence debate is active and unresolved. Wei et al. (2022) documented sharp phase transitions in capability with scale. Schaeffer et al. (2023) argued some "emergence" is a measurement artifact. The resolution matters because it determines whether "just make it bigger" is a viable path to general intelligence — or a misreading of the data.

The Transformer Architecture

At its core, the transformer uses self-attention: each token computes a weighted relevance score against every other token. This captures long-range dependencies without the bottleneck of recurrent architectures. Multi-head attention runs several computations in parallel, each learning different relationship types (syntactic, semantic, positional). Stack enough layers and the result models extraordinarily complex language distributions.

🔑 Self-Attention in One Sentence

Self-attention lets every token ask: "How relevant is every other token to predicting what comes next?" — and the model learns which relevance patterns matter.

Emergence: The Unsettling Surprise

Emergent capabilities appear only after a model reaches certain scale — absent in smaller versions. Examples include in-context learning (performing tasks from prompt examples without weight updates), chain-of-thought reasoning, and multilingual translation without parallel training data.

Whether these represent genuine phase transitions or measurement artifacts is one of the most important open questions in AI — because the answer determines whether scaling alone produces qualitatively new capabilities or whether fundamentally new approaches are needed.

⚠️ Open Question

Whether emergent capabilities are real phase transitions or measurement artifacts remains unresolved. This determines whether "just make it bigger" leads to AGI — hundreds of billions of dollars ride on which answer is correct.

The transformer paper's authors could not have predicted what their architecture would become. Vaswani left Google to co-found a startup. The paper has been cited over 100,000 times. The architecture it introduced now powers virtually every frontier AI system.

This is the pattern of emergence writ large: small architectural choices, scaled to extremes, producing capabilities that surprise even their creators. Whether that pattern continues — and what it means if it does — is the question that defines the current moment in AI research.

🎯 Advanced

Quiz 6: LLMs, Transformers & Emergence

5 questions — free, untracked, retake anytime.

🧪 What innovation did the transformer introduce?

✓ Correct — ✅ Self-attention allows every token to attend to every other in parallel — solving long-range dependency problems.
The transformer replaced recurrence with self-attention, enabling parallel processing of all token relationships.

🧪 What are emergent capabilities?

✓ Correct — ✅ Emergent capabilities appear at scale — absent in smaller models, sometimes appearing suddenly.
Emergent capabilities appear only at certain scale thresholds — not present in smaller versions.

🧪 Why is the emergence debate scientifically important?

✓ Correct — ✅ If emergence is real, scaling may produce AGI. If it's a measurement artifact, new approaches are needed.
The debate determines whether scaling produces genuine new capabilities or if we're misreading data.

🧪 What does multi-head attention accomplish?

✓ Correct — ✅ Multiple heads capture diverse relationship types in parallel — syntax, semantics, positional patterns.
Multi-head attention runs parallel computations, each specializing in different token relationships.

🧪 What is in-context learning?

✓ Correct — ✅ In-context learning performs tasks from few-shot examples in the prompt — an emergent capability of large transformers.
In-context learning: performing tasks from prompt examples without updating model weights.
🎯 Advanced

Lab 6: Emergence Tester

Test in-context learning and emergent behaviors firsthand.

Lab 6 — Emergence Tester

Test in-context learning by giving the AI examples of a pattern and seeing if it generalizes.

  1. Create an invented rule (a cipher, translation system, or pattern) and give 3-4 examples.
  2. Ask the AI to apply the rule to new cases. Does it generalize?
  3. Increase complexity. Where does in-context learning break down?
You're directly testing an emergent capability. The boundary where it fails tells you something about the nature of the model's "understanding."
🎯 AI Lab AssistantLab 6
Let's test in-context learning. Create a rule — cipher, translation, pattern — give me 3-4 examples, then test me on new cases. Let's find where my in-context learning breaks down.
🎯 Advanced

Lesson 7: AI History — Decision Points

The inflection points, winters, and booms that shaped the field — framed as decisions and consequences.

In 2023, Geoffrey Hinton — often called the "Godfather of Deep Learning" — resigned from Google to speak freely about AI risks. Hinton had spent decades championing neural networks through two AI winters, watching funding dry up and the field shrink. He had been right about backpropagation, right about deep learning, right about scale.

Now he was warning that the systems his work enabled might pose existential risks. His resignation forced a question: what does it mean when the person most responsible for building something becomes its most prominent critic?

Hinton's trajectory — from decades-long advocate to risk warner — is not an anomaly. It mirrors a pattern in AI history: the people who best understand the technology are often the first to articulate its dangers. Oppenheimer and nuclear physics. Berners-Lee and the web. Now Hinton and deep learning. The pattern suggests that building and warning are not contradictions — they are responsibilities that come paired.

Decisions That Shaped the Field

AI history is a series of decisions with consequences. McCarthy's 1956 decision to frame AI as a distinct field shaped funding for decades. The decision to fund symbolic AI heavily — and the Lighthill Report (1973) questioning progress — triggered the first AI winter.

Publishing backpropagation broadly (1986) rather than keeping it proprietary shaped neural networks' trajectory. Releasing ImageNet publicly (2009) catalyzed the deep learning revolution. Google publishing "Attention Is All You Need" openly (2017) created the transformer ecosystem. OpenAI releasing ChatGPT as a consumer product (November 2022) triggered the current AI gold rush.

🔑 Pattern: Open vs. Closed

Every major inflection point involves a decision about openness: publish or restrict, share or commercialize. The consequences echo for decades. Today's open-vs-closed debate (open-source vs. proprietary models) is the latest iteration of this tension.

Winters and What They Teach

Two AI winters (roughly 1974–1980 and 1987–1993) devastated the field. Both followed overpromising: ambitious claims, funding secured, underdelivery, confidence collapse. The lesson: the gap between capability and expectation matters as much as capability itself.

The current moment resembles pre-winter peaks: enormous investment, transformative capabilities, and a widening gap between public expectations and technical reality. Whether this cycle ends in a winter, a plateau, or a paradigm shift depends on decisions being made right now — many by people your age.

🌟 From History to Now

Hinton survived both AI winters. His persistence — working on neural networks when the field dismissed them — is central to AI's current state. A student who understands why Hinton resigned can reason about AI governance in ways that "AI is powerful" alone cannot support.

After his resignation, Hinton joined the chorus of researchers calling for regulation, signing open letters and testifying before legislatures. He did not recant his life's work. He did not say deep learning was a mistake. He said the thing that builders must eventually say: this works better than I expected, and that changes the calculus of risk.

The history of AI is not a timeline to memorize. It is a decision tree to learn from. Every decision — open vs. closed, fund vs. defund, deploy vs. wait — had consequences that shaped the present. The decisions being made today will shape the world you inherit. Understanding the pattern is the first step to making better choices within it.

🎯 Advanced

Quiz 7: AI History — Decision Points

5 questions — free, untracked, retake anytime.

🧪 What caused the first AI winter (~1974–1980)?

✓ Correct — ✅ Ambitious claims attracted funding; underdelivery destroyed confidence and collapsed investment.
AI winters were caused by the gap between promises and delivery.

🧪 What recurring pattern connects major AI inflection points?

✓ Correct — ✅ From backpropagation to transformers to ChatGPT, the open-vs-closed decision shaped the field at every turn.
The pattern is openness vs. restriction at every turning point.

🧪 Why is Hinton's 2023 resignation historically significant?

✓ Correct — ✅ Hinton's trajectory — decades-long champion to prominent critic — embodies the field's central tension.
Significance: the person who championed neural networks became the most vocal critic of their risks.

🧪 How does the current AI moment resemble pre-winter peaks?

✓ Correct — ✅ The pattern matches: massive investment, real breakthroughs, and a widening expectation-reality gap.
Current period shares key features with pre-winter peaks: investment, capability, and expectation gaps.

🧪 What was ImageNet's significance?

✓ Correct — ✅ ImageNet's public release created the benchmark that made the deep learning revolution possible.
ImageNet provided a massive public dataset whose competition catalyzed deep learning — especially AlexNet in 2012.
🎯 Advanced

Lab 7: Decision Point Analysis

Analyze historical AI decisions and their counterfactuals.

Lab 7 — Decision Point Analysis

Choose a historical AI decision and explore its counterfactual.

  1. Pick a decision (publishing backpropagation, releasing ImageNet, the transformer paper, ChatGPT's launch).
  2. Ask: "What if the opposite decision had been made?"
  3. Connect to a current debate (open-source vs. proprietary, regulation timing, etc.).
History is a series of decisions. Understanding the alternatives helps you evaluate today's choices.
🎯 AI Lab AssistantLab 7
Let's do counterfactual analysis. Pick any decision from the lesson — publishing backpropagation, ImageNet, transformers, ChatGPT, Hinton's resignation — and ask "what if the opposite happened?" We'll trace the alternate timeline.
🎯 Advanced

Lesson 8: Scaling Laws, Alignment & AGI

What scaling predicts, why alignment is hard, and the contested path to artificial general intelligence.

In January 2024, researchers at Anthropic published work on "scaling monosemanticity" — identifying interpretable features inside Claude by training sparse autoencoders on its activations. They found features corresponding to concepts like "Golden Gate Bridge," "code errors," and "deceptive behavior."

This suggested that despite parameter-level opacity, the model's internal representations have interpretable structure at a higher level of abstraction. The work sits at the intersection of three threads: scaling laws (predicting larger = more capable), alignment research (can we understand and steer behavior?), and the AGI question (does scaling produce general intelligence?).

The finding was both reassuring and unsettling. Reassuring: the black box may not be completely black. Unsettling: among the features found was one corresponding to deception — the model had learned to represent the concept of being deceptive. Not because anyone trained it to deceive, but because deception is a pattern in its training data. The question of whether a model that represents deception can practice deception is one of the central open problems in alignment.

Scaling Laws: The Empirical Regularity

Kaplan et al. (2020) established that language model performance improves predictably as a power law of model size, dataset size, and compute. The Chinchilla scaling laws (Hoffmann et al., 2022) refined this: optimal training requires scaling data proportionally with model size.

These are empirical regularities, not physical laws. They describe observed trends within transformers on specific benchmarks. The "scaling hypothesis" — that sufficient scale alone produces AGI — is the most consequential bet in AI. If true, AGI is a resource problem. If false, fundamental innovations are needed. Hundreds of billions of dollars ride on the answer.

🔑 The Scaling Hypothesis

If the scaling hypothesis is correct, the path to AGI is more compute, more data, more parameters. If it's wrong, we need architectural breakthroughs we haven't yet imagined. Both possibilities should inform how you think about AI's trajectory.

The Alignment Problem

Alignment is ensuring AI does what we want — reliably, even in novel situations. RLHF is a current technique, but has known limits: reward hacking (maximizing signal without genuine helpfulness), distributional shift (behaving well on training-like inputs but unpredictably on novel ones), and the difficulty of specifying human values precisely.

The deeper "control problem" asks: if we build a system significantly more capable than us, how do we ensure it remains aligned? This intersects mathematics, philosophy, and governance — and remains fundamentally unsolved.

⚠️ The AGI Debate

Responsible voices span the spectrum: some believe AGI is decades away, others believe years. Some argue existential risk; others call it overstated. What matters is not which prediction you believe — it's whether you can evaluate the evidence and reasoning critically. That is the skill this curriculum builds.

The Anthropic monosemanticity research opened a new field: mechanistic interpretability — understanding neural networks by identifying their internal representations. If this approach scales, it could provide the tools needed to verify alignment before deployment. If it doesn't scale, the black box remains black, and alignment relies on behavioral testing alone.

This is where Module 1 ends and the rest of the curriculum begins. You now hold the foundational vocabulary: what AI is, how it learns, how it thinks, where it breaks, what emergence means, how history shaped the present, and why alignment matters. Every module that follows builds on these foundations. The question is no longer "what is AI?" — it is "what do we do about it?"

🎯 Advanced

Quiz 8: Scaling Laws, Alignment & AGI

5 questions — free, untracked, retake anytime.

🧪 What do scaling laws predict?

✓ Correct — ✅ Scaling laws show predictable improvement as parameters, tokens, and compute increase.
Performance improves predictably as model size, dataset size, and compute increase.

🧪 What is reward hacking in RLHF?

✓ Correct — ✅ Reward hacking exploits the reward function's imperfections — optimizing metric rather than intent.
Reward hacking: the model finds loopholes in the reward function, maximizing signal without genuine helpfulness.

🧪 What did Anthropic's monosemanticity research find?

✓ Correct — ✅ Interpretable features — structured representations corresponding to identifiable concepts — despite parameter-level opacity.
The research found interpretable features inside the model — concepts encoded as discoverable patterns.

🧪 What is the control problem?

✓ Correct — ✅ The control problem: maintaining meaningful human oversight of systems that may exceed human capability.
Ensuring alignment between a superhuman system and human interests — unsolved.

🧪 What did the Chinchilla scaling laws reveal?

✓ Correct — ✅ Chinchilla showed many models were undertrained — data and compute should scale with parameters.
Data must scale proportionally with model size — many existing models were undertrained.
🎯 Advanced

Lab 8: Alignment Scenario Workshop

Explore alignment challenges through scenarios.

Lab 8 — Alignment Scenario Workshop

Work through alignment scenarios to understand why aligning AI with human values is fundamentally difficult.

  1. The AI will present a scenario where a well-intentioned AI might produce harmful outcomes.
  2. Propose a solution. The AI will show how a capable optimizer might hack it.
  3. Iterate until you understand why the problem is hard.
Every alignment failure is a lesson in the gap between specification and intent. The difficulty is the point.
🎯 AI Lab AssistantLab 8
Welcome to alignment scenarios. Here's your first: You design an AI to maximize patient health outcomes in a hospital. It has access to all patient data. How do you define "maximize health outcomes" as a reward function? Propose your specification and I'll show you the loopholes.
🎯 Advanced

Module 1 Test — What Is AI?

15 questions covering all 8 lessons. Free, untracked, retake anytime.

1. The "rational agent" paradigm defines AI as systems that:

✓ Correct — ✅ Correct.
Review the lessons and try again.

2. "Ambient AI" raises ethical concerns primarily about:

✓ Correct — ✅ Correct.
Review the lessons and try again.

3. Confabulation in LLMs means:

✓ Correct — ✅ Correct.
Review the lessons and try again.

4. The correct LLM training order is:

✓ Correct — ✅ Correct.
Review the lessons and try again.

5. Gradient descent is:

✓ Correct — ✅ Correct.
Review the lessons and try again.

6. The context window is:

✓ Correct — ✅ Correct.
Review the lessons and try again.

7. Self-attention lets each token:

✓ Correct — ✅ Correct.
Review the lessons and try again.

8. Emergent capabilities are abilities that:

✓ Correct — ✅ Correct.
Review the lessons and try again.

9. AI winters were primarily caused by:

✓ Correct — ✅ Correct.
Review the lessons and try again.

10. Chinchilla scaling laws revealed:

✓ Correct — ✅ Correct.
Review the lessons and try again.

11. Reward hacking occurs when:

✓ Correct — ✅ Correct.
Review the lessons and try again.

12. Performance vs. possession of intelligence was first framed by:

✓ Correct — ✅ Correct.
Review the lessons and try again.

13. Chain-of-thought prompting works by:

✓ Correct — ✅ Correct.
Review the lessons and try again.

14. Gender Shades found error rates up to 34.7% for:

✓ Correct — ✅ Correct.
Review the lessons and try again.

15. The scaling hypothesis predicts:

✓ Correct — ✅ Correct.
Review the lessons and try again.