Module 3 · Lesson 1

When AI Gets It Wrong

Confident, fluent, and completely fabricated — how do you tell the difference?

If an AI speaks with total confidence, does that make it trustworthy?

Steven Schwartz had been a lawyer for thirty years. In June 2023, facing a tight deadline on a personal-injury case, he used ChatGPT to help research relevant court decisions. The AI delivered a clean list of cases — full names, citation numbers, confident legal summaries. Schwartz filed the brief with the court.

The opposing lawyers could not find the cases. The judge could not find them either. When pressed, Schwartz checked — and discovered that every single case ChatGPT had cited was invented. Made up. The names, the citation numbers, the summaries — all fabricated, with no real legal precedent behind them.

Schwartz later said he had no idea the AI could generate "non-existent cases." He had asked ChatGPT whether the cases were real, and it said yes. The judge sanctioned him — a formal legal punishment — and the story made headlines around the world. A thirty-year legal career, nearly destroyed by trusting an AI that spoke like it knew exactly what it was talking about.

What Just Happened There?

ChatGPT did not lie in the way a person lies. It did not look up real cases, decide to hide them, and invent fake ones instead. What it did was stranger than lying: it generated text that sounded like legal citations — because that's what legal citations look like in the text it was trained on — without having any actual knowledge of whether those cases existed.

This is called a hallucination. When an AI produces text that is fluent and confident but factually wrong — or entirely made up — that is a hallucination. The word is borrowed from psychology, where it means perceiving something that isn't there. AI hallucinations are not random gibberish. They are polished, plausible, well-formatted falsehoods.

Here is the key thing that makes hallucinations so dangerous: an AI cannot tell you when it is hallucinating. It doesn't know. When Schwartz asked ChatGPT "are these real cases?" the AI said yes — not because it was trying to deceive him, but because it generates the most likely-sounding response, and "yes, these are real" was a more likely-sounding response than "actually, I just made those up."

Hallucination When an AI produces output that sounds confident and correct but is factually wrong or entirely invented. The AI has no way to flag this as an error.

Why Does This Happen?

Language models like ChatGPT, Claude, and Gemini are trained to predict what comes next in a sequence of words. They are extraordinarily good at this. If you give them the beginning of a legal citation, they will produce a plausible-looking ending — because they have read millions of legal citations and learned the pattern.

But knowing a pattern is not the same as knowing the facts. Think of it this way: if you practiced copying someone's handwriting for years, you could eventually write in their style very convincingly. But you would not know what they were actually thinking when they originally wrote. The AI knows the style of a legal citation. It does not know whether the case behind that citation actually exists in a courthouse somewhere.

This is not a bug that engineers will eventually fix. It is built into how these systems work. The models are pattern-matchers at a massive scale. Hallucinations happen most often when the AI is asked about specific facts — names, dates, citations, statistics — where getting the pattern right and getting the fact right are two completely different things.

Where Hallucinations Are Most Common

Legal citations. Scientific paper references. Historical dates and quotes. Statistics and percentages. Names of specific people and what they said or did. Any time an AI gives you a very specific fact, that's the moment to verify it independently — not because the AI is always wrong, but because you have no way to tell when it is.

The Confidence Problem

The thing that makes hallucinations so hard to catch is that they come wrapped in confidence. AI models don't say "I'm not sure, but maybe..." the way a person who is guessing might. They produce smooth, fluent text in exactly the same tone whether they are absolutely correct or completely fabricating something.

Humans use confidence as a signal. When someone speaks slowly, hesitates, or says "I think..." we know to check. When someone speaks quickly and clearly, we're wired to take it as evidence they know what they're talking about. AI breaks that heuristic — a word that means a mental shortcut you use to make fast judgments. The AI's confidence level tells you nothing about its accuracy.

This is why the Schwartz case happened. He was using a shortcut that works perfectly well with human experts: if someone speaks like an expert, they probably are one. With AI, that shortcut fails.

You Now See What Most People Miss

When most people use an AI and get a confident, well-formatted answer, they stop there. You now understand that fluency and confidence are features of how the text was generated — they say nothing about whether the facts are real. Every specific claim an AI makes is a claim you should be able to verify from a separate source. That is not paranoia. That is how professionals who actually understand these tools use them.

An Ethical Question With No Clean Answer

Here's the hard part: AI tools genuinely help people. They help lawyers draft briefs faster. They help students understand concepts. They help doctors find research they might have missed. Banning them outright would mean losing those benefits.

But the Schwartz case shows the cost of using them carelessly. And here's what makes it genuinely complicated: not everyone has the time, resources, or expertise to verify everything an AI tells them. A student in a school with one overtaxed teacher might rely on AI for homework help and have no realistic way to fact-check every sentence. Should the responsibility fall on the user who trusted a confident-sounding tool? Or on the company that built a tool capable of confident fabrication and deployed it widely?

There is no clean answer. Think about where you stand on that — and whether your answer would change depending on who the user was.

Lesson 1 Quiz

When AI Gets It Wrong

1. Steven Schwartz's ChatGPT-generated legal citations were dangerous primarily because they were:

Correct. The danger was exactly that they looked and sounded like real citations. That's what makes AI hallucinations so hard to catch — they're not obviously broken.

Not quite. The reason they caused so much damage was that they were indistinguishable from real citations — fluent, formatted, and completely fabricated.

2. When ChatGPT told Schwartz that its invented cases were real, this happened because:

Correct. The AI doesn't check facts — it generates text that fits the pattern of how a response would sound. "Yes, those are real" fits that pattern better than an admission of fabrication.

Incorrect. The AI has no intent to deceive. It produces the most statistically likely-sounding text — and confirmation sounds more natural than "I made that up."

3. A student uses an AI to research a historical speech and gets a detailed quote attributed to Abraham Lincoln with a specific date. What should they do?

Exactly right. Specific names, dates, and quotes are exactly the category most likely to be hallucinated. Verify against a primary source — a real document, a reliable archive.

Specificity doesn't signal accuracy in AI outputs. And asking the AI to confirm its own answer is circular — it will still generate likely-sounding text. Verify independently.

4. The word "heuristic" in this lesson means:

Correct. A heuristic is a fast judgment rule. "Confident speaker = knows what they're talking about" is a heuristic — useful with humans, broken with AI.

A heuristic is a mental shortcut — a fast rule for making judgments. The lesson used it to explain why Schwartz trusted the AI: he applied a human heuristic to a non-human system.

5. Which of these would be the MOST reliable way to check whether an AI-generated statistic is accurate?

Correct. Multiple AI tools can all hallucinate the same wrong statistic — they may share training data. The only real check is finding the original source.

AI confidence ratings aren't reliable, multiple AI tools can share the same hallucination, and something sounding reasonable is a heuristic the lesson specifically warns against. Go to the source.

Lab 1

The Hallucination Investigator

Your job: interrogate an AI about its own outputs

Your Role: Fact-Check Auditor

You've just been handed five "facts" that an AI generated for a student's history essay. Your job is to figure out which ones are real and which might be hallucinated — using the methods from Lesson 1.

Talk to the AI lab assistant below. It will play the role of an AI that generated some questionable outputs. Challenge it, ask how it knows things, demand sources. It won't always give you satisfying answers — that's part of the exercise.

Start by asking: "What's a hallucination red flag I should look for in AI-generated history facts?" — then push back on whatever it tells you.

AI Lab Partner Hallucination Auditor Mode

You're the auditor. I'm an AI that just produced a set of "facts" for a history essay. Some are real. Some might be completely invented. Your job is to figure out which are which — and how. Where do you want to start?

Module 3 · Lesson 2

The Machine Learned It From Us

AI bias is not a glitch. It's a mirror.

If an AI learned everything it knows from human decisions, whose fault is it when it discriminates?

In 2016, a journalist at ProPublica named Julia Angwin published an investigation that shook the American criminal justice system. Across the country, judges were using a software tool called COMPAS — made by a company called Northpointe — to help them make sentencing decisions. COMPAS analyzed data about a defendant and produced a "recidivism risk score": a number from 1 to 10 predicting how likely that person was to commit another crime.

Angwin's team analyzed the COMPAS scores for over 7,000 people arrested in Broward County, Florida, and matched them against who actually re-offended over the next two years. What they found was stark: Black defendants were nearly twice as likely as white defendants to be falsely flagged as high-risk when they went on to commit no further crime. White defendants who did re-offend were more often incorrectly labeled low-risk.

The algorithm had never been told anyone's race. It used factors like employment history, residential stability, age at first arrest. But those factors were themselves shaped by decades of unequal policing, economic inequality, and systemic discrimination. The AI had learned the patterns in biased historical data — and faithfully reproduced those patterns as a "risk score" that real judges used to determine how long real people went to prison.

What Bias Actually Means Here

When most people hear the word "bias," they imagine a person who consciously dislikes someone. But AI bias is different — and in some ways more insidious (a word that means harmful in a way that's hard to see). The COMPAS algorithm was not trying to discriminate. Its designers at Northpointe were not consciously trying to disadvantage Black defendants. But the data it learned from was not neutral.

Here's the mechanism: training data is the historical information that an AI learns from. If that data reflects real-world inequalities — who gets arrested, who gets hired, who gets loans — then the AI learns those inequalities as if they were facts of nature, not facts of history. It treats patterns in discriminatory data as reliable predictors.

There is a word for this: garbage in, garbage out. If the data going in carries a bias, the decisions coming out will carry that bias too — even if the algorithm itself has no explicit rule about race, gender, or any other characteristic.

Training Data Bias When an AI's historical learning data reflects real-world inequalities, the AI will reproduce those inequalities in its outputs — even without being explicitly programmed to do so.

This Is Not Just a Justice System Problem

The COMPAS case is dramatic because the stakes were literally years of someone's freedom. But the same mechanism appears in everyday AI systems:

In 2018, Amazon scrapped an AI hiring tool that was downgrading resumes containing the word "women's" — as in "women's chess club captain." The AI had been trained on ten years of Amazon's historical hiring data, which predominantly reflected male hires in technical roles. It learned that "male-pattern resumes" predicted success and penalized patterns associated with women.

In 2015, Google Photos auto-tagged photos of Black people as "gorillas" — a horrific failure that resulted from the fact that the training data for recognizing faces was overwhelmingly white. The AI had seen far fewer examples of dark-skinned faces and made catastrophic errors.

These are not three separate bugs. They are three examples of the same problem: AI systems trained on data from an unequal world will reflect that inequality back — often in the areas where it hurts people the most.

The Amazon Resume Case — 2018

Amazon's AI hiring tool was quietly downgrading applications from women without anyone telling it to. The company discovered this in 2018 and shut the project down. They never deployed it publicly — but for companies that did deploy similar tools without rigorous auditing, the same bias may have been affecting real hiring decisions without anyone knowing.

The Accountability Gap

Here is what makes the COMPAS story particularly important for understanding where AI is heading: when Julia Angwin's team published their findings, Northpointe disputed the methodology. They argued the algorithm was fair by a different mathematical definition of fairness. And here's the thing — both sides were using valid math. Different mathematical definitions of "fair" genuinely conflict with each other. You cannot satisfy all of them at once. This is not a theoretical problem. It is a real constraint.

This means that when someone tells you an AI has been tested for bias and found to be fair, you should ask: fair by which definition? For which group? Under what conditions? Fairness in AI is not a single thing you either have or don't have. It is a set of competing trade-offs, and someone made choices about those trade-offs — choices that are often buried in technical documentation that no one outside the company ever reads.

What You Can Now See

Most people hear "AI bias" and imagine a programmer typing something racist into a computer. You now understand the real mechanism: biased historical data, faithfully learned and reproduced. This changes how you evaluate every AI-powered decision that affects people — hiring tools, school admissions software, content recommendation engines. The question is never just "is the AI racist?" The question is: "what data did it learn from, and what inequalities were baked into that data?"

An Ethical Question About Who Decides

The COMPAS algorithm is still being used in some U.S. jurisdictions. Northpointe has never been required by law to publicly release the algorithm so it can be independently audited. Judges who use its scores are not always told how the score was generated.

Here's the ethical question — and it has no clean answer: should an algorithm that affects criminal sentencing be publicly auditable? The company argues their formula is a trade secret — protected intellectual property. Civil rights advocates argue that if a tool is used to restrict someone's liberty, the public has a right to understand how it works.

What do you think? And does your answer change if the tool works reasonably well on average, even if it fails badly for specific groups?

Lesson 2 Quiz

The Machine Learned It From Us

1. The COMPAS algorithm produced biased results against Black defendants primarily because:

Correct. The algorithm never used race directly, but it learned from data shaped by decades of unequal policing and economic inequality — and reproduced those patterns as "predictions."

The designers were not consciously biased, and race was not a direct input. The problem is more subtle: the proxy factors the AI used were themselves products of systemic inequality.

2. Amazon's AI hiring tool began penalizing resumes from women because:

Exactly. The AI learned "what a successful hire looks like" from ten years of data that skewed heavily male. It then treated male-pattern resumes as predictors of success.

The AI had no opinions and the data quality wasn't the issue. It faithfully learned patterns from historical hires — which happened to be predominantly male in technical roles.

3. A school district uses an AI to predict which students are "at risk" of dropping out, trained on five years of student data. A civil rights group claims the tool flags Black and Latino students at higher rates. The most accurate explanation to investigate first would be:

Correct. This is exactly the COMPAS pattern. Historical data reflecting unequal discipline, resource allocation, and support gets encoded as "risk signals." The AI reproduces structural inequality as individual prediction.

Apply the lesson's framework: if the training data reflects real-world inequality (unequal school funding, harsher discipline for minority students), the AI will learn and reproduce those patterns.

4. When Northpointe and ProPublica disagreed about whether COMPAS was "fair," what was the actual cause of the disagreement?

Correct. This is one of the most important ideas in AI ethics: fairness is not one thing. Different valid mathematical definitions of fairness are mutually exclusive, and choices between them involve values, not just math.

Both sides used valid math. The conflict was deeper: competing mathematical definitions of "fairness" genuinely contradict each other, and you cannot satisfy all of them simultaneously.

5. The phrase "garbage in, garbage out" applied to AI bias means:

Correct. No matter how sophisticated the algorithm, if the data it learned from encodes historical inequalities, those inequalities will appear in the outputs.

"Garbage in, garbage out" specifically means that the quality and fairness of training data determines the quality and fairness of outputs. Biased data produces biased AI.

Lab 2

The Bias Audit

You're the auditor. Demand answers.

Your Role: AI Ethics Auditor

A city government wants to deploy an AI tool that predicts which neighborhoods need more police patrols, based on historical crime data. You've been hired to audit it before deployment.

Talk to the AI lab assistant below. It's playing the role of the company trying to sell this tool. Your job: ask the hard questions about training data, fairness definitions, and what happens when the tool gets it wrong.

Start with: "Tell me about the training data this system was built on" — and don't accept a vague answer.

AI Lab Partner Bias Audit Mode

I represent the company that built the predictive patrol tool. We're confident it will reduce crime and make your city safer. What questions do you have for us?

Module 3 · Lesson 3

The Trust Test

Not "should I trust AI?" — but "which parts, and how much?"

How do you decide how much to rely on an AI tool when you can't see inside it?

In August 2023, two driverless robotaxis operated by Cruise — a subsidiary of General Motors — were involved in separate incidents in San Francisco within weeks of each other. In one, a Cruise vehicle ran a red light. In another, a Cruise robotaxi struck a pedestrian who had already been hit by another vehicle, then dragged her seventeen feet before stopping.

The California DMV suspended Cruise's permits. But what investigators later discovered made the story considerably worse: Cruise had withheld footage from regulators and submitted an incomplete account of what its vehicle had done. The AI had behaved in a way its operators hadn't fully anticipated — and when it did, the company's response was to conceal, not disclose.

The public had been told that these vehicles were safe enough to operate on city streets without a human driver. That claim was based on millions of miles of test data. But none of that data was available to the people sharing those streets. San Francisco residents had been asked to trust a system whose reliability they had no independent means of verifying.

Trust Is Not a Feeling — It's a Structure

The Cruise case illustrates something that matters far beyond self-driving cars: trust in AI should be structured, not instinctive. When you decide how much to rely on an AI system, that decision should be based on what you actually know about the system — not on how confident it sounds, how impressive its track record seems, or how much you want it to work.

Researchers who study how humans and AI systems work together have identified a concept called appropriate reliance. This means relying on an AI system exactly as much as its actual reliability warrants — no more, no less. Over-relying on a weak system leads to disasters like the Schwartz case. Under-relying on a strong system means losing the benefits it could provide.

The problem is that most of the time, we don't know where an AI system's actual reliability ceiling is. Companies don't always publish detailed accuracy data. Errors are often not reported publicly. The gap between what an AI system is marketed as and what it actually does in edge cases can be enormous.

Appropriate Reliance Trusting an AI system exactly as much as its demonstrated, verified reliability warrants — not based on marketing, aesthetics, or instinct.

The Trust Test: Four Questions

Here is a practical framework for calibrating how much to rely on any AI-generated output. These four questions won't give you a definitive answer, but they will help you think clearly about what kind of trust is warranted:

1. What kind of claim is this? Is it a pattern or a specific fact? AI is generally more reliable at identifying patterns ("this essay has good structure") than at asserting specific facts ("this treaty was signed on March 4, 1847"). Specific facts need verification.

2. What are the consequences of being wrong? If you use an AI's movie recommendation and it's bad, the cost is two hours. If a doctor uses an AI's diagnostic suggestion and it's wrong, the cost may be someone's life. Higher stakes demand more scrutiny.

3. Is there a way to check this independently? Can you verify the claim against a primary source, an expert, a dataset you can actually see? If yes, do it. If no — meaning the AI is your only source — factor that uncertainty into how you use the output.

4. Who built this, and what do they know about how it fails? Has the system been independently audited? Is its error rate documented? Did the company respond to past failures with transparency or concealment? The Cruise case is a reminder that how a company behaves when things go wrong is as important as how often things go right.

The Automation Paradox

Research by cognitive scientist Lisanne Bainbridge, published as early as 1983, identified what she called the "ironies of automation." The more reliable an automated system is, the harder it becomes for human operators to maintain the skills needed to take over when it fails — because they rarely need to intervene. Pilots who fly mostly on autopilot become less sharp at manual flying. The better AI gets, the more we have to deliberately practice the skill of questioning it.

What Overtrust Looks Like in Practice

In 2018, a Boeing 737 MAX aircraft operated by Lion Air crashed into the Java Sea, killing all 189 people on board. A second 737 MAX crash followed five months later in Ethiopia, killing 157 more. Both crashes were caused partly by an automated flight control system — the MCAS — pushing the nose of the aircraft down based on faulty sensor data. Pilots fought to regain control, but had not been given adequate training about how MCAS worked or how to override it.

The investigation found that Boeing had partly withheld information about MCAS from pilots and regulators, similar to Cruise's concealment. But the pilots' situation illustrates the human dimension: they were trained to trust the aircraft's systems. The system was wrong. And the gap between "this system is usually reliable" and "this system is reliable right now in this situation" cost 346 lives.

Overtrust does not mean stupidity. It means a rational trust calibration — built on a genuine track record — that fails in the exact edge case the training didn't prepare you for.

A Shift in How You Read the News

You now have a framework for reading any story about an AI system failing. The questions to ask: What kind of claim was the system making? What were the stakes? Was independent verification possible? And how did the company respond when it went wrong? Most news coverage of AI failures misses at least two of these four questions. Knowing the framework puts you ahead of most of the reporting you'll read.

The Ethical Tension: Transparency vs. Trade Secrets

Here's the uncomfortable truth: the four questions in the Trust Test often can't be answered, because the information needed to answer them isn't public. You can't know how often an AI hiring tool misclassifies candidates if the company won't release error rates. You can't know what data a medical AI was trained on if it's proprietary. You can't evaluate what an AI knows about how it fails if it never publishes failure data.

This is an institutional-level problem. Individual users making good personal decisions don't fix it. It requires either regulatory requirements for transparency or industry standards that make auditing normal. Neither exists consistently anywhere in the world right now, as of 2024.

So the ethical tension is real: we are being asked to make trust decisions about systems we cannot fully evaluate. The question is not whether that's okay. It's: who should have the responsibility to change it?

Lesson 3 Quiz

The Trust Test

1. "Appropriate reliance" on an AI system means:

Correct. Appropriate reliance is calibrated trust — not blind faith, not blanket rejection. It's grounded in what you actually know about the system's performance.

Appropriate reliance is specifically about calibration — trusting proportional to demonstrated reliability. Trusting because of scale, or checking AI against AI, both miss the point.

2. The Cruise robotaxi incidents were made significantly worse by what factor beyond the technical failure itself?

Correct. Concealment after failure is a key trust factor. A company that hides what went wrong makes it impossible for regulators, the public, or the system designers to learn from the failure and prevent the next one.

The lesson specifically highlights that Cruise withheld footage from regulators. How a company responds to failures is as important to trustworthiness as how often failures occur.

3. Using the Trust Test framework: a student uses an AI to summarize a textbook chapter for a test tomorrow. Which concern ranks HIGHEST?

Correct. Applying the framework: this involves specific facts (hallucination risk), non-trivial consequences (test grade), and a verifiable source exists (the textbook itself). That combination warrants cross-checking.

Apply the four Trust Test questions: What kind of claim? (specific facts — high hallucination risk). What are the stakes? (a test — real consequences). Can you check it? (yes — open the textbook). That combination demands verification.

4. The "automation paradox" identified by Lisanne Bainbridge suggests that:

Correct. Bainbridge's insight: high reliability reduces practice, reduced practice erodes skill, and when the reliable system finally fails, humans are least prepared to respond. It's a structural irony of trust.

Bainbridge's paradox is about skill erosion: reliable automation removes opportunities to practice override skills, so when automation fails at a critical moment, human operators are least prepared to take over.

5. Why can't individuals fully apply the Trust Test framework even if they understand it well?

Correct. The lesson's key institutional insight: even with perfect critical thinking, you can't evaluate what you can't see. Transparency is a structural problem that requires regulatory solutions, not just better individual judgment.

The lesson specifically identifies this: the Trust Test questions often can't be answered because the necessary information is proprietary. This is why individual vigilance isn't sufficient — it's a structural, institutional problem.

Lab 3

Trust Calibration Challenge

Apply the four-question framework to real scenarios

Your Role: Risk Assessor

Five AI applications have been proposed for use in your school or community. For each one, you need to assess whether the level of trust being placed in it is appropriate. Use the four Trust Test questions from Lesson 3.

The AI lab assistant will challenge your reasoning — it won't just agree with you. It will push you to go deeper. Take a position and defend it.

Start by picking one scenario: "An AI grading essays for a class where the grade affects the student's transcript." Tell me: is this appropriate reliance or not?

AI Lab Partner Trust Calibration Mode

I'm here to challenge your reasoning on AI trust. Pick a scenario and take a position — I'm going to push back, so make sure you can back it up with the four questions from the lesson. What are you starting with?

Module 3 · Lesson 4

Reading AI Like a Professional

The difference between a user and a critical user is what you do after you get the answer.

What does it actually look like to use AI well — not just often?

In 2023, a team of researchers at Stanford University School of Medicine published a study in JAMA Internal Medicine testing whether large language models could answer clinical medical questions accurately enough to assist doctors. They ran hundreds of medical questions through GPT-4 and evaluated the answers against expert physician responses.

GPT-4 performed remarkably well on many standard clinical scenarios — sometimes at or near the level of a practicing physician. The researchers were impressed. They were also careful. In the same paper, they documented specific categories where the model failed consistently: rare diseases, up-to-date drug interactions, and any scenario requiring knowledge of events after its training cutoff. It was brilliant in the center, unreliable at the edges.

A doctor who read only the headline — "GPT-4 performs like a physician" — might use the tool too broadly. A doctor who read the actual study knew precisely where the tool could be trusted and where it could not. The difference between those two doctors is not whether they use AI. It's how carefully they read what the research actually says.

The Skill Is in the Reading, Not Just the Using

The Stanford story points at something that matters as AI becomes more woven into daily life: knowing how to use AI well is a reading skill. It involves reading the output critically. It involves reading the claims people make about AI critically. And it involves understanding what the absence of information tells you.

Professional users of AI — doctors, lawyers, researchers, journalists — have developed informal practices for working with AI outputs. They tend to share some habits: they treat AI output as a first draft or a starting point, not a finished answer. They notice when claims are highly specific (dates, names, statistics) and flag those for separate verification. They are alert to what the AI didn't say — things that should be in a complete answer but aren't.

These habits are not complicated. But they require you to stay active and critical even when an answer looks finished and polished. The look of completion is part of what makes AI output dangerous to accept uncritically.

Critical User Someone who treats AI output as a starting point rather than a conclusion, actively checks specific claims, and notices what is missing as well as what is present.

The Five Habits of Critical AI Users

These five habits come from observing how professionals who rely on AI without being captured by it actually work:

1. Separate structure from facts. AI is usually more reliable at organizing ideas — outlining, summarizing, structuring an argument — than at producing accurate specific facts. Trust the structure. Verify the facts.

2. Treat specificity as a red flag, not a green one. A very specific claim ("the treaty was ratified on September 2, 1783") feels more credible than a vague one. In AI output, it's the opposite: specificity is exactly where hallucinations hide. The more specific the claim, the more urgently it needs an independent source.

3. Ask what's missing. An AI response that doesn't mention a key counterargument, a major complication, or a well-known exception is not necessarily wrong — but it may be incomplete in a consequential way. Completeness requires your own judgment about what ought to be there.

4. Know the training cutoff. Every language model has a knowledge cutoff — a date after which it knows nothing. Ask the model when its data ends. Any question involving recent events, current laws, or up-to-date statistics may be outdated by months or years.

5. Source the key claims. For anything important, trace the claim to a source that exists outside of the AI system. Not another AI. A primary document, a peer-reviewed study, an official record. This is the single most effective habit for catching errors before they matter.

The Misinformation Ecosystem

AI hallucinations don't stay private. When students submit hallucinated facts in essays, those essays sometimes get published. When journalists use AI to research stories and miss a fabricated detail, it enters the public record. In 2023, several news outlets discovered that AI-written articles they had published contained factual errors that passed through editorial review. The errors were fluent and plausible — exactly the kind that human editors are trained to catch in amateur writing but not in polished copy.

Reading Claims About AI Critically

Being a critical AI user also means reading about AI critically — not just reading its outputs. The claims made about AI in press releases, news coverage, and even academic papers are subject to the same scrutiny you'd apply to anything else.

When a company announces that their AI "surpasses human performance" at some task, the critical questions are: Which humans? Under what conditions? On which specific benchmark? Who funded the study? A model that surpasses human radiologists at detecting one type of cancer lesion in a specific dataset from one hospital is not the same as a model that's better than radiologists in general. But the headline often doesn't say that.

In 2021, researchers at Google published a paper claiming their DeepMind AI AlphaFold had "solved" the protein folding problem — a decades-long challenge in molecular biology. The claim was celebrated globally. It was also an overstatement: AlphaFold solved one version of the problem for many proteins, but not all, and not under all conditions. Scientists who read the actual paper knew the difference. People who read only the headlines did not.

This Changes How You Read Everything

You now have a complete framework for engaging with AI as a critical user — not a passive receiver. The five habits apply whether you're using an AI for homework, reading a news story about AI, or eventually making decisions in a professional context where AI tools are part of the workflow. Most people who use AI never develop these habits. The fact that you can name them, and know why each one matters, puts you in a genuinely different position from the majority of AI users — at any age.

The Bigger Picture: What This Module Was Really About

Across these four lessons, a single thread runs through every story: the gap between what an AI system appears to be and what it actually is. Hallucinations make an AI appear to know things it doesn't know. Bias makes an AI system appear neutral when it carries the weight of historical inequality. Overtrust makes a reliable system appear infallible. And polished marketing makes a product in its edge cases appear to work in all cases.

The skills in this module — identifying hallucination risks, tracing bias to its source in training data, calibrating trust with the four-question framework, and reading AI output with professional habits — are not skills about being skeptical of technology. They're skills about using technology with clear eyes. The people who do the most valuable work with AI are not the ones who trust it most or least. They are the ones who know what it is.

That is a harder position to hold than either "AI is amazing and will solve everything" or "AI is dangerous and we should be afraid." It requires staying curious, staying specific, and never treating the surface of an output as the whole truth. That is what you are now equipped to do.

Lesson 4 Quiz

Reading AI Like a Professional

1. According to the Stanford Medical School study, GPT-4 was LEAST reliable in which category?

Correct. The lesson specifically names these three categories of failure. The model performed well in the center of common clinical knowledge and failed at the edges — exactly the pattern of a well-trained but bounded system.

The Stanford study found GPT-4 performed well on standard clinical questions but failed consistently on rare diseases, current drug interactions, and anything requiring post-training-cutoff knowledge.

2. Why should a highly specific AI claim ("This law was passed on November 14, 1994") be treated as a RED FLAG rather than a sign of reliability?

Correct. Specificity in AI outputs is the exact site of highest hallucination risk — and the polish of a specific-sounding claim makes the fabrication harder to detect. It feels credible precisely because it sounds precise.

The five habits from the lesson specifically identify this: treat specificity as a red flag, not a green one. Specific claims are where hallucinations hide — and they're the hardest to catch because precision feels like accuracy.

3. A journalist uses an AI to draft a profile of a scientist, then publishes it. A reader later finds two sentences with fabricated quotes. Who bears responsibility here?

Correct. The lesson doesn't assign responsibility to only one party. The journalist failed to apply the verification habit that quotes specifically require. The company deployed a hallucination-capable system without adequate user guidance. Both contributed.

The lesson points at distributed responsibility. A critical user should have verified specific quotes — that's a five-habits failure. But the company's role in deploying a confident fabrication machine is also real. Both matter.

4. A press release says an AI tool "outperforms human doctors" at diagnosing a rare skin condition. A critical reader should immediately ask:

Correct. The lesson uses the AlphaFold example to make exactly this point: "surpasses human performance" always contains hidden specificity about which humans, which benchmark, under what conditions. Those qualifiers are what the claim actually means.

The lesson's guidance on reading AI claims critically: ask which humans, on which dataset, under what conditions, and who funded the study. "Outperforms" is always a claim about a specific, bounded comparison — not a general superiority.

5. What is the single most effective habit for catching AI errors before they cause harm?

Correct. The lesson explicitly names this as the single most effective habit. Two AI systems can share the same hallucination. An AI can explain flawed reasoning fluently. But a claim that doesn't exist in any primary source cannot survive that check.

The lesson specifically identifies sourcing key claims to outside-AI primary sources as the most effective single habit. AI self-ratings are unreliable, and two AI systems may share hallucinations — only a primary source check is independent.

Lab 4

The Critical User Challenge

Apply all five habits to a real AI output

Your Role: Professional Reviewer

You've been handed an AI-generated briefing document about climate change policy. Your job is to read it like a professional — applying the five habits from Lesson 4 — and identify every place where the critical user habits demand action.

The AI lab assistant below will give you sections of the briefing to analyze. Tell it what you notice, what you'd verify, and what's missing. It will push back on shallow analysis and ask you to go deeper.

Start by asking for the first section of the briefing — then tell me what a critical user would flag immediately.

AI Lab Partner Critical User Mode

Ready to work through this briefing? I'll give you sections to analyze — but I want specifics, not generalities. "This might be wrong" isn't enough. Tell me which of the five habits applies, what you'd actually check, and why. Ask for the first section when you're ready.

Module 3 Test

Bias, Errors, and the Trust Test — 15 questions · Pass at 80%

1. In the Steven Schwartz case (2023), what was the fundamental cause of the legal disaster?

Correct. The case is a failure of verification — Schwartz trusted AI output on a type of claim (specific legal citations) that is exactly where hallucinations occur most reliably.

The core failure was relying on AI output for specific factual claims — legal citations — without independent verification. That failure cost him severe professional consequences.

2. An AI hallucination is specifically defined as:

Correct. The defining characteristics of a hallucination are its fluency and confidence — it doesn't look like an error, which is what makes it dangerous.

Hallucinations are specifically fluent, confident-sounding outputs that are wrong or fabricated. The AI has no awareness it is hallucinating and no way to flag it.

3. Why can an AI NOT reliably tell you when it is hallucinating?

Correct. The AI generates text that fits the pattern of likely responses. "Yes, this is accurate" is a more likely-sounding response than "I just fabricated this" — regardless of actual truth.

The mechanism is architectural: language models predict likely next text without internal access to factual truth. They have no fact-checking layer. "Seems real" and "is real" are indistinguishable to the model.

4. The COMPAS algorithm produced biased risk scores without being explicitly programmed to discriminate because:

Correct. This is the core mechanism of training data bias: inequality in the historical record gets encoded as predictive patterns, with no explicit instruction needed.

COMPAS never used race directly. The bias emerged from proxy factors (employment history, residential stability, prior arrests) that were themselves shaped by systemic inequality. That's the "garbage in, garbage out" mechanism.

5. When Northpointe and ProPublica disagreed about whether COMPAS was "fair," both sides were using valid mathematics. This illustrates that:

Correct. This is one of the most important ideas in AI ethics: fairness is not a single, universally agreed-upon mathematical property. The trade-offs between definitions involve value choices, not just technical ones.

Both parties used valid math — the conflict was that they used different definitions of fairness, which are mathematically incompatible. This means "is this AI fair?" is always a question that requires asking "fair by which definition?"

6. "Appropriate reliance" means trusting an AI system:

Correct. Appropriate reliance is calibrated, evidence-based trust — not instinct, not marketing, not peer usage. It requires knowing what the system actually gets right and where it fails.

Appropriate reliance is specifically calibrated to verified reliability — not reviews, not user count, not your instinct. High stakes should actually trigger more scrutiny, not more AI reliance.

7. The Boeing 737 MAX crashes in 2018–2019 relate to AI trust because they illustrate:

Correct. The crashes demonstrate the structure of overtrust failure: a reliable system, rational trust built over time, and catastrophic failure in an edge case where human override skills had eroded due to rare use.

The lesson uses this case to illustrate overtrust: pilots had built rational trust in MCAS based on its track record, but weren't trained to override it in the specific failure mode that occurred. That's not individual error — it's a system design failure.

8. The "automation paradox" identified by Lisanne Bainbridge warns that:

Correct. Bainbridge's 1983 insight remains relevant: the more reliable the automation, the less you practice taking manual control, and the worse you'll be at it when you need to. AI makes this problem more acute, not less.

The paradox is specifically about skill erosion: reliable automation removes the occasions for practice, and infrequent practice means poor performance in the exact moment — system failure — when skills are most needed.

9. In the Stanford Medical School GPT-4 study, what was the key lesson for doctors who read the actual research paper (rather than just headlines)?

Correct. The value of reading the actual study was precision: understanding not just "this AI performs well" but "this AI performs well here and fails there." That precision enables appropriate reliance.

The study showed GPT-4 was strong in many areas and consistently weak in specific ones (rare diseases, current interactions, post-cutoff events). The lesson was: appropriate reliance requires knowing where the edges are.

10. A critical user treats AI output primarily as:

Correct. The defining characteristic of a critical user is treating output as a starting point — especially for specific claims — rather than a finished, authoritative answer.

A critical user's defining characteristic is treating AI output as a starting point, not a conclusion. Specific claims, regardless of stakes, warrant independent verification before use.

11. Of the five critical user habits, which specifically addresses the AI's knowledge boundary in time?

Correct. Every language model has a knowledge cutoff — a date after which it knows nothing. Questions about current events, recent laws, or up-to-date statistics require knowing when the model's knowledge ends.

"Know the training cutoff" is the habit specifically about temporal boundaries. Every AI model has a date after which its knowledge is zero — and that cutoff may be months or years in the past.

12. Google Photos tagging photos of Black people as "gorillas" in 2015 is best explained as:

Correct. This is a classic training data coverage failure: when the AI had seen far fewer examples of dark-skinned faces, it couldn't recognize them reliably and made catastrophic errors in a high-visibility product.

The Google Photos failure was a training data representation problem — overwhelmingly white training images meant the model was undertrained on dark-skinned faces. The same mechanism as COMPAS, different domain, catastrophic result.

13. Why is "checking one AI's answer against another AI's answer" NOT a reliable verification method?

Correct. Shared training data means shared hallucinations are possible. Agreement between two AI systems is not independent verification — it may just mean both learned the same fabrication.

AI models can share hallucinations because they may share training data. Two models agreeing on a fabricated fact doesn't make it real — it may just mean both learned the same false pattern from the same sources.

14. The institutional-level problem identified in Lesson 3 is that the Trust Test framework often cannot be fully applied because:

Correct. The lesson specifically names this: even perfect individual critical thinking can't compensate for missing data. Transparency about AI system performance requires regulatory or industry-level solutions, not just user vigilance.

The structural problem identified in Lesson 3 is that essential evaluation data — error rates, training data, audit results — is frequently proprietary. This is a transparency problem that requires systemic solutions, not better individual habits.

15. Across all four lessons, what is the common thread that connects hallucinations, training data bias, overtrust, and misleading AI claims?

Correct. The module's central insight: AI systems present a polished surface that regularly conceals real limitations. Critical users see through the surface — not to reject AI, but to use it with clear eyes.

The module's unifying theme is the gap between appearance and reality in AI systems. They appear more knowledgeable than they are (hallucinations), more neutral than they are (bias), more reliable than they are (overtrust), and more capable than they are (misleading claims).