Steven Schwartz had been a lawyer for thirty years. In June 2023, facing a tight deadline on a personal-injury case, he used ChatGPT to help research relevant court decisions. The AI delivered a clean list of cases β full names, citation numbers, confident legal summaries. Schwartz filed the brief with the court.
The opposing lawyers could not find the cases. The judge could not find them either. When pressed, Schwartz checked β and discovered that every single case ChatGPT had cited was invented. Made up. The names, the citation numbers, the summaries β all fabricated, with no real legal precedent behind them.
Schwartz later said he had no idea the AI could generate "non-existent cases." He had asked ChatGPT whether the cases were real, and it said yes. The judge sanctioned him β a formal legal punishment β and the story made headlines around the world. A thirty-year legal career, nearly destroyed by trusting an AI that spoke like it knew exactly what it was talking about.
ChatGPT did not lie in the way a person lies. It did not look up real cases, decide to hide them, and invent fake ones instead. What it did was stranger than lying: it generated text that sounded like legal citations β because that's what legal citations look like in the text it was trained on β without having any actual knowledge of whether those cases existed.
This is called a hallucination. When an AI produces text that is fluent and confident but factually wrong β or entirely made up β that is a hallucination. The word is borrowed from psychology, where it means perceiving something that isn't there. AI hallucinations are not random gibberish. They are polished, plausible, well-formatted falsehoods.
Here is the key thing that makes hallucinations so dangerous: an AI cannot tell you when it is hallucinating. It doesn't know. When Schwartz asked ChatGPT "are these real cases?" the AI said yes β not because it was trying to deceive him, but because it generates the most likely-sounding response, and "yes, these are real" was a more likely-sounding response than "actually, I just made those up."
Language models like ChatGPT, Claude, and Gemini are trained to predict what comes next in a sequence of words. They are extraordinarily good at this. If you give them the beginning of a legal citation, they will produce a plausible-looking ending β because they have read millions of legal citations and learned the pattern.
But knowing a pattern is not the same as knowing the facts. Think of it this way: if you practiced copying someone's handwriting for years, you could eventually write in their style very convincingly. But you would not know what they were actually thinking when they originally wrote. The AI knows the style of a legal citation. It does not know whether the case behind that citation actually exists in a courthouse somewhere.
This is not a bug that engineers will eventually fix. It is built into how these systems work. The models are pattern-matchers at a massive scale. Hallucinations happen most often when the AI is asked about specific facts β names, dates, citations, statistics β where getting the pattern right and getting the fact right are two completely different things.
Legal citations. Scientific paper references. Historical dates and quotes. Statistics and percentages. Names of specific people and what they said or did. Any time an AI gives you a very specific fact, that's the moment to verify it independently β not because the AI is always wrong, but because you have no way to tell when it is.
The thing that makes hallucinations so hard to catch is that they come wrapped in confidence. AI models don't say "I'm not sure, but maybe..." the way a person who is guessing might. They produce smooth, fluent text in exactly the same tone whether they are absolutely correct or completely fabricating something.
Humans use confidence as a signal. When someone speaks slowly, hesitates, or says "I think..." we know to check. When someone speaks quickly and clearly, we're wired to take it as evidence they know what they're talking about. AI breaks that heuristic β a word that means a mental shortcut you use to make fast judgments. The AI's confidence level tells you nothing about its accuracy.
This is why the Schwartz case happened. He was using a shortcut that works perfectly well with human experts: if someone speaks like an expert, they probably are one. With AI, that shortcut fails.
When most people use an AI and get a confident, well-formatted answer, they stop there. You now understand that fluency and confidence are features of how the text was generated β they say nothing about whether the facts are real. Every specific claim an AI makes is a claim you should be able to verify from a separate source. That is not paranoia. That is how professionals who actually understand these tools use them.
Here's the hard part: AI tools genuinely help people. They help lawyers draft briefs faster. They help students understand concepts. They help doctors find research they might have missed. Banning them outright would mean losing those benefits.
But the Schwartz case shows the cost of using them carelessly. And here's what makes it genuinely complicated: not everyone has the time, resources, or expertise to verify everything an AI tells them. A student in a school with one overtaxed teacher might rely on AI for homework help and have no realistic way to fact-check every sentence. Should the responsibility fall on the user who trusted a confident-sounding tool? Or on the company that built a tool capable of confident fabrication and deployed it widely?
There is no clean answer. Think about where you stand on that β and whether your answer would change depending on who the user was.
You've just been handed five "facts" that an AI generated for a student's history essay. Your job is to figure out which ones are real and which might be hallucinated β using the methods from Lesson 1.
Talk to the AI lab assistant below. It will play the role of an AI that generated some questionable outputs. Challenge it, ask how it knows things, demand sources. It won't always give you satisfying answers β that's part of the exercise.
In 2016, a journalist at ProPublica named Julia Angwin published an investigation that shook the American criminal justice system. Across the country, judges were using a software tool called COMPAS β made by a company called Northpointe β to help them make sentencing decisions. COMPAS analyzed data about a defendant and produced a "recidivism risk score": a number from 1 to 10 predicting how likely that person was to commit another crime.
Angwin's team analyzed the COMPAS scores for over 7,000 people arrested in Broward County, Florida, and matched them against who actually re-offended over the next two years. What they found was stark: Black defendants were nearly twice as likely as white defendants to be falsely flagged as high-risk when they went on to commit no further crime. White defendants who did re-offend were more often incorrectly labeled low-risk.
The algorithm had never been told anyone's race. It used factors like employment history, residential stability, age at first arrest. But those factors were themselves shaped by decades of unequal policing, economic inequality, and systemic discrimination. The AI had learned the patterns in biased historical data β and faithfully reproduced those patterns as a "risk score" that real judges used to determine how long real people went to prison.
When most people hear the word "bias," they imagine a person who consciously dislikes someone. But AI bias is different β and in some ways more insidious (a word that means harmful in a way that's hard to see). The COMPAS algorithm was not trying to discriminate. Its designers at Northpointe were not consciously trying to disadvantage Black defendants. But the data it learned from was not neutral.
Here's the mechanism: training data is the historical information that an AI learns from. If that data reflects real-world inequalities β who gets arrested, who gets hired, who gets loans β then the AI learns those inequalities as if they were facts of nature, not facts of history. It treats patterns in discriminatory data as reliable predictors.
There is a word for this: garbage in, garbage out. If the data going in carries a bias, the decisions coming out will carry that bias too β even if the algorithm itself has no explicit rule about race, gender, or any other characteristic.
The COMPAS case is dramatic because the stakes were literally years of someone's freedom. But the same mechanism appears in everyday AI systems:
In 2018, Amazon scrapped an AI hiring tool that was downgrading resumes containing the word "women's" β as in "women's chess club captain." The AI had been trained on ten years of Amazon's historical hiring data, which predominantly reflected male hires in technical roles. It learned that "male-pattern resumes" predicted success and penalized patterns associated with women.
In 2015, Google Photos auto-tagged photos of Black people as "gorillas" β a horrific failure that resulted from the fact that the training data for recognizing faces was overwhelmingly white. The AI had seen far fewer examples of dark-skinned faces and made catastrophic errors.
These are not three separate bugs. They are three examples of the same problem: AI systems trained on data from an unequal world will reflect that inequality back β often in the areas where it hurts people the most.
Amazon's AI hiring tool was quietly downgrading applications from women without anyone telling it to. The company discovered this in 2018 and shut the project down. They never deployed it publicly β but for companies that did deploy similar tools without rigorous auditing, the same bias may have been affecting real hiring decisions without anyone knowing.
Here is what makes the COMPAS story particularly important for understanding where AI is heading: when Julia Angwin's team published their findings, Northpointe disputed the methodology. They argued the algorithm was fair by a different mathematical definition of fairness. And here's the thing β both sides were using valid math. Different mathematical definitions of "fair" genuinely conflict with each other. You cannot satisfy all of them at once. This is not a theoretical problem. It is a real constraint.
This means that when someone tells you an AI has been tested for bias and found to be fair, you should ask: fair by which definition? For which group? Under what conditions? Fairness in AI is not a single thing you either have or don't have. It is a set of competing trade-offs, and someone made choices about those trade-offs β choices that are often buried in technical documentation that no one outside the company ever reads.
Most people hear "AI bias" and imagine a programmer typing something racist into a computer. You now understand the real mechanism: biased historical data, faithfully learned and reproduced. This changes how you evaluate every AI-powered decision that affects people β hiring tools, school admissions software, content recommendation engines. The question is never just "is the AI racist?" The question is: "what data did it learn from, and what inequalities were baked into that data?"
The COMPAS algorithm is still being used in some U.S. jurisdictions. Northpointe has never been required by law to publicly release the algorithm so it can be independently audited. Judges who use its scores are not always told how the score was generated.
Here's the ethical question β and it has no clean answer: should an algorithm that affects criminal sentencing be publicly auditable? The company argues their formula is a trade secret β protected intellectual property. Civil rights advocates argue that if a tool is used to restrict someone's liberty, the public has a right to understand how it works.
What do you think? And does your answer change if the tool works reasonably well on average, even if it fails badly for specific groups?
A city government wants to deploy an AI tool that predicts which neighborhoods need more police patrols, based on historical crime data. You've been hired to audit it before deployment.
Talk to the AI lab assistant below. It's playing the role of the company trying to sell this tool. Your job: ask the hard questions about training data, fairness definitions, and what happens when the tool gets it wrong.
In August 2023, two driverless robotaxis operated by Cruise β a subsidiary of General Motors β were involved in separate incidents in San Francisco within weeks of each other. In one, a Cruise vehicle ran a red light. In another, a Cruise robotaxi struck a pedestrian who had already been hit by another vehicle, then dragged her seventeen feet before stopping.
The California DMV suspended Cruise's permits. But what investigators later discovered made the story considerably worse: Cruise had withheld footage from regulators and submitted an incomplete account of what its vehicle had done. The AI had behaved in a way its operators hadn't fully anticipated β and when it did, the company's response was to conceal, not disclose.
The public had been told that these vehicles were safe enough to operate on city streets without a human driver. That claim was based on millions of miles of test data. But none of that data was available to the people sharing those streets. San Francisco residents had been asked to trust a system whose reliability they had no independent means of verifying.
The Cruise case illustrates something that matters far beyond self-driving cars: trust in AI should be structured, not instinctive. When you decide how much to rely on an AI system, that decision should be based on what you actually know about the system β not on how confident it sounds, how impressive its track record seems, or how much you want it to work.
Researchers who study how humans and AI systems work together have identified a concept called appropriate reliance. This means relying on an AI system exactly as much as its actual reliability warrants β no more, no less. Over-relying on a weak system leads to disasters like the Schwartz case. Under-relying on a strong system means losing the benefits it could provide.
The problem is that most of the time, we don't know where an AI system's actual reliability ceiling is. Companies don't always publish detailed accuracy data. Errors are often not reported publicly. The gap between what an AI system is marketed as and what it actually does in edge cases can be enormous.
Here is a practical framework for calibrating how much to rely on any AI-generated output. These four questions won't give you a definitive answer, but they will help you think clearly about what kind of trust is warranted:
1. What kind of claim is this? Is it a pattern or a specific fact? AI is generally more reliable at identifying patterns ("this essay has good structure") than at asserting specific facts ("this treaty was signed on March 4, 1847"). Specific facts need verification.
2. What are the consequences of being wrong? If you use an AI's movie recommendation and it's bad, the cost is two hours. If a doctor uses an AI's diagnostic suggestion and it's wrong, the cost may be someone's life. Higher stakes demand more scrutiny.
3. Is there a way to check this independently? Can you verify the claim against a primary source, an expert, a dataset you can actually see? If yes, do it. If no β meaning the AI is your only source β factor that uncertainty into how you use the output.
4. Who built this, and what do they know about how it fails? Has the system been independently audited? Is its error rate documented? Did the company respond to past failures with transparency or concealment? The Cruise case is a reminder that how a company behaves when things go wrong is as important as how often things go right.
Research by cognitive scientist Lisanne Bainbridge, published as early as 1983, identified what she called the "ironies of automation." The more reliable an automated system is, the harder it becomes for human operators to maintain the skills needed to take over when it fails β because they rarely need to intervene. Pilots who fly mostly on autopilot become less sharp at manual flying. The better AI gets, the more we have to deliberately practice the skill of questioning it.
In 2018, a Boeing 737 MAX aircraft operated by Lion Air crashed into the Java Sea, killing all 189 people on board. A second 737 MAX crash followed five months later in Ethiopia, killing 157 more. Both crashes were caused partly by an automated flight control system β the MCAS β pushing the nose of the aircraft down based on faulty sensor data. Pilots fought to regain control, but had not been given adequate training about how MCAS worked or how to override it.
The investigation found that Boeing had partly withheld information about MCAS from pilots and regulators, similar to Cruise's concealment. But the pilots' situation illustrates the human dimension: they were trained to trust the aircraft's systems. The system was wrong. And the gap between "this system is usually reliable" and "this system is reliable right now in this situation" cost 346 lives.
Overtrust does not mean stupidity. It means a rational trust calibration β built on a genuine track record β that fails in the exact edge case the training didn't prepare you for.
You now have a framework for reading any story about an AI system failing. The questions to ask: What kind of claim was the system making? What were the stakes? Was independent verification possible? And how did the company respond when it went wrong? Most news coverage of AI failures misses at least two of these four questions. Knowing the framework puts you ahead of most of the reporting you'll read.
Here's the uncomfortable truth: the four questions in the Trust Test often can't be answered, because the information needed to answer them isn't public. You can't know how often an AI hiring tool misclassifies candidates if the company won't release error rates. You can't know what data a medical AI was trained on if it's proprietary. You can't evaluate what an AI knows about how it fails if it never publishes failure data.
This is an institutional-level problem. Individual users making good personal decisions don't fix it. It requires either regulatory requirements for transparency or industry standards that make auditing normal. Neither exists consistently anywhere in the world right now, as of 2024.
So the ethical tension is real: we are being asked to make trust decisions about systems we cannot fully evaluate. The question is not whether that's okay. It's: who should have the responsibility to change it?
Five AI applications have been proposed for use in your school or community. For each one, you need to assess whether the level of trust being placed in it is appropriate. Use the four Trust Test questions from Lesson 3.
The AI lab assistant will challenge your reasoning β it won't just agree with you. It will push you to go deeper. Take a position and defend it.
In 2023, a team of researchers at Stanford University School of Medicine published a study in JAMA Internal Medicine testing whether large language models could answer clinical medical questions accurately enough to assist doctors. They ran hundreds of medical questions through GPT-4 and evaluated the answers against expert physician responses.
GPT-4 performed remarkably well on many standard clinical scenarios β sometimes at or near the level of a practicing physician. The researchers were impressed. They were also careful. In the same paper, they documented specific categories where the model failed consistently: rare diseases, up-to-date drug interactions, and any scenario requiring knowledge of events after its training cutoff. It was brilliant in the center, unreliable at the edges.
A doctor who read only the headline β "GPT-4 performs like a physician" β might use the tool too broadly. A doctor who read the actual study knew precisely where the tool could be trusted and where it could not. The difference between those two doctors is not whether they use AI. It's how carefully they read what the research actually says.
The Stanford story points at something that matters as AI becomes more woven into daily life: knowing how to use AI well is a reading skill. It involves reading the output critically. It involves reading the claims people make about AI critically. And it involves understanding what the absence of information tells you.
Professional users of AI β doctors, lawyers, researchers, journalists β have developed informal practices for working with AI outputs. They tend to share some habits: they treat AI output as a first draft or a starting point, not a finished answer. They notice when claims are highly specific (dates, names, statistics) and flag those for separate verification. They are alert to what the AI didn't say β things that should be in a complete answer but aren't.
These habits are not complicated. But they require you to stay active and critical even when an answer looks finished and polished. The look of completion is part of what makes AI output dangerous to accept uncritically.
These five habits come from observing how professionals who rely on AI without being captured by it actually work:
1. Separate structure from facts. AI is usually more reliable at organizing ideas β outlining, summarizing, structuring an argument β than at producing accurate specific facts. Trust the structure. Verify the facts.
2. Treat specificity as a red flag, not a green one. A very specific claim ("the treaty was ratified on September 2, 1783") feels more credible than a vague one. In AI output, it's the opposite: specificity is exactly where hallucinations hide. The more specific the claim, the more urgently it needs an independent source.
3. Ask what's missing. An AI response that doesn't mention a key counterargument, a major complication, or a well-known exception is not necessarily wrong β but it may be incomplete in a consequential way. Completeness requires your own judgment about what ought to be there.
4. Know the training cutoff. Every language model has a knowledge cutoff β a date after which it knows nothing. Ask the model when its data ends. Any question involving recent events, current laws, or up-to-date statistics may be outdated by months or years.
5. Source the key claims. For anything important, trace the claim to a source that exists outside of the AI system. Not another AI. A primary document, a peer-reviewed study, an official record. This is the single most effective habit for catching errors before they matter.
AI hallucinations don't stay private. When students submit hallucinated facts in essays, those essays sometimes get published. When journalists use AI to research stories and miss a fabricated detail, it enters the public record. In 2023, several news outlets discovered that AI-written articles they had published contained factual errors that passed through editorial review. The errors were fluent and plausible β exactly the kind that human editors are trained to catch in amateur writing but not in polished copy.
Being a critical AI user also means reading about AI critically β not just reading its outputs. The claims made about AI in press releases, news coverage, and even academic papers are subject to the same scrutiny you'd apply to anything else.
When a company announces that their AI "surpasses human performance" at some task, the critical questions are: Which humans? Under what conditions? On which specific benchmark? Who funded the study? A model that surpasses human radiologists at detecting one type of cancer lesion in a specific dataset from one hospital is not the same as a model that's better than radiologists in general. But the headline often doesn't say that.
In 2021, researchers at Google published a paper claiming their DeepMind AI AlphaFold had "solved" the protein folding problem β a decades-long challenge in molecular biology. The claim was celebrated globally. It was also an overstatement: AlphaFold solved one version of the problem for many proteins, but not all, and not under all conditions. Scientists who read the actual paper knew the difference. People who read only the headlines did not.
You now have a complete framework for engaging with AI as a critical user β not a passive receiver. The five habits apply whether you're using an AI for homework, reading a news story about AI, or eventually making decisions in a professional context where AI tools are part of the workflow. Most people who use AI never develop these habits. The fact that you can name them, and know why each one matters, puts you in a genuinely different position from the majority of AI users β at any age.
Across these four lessons, a single thread runs through every story: the gap between what an AI system appears to be and what it actually is. Hallucinations make an AI appear to know things it doesn't know. Bias makes an AI system appear neutral when it carries the weight of historical inequality. Overtrust makes a reliable system appear infallible. And polished marketing makes a product in its edge cases appear to work in all cases.
The skills in this module β identifying hallucination risks, tracing bias to its source in training data, calibrating trust with the four-question framework, and reading AI output with professional habits β are not skills about being skeptical of technology. They're skills about using technology with clear eyes. The people who do the most valuable work with AI are not the ones who trust it most or least. They are the ones who know what it is.
That is a harder position to hold than either "AI is amazing and will solve everything" or "AI is dangerous and we should be afraid." It requires staying curious, staying specific, and never treating the surface of an output as the whole truth. That is what you are now equipped to do.
You've been handed an AI-generated briefing document about climate change policy. Your job is to read it like a professional β applying the five habits from Lesson 4 β and identify every place where the critical user habits demand action.
The AI lab assistant below will give you sections of the briefing to analyze. Tell it what you notice, what you'd verify, and what's missing. It will push back on shallow analysis and ask you to go deeper.