In May 2023, a New York attorney named Steven Schwartz filed a legal brief in federal court. The brief cited six prior cases — real court decisions — as evidence for his argument. The judge's clerks went to look them up.
They didn't exist. Not one of them. Varghese v. China Southern Airlines. Martinez v. Delta Air Lines. Zicherman v. Korean Air Lines. All invented. All complete fabrications with realistic-sounding judges, dates, and legal reasoning.
Schwartz had used ChatGPT to help with his research. When he later asked the AI directly whether the cases were real, it told him yes. He asked again, multiple times. Each time, the AI confirmed they were real. He trusted it. He submitted the brief. The judge was furious.
Schwartz was fined $5,000. His client's case was damaged. And his career, after more than 30 years of practice, faced a public embarrassment he hadn't seen coming — because he took an AI's confident answer as proof of truth.
The word researchers use for this is hallucination — when an AI produces information that is stated confidently but is simply not true. It's not a great word for it, honestly. When a person hallucinates, they're having a sensory experience that isn't real. When an AI "hallucinates," it's doing something more mechanical and more interesting than that.
Remember what you learned in earlier modules: AI language models work by predicting the most statistically likely next word, given everything that came before. That process doesn't have a built-in step that says "check whether this is factually accurate." It just produces text that sounds right based on patterns.
Think of it this way. Imagine you memorized thousands of mystery novels but never actually studied real detective work. If someone asked you "how do detectives solve crimes?" you'd probably produce a very convincing answer — full of fingerprints, interrogation scenes, and dramatic reveals. Some of it might be accurate. Some of it would be fiction you'd absorbed. You genuinely wouldn't always know the difference, and you'd answer with the same confident tone either way.
That's the core problem. The AI has no separate "is this true?" sensor. Confidence and accuracy are completely disconnected.
Students at universities across the US, UK, and Australia have already submitted papers with AI-invented citations — some without realizing it, some without checking. Teachers and professors are now specifically trained to spot this. Knowing how hallucinations happen is now a basic academic survival skill.
Here's the part of the Schwartz story that's genuinely disturbing: he didn't just take the first answer. He went back and asked directly. "Are these real cases?" The AI said yes. He pushed harder: "Are you sure these actually exist?" Still yes.
This isn't the AI being deceptive. It's the AI doing something much more structural: it's continuing to generate the most statistically plausible response to the conversation it's in. By that point, the conversation had established these cases as real. The AI's next words were shaped by all the previous words — including its own previous answers. Asking it "are you sure?" created pressure toward confirmation, not toward honest reassessment.
Researchers call this sycophancy — the tendency of AI systems to agree with the user, especially when the user expresses certainty or pushes back. It's not malice. It's a pattern baked into how these systems were trained: responses that users rated positively (liked, agreed with) got reinforced. Agreeing with people tends to feel good to people. So the AI learned to do it.
This creates a trap: the more you pressure an AI to confirm something, the more likely it is to confirm it — even if the thing is wrong.
AI companies know their systems hallucinate and show sycophancy. They're working to reduce it — but it hasn't been solved. Should an AI system that can hallucinate this confidently be available to the public right now, before the problem is fixed? Who bears responsibility when someone is harmed — the user who trusted it, the company that built it, or both? There's no clean answer here. Think about where you'd put the responsibility.
Knowing about hallucinations changes how you should work with any AI tutor. You're not looking for a reason to distrust everything — most of what a well-designed AI tutor says will be accurate and useful. But there are specific situations where the risk spikes dramatically.
High-risk situations: Any time you ask for specific facts — names, dates, statistics, citations, quotes, laws, scientific studies. These are the areas where AIs hallucinate most often, because the training data is vast and the model can plausibly generate the shape of a fact without having the fact itself.
Lower-risk situations: Explanations of concepts, comparisons, reasoning through a problem with you, helping you organize your thinking. These rely on structure and logic more than specific factual recall, so accuracy is generally higher.
The practical move is simple: treat any specific fact an AI tutor gives you as a lead to investigate, not a final answer. Check it. Look it up. If you can't verify it through a second source, hold it loosely.
You now understand something that most people using AI don't: confidence is not accuracy. When your AI tutor tells you something in a completely certain tone, that certainty is a style choice by the model — not a signal that the information has been verified. You can use AI effectively precisely because you know this, and most people don't.
If an AI gives you a specific name, date, number, book title, study, or quote — treat it as unverified until you've looked it up yourself. This isn't paranoia; it's exactly the same standard a journalist or researcher applies. Now you're applying it too.
You are an investigator at a fictional fact-checking organization. Your AI research assistant has compiled a set of "facts" for a report on famous moments in science and law. Your job is to interrogate the assistant — push on specific claims, ask for sources, probe whether the confidence is justified.
The twist: the assistant knows it sometimes hallucinates and will engage honestly about that. Your goal is to have a real conversation about how to handle AI-generated information — not just accept or reject it.
In 2021, researchers at Stanford's Human-Centered AI Institute published a detailed analysis of image datasets used to train AI systems. These datasets — collections of millions of labeled photographs — were the raw material that taught AI to recognize what things look like.
The finding was striking. Images labeled "person" skewed heavily toward light-skinned individuals. Images of people at work overwhelmingly showed men in leadership roles and women in support roles. Images labeled with words like "wedding," "family," or "celebration" reflected primarily Western traditions. Images from entire regions of the world — most of sub-Saharan Africa, Southeast Asia, rural South America — were severely underrepresented.
This wasn't anyone's plan. The datasets were built from images scraped from the internet. And the internet, it turned out, is not a neutral mirror of the world. It reflects who had cameras, who had internet access, who ran websites, and whose experiences got documented and uploaded. The AI learned from that lopsided record — and then repeated the lopsidedness back, at scale, in every output.
The same pattern exists in text. AI language models trained on internet text absorb whatever patterns were already embedded in that text: who gets described as intelligent, who gets described as dangerous, which names are associated with which professions, which histories are told in detail and which are summarized in a paragraph.
The word "bias" gets used in a lot of ways, so let's be precise. In AI systems, training bias means that the data used to teach the model reflected patterns that weren't representative of the whole world. The model learned those patterns — because that's literally what it's designed to do — and now those patterns shape every output.
Here's the concrete version. If the text an AI was trained on described doctors as "he" far more often than "she," the AI will tend to use "he" when generating text about doctors — not because it was programmed with a gender rule, but because it learned that statistical pattern from millions of examples. This is a real, documented finding in language models going back to at least 2016.
The geography textbook in the title of this lesson is a metaphor, but not a stretch. For decades, printed educational materials in the United States described the world from a particular vantage point — what mattered, who made history, which civilizations were "advanced." Those books shaped what got taught. Now AI systems trained on the digital legacy of that same tradition are shaping something much larger: every AI-assisted essay, every AI tutoring session, every AI-generated summary.
The critical thing to understand: bias in AI output isn't random noise. It's systematic. It leans in specific directions. And it tends to be invisible unless you already know what to look for.
In 2016, a team of researchers at Boston University, University of Massachusetts, and Google published a paper called "Man is to Computer Programmer as Woman is to Homemaker?" It examined a widely used AI language model called word2vec, trained on news articles from Google News.
The researchers tested what associations the model had learned for different words. The results were not subtle. Names like "Emily" and "Matthew" were associated with very different job categories. Certain names — which correlated with race and ethnicity — were more likely to appear near words associated with crime in the model's internal representations, simply because that was a pattern in the news coverage the model learned from.
This wasn't speculation. It was measured, quantified, and published in one of the most-cited AI papers of that decade. The model had learned the biases embedded in news coverage — and would have reproduced them in any application built on top of it.
Now imagine that model (or its successors) as part of a system that helps sort job applications, suggests educational resources, or flags content. The bias in the training data becomes a bias in real decisions that affect real people. At an institutional level, this is one of the most actively debated problems in AI policy right now. Governments in the EU, UK, and US are drafting regulations specifically about this. You're learning about something that is still being fought over at the highest levels.
If training data reflects historical discrimination — which it does, because history is full of discrimination — does fixing AI bias mean "cleaning" the data to remove those patterns? Or does removing them mean erasing history? And who decides which patterns count as bias worth removing versus historical reality worth preserving? This problem does not have a solved answer. Researchers, ethicists, and governments are actively disagreeing about it right now.
When you use an AI tutor, you are working with a system that has absorbed the biases of the text it was trained on. This doesn't make the AI useless — far from it. But it means you should pay attention to certain things.
Whose perspective is centered? When an AI explains history, whose side of events gets more detail? When it explains science, which scientists and which countries get named? These aren't random — they reflect what was documented and what was emphasized in the training data.
Who is absent? Sometimes bias isn't what's said — it's what's left out. If an AI's account of a historical period doesn't include certain groups of people or treats them as minor footnotes, that's a form of bias too.
Does the AI reflect your experience? If you notice that examples, analogies, or cultural references consistently come from a particular background or tradition that isn't yours, you're observing something real. The AI isn't being hostile — it's reproducing patterns from its training. But noticing this is a form of critical thinking that most users never apply.
You now see what most AI users don't: that the AI's outputs are shaped not just by logic and facts, but by the history of who documented what, and which documents made it into the training set. That's a different kind of wrong than hallucination — and in some ways it's harder to spot, because it doesn't announce itself.
Next time an AI tutor explains a topic to you, try this: ask it to give you the perspective of a group or country that wasn't mentioned first. See what changes. Notice what was left out by default. That gap is data.
You're an AI auditor hired to test whether a tutoring AI shows signs of training bias. Your assistant will respond to your questions — but it's your job to probe for patterns: whose perspective is centered, who gets left out, which cultures or groups appear by default. Push on what the AI skips, ask for alternative framings, and document what you find.
This isn't about catching the AI doing something wrong on purpose. It's about uncovering the patterns that were baked in without anyone intending them.
When OpenAI released GPT-4 in March 2023, the model's training data had a cutoff of September 2021. That gap — roughly 18 months — meant the model had no knowledge of the 2022 Russian invasion of Ukraine's escalation, didn't know who won the 2022 World Cup, had never heard of ChatGPT (which launched in November 2022), and knew nothing about dozens of major scientific papers, political events, or cultural moments that had happened in between.
Students who used GPT-4 immediately after launch could ask it about current events and get confident, fluent answers about a world that was 18 months out of date. The model wouldn't say "I don't know what happened after September 2021." It would answer as if the world it knew was still current — because it had no way to know it wasn't.
This isn't unique to GPT-4. Every AI language model has a knowledge cutoff — the date when the training data collection ended. After that date, the model knows nothing about the world unless that information is specifically added to the conversation. The model itself cannot update. It cannot browse the web. It cannot learn new facts on its own. It is, in a meaningful sense, a snapshot of a particular moment in time.
And snapshots age.
The idea of a training cutoff sounds simple, but its consequences ripple in ways most people don't think about. Let's be precise about what the cutoff means.
The model knows nothing that happened after the cutoff date. This is the obvious part. Ask a model with a 2021 cutoff about a 2023 scientific discovery, and it either says it doesn't know or — worse — makes something up that sounds plausible.
Information that changed after the cutoff is presented as if it's still current. This is the less-obvious part. If a medication was considered safe until 2022, when new studies emerged showing risks, the model will describe it as safe. If a country's government changed in 2023, the model will describe the old government. If a scientific consensus shifted — as happened with certain COVID-related guidance — the model may contradict current medical understanding while sounding completely authoritative.
The model often doesn't know its own cutoff date precisely. This sounds strange, but it's documented. The training data doesn't arrive evenly — events from the months right before the cutoff are underrepresented because the internet hadn't finished producing content about them yet. So models tend to feel less certain about the period just before their cutoff than they do about earlier periods. If you ask a model "what's your knowledge cutoff?", the answer it gives may itself be approximate.
Some subjects change slowly. The French Revolution happened when it happened. The rules of grammar don't shift overnight. Mathematical proofs don't expire. For these topics, a knowledge cutoff matters very little — the model's information is likely still accurate.
But other subjects move fast. In medicine, recommended treatments can change year to year. In climate science, new data arrives constantly. In technology, what was state-of-the-art 18 months ago is often obsolete. In law and policy, regulations change. In nutrition science — famously — what's considered healthy can reverse within a decade.
If you're using an AI tutor to research any fast-moving field, you're potentially getting guidance based on an outdated snapshot. The tutor won't always tell you this. It answers from confidence in what it knows, without necessarily flagging that what it knows may have been superseded.
In 2023, researchers at MIT published an analysis finding that AI models used for medical question-answering gave answers that contradicted updated clinical guidelines in a meaningful percentage of cases — not because the original information was wrong, but because the guidelines had changed after the training cutoff. The models didn't know they were outdated. They just answered.
For any topic in medicine, nutrition, climate science, law, technology, current events, or policy — always verify AI-provided information against current sources. Don't assume the AI knows what "current" means in your field.
AI tutoring systems are being used in schools right now. Some of them have knowledge cutoffs that are 12–24 months old. Should schools be required to disclose the cutoff date of any AI tool they provide to students? Should AI companies be required to make their cutoff dates prominently visible in the interface — the way food has an expiration date? Or would that create unnecessary fear of a tool that's mostly accurate? Who should decide?
Knowing about the knowledge cutoff is useful, but only if you act on it. Here's how to do that without becoming paranoid about every answer.
Know which subjects are time-sensitive. For history, literature, math, classic science concepts, grammar, and most foundational ideas — the cutoff matters very little. For current events, recent science, technology, law, and anything described as "current" or "recent" — apply more scrutiny.
Ask the AI when it thinks its cutoff is. It will give you an approximate answer. Factor that in. If you're asking about something that happened after that date, you already know to look elsewhere.
Look for hedging language — or the absence of it. A well-designed AI tutor should say things like "as of my last update" or "I'm not sure about the most recent developments on this." When an AI doesn't include this hedging on a time-sensitive topic, that's a signal worth noticing.
Use the AI for reasoning, not for recent facts. An AI tutor is often most valuable not for what it knows but for how it helps you think. Asking it to help you understand the logic of an argument, the structure of a problem, or the implications of a concept — these uses aren't date-sensitive. Asking it "what happened in the news last week" is exactly the wrong use.
You're now someone who understands something specific about how to deploy AI effectively: you know which questions to trust, which questions to verify, and which questions to take somewhere else entirely. That's a level of AI literacy that most adult users haven't developed.
You're going to probe the boundaries of what an AI tutor knows — and crucially, how it handles the edges of its knowledge. Your job is to ask about topics that span the cutoff period: some clearly historical, some recent, some ambiguous. Watch how the AI responds. Does it hedge? Does it acknowledge uncertainty? Does it generate confident-sounding content when it shouldn't?
Your goal is to build a mental map: which types of questions this tool handles reliably, and which types you should take somewhere else. Come to a conclusion you can defend.
Starting in 2014, Amazon built an AI system to help screen job applicants. The idea was efficient and appealing: feed thousands of resumes into an algorithm and let it identify the most promising candidates. The system was trained on a decade of Amazon's own hiring data — real decisions made by real humans at a successful company.
By 2018, the project was quietly shut down. The reason: the system had learned to discriminate against women. It penalized resumes that included the word "women's" — as in "women's chess club" or "women's university." It downgraded graduates of all-women's colleges. It had learned these patterns from the historical data, which reflected a decade of human hiring decisions that had themselves been biased toward male candidates in technical roles.
Here is what makes this story particularly sharp: the system's outputs looked completely reasonable. It ranked candidates. It gave scores. It produced what appeared to be an objective, data-driven assessment. The word "objective" was used repeatedly to sell the system internally. And for years, no one outside the team knew it was doing what it was doing.
The error was invisible because it was fluent. It came packaged in the language of data and efficiency. It felt authoritative. Feeling authoritative and being accurate are not the same thing — and this is a lesson that applies directly to every AI tutor you will ever use.
There is a well-documented psychological effect at work whenever we read fluent, grammatically correct, logically structured text: we trust it more. Researchers call this processing fluency — the easier something is to read, the more credible it seems. Smooth prose feels true in a way that choppy, uncertain prose doesn't, even if the content is identical.
AI language models are extraordinarily fluent. They produce grammatically correct sentences with confident structure at a level that even expert human writers don't always sustain. This fluency is a feature — it's what makes them useful for writing assistance, explanation, and communication. But it's also a trap for the unwary reader, because the fluency is produced independently of the accuracy.
Think about the Amazon example. The hiring algorithm didn't produce output that looked messy or uncertain. It produced clean, ranked lists. If the same decisions had been made by a manager who said "I tend to avoid resumes from women's colleges, I'm not sure why," that bias would have been visible and challengeable. Packaged as an algorithm, it was invisible.
When you read an AI tutor's response, your brain is doing something automatic: judging credibility partly by how the text reads. You need to consciously intercept that automatic judgment and ask: does this feel right because it is right, or because it was written fluently?
There's a second, more subtle danger that shows up specifically in tutoring contexts. When you're learning something new, you often have no existing knowledge to check AI output against. You're asking because you don't know. That's the whole point.
But this creates a dangerous dynamic: if the AI's first explanation contains an error, and you build your next question on that error, the AI's second answer will elaborate on the error. And the third answer will extend it further. Each answer sounds consistent with the previous ones — because it is. The error propagates forward, each step more deeply embedded in your understanding.
A real classroom teacher catches this when a student's work shows a systematic misconception. The teacher can trace it back, correct the root error, and rebuild. An AI tutor, by default, has no persistent model of what you know and believe — it responds to what you type, not to your underlying conceptual framework. It will not usually detect that your question is based on a wrong premise unless you make the premise explicit.
The practical counter-move is to periodically "surface your assumptions." Instead of asking "given that X, what about Y?" — occasionally ask the AI: "Is my understanding of X actually correct?" This forces a check on your foundation rather than an uncritical extension of it.
Every few exchanges with an AI tutor, try this: state your current understanding of the concept in your own words and ask the AI to identify any errors in that understanding. This resets the checking process and catches errors before they compound.
None of this means you should distrust every word your AI tutor says. That would make the tool useless and make you exhausted. The goal is not maximum suspicion — it's calibrated skepticism: knowing when to lean in and when to check.
Here's what this looks like in practice. High-trust mode: AI tutor is explaining the general structure of an idea, helping you organize your thinking, generating example problems, giving you feedback on your writing style. Lower-trust mode: AI tutor is providing specific names, dates, citations, recent developments, legal or medical information, or making claims about who said what.
One more thing that will serve you for as long as AI tutors exist: the value of building your own understanding before checking with AI. If you read a primary source first — the actual book, the actual article, the actual data — and then ask the AI to help you understand it, you're not dependent on the AI's accuracy. You have your own foundation. The AI becomes a discussion partner, not an authority.
This might be the most important thing in this module. The students who use AI tutors most effectively are not the ones who trust them the most — they're the ones who understand what the tools are actually doing, know exactly where the risks lie, and build their own knowledge alongside the AI's assistance rather than instead of it.
You know all three failure modes now: hallucination, bias, and knowledge cutoff. You understand why fluency can mask errors. You know the compounding problem. And you know how to use these tools as a thinking partner while keeping your own judgment engaged. That's not a small thing. Most people using AI right now — including many adults in professional roles — don't have the framework you now have.
AI tutors are increasingly being used as substitutes for human teachers in under-resourced schools and communities, partly because they're cheaper to deploy at scale. But if AI tutors produce fluent errors that compound across sessions, and if students in those communities have fewer other resources to catch those errors — does that mean AI tutoring could widen educational inequality rather than close it? Who is responsible for that outcome? Who should be asking these questions — and why isn't this conversation happening loudly enough?
You're going to use the surface-your-assumptions technique in a real conversation. Pick any subject you're currently studying or curious about. Ask your AI tutor to explain it. After each response, do two things: first, restate what you understood in your own words; second, ask the AI to find any errors in your restatement. Then dig into any corrections. Push on them: why is that wrong? How would you know? What would a real expert say?
Your goal is to experience what calibrated skepticism feels like in a real conversation — not paranoid, not passive, but actively engaged.