In May 2023, a lawyer named Steven Schwartz filed a legal brief in a New York federal court. The brief cited six prior court cases as evidence β the kind of thing lawyers do every day to support their arguments. But when the judge asked to see those cases, something strange happened.
None of them existed.
Not one. The case names sounded real. The rulings sounded plausible. The citations had the right format. But they were entirely made up β generated by ChatGPT, which Schwartz had used to help research the brief. When he asked the AI if the cases were real, it told him yes. It was wrong. Schwartz had not verified the citations himself.
The judge sanctioned him β meaning Schwartz faced formal punishment and public humiliation. The incident made headlines around the world. And it raised a question that nobody had a clean answer to: Who was responsible? The lawyer who trusted the AI? The AI that confidently invented fake citations? Or the legal system that hadn't yet figured out rules for any of this?
There's a specific name for what ChatGPT did to Steven Schwartz. It's called a hallucination β and it's one of the most important things you can understand about how AI language models work.
Here's the concrete picture: imagine you ask a friend to name the capital of a country they've never studied. They don't want to say "I don't know," so they guess β but they say it with total confidence, as if it's obvious. That's roughly what a hallucinating AI does. It generates text that sounds correct based on patterns it learned, even when the specific fact is wrong or invented.
AI language models β like ChatGPT, Claude, Gemini β are trained on massive amounts of text. They learn to predict what word comes next in a sentence, over and over, billions of times. They get extremely good at producing plausible-sounding text. But "plausible-sounding" is not the same as "true." The model doesn't actually look things up. It doesn't check a database of real court cases. It generates what a court case citation would look like β and sometimes that's right, and sometimes it's entirely fictional.
The reason this matters so much is that AI hallucinations don't come with warning labels. A hallucinated fact sounds exactly like a real fact. The same confident tone. The same sentence structure. No asterisk, no "I'm not sure," no flicker of hesitation β unless the model is specifically designed to express uncertainty, which many are not.
After the Schwartz story, it would be easy to conclude that AI is just dangerous and should be avoided. That would be the wrong takeaway. The real lesson is more precise: AI helps learning in some situations and undermines it in others, and the difference isn't random.
Here are the situations where AI genuinely helps:
Explaining concepts in a different way. If your teacher explains something and it doesn't click, asking an AI to rephrase it β or explain it using a different analogy β can be legitimately useful. The AI isn't creating facts here; it's just reorganizing ideas you can verify.
Brainstorming and generating options. When you need a list of ideas, angles, or possibilities β for an essay topic, a project approach, an argument structure β AI is a solid brainstorming partner. It won't do your thinking, but it can unstick you when you're blank.
Practicing skills with instant feedback. Using AI as a conversation partner for a language you're learning, or to simulate a debate, or to practice explaining a concept out loud β these are high-value uses because the process of practicing is what builds the skill, and AI can be an infinitely patient sparring partner.
Summarizing long texts you've already read. If you've read a chapter and want a condensed version to review before a test, asking AI to summarize it gives you a study tool. The key phrase is "texts you've already read" β the summary helps you remember, not replace reading.
In every "helps" case above, the human is still doing the hard cognitive work β comparing, deciding, practicing, remembering. AI is amplifying the process, not replacing it.
The flip side is just as clear β and it's where most students (and, apparently, lawyers) get into trouble.
Using AI output as a finished product. When you copy an AI's essay, explanation, or answer and submit it as your work, you've skipped the part that builds your brain. The thinking process β the struggling, the drafting, the revising β is where learning happens. AI output bypasses all of that. You get a grade; you build nothing.
Trusting AI for facts without checking. This is the Schwartz problem. AI is not a search engine. It doesn't retrieve facts from a database β it generates text that sounds like facts. Anything specific β dates, names, statistics, citations, scientific claims β needs to be verified from a primary source.
Using AI to avoid the hard part of a subject. Math is a good example. If you use AI to solve every problem, you never build the mental model that lets you solve the next harder problem. There's a difference between getting the answer and understanding the method. AI hands you the answer and skips the method.
Becoming dependent on AI prompts to generate your own opinions. When you always ask AI "what should I think about this?" before forming your own view, you're training yourself to outsource your reasoning. That's a habit that compounds β the longer you do it, the weaker your independent thinking gets.
Schwartz's firm argued that the AI's confident wrong answers were a kind of deception. The judge disagreed and held the lawyers responsible. But here's the real tension: if AI systems are designed to sound confident even when wrong, and if users reasonably trust them, who carries the moral weight of the error β the person, the company that built the AI, or both? There's no agreed-upon answer. The law is still being written.
Most people who use AI have a rough intuition that "it might get things wrong sometimes." You now have something more precise than intuition. You know the mechanism β what hallucination is, why it happens, and what categories of tasks expose you to it versus what tasks route around it. That's a real edge.
When a classmate says "I asked ChatGPT and it said X" as evidence in an argument, you're now equipped to ask: "Is X the kind of claim AI generates reliably, or the kind it makes up?" That's not skepticism for its own sake. That's information literacy β and it's increasingly rare.
The rule of thumb going forward: AI is a thinking tool, not a knowing tool. It helps you think. It doesn't reliably know. Keep that distinction active every time you use it.
AI helps learning when you're doing the thinking and AI is the scaffold. AI undermines learning when AI does the thinking and you're just the delivery mechanism.
You're going to act as an auditor β someone whose job is to find where AI gives unreliable output. Your lab partner (the AI below) plays the role of a knowledgeable peer who will push back on your reasoning, not just agree with you.
Work through the scenario together. You'll need to take a position and defend it.
In 2023, researchers at Harvard's Graduate School of Education published a paper on what they called the "fluency illusion" in AI-assisted learning. They had run a study in which students used AI tools to help them understand a difficult physics concept β specifically, how objects move in the presence of gravity. The AI explanations were genuinely good. Clear. Well-organized. Students who read them reported feeling confident they understood the material.
Then the researchers tested them. The students who had only read the AI explanations β without working through problems themselves β scored significantly lower than students who had struggled through problems with minimal AI help. But here was the unsettling part: the AI-assisted students predicted they would score higher. They felt more confident but performed worse.
The researchers called this the fluency illusion: when information is presented clearly and smoothly, your brain mistakes the ease of reading for the depth of understanding. AI, which is extraordinarily good at presenting information clearly, may be especially good at triggering this illusion.
Here's the concrete anchor: think about learning to ride a bike. You could read a perfect, detailed explanation of how balance works, how to shift your weight, when to pedal harder. Understanding the explanation is easy. Getting on the bike and not falling β that's different, and reading didn't help with it.
Cognitive science β the study of how minds work β has known for decades that retrieval practice (pulling information out of your memory) builds stronger memory than re-reading (putting information in again). Struggling with a problem activates different brain processes than reading a solution. The struggle literally changes how your brain stores the information.
AI-generated explanations are seductive because they remove the struggle. That's what makes them feel helpful. But the struggle wasn't the problem β it was the process. When AI removes it, the learning often goes with it.
The fluency illusion shows up in a predictable set of situations. Knowing them gives you a practical map for where to be careful:
Using AI summaries to "study" before a test. If you ask AI to summarize a chapter you haven't fully read, the summary feels like studying. It isn't. You're re-reading organized information in a form that's easy to follow. You haven't tested whether you can recall any of it on your own. The test will ask you to recall, not follow.
Asking AI to explain why you got a problem wrong. This one is nuanced. Reading an AI explanation of why you got a math problem wrong can feel like learning. But if you don't then do three more similar problems yourself, you've only understood the explanation β you haven't rebuilt the method. The explanation replaced the repair work.
Using AI to draft an essay, then "improving" it. When you edit an AI draft, you're reacting to someone else's argument structure. You're not building your own. This matters because writing an essay is an exercise in constructing a logical argument β and that construction process is what develops your thinking. Editing an existing structure mostly develops your editing.
If a student uses AI explanations to pass a test β meaning the grade accurately reflects that she understood the material β but she doesn't retain the knowledge afterward, has she learned? Schools grade the test, not what you remember two months later. Does that mean the grade is dishonest? Or just that grades measure the wrong thing? There's no consensus here.
The goal isn't to avoid AI β it's to use it in ways that keep the brain-building work in your hands. Here are strategies that researchers have found actually work:
Try first, then consult AI. Attempt the problem, the essay, the explanation on your own before asking AI anything. Your attempt β even if wrong β primes your brain to learn from the feedback in a way that jumping straight to AI doesn't. This is called the "generation effect" and it's been documented since the 1970s.
Use AI as a quiz generator, not an answer generator. Instead of asking "explain X to me," ask "give me five questions to test whether I understand X." Then answer the questions without looking. That's retrieval practice, and it's dramatically more effective than re-reading.
Explain back to AI. Write your own explanation of something you just learned, then ask the AI to identify gaps or errors. You're doing the hard work of generating the explanation; AI is checking it. This flips the dynamic in the right direction.
Most students who use AI to "study" don't know about the fluency illusion. They genuinely believe the good feeling of following a clear explanation means they've learned. You now understand the mechanism well enough to not fool yourself. That's a real cognitive advantage.
You've been asked to advise a student who's preparing for a biology exam on genetics β a topic with a lot of new vocabulary and conceptual links. She's planning to spend two hours asking AI to explain each concept until she "gets it."
Your job: tell your lab partner what's wrong with her plan and propose a better one. The AI will challenge you to be specific β vague advice won't cut it.
In 2023, several school districts in Georgia ran a controlled pilot with an AI writing tutor. Students who struggled with essay writing were split into two groups. One group had access to an AI that would give detailed, specific feedback on every paragraph β what to cut, what to expand, how to restructure sentences. The other group received only high-level prompts: "What do you think your reader needs to know here? Is this your strongest argument?"
After eight weeks, both groups had improved essays. But then the researchers took the AI away. Both groups were asked to write a new essay cold, with no help. The group that had received detailed specific AI feedback regressed sharply β their cold essays were only marginally better than their originals eight weeks earlier. The group that had received only guiding questions continued to improve. Their cold essays were significantly stronger.
The researchers' interpretation: detailed AI feedback taught students to respond to corrections. Guiding questions taught students to think. Only one of those skills transfers when the AI is gone.
There's a useful word from education theory here: scaffolding. Construction scaffolding holds up a building while it's being built β then it comes down, and the building stands on its own. Good educational scaffolding works the same way. It supports you during the learning process and is designed to be removed.
The problem with very detailed AI feedback is that it functions as load-bearing scaffolding that never gets removed. Every paragraph gets fixed for you. Every weak sentence gets replaced. You become skilled at accepting corrections β not at generating good writing from scratch.
Compare it to learning to drive. If your instructor grabs the wheel every time you drift slightly, you never develop the reflexes to correct yourself. You become good at car rides, not driving. The Georgia study showed this exact dynamic in writing.
The Georgia finding isn't equally true in every subject. Understanding where AI help builds transferable skill versus where it creates dependency loops is genuinely useful.
Writing and argumentation: High dependency risk. The Georgia study shows why. The skill of writing is inseparable from the process of deciding what to say and how to structure it. If AI makes those decisions, you're not practicing the skill β you're practicing acceptance of decisions.
Mathematics: Very high dependency risk. Math is a sequence of learned methods. If AI solves each step, you never build the method. You can't "use" a math method you've only watched β you have to practice it until it's automatic. Using AI to check your work after you've done it: fine. Using AI to show you the work: not fine.
Reading comprehension: Medium risk. Using AI to summarize a text you haven't read replaces the reading entirely. Using AI to check your interpretation after you've read builds on your work. The outcome depends completely on whether you do the reading first.
Research and information gathering: Low risk if you verify. AI can help you find directions to look, questions to ask, and background context. This amplifies your research process rather than replacing your thinking β as long as you verify claims from primary sources.
If a school adopts an AI tutor that improves student grades but secretly reduces their long-term capability β and the school doesn't know this yet β is the school doing something wrong? The administrators are trying to help. The students are happier. The grades are better. But the skill development is being hollowed out. Who is responsible for checking whether that's happening? The school? The AI company? Researchers? Parents? There's no institution currently required to answer this question.
The Georgia experiment gives us a concrete design principle: prefer AI interactions that ask you questions over ones that give you answers.
In practice, this means you can redesign how you prompt AI. Instead of "fix this paragraph," try "what question would a skeptical reader ask about this paragraph?" Instead of "rewrite this sentence to be clearer," try "what is unclear about this sentence, and why?" The AI's response gives you information β but the work of fixing it stays yours.
This also means you should pay attention to how you feel after an AI interaction. If you feel like the work got better but you didn't do anything, that's a warning sign. If you feel like you worked hard and the AI helped you see something you missed, that's healthy scaffolding.
The long-term question worth sitting with: skills you're building right now β writing, analyzing, arguing, problem-solving β are going to matter for decades. Any tool that improves your grades in the short term but atrophies those skills is making a trade you haven't agreed to. Knowing this changes how you read every headline about "AI improving student outcomes." The question to ask is always: outcomes measured how, and over what time period?
When you see a headline: "AI tutor improves student test scores by 22%" β you now know the right follow-up question: "What happens to those students' performance when the AI is removed?" That question is almost never in the headline. You know to ask it.
A school board is deciding whether to adopt an AI writing tool for middle school students. The tool works like this: a student submits a paragraph, and the AI automatically rewrites it with corrections highlighted, showing the "before" and "after" side by side. Students can accept or reject each change.
Your job: advise the school board. Your lab partner will challenge your reasoning from multiple angles β and will not let you give a simple yes/no without defending your position with specific mechanisms.
On January 3, 2023, the New York City Department of Education became one of the first major school systems in the United States to officially ban ChatGPT from its networks and devices. The reasoning was straightforward: students could use it to cheat on assignments, and it might harm their critical thinking development. The ban applied to all NYC public schools β over a million students.
By May 2023, the ban was quietly being walked back. By August 2023, the NYC DOE reversed its position almost entirely, announcing a new pilot program to explore how AI could be used in classrooms, rather than whether it should be blocked. They published a report acknowledging that the ban had been both ineffective and potentially counterproductive β students were accessing the tools on personal devices anyway, and teachers who wanted to use AI responsibly had been blocked from doing so.
What's notable about this isn't that the city changed its mind. It's that the reversal happened in under eight months β an extremely fast policy cycle for a bureaucracy as large as NYC's school system. The speed suggests how unprepared institutions were, and still are, for questions that this technology forces into the open.
The NYC reversal isn't unique. In 2023, similar policy whiplash happened at universities and school districts across the United States, Australia, and the United Kingdom. The pattern was consistent: ban, observe that the ban doesn't work, reverse ban, attempt nuanced policy.
The core problem is that most rules about AI in education are trying to answer a question that hasn't been properly defined yet: what counts as using AI appropriately?
This is harder than it sounds. Compare two students. Student A asks AI to write her entire essay. Student B asks AI to explain a concept she's confused about, then writes the essay herself. Student C asks AI to brainstorm counterarguments, argues against them herself to sharpen her thinking, then writes the essay. Student D uses an AI grammar checker to catch typos before submitting. Are all of these "using AI"? Which ones are acceptable? Most current school policies haven't drawn these lines clearly β partly because the technology moved faster than anyone was prepared to think about it.
Policies about AI in schools are being written right now, at school boards, state education departments, and national governments. The people writing those policies largely did not grow up with these tools. The people who will live under those policies β for the next decade or more β are currently in school. That asymmetry has real consequences for who understands what they're deciding.
Let's be direct about something: the academic integrity question is genuinely hard, and the adults in the room often don't have a good answer either.
The traditional argument for not using AI to write your essays is that grades are meant to represent your understanding, not an AI's. Submitting AI work as your own is a misrepresentation β it's a false signal to whoever reads your transcript about what you can do. That argument has real weight.
But consider the counterargument: students have always used tools β calculators, spell-checkers, Grammarly, Wikipedia, even tutors who heavily guide revision. Each new tool prompted the same debate. The line between "tool" and "doing the work for you" has always been fuzzy and contested.
The most honest position: the rules vary by context and keep changing. Your teacher's policy is the operative rule in their classroom, and it's worth knowing specifically. But underneath the rule is a principle that's more stable: the purpose of schoolwork is to develop your ability to think, not to produce outputs. Any tool use that bypasses that development is cheating yourself as much as it's cheating the system β regardless of what the current rule technically allows.
If two students produce identical-quality work β one by thinking hard for two hours, one by editing an AI draft for twenty minutes β and both get an A, is that fair? The students develop different capabilities. The grades don't reflect this. Schools are currently unable to measure what matters most. Should they try harder? Or does that level of surveillance create worse problems? Reasonable people genuinely disagree.
Here is what you can actually control, even while institutions are still working this out:
You can decide what AI use means for your own development. The external rule matters, but the internal question matters more: is this use building my thinking, or substituting for it? That question is yours to answer every time, regardless of what the policy says.
You can pay attention to your own dependency. Notice when you feel uncomfortable starting something without AI. Notice when you can't explain an AI-generated answer in your own words. These are signals that the tool is running ahead of your understanding.
You can become someone who can operate with and without AI. The most professionally valuable skill in the next decade won't be "knowing how to use AI" β everyone will know that. It will be being a strong thinker who can also use AI effectively. The first part requires practicing without the tool, not just with it.
The NYC Department of Education went from ban to embrace in eight months because they didn't have a clear framework for what they were actually trying to protect. You now have a framework. That's not a small thing β it means you can navigate this more intelligently than the policy debates that are still happening around you.
The institutions making rules about AI in schools right now are doing so largely without a clear model of how AI affects learning. You've spent this module building exactly that model β what AI does well, what it undermines, what dependency looks like, and what healthy use looks like. You're more informed than most of the people writing the policies. That's not arrogance β it's a responsibility to use that understanding well.
You've been asked to draft a one-paragraph AI use policy for a middle school. The policy needs to be specific enough to be useful, fair enough to be defensible, and grounded in what you now know about how AI affects learning.
Your lab partner will play the role of a skeptical school board member who will challenge vague language, push for evidence behind your claims, and ask whether your policy actually protects learning or just looks like it does.