The school year had barely started after winter break when David Banks, Chancellor of New York City Public Schools β the largest school district in the United States, serving over one million students β issued a network-wide block. As of that first week of January 2023, ChatGPT was cut off from every school WiFi network and every school device in NYC's 1,600-plus schools. The decision was announced quietly, almost as if speed was the goal rather than deliberation.
The reasoning was straightforward: students might use it to cheat. The solution was equally straightforward: block access. Within days, education reporters were calling it the first major US school district ban on AI. Within weeks, critics β including many of the district's own teachers β were pointing out that the block did almost nothing. Students still had phones on cellular data. The tool was still reachable from any home network. The ban had blocked the school's WiFi; it had not blocked ChatGPT.
By May 2023, fewer than five months later, New York City reversed course entirely. The same administration that issued the ban announced a new initiative: a pilot program to explore how AI could be used responsibly in classrooms. Chancellor Banks told reporters the initial ban had been "wrong" and "the knee-jerk reaction." The city that had moved fastest to block AI was now moving to teach students how to use it.
Here's the first thing you need to understand about the NYC story: the ban was not irrational. The people who made the decision weren't foolish. They were responding to a real problem β students submitting AI-generated work as their own β with the fastest tool they had available, which was a network filter. It's the same logic a school uses when it blocks gaming sites during class hours. Block the distraction, restore focus.
The difference is that a gaming site and a general-purpose AI tool are not the same kind of thing. You can block a gaming site and students lose access to that game. You block ChatGPT on the school network and students switch to their phone's data connection in about four seconds. The tool is too accessible, too embedded in ordinary life outside school walls, for a network block to meaningfully restrict it. What the block actually communicated was not "you may not use AI" β it communicated "we, your school, are not going to help you navigate this."
That gap β between what a rule signals and what it actually achieves β is the central design problem of any classroom AI policy. A rule that cannot be enforced is not a policy. It is a statement about what the administration wishes were true.
The New York City situation sits at the exact boundary between these two things. The January decision was a rule β "block it." The May reversal was the beginning of a policy β "figure out how to integrate it responsibly." The first was fast and felt decisive. The second was slower and required actually thinking through what responsible use means in a classroom.
While NYC was reversing its ban, something else was happening across the country that didn't make as many headlines. Individual teachers β without waiting for district guidance β were already building their own classroom-level approaches. Some were banning AI use outright in their specific assignments, redesigning assessments so they required things AI genuinely couldn't produce: in-class handwritten responses, oral defenses of written work, documented revision processes. Others were doing the opposite: explicitly building AI into assignments, requiring students to submit both their AI-assisted draft and a written reflection on what the AI got wrong or missed.
Neither of these teacher-level decisions appeared on any official policy document. They were responses to a gap β the school or district hadn't decided what to do, so individual teachers made judgment calls. This is normal in education. Teachers constantly make micro-policy decisions in their classrooms. The difference in 2023 was that those decisions now had major implications for whether students were developing skills or bypassing them, and whether a student's work could be fairly graded against a peer's work made with different tools.
If one teacher bans AI use and another allows it, students in different classrooms are developing different skills and learning different norms β even within the same school. A policy that exists only at the classroom level creates a lottery: your relationship with AI tools depends on which teacher's room you end up in.
This is why designing AI policy is a genuinely hard problem. It's not hard because the technology is complicated. It's hard because the decisions interact with fairness, with what counts as authentic learning, with what it means to develop a skill, and with who gets to decide all of the above. Those questions don't have clean answers, and anyone who tells you they do is probably selling something.
Before you can design something, you have to know what problem you're solving. The phrase "AI-safe classroom" sounds like it means "a classroom where AI is blocked." But after the NYC story, you should be skeptical of that framing. Blocking AI doesn't make a classroom safe from AI's effects β it just makes the school invisible to how students are actually working.
A better definition: an AI-safe classroom is one where students know what AI can and can't do, understand why certain uses undermine their own development, and have clear norms that make it possible for honest work to be recognized as honest. That's a lot more complex than a network filter. It requires students to understand something, not just comply with something.
Most of the people writing AI policies for schools right now are focused on detection β how do we catch students using AI? You can now see why detection is the wrong frame. Detection is reactive. A real policy is proactive: it defines what honest work looks like before the assignment starts, not after suspicion arises. Knowing this puts you ahead of most of the current policy debate.
Over the next three lessons, you're going to build that understanding piece by piece. We'll look at what "academic honesty" actually means in an AI era β because the traditional definition doesn't quite fit anymore. We'll examine how other institutions (universities, law firms, newsrooms) have tried to write AI policies, and what worked versus what fell apart. And in Lesson 4, you'll design a complete policy framework of your own.
The ethical question you should be sitting with as you move forward: If a student uses AI to help organize their ideas but writes every word themselves, have they cheated? What if the AI suggested which ideas to include? What if it wrote the transitions? At what point does help become replacement β and does that point move depending on the subject, the grade level, or the purpose of the assignment? There is no universal answer. But the policy you design will have to pick a position.
Your school's principal has shared a draft AI policy and asked you to poke holes in it before it goes to the school board. You've been brought in because you've studied how the NYC ban played out. The AI assistant here is a fellow student auditor β skeptical, direct, and not going to accept vague answers.
The draft policy reads: "Students may not use AI-generated text in any submitted schoolwork. Violations will be treated as academic dishonesty."
In the spring of 2023, a high school student in the Boston suburb of Wellesley, Massachusetts submitted a five-paragraph essay for an AP English class. The teacher ran it through Turnitin's AI detection feature, which had been released just weeks earlier in February 2023. The detection software flagged the essay as 94% likely to have been AI-generated. The teacher reported it as academic dishonesty. The student was called into the principal's office.
The student insisted the essay was theirs. They had used ChatGPT, they said β but only to help them organize their outline. Every sentence in the submitted draft had been written by the student themselves. The AI had helped them figure out what to say, but not how to say it. The teacher said the detection score spoke for itself. The student's parents hired a lawyer.
Cases like this one played out in variations across the country throughout 2023. By August of that year, Turnitin itself acknowledged that its AI detection tool had a false positive rate of roughly 4% β meaning that for every hundred essays it flagged as AI-written, four were almost certainly written entirely by humans. In a school of a thousand students, that's forty wrongly accused students per submission cycle. The company recommended that the tool be used as one signal among many, not as definitive proof. Entire school districts had already been using it as definitive proof.
Academic honesty policies at most schools were written decades ago, updated periodically, and designed around a specific threat: one student copying another student's work, or copying from a published source without citation. The concept at the center of those policies is called plagiarism β representing someone else's words or ideas as your own without attribution.
AI creates a genuinely new problem for that definition. When a student uses ChatGPT to write a paragraph, whose words are they copying? ChatGPT doesn't have an author. Its output isn't taken from a specific source that can be cited. It was synthesized from millions of sources simultaneously. Traditional plagiarism detectors, which work by matching text to a database of existing documents, often find nothing β because the AI output doesn't match any single document. The concept of plagiarism, as written, may simply not apply.
These distinctions matter because they have to appear somewhere in a real policy. A policy that simply says "no AI" doesn't capture the difference between a student who had AI write their whole essay and a student who asked AI "does this paragraph make sense?" Those are not the same act. Any honest policy has to distinguish between them.
The Turnitin story reveals a second problem: detection technology is unreliable, and schools that treat detection scores as verdicts are creating a system where an algorithm can end a student's academic record. That's worth sitting with. Turnitin's own documentation says its detection tool is not designed to be used as evidence of wrongdoing. It is designed to flag work for human review. But when administrators or teachers under time pressure receive a 94% AI score, the temptation to treat that as a verdict is real.
Several researchers have documented that AI detection tools flag non-native English speakers at disproportionately high rates. Students who write in clear, structured prose β a skill many students spend years developing β are sometimes penalized by detection software that reads clarity as machine-like. This is not a minor edge case. It's a systematic bias embedded in a tool that schools are using to make consequential decisions.
A school uses AI detection software that it knows has a 4% false positive rate. School leadership decides this is acceptable because it catches most cheaters. If 40 innocent students per semester are falsely accused, and the school knows this, is using the tool ethical? Does it matter how severe the consequences of a false accusation are? Does it matter if there's no better alternative?
This question doesn't have a clean answer, and you shouldn't expect one. But it has to be answered somehow β because every school that uses detection software has implicitly answered it, whether they've thought about it carefully or not.
Several universities moved faster than high schools on this problem. MIT, in a policy guidance document released in August 2023, distinguished between AI use that undermines the learning objectives of an assignment and AI use that does not. The key question wasn't "did you use AI?" β it was "does your submitted work demonstrate your own mastery of the skills this assignment was designed to develop?"
This reframe is significant. It shifts the question from "what tools did you use?" to "what can you actually do?" An oral defense of written work answers that second question directly. A revision log showing how the student's thinking developed answers it. A written reflection on what the AI got wrong, submitted alongside the AI-assisted draft, answers it. These are not tricks to catch cheaters β they're pedagogical tools that make the student's thinking visible regardless of what tools they used.
When you read a school's academic honesty policy now, you can immediately ask: was this written before 2022? Does it mention AI specifically? Does it distinguish between AI ghostwriting and AI assistance? Does it address detection reliability? Most policies you encounter will fail all four questions. You can see the gap that most adults haven't noticed yet β the definition of "honest work" needs to be rebuilt for an era when AI can produce text on demand.
The goal of an AI-era academic honesty policy isn't to catch cheaters after the fact. It's to design assignments and norms that make cheating less useful and honest work more visible. That's a design problem, not a policing problem. And design is what you're going to practice in the labs ahead.
You've been handed a school's existing academic honesty statement: "Students shall not represent the work of others as their own. Violations include copying, plagiarism, and submitting work produced by other persons."
Your task: propose specific new language that addresses AI use β distinguishing AI ghostwriting from AI assistance β without making the policy so complicated that no one can follow it. The AI here will push back on vague phrasing and ask you to be more precise.
In the spring of 2023, Harvard Law School faced a problem it hadn't anticipated. Students were using AI tools to draft legal memos β foundational assignments that teach students how to construct legal arguments from scratch. The school's honor code prohibited submitting work "not your own," but legal memo writing had always existed in a gray zone: students could ask classmates to review their drafts, visit writing centers, and incorporate feedback from instructors. The question was whether using AI crossed a line that these other forms of help did not.
Harvard's initial response was a faculty committee. The committee spent most of the spring semester deliberating. Meanwhile, individual professors made individual calls: one banned AI from all course submissions; another required students to disclose any AI use in a footnote; a third designed a new assessment format β a timed in-person exercise β and stopped worrying about take-home assignments altogether. By the end of spring 2023, Harvard Law had a draft policy framework but no final policy. The professors had effectively each built their own.
Across town, the Boston Globe newsroom was navigating something similar. In March 2023, the Globe issued internal guidance that journalists could use AI for research and background summarization but not for drafting published text. Within weeks, reporters discovered that the line between "research summarization" and "drafting" was impossible to enforce in practice: if a journalist asked an AI to summarize five sources and then wrote from those summaries, was the resulting text AI-influenced or not? The guidance created a compliance gray zone that nobody knew how to navigate consistently.
The Harvard and Globe cases share a structural problem: they drew a line where the activity on both sides of the line looks almost identical from the outside. "Using AI for research is okay; using AI for drafting is not" sounds meaningful. But the moment you try to apply it to real work, you discover that research and drafting blur together constantly. Writers research by drafting. They draft by synthesizing research. The categories weren't designed for a tool that can do both simultaneously.
Policies that draw lines in blurry places fail for a specific reason: they rely on every person making consistent judgment calls about where the line is, and judgment is precisely what varies most from person to person. When a policy requires consistent judgment that the policy itself hasn't defined, it essentially outsources the policy to individual interpretation β which is what you had before the policy existed.
Neither gray-zone policies nor pure bright-line rules solve the AI problem perfectly. A bright-line rule ("no AI text in any submission, ever") is enforceable in theory but ignores important distinctions between uses. A gray-zone policy ("use AI responsibly") acknowledges those distinctions but can't actually enforce them. The institutions that have had the most success are the ones that redesigned what they were measuring β rather than trying to regulate the tool.
Not everything broke. Three institutional responses from 2023 are worth examining because they produced policies that held up, at least initially.
In August 2023, the University of Michigan released a faculty guide that organized AI policies into three clear tiers: AI-prohibited (the assignment is specifically designed to assess unaided human capability), AI-disclosed (AI use is allowed but must be documented and cited like any other source), and AI-integrated (AI is part of the assignment design; students are expected to use and evaluate it). The innovation was that faculty had to declare which tier applied at the start of each assignment β students weren't left guessing.
The AP, one of the world's major news agencies, issued guidance in August 2023 that avoided the "research vs. drafting" distinction that broke the Globe's policy. Instead, AP's guidance focused on function: AI could not be used to generate content that would be published under a journalist's byline. What the journalist did with AI in their own research process was their professional judgment, but the published output had to represent the journalist's own reporting and writing. This worked because it created a clear, verifiable endpoint β the published text β rather than trying to regulate an invisible process.
In September 2023, Corvallis High School in Corvallis, Montana (enrollment: approximately 350 students) piloted a different approach entirely. Rather than issuing a policy and enforcing it, teachers spent two weeks at the start of the school year running classroom discussions about what AI was, what it could do, and why certain uses would undermine students' own development. Students then co-created classroom agreements β not rules handed down from administrators, but norms the students articulated themselves. Teacher Rebecca Hanson reported that honor code violations dropped significantly and that students were more likely to ask openly when they weren't sure what was allowed.
All three successful cases share something: they gave the person making the decision β the student, the journalist, the teacher β a clear way to know where they stood before they acted. Michigan's tiers were declared upfront. AP's bright-line was about the published endpoint. Montana's co-created norms were internalized, not just announced. Compare this to the policies that failed: they all required people to make ambiguous calls after the fact, under pressure, without clear criteria.
There's one more thing the Harvard and Michigan cases reveal when you put them side by side. Harvard Law School β with its enormous resources, its dedicated faculty committees, its long tradition of rigorous policy deliberation β still ended in a draft framework and no final policy after a full semester. A rural Montana high school with 350 students, no committee, and two weeks of classroom conversation produced something that worked. Scale and prestige did not predict effectiveness.
What predicted effectiveness was specificity of context. The Montana teachers knew their students. They could run a real conversation. They could iterate when something wasn't working. Harvard's committee was writing for a hypothetical average student, in hypothetical average conditions. The gap between policy-as-written and policy-as-lived is always largest when the people writing the policy are far from the people living under it.
When a school, company, or government body releases an AI policy, you can now read it with a specific set of questions: Does it draw clear lines or blurry ones? Does it regulate the process or the output? Did the people who will live under it have any voice in writing it? Most policies you'll encounter in the next few years will fail at least one of these. Knowing the failure modes in advance is what lets you design something better.
In Lesson 4, you're going to put all of this together. You'll design a full policy framework for a real-ish scenario. The question you should be holding now: If the people most affected by a policy have the most accurate information about what will actually work β but they have the least institutional power to write it β whose job is it to close that gap?
A school board is choosing between three draft AI policies for their district. They've asked you to identify which policies have hidden failure modes before the vote. Study each one, then tell the AI which you think will fail and why. The AI will challenge your reasoning β and may not agree.
In February 2024, Auckland Grammar School in New Zealand took an unusual step. Rather than issuing an AI policy from the top down, the school's senior leadership team β Principal Timothy O'Connor among them β convened a working group that included students alongside teachers. The students weren't there to rubber-stamp an already-written policy. They were given actual drafting authority over several clauses. They were asked: what AI uses do you think are genuinely helpful to your learning? Which ones do you feel are replacing learning you should be doing yourself? What would it take for you to trust that a policy was fair?
The resulting document β released in March 2024 β was unusual in several ways. It specified, by subject area, which AI uses were permitted in each kind of assessment. It included a student "right to ask" β any student could ask their teacher before an assignment whether a specific AI use was permitted, and the teacher was required to give a written answer. And it contained a built-in review clause: the policy would be revisited every six months because the technology was changing fast enough that a policy written today might be irrelevant by the end of the year.
Was it perfect? No. Some teachers found the subject-by-subject specificity burdensome. Some students felt the "right to ask" created awkward dynamics when peers assumed asking meant you planned to cheat. But the school had built something that its community understood, had partly authored, and was designed to evolve. The policy's biggest success was not that it got every decision right β it was that it gave everyone a common language for the conversation.
The Auckland example β along with everything you've read in the previous three lessons β points to five elements that appear in every AI policy that has actually held up over time. A policy that's missing any of these will usually develop a specific, predictable problem.
Not "AI is prohibited" or "AI is allowed." The purpose statement answers: what is this policy trying to protect? Usually that's something like: the development of specific skills, the fairness of assessment, or the integrity of feedback loops between teacher and student. Without this, every specific rule is disconnected from its reason β and rules without reasons get followed resentfully and abandoned easily.
Different assignments have different purposes. A timed in-class writing assessment is measuring something different from a month-long research project. A policy that treats all assignments the same will either be too restrictive for projects or too permissive for assessments. The University of Michigan's three-tier system (AI-prohibited / AI-disclosed / AI-integrated) works because it maps to the actual diversity of academic work.
Whatever the policy prohibits or requires must be checkable at a specific, visible moment β not buried in an invisible process. The AP's "published byline text" is an observable enforcement point. A required disclosure footnote is observable. "Using AI responsibly" is not observable at any specific moment. Design your enforcement around what can actually be seen.
Auckland's "right to ask" was not just a courtesy β it was a signal that ambiguity was expected and that the proper response to ambiguity is a conversation, not a violation. Policies that assume every situation is already covered by the written rules create a culture where students who are genuinely unsure choose not to ask β because asking feels like admission of intent. A formal mechanism for clarifying gray zones reduces this pressure.
AI capabilities are changing every few months. A policy written in September 2023 was partially obsolete by March 2024. Any AI policy written without a review date is building in its own obsolescence. Six months is a reasonable review interval. Annual review is the bare minimum. A policy that cannot be updated is not a living document β it's a historical artifact waiting to become irrelevant.
Designing a policy means making genuine choices where reasonable people disagree. There is no technically correct answer to these questions β but every policy has to answer them, either explicitly or by default.
Will your policy rely on AI detection software? If so, how will you handle false positives? If not, what establishes that submitted work is authentic? You cannot fully avoid this choice β even a "no detection" policy is choosing trust as the default mechanism, with all the vulnerabilities that implies.
Some AI uses can be blocked at the assignment level by changing what's being assessed. Others can't. A blanket prohibition requires enforcement; an assessment redesign builds authenticity into the task itself. Most effective policies use some combination, but you have to decide the ratio.
Will students participate in creating the norms? Research from the Montana case and Auckland case suggests participation increases compliance and reduces resentment. But it also takes time, creates negotiating complexity, and requires teachers to share some authorship of classroom norms. That trade-off is real.
You have now read more carefully about AI policy design than most administrators who are currently writing these policies. You understand why bans fail, why detection is unreliable, why gray-zone policies produce inconsistency, and what five elements make a policy durable. When you encounter a school AI policy β your own school's, a college you're considering, a workplace in the future β you can evaluate it with the same framework you'd bring to any other institutional design question. That's not a small thing. Most people never learn to read policy this way.
The final ethical question this module leaves open: A student who uses AI to compensate for a documented learning disability β getting help with organization, sentence structure, or expression of ideas they genuinely have β is using AI very differently from a student who simply doesn't want to do the work. Should a single classroom policy treat these two uses the same way? If not, how does a teacher know which is which β and what happens when they guess wrong? There is no clean answer. But the policy you design in the lab will have to take a position.
You're designing an AI policy for Ridgeline Middle School β 600 students, grades 6β8, mix of urban and suburban families, one-to-one laptop program, teachers with varying comfort levels around technology. The school board wants a policy ready for the start of next semester. You have the five elements. The AI here is your skeptical co-designer β it will stress-test everything you propose.
Your goal is to produce a policy framework covering at least three of the five elements. You don't have to solve every problem β you have to make honest choices and be able to defend them.