Intro
L1
Β·
Quiz
Β·
Lab
L2
Β·
Quiz
Β·
Lab
L3
Β·
Quiz
Β·
Lab
L4
Β·
Quiz
Β·
Lab
Module Test
Real or Rendered: Spot the Fake Β· Introduction

The Camera Always Lied β€” We Just Didn't Know It Yet

A new skill is now necessary for anyone who wants to understand the world they're actually living in.

In February 2023, a photo went viral on Twitter showing Pope Francis wearing a massive white puffer jacket β€” the kind a streetwear brand might release for $400. Millions of people shared it, laughed at it, commented on it. Some news outlets briefly treated it as real. The image had been generated by a Chicago man named Pablo Xavier using an AI tool called Midjourney β€” and he'd spent maybe twenty minutes making it. He was not a graphic designer. He had no special skills. He was just a person with a phone and a free account.

That image is this course in miniature. Not because AI fakes are new β€” photo manipulation goes back to the 1860s, when photographers were already stitching Civil War battle scenes together from separate negatives. But because for the first time in history, anyone can produce a convincing fake image, video, or voice recording in minutes, without any technical training. The gap between "I want to deceive someone" and "I successfully deceived someone" has collapsed. That changes something fundamental about how information works.

This course will give you a specific, practical skillset: how to look at an image, a video, or an audio clip and ask the right questions about whether it's real. You won't become a forensics expert. But you'll develop habits of attention that most adults β€” including journalists, politicians, and teachers β€” still don't have. That's not an exaggeration. By the end of this module, you will see things in images that you currently scroll past without noticing.

Real or Rendered Β· Lesson 1 of 4

When Seeing Stopped Meaning Believing

The moment trust in images broke β€” and why it happened faster than anyone expected.
If a photograph can no longer be trusted, what does that do to everything photographs have ever proved?

On the evening of January 22, 2024, voters in New Hampshire received phone calls from a voice that sounded unmistakably like President Joe Biden. The voice told them not to vote in the upcoming primary β€” that voting in January would only help Republicans, and to "save your vote for November." Tens of thousands of calls went out. The voice was indistinguishable from Biden's actual recordings. It was, in every acoustic detail, him. Except it wasn't. A political consultant named Steve Kramer, working for a rival campaign, had hired a vendor who used AI voice cloning to fabricate the entire message. The technology cost roughly $500. The potential to suppress votes across an entire state cost five hundred dollars.

The New Hampshire Attorney General launched an investigation. The FCC eventually moved to ban AI-generated voices in robocalls. But the technology that made the call was already widely available to anyone. The genie, as they say, was not going back in the bottle. And what made this case different from every political dirty trick before it was one specific thing: there was no human performance involved. No actor practiced Biden's voice in a studio. A machine listened to his public speeches and learned to reproduce him β€” his cadence, his pauses, his slight Delaware accent β€” without ever hearing him in person. That capability, which would have been science fiction in 2019, was a $500 vendor service in 2024.

How We Got Here: A Very Fast History

To understand why 2023 and 2024 felt like a rupture, you need to understand how slowly this was building β€” and then how suddenly it accelerated.

Photography was invented in the 1830s. Within decades, photographers were already manipulating images. In 1865, a famous portrait of Abraham Lincoln circulating in the United States turned out to be Lincoln's head pasted onto the body of the Southern politician John C. Calhoun. Nobody noticed for nearly a century. The manipulation was discovered in 1961. For most of photography's history, faking an image required physical skill β€” darkroom chemistry, careful cutting, precise alignment of negatives. Only experts could do it convincingly.

Adobe Photoshop changed this in 1990. Suddenly, digital manipulation was possible without a darkroom. But it still required skill, time, and software training. A professional retoucher might spend days on a convincing composite image. The barrier wasn't gone β€” it had just moved. Detecting fakes became a specialized job: forensic image analysts who could spot clone-stamped textures, lighting inconsistencies, and metadata anomalies.

Then, between 2017 and 2022, something different happened. Researchers developed a new class of AI system β€” called a generative model β€” that didn't manipulate existing images. It created new ones from scratch, pixel by pixel, guided only by a text description. By 2022, tools like DALL-E 2, Midjourney, and Stable Diffusion were available to the general public. You typed a sentence. You got a photorealistic image. No skill required.

The same acceleration happened with audio. In 2023, tools like ElevenLabs allowed anyone to clone a voice from a 30-second audio sample. Video synthesis followed. By early 2024, companies were offering "talking head" video generation β€” realistic lip-synced video of a person saying words they never said β€” for subscription prices comparable to a Netflix account.

Generative model An AI system trained to produce new content β€” images, audio, video, text β€” by learning patterns from enormous amounts of existing content. It doesn't copy; it generates something new that fits those patterns.
Voice cloning Using AI to analyze a real person's voice recordings and produce new audio that sounds like them saying anything β€” including things they never said.
What Changed and What Didn't

Here's something important that often gets lost in the panic: humans have always lied. Propaganda has existed for thousands of years. Staged photographs go back to the Crimean War in the 1850s, when photographer Roger Fenton rearranged cannonballs on a road to make a more dramatic image. The desire to deceive with images is not new.

What changed is the cost and the skill floor. Before 2022, creating a convincing fake required one of three things: money (hire a professional), time (learn the skills yourself), or access (work at a studio or media organization). Those barriers weren't perfect, but they filtered out casual deception. Most people who wanted to spread a fake image had to either use an obvious fake β€” the kind that falls apart under mild scrutiny β€” or spend significant resources on a good one.

Generative AI removed those barriers almost entirely. Creating a convincing fake image now takes seconds. A convincing fake voice takes minutes. A basic deepfake video takes hours but requires no professional equipment. The result is that the volume of synthetic media in the world is increasing faster than any detection system can handle. Researchers at the University of Washington and other institutions have repeatedly shown that humans β€” including trained professionals β€” can correctly identify AI-generated faces only about 48% of the time. That's basically a coin flip.

This is the world you are growing up in. Not a world where some sophisticated state actor occasionally produces a fake to deceive millions β€” though that still happens β€” but a world where any bored, curious, or malicious person can produce convincing fakes at scale, for free, in minutes.

The Number That Matters

In 2023, researchers at the cybersecurity firm Deeptrace estimated that deepfake video content was doubling every six months online. By 2024, detection companies were identifying over 500,000 synthetic media pieces per day across major platforms. Most were never labeled as fake.

The Trust Collapse β€” and What It Actually Threatens

You might expect the biggest danger from AI fakes to be specific incidents β€” a fake video of a president declaring war, a fabricated recording of a CEO committing fraud. Those things are real risks. But researchers who study information warfare argue that the more insidious danger is something subtler: generalized distrust.

If people believe that any image, video, or audio clip could be fake, they stop trusting evidence entirely. This has already started happening. After the New Hampshire Biden robocall case in January 2024, journalists noticed a new phenomenon in their comments sections: even real, verified recordings of public figures were being dismissed as "probably AI." The fake didn't need to fool everyone. It just needed to make everyone doubt everything.

Researchers call this the "liar's dividend" β€” the idea that the existence of AI fakes benefits liars not just by creating false evidence, but by giving real liars a new defense. Any time a real recording catches someone in a lie, they can simply claim it's AI-generated. This was already happening in court cases by 2024, with defendants' lawyers raising AI-fakery arguments against genuine video evidence.

This is the part that should genuinely concern you β€” not just as a future voter or consumer, but right now, as someone who shares content online. Every time you share something without checking whether it's real, you're part of this system. Understanding how synthetic media works is no longer optional if you want to be an honest participant in public life.

You Now See What Most People Miss

Most people who encounter a suspicious image ask one question: "Does this look fake?" You now know that's the wrong question. The right questions are: Who made this, when, with what tool, and what do they want me to believe? The technology exists to make almost anything look real. The skill is in asking better questions β€” not just looking harder.

An Ethical Question Without an Easy Answer

In October 2023, a group of researchers at the Massachusetts Institute of Technology published an AI-generated photo of a destroyed city to illustrate an article about climate disaster risk. The image was clearly labeled "AI-generated illustration." Their argument: the image conveyed the emotional reality of a possible future more vividly than any existing photograph. It told a true story about a thing that could happen, using a fake image of a thing that hadn't happened yet.

On the other side: critics argued that using synthetic images to represent possible futures, even labeled, trains readers to accept fabricated scenes as emotionally legitimate evidence. That once you've felt the reality of a fake disaster image, you can't fully "unsee" it. That the emotional response it produces is the same regardless of the label, and that emotion β€” not the label β€” is what drives belief and action.

Here is the ethical question, and it does not have a clean answer: Is it acceptable to use a fake image to tell a true story, if the image is labeled? What changes if the label is small? What changes if the story is genuinely important? What changes if there are no real photographs of the thing being described? There are serious, thoughtful people on both sides of this. You will have to decide where you stand β€” and that decision will matter for how you create and share content for the rest of your life.

Lesson 1 Quiz

5 questions β€” reason through each one. Not all answers are obvious.
1. The New Hampshire Biden robocall in January 2024 was significant primarily because:
Correct. The case mattered because of the cost and accessibility barrier that collapsed β€” not because imitation itself was new. Political voice imitation is old; doing it convincingly for $500 with no actor is new.
Not quite. The lesson's key point was about the cost and accessibility of the deception, not whether it was historically unprecedented or perfectly foolproof. Re-read the opening case.
2. Which of the following best explains the "liar's dividend"?
Exactly. The liar's dividend isn't just about creating fakes β€” it's about using the existence of fakes to cast doubt on real evidence. This is already appearing in legal cases.
The liar's dividend refers to how real liars benefit from general distrust of media, not about production quality or cost. Look back at the "Trust Collapse" section.
3. A school newspaper editor receives a photo of the principal apparently cheating at a trivia contest. The image looks realistic and comes from an anonymous source. Applying what you learned in Lesson 1, what is the MOST important first question to ask?
Right. The lesson explicitly reframes the key question away from "does this look fake?" toward the provenance and motive questions. Visual realism is no longer a reliable test.
Appearance-based and technical pixel-level checks are no longer reliable first steps. The lesson argued you need to ask about source, timing, tool, and motive before anything else.
4. Before generative AI tools like Midjourney became widely available around 2022, what was the main barrier to creating convincing fake images?
Correct. The lesson's historical section emphasizes the "cost and skill floor" β€” money, time, or expertise. Generative AI collapsed all three barriers simultaneously.
The lesson's historical arc focuses on the practical barriers of cost, time, and skill β€” not legal barriers or internet infrastructure. Review the "How We Got Here" section.
5. MIT researchers used a labeled AI-generated image to illustrate a climate disaster article. A critic argues this is harmful even with a label. Which argument best supports the critic's position?
This is the strongest version of the critic's argument as presented in the lesson: emotional responses don't check labels. The concern is that the feeling of reality persists even when the viewer knows intellectually it's fake.
The critic's strongest argument isn't about visual quality or label size β€” it's about how emotional responses to images work independently of rational knowledge about their source. Re-read the ethical question section.

Lab 1 β€” The Provenance Investigator

You're the investigator. Your job isn't to look at images β€” it's to interrogate where they came from.

Your Assignment

You've been handed a suspicious image by a student journalist at your school. The image appears to show a local city council member accepting cash in a parking lot. The source is anonymous. The image is photorealistic. Before your paper publishes anything, you need to decide: what questions do you ask, and in what order?

Your lab partner RENN is an experienced media investigator. They won't tell you what to do β€” they'll push back on weak reasoning and ask you to defend your choices. You need to take a position and argue for it. There are at least 3 exchanges required before your investigation is complete.

Start by telling RENN: what is the first thing you would do upon receiving this image, and why? Don't just name a step β€” explain the reasoning behind it.
RENN β€” Media Investigator
Lab 1
You've got an anonymous image, photorealistic, politically explosive. Clock is ticking β€” the student who sent it says another outlet might publish first. What's your first move, and what's the thinking behind it?
Real or Rendered Β· Lesson 2 of 4

How AI Generates Images β€” And Why That Matters for Spotting Them

Understanding the machine tells you exactly where it fails β€” and that's where you look.
If an AI makes an image by learning patterns from millions of photographs, why does it still get hands wrong?

On March 22, 2023, a set of images began circulating on Twitter showing former President Donald Trump being physically arrested by New York police officers β€” dragged down steps, pinned against a car, in apparent chaos. The images were strikingly vivid. They were shared by hundreds of thousands of people before anyone identified them as fake. When users looked carefully, they found the telltale artifacts: a police officer with six fingers, a bystander whose face dissolved into a blur at the edges, brickwork that repeated in an impossible pattern. The images had been generated by a journalist named Eliot Higgins, founder of the investigative outlet Bellingcat, using Midjourney v5. He said he made them to demonstrate how convincing the technology had become. The demonstration worked β€” possibly too well.

What Higgins' images showed was something important: the AI made mistakes in specific, predictable places. Not randomly β€” in exactly the places you'd expect if you understood how the system works. The six-fingered hand, the dissolved face at the edge of the frame, the repeating brickwork β€” these weren't random glitches. They were the fingerprints of how generative image models process and produce visual information. Once you understand the mechanism, you know where to look. The machine's failure modes are not random. They follow a logic.

What a Diffusion Model Actually Does

Most modern AI image generators use a technology called a diffusion model. The name sounds technical, but the idea is surprisingly understandable once you have the right analogy.

Imagine you have a photograph. Now imagine you add random noise to it β€” static, like a TV without signal β€” until the original image is completely buried under the noise. Now imagine you train a neural network to reverse that process: given a noisy image, predict what the slightly less-noisy version looked like. Do this thousands of times, in smaller and smaller steps, and the network learns to "denoise" images all the way back to clarity. That's diffusion.

When you give a diffusion model a text prompt β€” "a photo of a city council member accepting cash in a parking lot at night" β€” it starts with pure noise and progressively denoises it toward an image that matches your description. It's not searching through stored photographs. It's synthesizing a new image by applying learned patterns about how photographic elements relate to each other in space. It knows that "parking lot at night" means certain colors, certain light sources, certain textures. It assembles those patterns.

This process is extraordinarily good at capturing average visual relationships β€” the things that appear most consistently across the billions of training images. A face, centered in frame, well-lit, in a common pose: the model has seen this configuration millions of times and can render it flawlessly. But it struggles at structured complexity β€” things where there are explicit rules about how elements relate to each other that aren't purely visual.

Diffusion model An AI system that generates images by learning to reverse the process of adding noise to a picture β€” starting from random static and progressively building toward a coherent image guided by a text description.
Why Hands, Text, and Edges Fail

Human hands are one of the most structurally complex objects a generative model encounters. A hand has five fingers, each with three joints, arranged according to biological rules that are rigid: four fingers extend from the palm in a specific fan pattern; the thumb opposes them from the side; finger lengths follow a fixed proportion. The model doesn't know these rules explicitly β€” it only knows what hands tend to look like in the training images. And when the hand is partially obscured, at an unusual angle, or in motion, the model has seen fewer examples and has to extrapolate. The extrapolation produces extra fingers, melded knuckles, or fingers that bend in anatomically impossible directions.

Text inside an AI-generated image fails for a related reason. The model learned what text looks like β€” the visual shapes of letters β€” but not what text means. So it produces arrangements of letter-like shapes that follow the visual rhythm of text without encoding any actual words. Hold a Midjourney image up to the light and read the street signs, the newspapers, the storefront lettering: they're typically gibberish that looks like language without being any language.

Edge artifacts β€” faces that blur at the boundaries of a frame, backgrounds that become inconsistent near the edges of a subject β€” occur because diffusion models generate images holistically rather than layering objects in physical space. A real photograph has a lens, a focal plane, and a consistent physics of light. A generated image has learned to look like a photograph without having a lens. Near the edges where the training data provided less guidance, the model's confidence drops and artifacts appear.

Reflections are another reliable tell. Mirrors and reflective surfaces require exact geometric consistency β€” the reflected image must be the mirror-reverse of the real object at the correct angle. The model has seen reflections but doesn't understand the geometry. It produces plausible-looking reflections that, on inspection, reflect different objects than what's in the frame, or reflect from the wrong angle.

The Inspection Checklist

When evaluating a potentially AI-generated image, examine in this order: (1) Hands and fingers β€” count them, check anatomy. (2) Text in the scene β€” can you read it? (3) Background edges near the subject β€” do they smear or repeat? (4) Reflective surfaces β€” do they reflect correctly? (5) Lighting β€” does the same light source hit all objects consistently?

The Arms Race: Detection and Generation

Here is where things get complicated, and where you need to hold a difficult idea in mind: the very artifacts you just learned to look for are becoming less reliable as indicators, because the generators are improving.

Midjourney v4, released in late 2022, produced six-fingered hands routinely. Midjourney v6, released in December 2023, handles hands significantly better. As detection researchers identify specific failure modes and publish their findings, the model developers train on that feedback and patch the failures. This is not a conspiracy β€” it's just how any technology improves. But it means that the visual tells that were reliable in 2022 are less reliable in 2024, and the tells that are reliable in 2024 will be less reliable in 2026.

This is why understanding the mechanism matters more than memorizing the tells. The specific artifacts will change. The fundamental logic β€” that these systems generate by pattern-matching rather than by understanding physical reality β€” will persist for the foreseeable future. A system that generates by pattern-matching will always struggle more with structured complexity than with common configurations. The specific places it struggles will shift, but the underlying reason it struggles will remain.

There is also a separate detection technology: AI systems trained specifically to identify generated images. Companies like Hive Moderation, Illuminarty, and AI or Not operate AI detectors. These work by identifying statistical patterns in how pixels are distributed in generated versus photographed images. They are useful but not reliable: as of 2024, detection accuracy drops significantly when the generated image has been compressed, cropped, or run through a filter β€” all things that happen routinely when images are shared on social media.

What You Now Understand That Changes Everything

Knowing how diffusion models work means you're no longer just looking at images β€” you're understanding why certain elements are harder for the machine to fake. You have a theory, not just a checklist. Theories survive when the checklist changes. Every time you see a generated image with perfect hands, you'll know: either the generator improved, or whoever made it reviewed and corrected the hands manually. Both of those facts tell you something.

An Ethical Question Without an Easy Answer

In December 2023, the government of Belarus published a series of photographs showing what it claimed were Ukrainian military atrocities β€” images of soldiers committing acts of violence against civilians. Independent researchers at Bellingcat analyzed the images and identified them as AI-generated composites. Belarus denied this. The images were used in domestic propaganda broadcasts.

Here is the question: should social media platforms automatically remove images that AI detectors flag as synthetic β€” even knowing that AI detectors have significant error rates and might remove real documentary photographs of genuine atrocities? What is worse: leaving AI-generated propaganda up, or mistakenly removing evidence of real violence? Who should make that decision, and on what timeline? These are decisions being made right now β€” by engineers at Meta, Google, and TikTok β€” and they affect what billions of people see. There is no version of this problem that doesn't harm someone.

Lesson 2 Quiz

5 questions β€” apply the mechanism, not just the vocabulary.
1. A diffusion model generates an image by:
Correct. Diffusion works by reversing a noise-adding process β€” building an image from random static guided by learned patterns. It synthesizes; it doesn't search or assemble from libraries.
Diffusion models don't search, stitch, or assemble from libraries. They synthesize by denoising β€” starting with static and building toward coherence. Review the "What a Diffusion Model Actually Does" section.
2. Why do diffusion models consistently struggle more with hands than with faces?
Right. It's the structured complexity β€” biological rules that are strict β€” combined with the model's purely visual (not rule-based) knowledge. When the pose is unusual, it has fewer matching training examples and must extrapolate.
The answer isn't about frequency in training data or viewer perception β€” it's about the structural rigidity of hands versus the model's purely visual learning. The model doesn't know the rules; it knows appearances. Re-read the "Why Hands, Text, and Edges Fail" section.
3. You're examining an AI-generated image and the text on a storefront sign reads "GRFKTL MRNS." What does this tell you about how the model processed language?
Exactly. The model knows the visual pattern of text β€” how letters are spaced, how signs look β€” without having semantic understanding of language. It produces visual language without meaning.
The garbled text isn't intentional obfuscation or translation β€” it's a failure of understanding. The model knows what text looks like without knowing what text means. Re-read the section on why text fails.
4. A 2024 AI image generator produces a photo of a politician at a protest with perfect hands and readable signs. This most likely means:
Right. The lesson explicitly warns that specific tells become less reliable as generators improve. Absence of obvious artifacts is not confirmation the image is real β€” it may just mean the generator got better, or someone corrected the outputs.
The lesson specifically warns against concluding an image is real because the old tells are absent. Generators improve. Human post-processing corrects outputs. The absence of 2022 artifacts does not confirm authenticity in 2024.
5. An AI detection tool flags a photograph from a war correspondent as "likely AI-generated." The correspondent insists it's real and was taken on their camera. What should a news editor do first?
Correct. AI detectors have significant error rates β€” especially after compression. The provenance chain (original file, metadata, chain of custody) is more reliable than a detector flag alone. The flag is a reason to investigate, not a conclusion.
Neither automatic removal nor ignoring the flag is appropriate. The lesson noted that detectors are useful but unreliable β€” especially after image compression. The right response is to investigate provenance independently.

Lab 2 β€” The Artifact Auditor

You understand the mechanism. Now you audit someone else's reasoning about it.

Your Assignment

A classmate has examined an image and concluded it's AI-generated. Their reasoning: "The hands look a little weird and one finger seems long." They want to post their analysis online as a debunk. You need to audit their reasoning before they publish β€” and either strengthen it, challenge it, or identify what's missing.

RENN is your forensics partner. They'll pressure-test your analysis and ask you to go deeper. Minimum 3 exchanges to complete this lab.

Start by telling RENN: Is "the hands look weird" sufficient reasoning to conclude an image is AI-generated in 2024? Defend your answer with what you know about how generators work and how the technology has changed.
RENN β€” Forensics Partner
Lab 2
Your classmate's analysis is about to go public. One weird-looking finger. Is that your case? Walk me through your thinking β€” is this evidence enough to debunk something, and if not, what would be?
Real or Rendered Β· Lesson 3 of 4

Deepfakes: When the Person on Screen Isn't There

Video felt like the last frontier of proof. It isn't anymore β€” and the implications are stranger than you think.
If a video can show a real person saying anything, what happens to the idea of video evidence?

On October 7, 2023, the same day Hamas launched its attack on southern Israel, a video began circulating on social media showing what appeared to be a senior Israeli military official announcing a policy of collective punishment against Palestinian civilians. The official's face moved naturally. His mouth matched the words. He looked directly into camera. The Israeli military issued a denial within hours and identified the video as a deepfake β€” a synthetically generated video using the official's likeness. Independent verification by organizations including First Draft and Storyful confirmed this. But the video had been viewed millions of times before any correction reached a comparable audience. In the early hours of a conflict, when people are most frightened and most eager to understand what is happening, a fabricated video of a military official had shaped the information environment for millions of viewers.

This was not an isolated event. Throughout the Russia-Ukraine conflict beginning in 2022, both sides accused each other of circulating deepfake videos of military and political officials. In March 2022, a video appeared showing Ukrainian President Volodymyr Zelensky apparently telling Ukrainian soldiers to lay down their arms and surrender. It was identified as a deepfake within hours β€” partly because Zelensky's neck looked oddly proportioned, a common failure mode β€” but not before it was broadcast on a hacked Ukrainian news website and viewed widely. Ukrainian officials responded quickly, but the speed of correction didn't match the speed of spread. It almost never does.

How Deepfake Video Works

The word "deepfake" was coined on Reddit in 2017, when a user began posting realistic face-swapped celebrity videos using a technique derived from deep learning research. The name stuck even as the technology evolved far beyond simple face swapping.

Modern deepfake video works through one of two main methods. The first is face replacement: taking a video of one person and mapping the face of a different person onto it, matching lighting, skin tone, and head movement. This requires a significant amount of source footage of the target person β€” at least several minutes of varied facial expressions and angles. Until 2022, this was the primary limitation: you could deepfake celebrities and politicians who had extensive public video footage, but not private individuals.

The second method is audio-driven face synthesis, sometimes called "talking head" generation. Here, you start with a single photograph of the target person and a new audio recording. The system synthesizes video of the person's face moving to match the audio β€” generating realistic lip movements, micro-expressions, and subtle head movements that match the emotional content of the speech. By 2024, tools like HeyGen and Synthesia were offering this capability commercially for entirely legitimate purposes β€” mostly corporate training videos and multilingual content. The same tools can be misused.

The tell-tale signs of deepfakes follow from these mechanisms. Face replacement often shows inconsistencies at the jaw and neck boundary β€” the seam where the swapped face meets the original body. Audio-driven synthesis struggles with tooth visibility (teeth are structurally complex and move in ways that are hard to predict from audio alone), with eye blinking patterns (blink rates become statistically abnormal), and with the subtle physics of how skin moves with underlying muscle.

Deepfake A synthetic video in which a real person's face, voice, or both have been replaced or generated using AI β€” making them appear to say or do something they never actually did.
Talking head synthesis An AI technique that generates realistic video of a person speaking from just a single photograph and a new audio track β€” producing lip movements and expressions that match the new audio.
What to Look For in Suspicious Video

Video analysis is harder than image analysis for an obvious reason: you're evaluating many frames per second rather than a single frame. But this also gives you more data. Inconsistencies that might not appear in any single frame become visible across time.

The most reliable indicators of deepfake video as of 2024 include: unnatural blinking patterns (either too frequent, too infrequent, or blinks that don't fully close); facial boundary issues, especially around the jaw, ears, and hairline where the face swap boundary is hardest to blend; inconsistent lighting, where the subject's face is lit differently from the rest of the scene or the lighting doesn't match the claimed location; and lip-sync imprecision, where the mouth movements don't quite match the consonant sounds in ways that become obvious if you watch the mouth while listening closely.

There is also a behavioral tell that doesn't require any technical knowledge: sudden high-stakes statements from powerful people appearing first on anonymous social media accounts rather than official channels. Real announcements by military officials, presidents, and executives are almost always first announced through official channels β€” press releases, verified accounts, press conferences. If a major statement by a named official appears first as a viral video clip from an anonymous account, that's not a technical artifact β€” it's a provenance red flag that any careful reader can notice.

This behavioral check is actually more reliable right now than technical visual analysis, because visual quality is improving faster than behavioral conventions are changing. Institutions still have communication norms. Deepfakes can't easily fake an institutional process.

The 30-Second Rule

Before accepting any dramatic video of a public figure saying something consequential: pause 30 seconds. Ask: Did this appear first on an official channel? Is any major news organization reporting this through their own independent sourcing? If the answer to both is no, hold your judgment. Real news travels through verifiable institutional channels. Deepfakes can't fake those.

When Deepfakes Aren't Used for Politics

Most public discussion of deepfakes focuses on political disinformation. But the majority of deepfakes that actually circulate online β€” by volume β€” are not political. According to research by Sensity AI published in 2023, over 96% of deepfake videos they catalogued were non-consensual intimate imagery: synthetic videos of real people, mostly women, generated without their knowledge or consent.

This is a different harm than political disinformation, but in some ways a more immediate one. In 2024, several high schools in New Jersey, Pennsylvania, and Spain reported incidents where students had generated non-consensual deepfake images of classmates using free tools available through standard app stores. The victims β€” almost exclusively girls β€” experienced documented psychological harm. Legislation in several US states moved to criminalize non-consensual deepfake imagery, but enforcement lagged significantly behind the technology's accessibility.

This matters for this course in a specific way: the ethical stakes of deepfake technology are not abstract or distant. They exist at the scale of your school. They affect people your age. Understanding how the technology works is not just about reading news more carefully β€” it's about understanding a capability that already exists in the hands of people in your immediate environment, and that has been used to cause specific, documented harm to teenagers.

The Institutional Channel Test

Knowing about the institutional channel test puts you ahead of most social media users. You don't need sophisticated video analysis software to apply it β€” you just need to ask one question before sharing: did this appear first through a verified, institutional channel? That question alone would stop most deepfake political disinformation from spreading. The reason it doesn't is that most people don't think to ask it. You now will.

An Ethical Question Without an Easy Answer

In 2024, the Indian film industry used AI deepfake technology to restore the performance of actor Irrfan Khan, who died in 2020, for a film he had been scheduled to appear in. His family consented. The filmmakers argued they were honoring his legacy and fulfilling a creative collaboration he had agreed to. Critics argued that no one β€” not family members, not studio executives β€” can meaningfully consent on behalf of a dead person's likeness, and that normalizing posthumous AI performance creates a framework in which studios will eventually use deceased actors without family approval at all.

Here is the question: does consent from a living family member make AI resurrection of a deceased person's likeness ethically acceptable? What if the person left a will saying they wanted their likeness to be used this way? What if they said nothing? What if they actively said they didn't want it β€” but their estate, which controls the rights, disagrees? There is currently no legal consensus and no industry standard. These decisions are being made film by film, contract by contract. Where would you draw the line?

Lesson 3 Quiz

5 questions β€” test your ability to apply deepfake detection reasoning to new situations.
1. The Zelensky deepfake in March 2022 was identified relatively quickly partly because of a visual artifact at his neck. What does this artifact tell us about how that deepfake was likely made?
Correct. Neck and jaw boundary issues are characteristic of face-replacement deepfakes, where the technical challenge is blending the swapped face onto an existing body β€” the seam is hardest to hide at the jaw and neck.
The neck artifact points to face-replacement technique β€” the hardest part of which is blending at the face-body boundary. This is a structural consequence of how face swapping works, not a training data or synchronization issue.
2. Why is the "institutional channel test" considered more reliable right now than visual deepfake detection?
Right. The insight is asymmetric: visual tells are a moving target as generators improve, but the social and institutional norms around how real announcements are made are stable. Exploiting that asymmetry is smart detection strategy.
The lesson's argument is about asymmetric rates of change β€” visual quality improves faster than institutional norms change. The institutional test exploits something the technology can't easily fake: a press release, a verified account, an independent corroborating source.
3. You receive a video of your school principal apparently announcing that exams are canceled this Friday. The video looks realistic. What should you do before telling your friends?
Exactly right β€” and this applies at the school level just as much as at the national level. Real institutional announcements travel through official channels first. A viral video that doesn't appear anywhere official is a major red flag regardless of visual quality.
Technical analysis (blink rates, detector tools, visual comparison) is harder and less reliable than the institutional channel check. Ask the simpler question first: did this appear through any official school channel?
4. The Sensity AI research finding that 96% of deepfakes by volume are non-consensual intimate imagery is significant for this course because:
Right. The lesson uses this statistic to make the stakes personal and immediate β€” this technology's primary harm is being done at a personal level, including in schools, not primarily in geopolitical contexts.
The lesson uses this statistic to reframe the stakes β€” away from abstract political harm toward immediate personal harm that already exists in high school environments. It's about proximity of the threat, not technical sophistication comparisons.
5. A filmmaker wants to use AI to create a posthumous performance of a deceased actor in their biopic. The actor's adult child has given written consent. A critic argues this is still ethically problematic. What is the strongest version of the critic's argument?
This is the strongest version as the lesson presents it β€” the concern is about the slippery slope of the precedent being set, not about this specific case alone. Consent frameworks established today shape what becomes acceptable tomorrow.
The strongest critic argument isn't about technical quality or financial conflicts β€” it's about the precedent being set. The lesson frames the concern as: whose consent counts, and what happens when the commercial incentive to use deceased likenesses outgrows whatever consent framework was established.

Lab 3 β€” The Video Analyst

You've got a suspicious video, a deadline, and a partner who won't accept shallow reasoning.

Your Assignment

A video has appeared on social media showing the mayor of a mid-sized US city apparently admitting to taking bribes. The video has 2 million views. The mayor's official social accounts have posted nothing. No major news outlet has independently reported on it. The video looks realistic β€” no obvious glitches. Your task is to decide: publish your analysis, or hold?

RENN is your video analysis partner. They expect you to use the institutional channel test, identify what additional evidence you'd need, and take a position on what to do. Minimum 3 exchanges to complete this lab.

Tell RENN: based on what you know, what is your current assessment of whether this video is likely real or fabricated β€” and what is the single most important piece of additional information you'd want before publishing anything?
RENN β€” Video Analysis Partner
Lab 3
Two million views, no official confirmation, no news outlet corroboration. The clock is ticking and everyone is asking you about it. Walk me through your thinking β€” real or fabricated, and what do you need to know before you say anything publicly?
Real or Rendered Β· Lesson 4 of 4

Building a Detection Mindset β€” Not Just a Checklist

Checklists expire. The habit of asking better questions doesn't.
In a world where any piece of media could be synthetic, what does it mean to be a reliable source of information for the people around you?

On August 31, 2023, a photograph appeared across multiple social media platforms showing massive fires destroying large sections of Maui, Hawaii β€” specifically, the town of Lahaina, which had been devastated by genuine wildfires earlier that month. Some of the images were real. Some were AI-generated. Some were real photographs β€” but from different wildfires in different countries, misattributed to Lahaina. Within 72 hours, fact-checkers at Reuters, AFP, and the AP had sorted through hundreds of images, identifying fabrications and misattributions. They used reverse image search to trace provenance, checked metadata where available, geolocated images against satellite data, and cross-referenced the lighting and vegetation against known Lahaina geography.

What was striking about their methodology was how rarely they used AI detection tools. Instead, they applied a discipline of provenance tracking β€” asking not "does this look real?" but "where did this image come from, what is its history, and does that history check out?" The same photograph of a burning hillside could be real Maui footage or a 2021 wildfire in Greece β€” both look equally convincing. Only tracing the image's origin reveals which it is. The professional fact-checkers weren't better at seeing fake images. They were better at not trusting their eyes in the first place.

The Four Questions That Don't Expire

Visual tells change as technology improves. Detection software lags behind generation quality. No checklist of artifacts will remain valid for more than a couple of years. What remains stable is a set of questions about provenance β€” the origin and history of a piece of media β€” that apply regardless of what the technology looks like.

Question 1: Where did this first appear? Not who shared it with you, but where in the chain of distribution it originated. An image that first appeared on a verified news organization's account has a different credibility profile than one that first appeared on an anonymous account created three days ago. Most people never ask this question. Asking it makes you unusual.

Question 2: What is the claimed context, and can it be independently verified? An image of a fire means nothing without knowing where and when. A video of a public figure saying something means nothing without knowing when and where it was recorded. Can you find independent corroboration of the claimed context from a source that isn't just re-sharing the original?

Question 3: Who benefits if this is believed? This doesn't tell you whether it's real, but it tells you who has a motive to produce it. A deepfake of a political candidate saying something embarrassing right before an election benefits their opponent. Understanding motive doesn't prove fabrication β€” but it tells you where to look hardest.

Question 4: Have I seen the original? "Going to the source" sounds obvious but almost nobody does it. Many viral images and videos are screenshots of screenshots, compressed and cropped and re-uploaded until any metadata that might reveal their origin is long gone. Asking for the original file β€” and thinking about whether it's available β€” is a basic discipline that professional fact-checkers apply automatically.

Provenance The documented history of where a piece of media came from β€” who created it, when, where it was first published, and how it traveled to where you encountered it. Provenance is more reliable than visual analysis for assessing authenticity.
Reverse Image Search and Other Tools

Alongside the four questions, there are practical tools worth knowing β€” not because they're infallible, but because they're fast and often decisive.

Reverse image search β€” available through Google Images, TinEye, and Bing β€” lets you upload an image and find other places it has appeared online. This is the fastest way to catch misattributed real images: photographs from old events or different countries being relabeled as something current. In the Maui wildfires case, several "Lahaina fire" images were identified within minutes as Greek wildfire photographs from 2021 via reverse image search. The technique doesn't identify AI-generated images (which may not appear anywhere else online) β€” but it does catch the most common form of visual misinformation: real images misused out of context.

Image metadata (called EXIF data) can reveal the camera model, GPS coordinates, and timestamp of a photograph β€” but only if it hasn't been stripped. Most social media platforms automatically strip EXIF data when images are uploaded, so this is useful primarily for images received directly as files rather than screenshots shared from feeds.

Geolocation β€” cross-referencing visual details in an image (building styles, signage, vegetation, street layout) against satellite imagery β€” is a skill developed by professional OSINT (Open Source Intelligence) analysts. Organizations like Bellingcat have used geolocation to verify or debunk dozens of high-stakes images in conflict zones. For most everyday purposes, this level of analysis isn't necessary β€” but knowing it exists matters when stakes are high.

AI detection tools like Hive Moderation and Illuminarty are useful as a first-pass signal, not a verdict. Use them to flag an image for further investigation, not to conclude it's fake. As of 2024, their false-positive rates on compressed images are high enough that a positive detection result should increase your scrutiny, not end your inquiry.

The Minimum Viable Check

If you only ever do one thing before sharing a suspicious image: run it through Google Images reverse search. It takes 15 seconds. It catches the most common category of visual misinformation β€” real images misrepresented out of context. It won't catch AI-generated images, but it handles a very large share of what actually circulates.

What You Owe the People Around You

Here is a framing that most media literacy courses avoid because it sounds preachy β€” but which is actually just accurate: when you share something, you are making an implicit claim that it's worth other people's attention. You are vouching for it, at least at the level of "this seemed interesting and plausible enough to pass on." In a world of mass social media, that vouching multiplies rapidly. A single share by someone with 200 followers, if re-shared by three people with larger followings, can contribute to something reaching tens of thousands of people by morning.

This doesn't mean you should never share anything unless you've verified it. That's an impossible standard. What it means is that there's a proportional responsibility: the more dramatic the claim, the more harmful the implications if it's wrong, and the more shareable the content, the more scrutiny you owe it before passing it on. A funny meme about a celebrity looking weird at an award show? Low stakes, share away. A video showing a politician committing a crime right before an election? That warrants the 30 seconds of checking that most people skip.

The people who get this right aren't the ones who are paralyzed by skepticism. They're the ones who've developed a fast, automatic habit: before sharing something that makes me feel something strong β€” outrage, shock, vindication β€” pause and ask the four questions. Strong emotional reactions to media are exactly the condition in which fabrications are most effective, because emotion and scrutiny don't coexist easily. Training yourself to apply scrutiny precisely when you feel the most certain is the core skill this entire course is trying to build.

What You Can Now Do That Most People Cannot

You now have a framework β€” four questions about provenance β€” that doesn't expire as technology changes. You know the specific places AI image generators fail and why. You know the institutional channel test for video. You know what reverse image search can and cannot catch. You know what AI detectors can and cannot reliably conclude. Most adults β€” including many journalists and politicians β€” don't have this framework. That's not an exaggeration. This specific combination of conceptual understanding and practical habits is genuinely uncommon. Use it responsibly β€” and teach it to someone else.

An Ethical Question Without an Easy Answer

In early 2024, a researcher at Stanford published a proposal arguing that all AI-generated media should be required by law to carry an invisible digital watermark β€” a code embedded in every pixel that identifies the content as synthetically generated, readable by any detection tool. The proposal had significant support from the AI safety research community. It also had significant opposition from civil liberties lawyers, who argued that mandating watermarks on all AI-generated content creates a government-controlled registry of who is producing synthetic media and when β€” a surveillance infrastructure that could be used to track political speech, artistic expression, and private communications.

Here is the question: should governments require all AI-generated images and videos to carry an identifying watermark, knowing that this creates a detection infrastructure that also enables surveillance of who creates what? What if watermarks can be stripped? What if only democratic governments implement it, and authoritarian ones don't β€” giving them an advantage in disinformation? What if the watermark system is controlled by a private company rather than a government? There are no easy answers here. These debates are happening in actual legislative chambers right now. Where would you weigh in?

Lesson 4 Quiz

5 questions β€” apply the detection mindset, not just the tools.
1. Professional fact-checkers responding to the Maui wildfire image crisis primarily used provenance tracking rather than AI detection tools. What does this reveal about the limits of detection tools?
Exactly. The Maui case illustrates that misattributed real images are at least as common as AI-generated fakes β€” and AI detectors can't identify them. Provenance catches what detectors miss: the question of whether a real image is being used honestly.
The lesson's key insight from the Maui case is that detection tools solve only one problem (identifying synthetic content) while provenance tracking solves a broader one (whether any image β€” real or fake β€” is being used honestly in context).
2. Of the four provenance questions introduced in Lesson 4, which one would MOST directly help you catch a real photograph from a 2019 protest being shared as if it happened today?
Right. "Where did this first appear?" combined with reverse image search directly reveals that the image was circulating with different attribution previously. This is exactly how misattributed images get caught.
For a misattributed real photograph, the most powerful question is origin β€” where it first appeared. Reverse image search answers that question and reveals prior circulation with different context. EXIF data won't flag a real photograph as AI-generated; it isn't.
3. You feel a powerful surge of outrage watching a video that confirms something you already strongly believed about a political figure. According to Lesson 4, this emotional state should make you:
Correct. The lesson explicitly argues this: "Strong emotional reactions to media are exactly the condition in which fabrications are most effective, because emotion and scrutiny don't coexist easily." Train yourself to check hardest when you feel most certain.
The lesson inverts the intuitive response: emotional certainty should trigger more scrutiny, not less β€” or action. Fabricators design content to provoke exactly the emotions that suppress critical evaluation. That's the mechanism to resist.
4. Reverse image search is reliable for catching which type of misinformation, and unreliable for which type?
Exactly as the lesson states. Reverse image search is powerful for misattributed real photographs (they have an existing online history) and limited against AI-generated images (which are new and have no prior circulation to trace).
The lesson is specific about this: reverse image search traces prior circulation, which catches misattributed real images. It doesn't identify synthetic content β€” AI-generated images are new, so they have no prior history to find.
5. A researcher proposes mandatory AI watermarks in all generated content. A civil liberties lawyer opposes this. Which concern most directly challenges the watermark proposal on grounds beyond just "it could be wrong sometimes"?
This is the strongest civil liberties objection as the lesson presents it: the concern isn't just about whether watermarks work accurately β€” it's that the infrastructure required to implement them is itself a surveillance system, regardless of its stated purpose.
The civil liberties argument in the lesson isn't about cost or competitive disadvantage β€” it's about the surveillance infrastructure the system creates. Building the ability to track all synthetic content creation is a different kind of risk than accuracy errors.

Lab 4 β€” The Policy Designer

You've learned to detect. Now you have to decide what to do about it at scale.

Your Assignment

You've been asked by a fictional city council to advise on a local policy: should the city require that all AI-generated images used in official city communications (press releases, social media, reports) carry a visible label identifying them as AI-generated? The council wants your recommendation and the reasoning behind it.

RENN is your policy analysis partner. They'll push back on weak arguments and ask you to consider second-order effects β€” what happens as a result of your policy that you didn't intend. You need to defend a clear position. Minimum 3 exchanges.

Start by giving RENN your recommendation β€” should the city require visible AI labels on official communications? β€” and your primary reason. Be specific about what you're trying to prevent or enable, not just what sounds good.
RENN β€” Policy Analysis Partner
Lab 4
The council is waiting. Mandatory AI labels on official city communications β€” yes or no, and what's driving that decision? Don't give me a committee answer. Give me a position and the reasoning that holds it up.

Module Test β€” Real or Rendered, Module 1

15 questions across all four lessons. Pass at 80% or above to complete the module.
1. The core reason the New Hampshire Biden robocall was a significant event was:
Correct. The significance was economic and technical: not that deception occurred, but that the barrier to sophisticated deception dropped to $500 and no specialized skills.
The lesson emphasizes the collapsed cost and skill barrier β€” not historical novelty or the investigative response. Review Lesson 1's opening case.
2. The "liar's dividend" refers to:
Right. The liar's dividend is a second-order effect: the existence of convincing fakes doesn't just enable new lies β€” it provides a defense for old ones by making all evidence deniable.
This concept is about how genuine evidence can now be denied using the existence of AI fakes as cover. Review the "Trust Collapse" section of Lesson 1.
3. What was the key barrier to convincing fake images before generative AI tools became publicly available around 2022?
Correct. The "cost and skill floor" was the key barrier β€” generative AI collapsed all three simultaneously.
The lesson focuses on practical barriers: cost, time, skill. Not legal or technical infrastructure barriers. Review the "How We Got Here" section.
4. A diffusion model generates images by:
Right. Diffusion reverses a noise-adding process to synthesize new images that fit learned visual patterns β€” it doesn't search or assemble from stored examples.
Diffusion models synthesize by denoising, not by searching or assembling. Review Lesson 2's explanation of how diffusion models work.
5. AI image generators struggle more with hands than with forward-facing portraits primarily because:
Correct. It's structured complexity plus extrapolation under uncertainty. The model knows appearances, not rules β€” and when the appearance is unusual, it fails.
The answer is about structured complexity and rule-based anatomy that the model doesn't explicitly understand. Review the "Why Hands, Text, and Edges Fail" section.
6. Text inside AI-generated images is typically unreadable gibberish because:
Right. The model learned the visual pattern of text without semantic understanding β€” it produces the appearance of language without its meaning.
This failure is about the distinction between visual pattern learning and semantic understanding. The model knows what text looks like, not what text means. Review Lesson 2.
7. The Zelensky deepfake video in March 2022 was identified partly because of an artifact at his neck. What deepfake technique does this artifact suggest was used?
Correct. Jaw and neck boundary issues are characteristic indicators of face-replacement deepfakes, where the seam between swapped face and original body is the most technically challenging area to hide.
Neck artifacts in deepfakes point to face-replacement technique β€” the seam between face and body. This is a structural consequence of how face swapping works. Review Lesson 3's section on deepfake techniques.
8. The "institutional channel test" is currently considered more reliable than visual deepfake detection primarily because:
Right. The key insight is the asymmetric rate of change: technology improves fast, institutional norms change slowly. Exploit the stable element for detection.
The lesson's argument is about asymmetric rates of change between technology and institutional norms. Review the "institutional channel test" section in Lesson 3.
9. According to Sensity AI research cited in Lesson 3, the majority of deepfakes by volume that circulate online are:
Right. This statistic was included to make the stakes immediate and personal β€” this is the primary real-world harm from the technology, and it occurs at the scale of schools.
The 96% figure referred to non-consensual intimate imagery. The lesson used this to reframe where the most common harm is actually occurring. Review Lesson 3.
10. Provenance tracking is more reliable than AI detection tools for identifying misinformation because:
Exactly right. Provenance tracking is broader β€” it catches misattributed real images too, not just synthetic ones β€” and it doesn't suffer the accuracy degradation that hits detection tools after image compression.
The lesson's argument is that provenance tracking solves a broader problem (including misattributed real images) and is more robust to the image processing that defeats detection tools. Review Lesson 4.
11. Reverse image search is most useful for which specific category of misinformation?
Right. Reverse image search works by finding prior circulation of an image β€” which only helps when the image has existed before. Misattributed real photographs have prior history; new AI-generated images don't.
Reverse image search traces prior circulation. AI-generated images are new and have no prior history to find. It's specifically useful for misattributed real images. Review Lesson 4.
12. A friend sends you a video of a celebrity apparently saying something outrageous, and it makes you feel vindicated about something you already believed. Lesson 4 says your primary response should be to:
Right. The lesson explicitly argues that emotional certainty should trigger heightened scrutiny, not sharing. Fabricators design content to produce exactly the emotions that suppress evaluation.
The lesson's key behavioral insight is that strong emotional reactions β€” especially confirmation of existing beliefs β€” are the condition in which fake content works best. That's exactly when to slow down, not speed up. Review Lesson 4.
13. The strongest civil liberties objection to mandatory AI watermarking of all generated content is:
Correct. The civil liberties concern isn't about effectiveness or commercial impact β€” it's about what the necessary infrastructure enables beyond its stated purpose.
The lesson's civil liberties argument is about surveillance infrastructure, not effectiveness or commercial fairness. The concern is what the system enables beyond detection. Review the ethical question in Lesson 4.
14. An AI image shows a city official at what appears to be an undisclosed meeting. The image has perfect hands, readable text on a nearby sign, and no visible background artifacts. What should you conclude?
Exactly. Lessons 2 and 4 both make this point: visual tells are a moving target, and their absence is not evidence of authenticity. Provenance questions apply regardless of how the image looks.
The lessons specifically warn that absence of visual artifacts does not confirm authenticity as generators improve. And detection tools have significant error rates β€” especially on processed images. Apply provenance questions regardless. Review Lessons 2 and 4.
15. Which combination of the four provenance questions from Lesson 4 would be most relevant if you received a short video clip β€” appearing to show a school administrator making a controversial statement β€” from an anonymous classmate?
Correct. All four questions scale down to school-level incidents just as well as national ones. The institutional channel test applies (did it come through official channels?), motive is relevant, the original file matters, and independent verification of context is possible.
The provenance framework applies at all scales, including school incidents. The four questions are not limited to national media. All four are relevant here β€” review Lesson 4's "Four Questions" section.