In October 2021, a 17-year-old in New Jersey posted a sarcastic tweet during a school lockdown drill — the kind of dark joke teenagers make when they're bored and a little anxious. Within hours, the tweet had been screenshotted, stripped of its context, and shared by thousands of strangers who had no idea it was a joke. Her school called her parents. News stations picked up the story. She received death threats from adults across the country. The post was up for less than six hours before she deleted it. The consequences lasted years. That's the world this course is about — not the world your parents grew up in, and not the world the adults writing internet-safety pamphlets imagined.
Right now, AI systems are being used to scan social media, analyze text for emotional tone, build profiles of people based on their posts, and flag behavior that algorithms decide is suspicious. Schools are buying software that monitors students' online activity. Employers search candidates' social histories going back a decade. The things you say online don't just reach your followers — they feed systems designed to draw conclusions about who you are. Most people have no idea this is happening. You are about to.
This course won't tell you to "stay safe online" or to never post anything you wouldn't want your grandmother to see. That advice was outdated in 2010. What it will do is show you the actual mechanics — why posts go viral against the poster's will, what a digital footprint actually means when AI is reading it, how online identities get stolen or distorted, and what privacy means when nothing is ever truly deleted. By the end, you'll have a framework for making smarter choices — not because someone told you to, but because you understand what's actually going on.
On the morning of December 20, 2013, a 30-year-old public-relations director named Justine Sacco boarded a flight from London to Cape Town, South Africa. Before takeoff, she posted a tweet to her 170 followers — a clumsy attempt at ironic humor about AIDS and race that was, by most readings, offensive. Then she put her phone on airplane mode and fell asleep for eleven hours.
While she slept, the internet did not. A writer named Sam Biddle saw the tweet, retweeted it, and added: "Justine Sacco, PR executive, @JustineSacco, is tweeting this from JFK right now." Her original post had been made to 170 people. Within two hours it had been seen by millions. A hashtag — #HasJustineLandedYet — trended globally. Strangers coordinated to find her flight number. Someone drove to the Cape Town airport just to photograph her expression when she turned on her phone and saw what had happened. By the time she landed, she had lost her job. This all occurred between approximately 11 a.m. and 10 p.m. on a single Friday, while she was completely unreachable in the air.
Sacco's case is not unique. It is, however, one of the most precisely documented examples of what journalist Jon Ronson later called a "public shaming event" — and it happened before modern AI content-amplification systems existed. Today, those systems are running. What took the 2013 internet a few hours of human effort now takes algorithms a few minutes. The question this lesson explores is not whether Justine was right or wrong. The question is: how does a post escape the person who made it?
Most people think content goes viral because it's popular. That's only half true. Content goes viral because platforms are designed to make it go viral — and the design works on emotions, not logic.
Every major social platform uses a system called a recommendation algorithm. Think of it as a machine whose job is to keep you scrolling. It learns what makes you stop and engage — what makes your thumb pause, what makes you tap like, what makes you comment in anger. Then it shows you more of that. Not because it cares about truth or fairness, but because engagement equals time-on-app, and time-on-app equals advertising money.
The emotions that drive the most engagement are not happiness or calm. Research by William Brady at New York University, published in 2017, found that moral outrage spreads faster than almost any other kind of content. Every word that signals anger, disgust, or perceived injustice increases the likelihood a post gets shared by about 17%. A post that makes someone feel righteous — like they're punishing someone who deserves it — spreads almost automatically.
Justine Sacco's tweet was tailor-made for this, even though she never intended it to be. It combined a recognizable name, a clear "villain" role, a moral transgression, and an irresistible dramatic element — she didn't know. The not-knowing made it a story. And once it was a story, the platform's mechanics did the rest.
In 2013, amplification was driven mostly by humans retweeting. Today, AI systems decide within milliseconds whether to push content to thousands of extra people. A post that gets a few angry replies in the first ten minutes can be auto-promoted to hundreds of thousands of users before any human reviewer sees it. The window between "posted" and "viral" is shorter than it has ever been.
There is a concept in communication theory called context collapse. Here's what it means in plain terms: when you talk to different people in real life, you naturally adjust. You talk differently to your best friend than you do to your teacher, your grandparent, or a stranger on the street. You're not being fake — you're just reading the room. Every human does this. It's called code-switching, and it's a sign of social intelligence, not dishonesty.
Online, that adjustment is almost impossible. When you post on a public platform, you are technically speaking to everyone at once — your close friends, your relatives, people who hate you, people who've never heard of you, journalists, future employers, and AI systems that crawl the internet to build databases. All of those audiences collapse into one. The thing you posted for your three closest friends can be read by your principal within the hour.
Context collapse was named and described by researcher danah boyd (she spells her name in lowercase) in her 2014 book It's Complicated: The Social Lives of Networked Teens. Boyd studied thousands of teenagers' online behavior over a decade and found that the most consistent problem wasn't that young people were being reckless. It was that they had no adequate vocabulary for the fact that their intended audience and their actual audience were completely different things.
Justine Sacco was posting to 170 followers, most of whom presumably understood her sense of humor. Her actual audience, by nightfall, was tens of millions of strangers — none of whom had any context for who she was or what she meant. That gap — between who you were talking to and who actually hears you — is context collapse. And AI systems that automatically scrape, index, and redistribute content make that gap permanent.
When someone says "just don't post anything offensive," they're treating context collapse as a content problem. It's actually an audience problem. The exact same words that are funny to your friend group can be career-ending to a stranger with no context. Understanding this means you're thinking about communication at a level most adults haven't caught up to yet.
In 2013, the viral spread of Justine Sacco's tweet was driven by human decisions — each retweet was a person choosing to share. What's different now is that AI recommendation systems have inserted themselves into that process, and they don't make choices the way humans do. They optimize.
Here's a concrete example. In 2021, Facebook's own internal research — later leaked by whistleblower Frances Haugen — revealed that the platform's algorithm had been modified in 2018 to prioritize content that received "angry" reactions. The goal was to increase engagement. The result, as Facebook's own researchers documented, was that the algorithm systematically amplified outrage, misinformation, and divisive content because those got more clicks. The AI wasn't trying to make people angry. It just learned that anger kept people on the app, and it did its job.
This matters for understanding your own digital life because the AI systems running these platforms don't know you. They don't know your tone, your irony, your context, or your intent. They know engagement signals. A post you made sarcastically gets treated identically to one made sincerely. A joke that your friends understood gets amplified to strangers who don't get it. The AI is a powerful distribution machine that is completely indifferent to meaning.
By 2023, short-video platforms like TikTok were running recommendation systems so refined that they could identify a user's emotional state from their scrolling patterns and adjust content delivery accordingly — slowing down when a user seemed to pause longer on certain topics, speeding up when they were disengaged. These systems have no concept of "this post might ruin someone's life." They have only the concept of "this post drives engagement."
If a platform's AI system amplifies your post to millions of people who misunderstand it and those people send you death threats — who is responsible? You made the post. The AI did the amplifying. The strangers sent the threats. At what point does the platform bear responsibility for what its own algorithm does to real people? This question is being debated right now in courts and legislatures around the world, and nobody has a settled answer yet.
There is a cultural myth that going viral is good. Sometimes it is. Musicians have been discovered, businesses have been launched, injustices have been exposed because the right content reached the right people at the right time. But the same mechanics that create a lucky break can create a catastrophe — and the person it's happening to rarely knows which one it is until it's over.
Consider what happened to Ghyslain Raza, a 15-year-old Canadian student who in 2002 filmed himself swinging a golf ball retriever like a lightsaber in a school studio and forgot to erase the tape. Classmates found it, digitized it, and posted it online. By 2006 it had been viewed over 900 million times — he became known as "Star Wars Kid," one of the first major internet viral figures. Raza did not benefit from this. He was bullied so severely that his parents pulled him from school and he required psychiatric care. His family sued the classmates who posted the video. In 2013, he spoke publicly about the experience for the first time: "No matter how hard I tried to ignore people telling me to commit suicide, I couldn't help but feel worthless, like my life had no value."
Raza's case predates AI-driven amplification by almost two decades. The video spread through early message boards and manual sharing — slow by today's standards. In a 2023 environment, with algorithmic amplification, that same video would have reached a billion views in days, not years. The human cost — the bullying, the shame, the psychological harm — would have been compressed into weeks instead of years, with no off-ramp.
Knowing this changes how you see every piece of content that "blows up." Behind almost every viral post about a private individual is a real person who did not design their life to be public entertainment. This doesn't mean viral content is always wrong. It means you now have a frame — a way of seeing the person inside the post — that most people scrolling past them don't have.
Virality is not a feature of content. It is a feature of systems — systems designed by companies, optimized by AI, and activated by human emotion. You are not powerless inside those systems, but you are also not in control of them. The gap between what you intended and what actually happens to your post can be enormous. Lesson 2 explores what happens after the damage — and why the internet's memory is longer than you think.
You are a digital media auditor. Your job is to analyze viral events — not to decide if someone was a good or bad person, but to identify the mechanics that made their content spread. The AI you're working with is a fellow analyst — skeptical, direct, and will push back if your reasoning is weak.
Use the case of Justine Sacco (2013) or Ghyslain Raza (2002) — or bring in another case you know about. Identify: which virality mechanics were active, whether context collapse occurred, what role AI/algorithmic amplification played (if any), and whether the spread was proportionate to what the person actually did.
In 1998, a Spanish lawyer named Mario Costeja González had his home repossessed and auctioned to pay a social security debt. As required by law, the auction was published in a newspaper, La Vanguardia. The debt was paid. The auction happened. Life moved on. Then the internet arrived. By 2009, if you searched Mario's name on Google, the first result was still that 1998 auction announcement — a fact from his past that he had legally resolved over a decade earlier, now permanently attached to his professional identity.
Mario complained to Google and to the Spanish data protection authority. Google refused to remove the link, arguing it was public information published by a newspaper. In 2014, the Court of Justice of the European Union ruled against Google. The court established what is now known as the "right to be forgotten" — the legal principle that individuals have the right, under certain conditions, to request that search engines remove links to information about them that is outdated, irrelevant, or harmful to their reputation. By 2020, Google had received over 845,000 such removal requests from Europeans.
The United States has no equivalent law. If you are in the US and something embarrassing or damaging was published about you online, it stays — indexed, searchable, and available to any AI system that crawls the web. Mario's case shows that the fight over digital memory is real, legal, and ongoing. It also shows something more immediate: the internet doesn't forget just because you do.
The phrase "digital footprint" gets used so often in internet-safety talks that it's lost most of its meaning. So let's be precise about what it actually includes, because the real list is probably longer than you think.
Your digital footprint has two layers. The first is your active footprint — things you deliberately created: posts, comments, messages you sent, accounts you made, photos you uploaded, reviews you wrote. These feel like choices, and they are. The second layer is your passive footprint — data generated about you without you consciously creating it. This includes: every search you've ever typed, every website you've visited, how long you spent on each page, your location when you accessed an app, what time of day you're most active, what you looked at for more than three seconds, which ads you hovered over without clicking.
The passive footprint is almost always larger than the active one. And it's almost entirely invisible to you. You can't see it. You can't easily delete it. And it's being read by AI systems constantly.
In 2018, researchers at the University of Cambridge published a study showing that from Facebook likes alone — without any other data — machine learning models could predict a person's political views, religion, sexual orientation, and even personality type with accuracy rates comparable to people who had known that person for years. The likes were passive data. Most people never thought of liking a post as a statement about who they are. The AI read it that way anyway.
Before going further: think about one thing you've searched for online in the past week that you'd prefer nobody else to know. That search is stored somewhere. It was processed by an algorithm that used it to build a profile of your interests. This isn't hypothetical — it is the actual business model of most free online services.
In 1996, a nonprofit organization called the Internet Archive began systematically copying and storing the entire public web. Every few months — and in many cases, daily — automated bots crawl billions of web pages and save snapshots. By 2023, the archive contained over 800 billion web pages. It is accessible to anyone, for free, at web.archive.org. It is known informally as the Wayback Machine.
This means that a post you made on a public forum in 2017, deleted in 2019, may still exist in the Wayback Machine's archive from 2017. Journalists regularly use it to recover deleted statements. So do opposition researchers in political campaigns. So do employers doing background checks. And increasingly, AI systems training on internet data have ingested archived web content — meaning something you deleted years ago may have already been read and processed by AI models that will continue using that training data for years.
In 2019, a high school student from Georgia named Kyle Kashuv — who had become a prominent gun-rights activist after surviving the 2018 Parkland shooting — had his admission to Harvard University rescinded. The reason: private messages from two years earlier, when he was 16, had been screenshotted and shared publicly. The messages contained racial slurs. Kashuv apologized publicly and said he had grown significantly since writing them. Harvard revoked admission anyway. The institution making the decision about who he was in 2019 based on what he'd written as a 16-year-old in 2017.
This is not a lesson about whether Harvard was right or wrong. That's genuinely complicated. It is a lesson about the gap between the person who wrote something and the record that remains after they've changed.
Kyle Kashuv wrote racist messages when he was 16. He says he changed. Should a permanent digital record define who someone is years later, when they were young and the context has shifted? How long should a record be held against someone? Is there a difference between a public figure and a private person in this regard? There's no agreed answer — but this exact debate is shaping privacy law, college admissions policies, and HR practices right now.
Here is something that surprises most people: modern AI language models — the kind that power chatbots and writing assistants — were trained on enormous amounts of text scraped from the internet. That includes old forum posts, deleted blog entries, social media content, news articles, and archived web pages. When you "delete" a post, you remove it from the platform's visible interface. But if that post was already scraped by a crawler — by the Internet Archive, by a search engine's index, by an AI training dataset — the deletion didn't reach those copies.
This created a legal crisis in 2023 that is still unresolved. Authors discovered their published books had been included in datasets used to train AI models without permission. In July 2023, Sarah Silverman, Christopher Golden, and Richard Kadrey filed a lawsuit against Meta, arguing that its LLaMA AI model had been trained on illegally obtained copies of their copyrighted work. Similar suits were filed against OpenAI. The underlying question — who owns data once it's online, and what can AI companies do with it — has no settled legal answer.
For you, the practical implication is this: the internet is not a whiteboard. It is closer to a stone carving. You can paint over the carving. You can put a cloth over it. But underneath, the marks remain — and sophisticated tools, including AI systems, can see through paint.
Every conversation about "just delete it" assumes that deletion is the end of the story. Knowing what you now know — about the Wayback Machine, about AI training data, about the difference between visible and indexed content — you understand that deletion is often just the beginning of a much longer story. This is information that shapes real decisions being made right now about AI copyright law, data privacy regulations, and the "right to be forgotten" in the US.
If you can't truly erase the past, what can you actually do? This is a real question with real, if imperfect, answers.
The first strategy is proactive publication — the idea that the best way to control what comes up when someone searches your name is to actively create good content that ranks higher than bad content. Search engines show results in order of relevance and authority. A well-maintained public profile, a portfolio of work, a consistent positive presence can push older or negative results down — not eliminate them, but make them less visible.
The second strategy is understanding the legal landscape. In Europe, the "right to be forgotten" established in the 2014 Costeja González ruling means you can formally request that Google and other search engines remove links to outdated personal information. In the US, the Children's Online Privacy Protection Act (COPPA) provides some protections for users under 13. Several US states — California, Virginia, Colorado — have passed their own data privacy laws that give residents more rights to request data deletion. These are imperfect tools, but they exist.
The third strategy is the hardest: accepting that some things persist and managing expectations about future audiences accordingly. This doesn't mean never posting anything personal. It means occasionally asking: if someone who doesn't know me at all sees this five years from now with no context, what might they conclude? That question isn't about being paranoid. It's about the gap between your intended audience and your actual audience — the context collapse you learned about in Lesson 1, now viewed through the lens of time rather than space.
You are a digital footprint investigator. A college admissions office has asked you to assess what a candidate's online presence says about them — not based on their application, but on publicly available data. Your AI partner helps you think through what data would actually be findable and what it would reveal.
The twist: you are investigating yourself (hypothetically) or a fictional 17-year-old who has been online since age 10. What active and passive data would exist? What would AI systems infer from it? Is that inference fair?
In early 2023, actress Scarlett Johansson discovered that her face and apparent likeness had been used in AI-generated advertisements being run on major social media platforms — ads she had never agreed to, for products she had never endorsed. The ads used deepfake technology to superimpose her face on a spokesperson, then used AI voice cloning to add narration in a voice designed to sound like hers. The ads ran for weeks before being taken down. Similar incidents happened to Tom Hanks, who in September 2023 publicly warned his followers on Instagram: "There's a video out there promoting some dental plan with an AI version of me. I have nothing to do with it."
Then in May 2024, OpenAI released a voice for its ChatGPT assistant called "Sky" — which Johansson said sounded so similar to her voice (she had previously recorded the AI character Samantha in the 2013 film Her) that friends and her own agents contacted her assuming she had agreed to the deal. She had not. She had in fact declined a direct offer from OpenAI's CEO. OpenAI paused the Sky voice while the dispute was ongoing. The legal question — does a person own the right to their voice and likeness in the AI era, even when the AI never literally copied them but only mimicked them — has no settled answer.
These cases involve celebrities because celebrities have public profiles and legal resources. But the underlying technology is available to anyone. In 2023, the cost of training an AI model to mimic a specific person's writing style, voice, or visual appearance had dropped to near zero for anyone with moderate technical skill. The question of who you are online is no longer only about what you post. It's about what AI can construct from what you've posted — and whether you have any say in the result.
Most people think of identity theft as stealing a credit card number or a Social Security number. That version still exists. But in the AI era, digital identity theft has added several new forms that are harder to detect and harder to recover from.
The first new form is account takeover — gaining unauthorized access to someone's actual accounts through password theft, phishing (fake login pages designed to capture your credentials), or SIM-swapping (convincing a phone carrier to transfer someone's number to the thief's device, bypassing two-factor authentication). In August 2020, a 17-year-old from Tampa, Florida named Graham Ivan Clark was arrested for using SIM-swapping to take over the Twitter accounts of Barack Obama, Joe Biden, Elon Musk, Bill Gates, and dozens of others simultaneously — running a bitcoin scam that netted over $100,000 in a single afternoon. Clark was 17.
The second new form is synthetic identity creation — using publicly available data about you to build a fake version of you. This might mean creating a fake social media account using your photos, writing in your style based on your posts, and using it to say things you never said. In 2022, researchers at Georgetown University documented cases where AI-generated "sock puppet" accounts — fake accounts designed to look like real people — were being used to spread political messages that the real people whose identities were being mimicked would never have supported.
The third form — the newest and least legally defined — is AI voice and likeness cloning. Given fifteen to thirty seconds of someone's recorded voice, widely available AI tools can generate unlimited new audio in that person's voice saying anything. Given enough photos, AI can create video of someone doing things they never did.
Here is a distinction that changes how you think about everything: there is a difference between you and your digital profile. You are a person — complex, changeable, full of context, capable of explaining yourself. Your digital profile is a collection of data points — interpreted by algorithms, indexed by search engines, and read by AI systems that have no capacity to ask you what you meant.
This distinction matters because systems that make decisions about you — hiring algorithms, credit scoring systems, college admissions software, social media safety tools — do not interact with you. They interact with your profile. In 2014, Amazon built an AI hiring tool intended to automatically screen job applications. By 2018, internal reviewers discovered that the model had learned to penalize resumes that included the word "women's" (as in "women's chess club") and downgrade graduates of all-women's colleges. Amazon scrapped the tool. The AI hadn't been told to discriminate. It had just learned patterns from a decade of Amazon's previous hiring decisions — and those decisions had been made by humans who had, consciously or not, favored male candidates.
Your digital profile will be read by systems like this. Understanding that the profile and the person are different things — and that the profile can be wrong, incomplete, or actively biased — is one of the most practically useful things you'll take from this course.
When a system makes a decision about you based on your digital profile, and the decision is wrong, the instinct is to say "the AI made a mistake." The deeper truth is that the AI made a decision based on incomplete data — data that can never capture who you actually are. Knowing this distinction means you can challenge those decisions more effectively, because you know what the system is reading and what it's missing.
In April 2023, a journalist named Kashmir Hill at The New York Times reported on an experiment: she asked ChatGPT to write a biography of herself based on publicly available information about her. The biography contained several invented facts — conferences she had never attended, positions she had never held, articles she had never written. The AI wasn't lying in any meaningful sense. It was doing what language models do: generating plausible-sounding text based on patterns. But those invented facts were presented with the same confident tone as the accurate ones. If someone read that biography without knowing Kashmir Hill personally, they would have no way to know which parts were real.
This phenomenon — AI generating false information about real people stated confidently and specifically — is called hallucination. It's a known limitation of current AI systems. And it creates a practical problem: AI-generated profiles of real people are already appearing on the internet, being indexed by search engines, and potentially feeding back into the training data of future AI models. A hallucinated fact about you, repeated often enough across the internet, could become a durable part of your digital profile — attributed to you, searchable under your name, and very difficult to correct.
In 2023, a Georgia radio host named Mark Walters sued OpenAI after a ChatGPT-generated legal summary falsely accused him of embezzlement. The summary was produced in response to a journalist's question. It named him specifically, described crimes in detail, and was entirely fabricated. This was the first defamation lawsuit filed against an AI company in the US. The case was ongoing as of 2024. The question at its center — whether an AI company is legally responsible for falsehoods its model invents about real people — has no settled precedent.
If an AI model hallucinates false criminal accusations about a real person and those accusations spread online, who is responsible? The company that built the model? The person who asked the question? The platform that hosted the output? The answer determines who you would sue, who would pay, and whether the harm could ever be undone. Courts around the world are working on this right now — without consensus.
There's a realistic version of control and an unrealistic one. The unrealistic version is: manage everything about your online identity so that no one can ever misrepresent you. That's not possible. The realistic version is: understand what systems are reading about you, reduce unnecessary exposure, correct errors when you find them, and build a strong enough authentic presence that misrepresentations are harder to sustain.
Specifically: review what appears when you search your own name. If there are results you didn't create, know that they exist. In Europe, you can request removal under GDPR. In the US, you can often request removal directly from websites (with varying success) or from data broker services that aggregate personal information. Companies like Spokeo, Whitepages, and BeenVerified collect and sell profiles of private individuals — most allow opt-out requests.
More importantly: the most durable protection for your digital identity is a well-documented, authentic one. Not because it prevents bad actors — it doesn't — but because people who know who you actually are, from your own documented record, are more resistant to being fooled by a fake version of you. A person with no real online presence is easier to impersonate than a person whose real presence is clear and consistent.
And finally: understand that the gap between your digital profile and your actual self is not a flaw in you. It is a structural feature of how these systems work. You are not reducible to your posts. Neither is anyone else.
You are an identity auditor working with a law firm that handles cases where AI-generated false information has damaged someone's reputation. Your job is to identify the specific mechanisms by which an AI profile diverges from reality — not just that it's wrong, but how and why it gets things wrong.
The fictional subject is Alex Chen, a 19-year-old college student who has been online since age 9. Alex's digital footprint includes: a gaming YouTube channel from ages 10–14 (now private), a Twitter account active 2016–2020 (deleted), current Instagram (private), and three years of Discord server activity (semi-public). An AI was asked to generate a profile of Alex for a job application background check.
In January 2020, journalist Kashmir Hill (the same reporter from Lesson 3) published a story in The New York Times that most readers found genuinely unsettling. A company called Clearview AI, founded in 2017, had built a facial recognition database containing over three billion photographs — scraped without permission from Facebook, Venmo, YouTube, and millions of other websites. The database allowed law enforcement clients to upload a photo of an unknown person and receive back a list of results showing that person's social media profiles, along with links to the pages where their photos appeared.
Clearview had sold this service to over 600 law enforcement agencies in the US and abroad, including the FBI and Interpol, before the Times story ran. Every photograph in the database had been technically public — posted voluntarily on social media, visible to anyone who visited those pages. Clearview argued that scraping public photos was legal, just as anyone could manually search someone's public social media. The company's attorney compared it to "a super-Google for faces." Privacy advocates argued that there is a fundamental difference between a photo being visible to the people who encounter it naturally and a company aggregating billions of such photos into a searchable system that could track any individual's movements, relationships, and history.
By 2022, Clearview had been banned from selling its services to private companies in the US, fined over $9 million in the UK, and ordered to delete data on European citizens. The US government continued to use it. The core legal question — whether scraping and aggregating public data crosses a privacy line even if each individual piece of data was technically public — is still being resolved in courts. Clearview did not invent this question. It just made it impossible to ignore.
The traditional definition of privacy is "the right to be left alone." That definition comes from a 1890 Harvard Law Review article by Samuel Warren and Louis Brandeis — and it was written in response to newspapers publishing society gossip, which was the cutting-edge privacy threat of the time. It's not a bad definition. But it was written for a world where information was scarce and required human effort to collect and distribute.
Legal scholar Helen Nissenbaum proposed a more useful framework in her 2010 book Privacy in Context. Her concept, called contextual integrity, argues that privacy is not about secrecy — it's about appropriate information flow. Information flows appropriately when it matches the norms of the context in which it was originally shared. A doctor sharing your medical information with another doctor is appropriate. A doctor sharing that same information with your employer is a violation — not because the information became secret, but because it moved outside the context where it was supposed to stay.
This framework is far more useful for thinking about Clearview. The photos in Clearview's database were public. But they were posted in a specific context — social media profiles, where the expected audience is people who encounter the profile naturally. Aggregating those photos into a facial recognition database for law enforcement fundamentally violates the original context. The information moved outside the norms of the context where it was shared. Under Nissenbaum's framework, that's a privacy violation — even if each individual photo was technically visible.
Contextual integrity is the framework being used by many privacy lawyers, regulators, and technologists to argue for new AI regulations in 2024. The EU's AI Act, passed in 2024, draws partly on this logic to restrict certain uses of biometric data. This is the kind of thinking happening at policy levels right now — and it started as an academic concept developed by one person at NYU.
There's a version of surveillance most people recognize: cameras in stores, police following someone, a government tapping a phone. And then there's the version that most people experience constantly without recognizing it as surveillance at all.
In 2018, an investigation by the Associated Press found that Google was recording users' location even when they had turned off "Location History" in their account settings. The data was being stored in a separate system called "Web & App Activity," which was enabled by default. Google updated its disclosures after the story, but the underlying practice — collecting data through systems users don't know are running — had been operating for years. In 2020, Google paid $391 million to settle a class action lawsuit in 40 US states over the practice.
In 2019, a report by researchers at Oxford found that the average website contains 7 tracking technologies — third-party scripts that report your behavior back to advertising networks. If you visited 20 websites in a day, your behavior across those sites was likely reported to dozens of separate companies, each adding it to a profile that they sell to advertisers, data brokers, and in some cases, government agencies. None of these sites asked you specifically if this was acceptable. Most mentioned it somewhere in a privacy policy that nobody reads.
For a 12-year-old who has been online since age 7, this means there are likely profiles of your interests, behavioral patterns, emotional responses to content, and social connections held by companies you have never heard of, built over five-plus years, that you have never consented to and cannot easily access or delete.
Most free online services are free because they sell your data or your attention to advertisers. If you use Google Search, Gmail, YouTube, TikTok, or Instagram for free, the business model involves your data. Is this a fair trade? You get access to powerful tools; they get data about you. Some people argue this is a reasonable exchange — you can always pay for alternatives. Others argue that the exchange isn't transparent, that young users can't meaningfully consent, and that the power imbalance between a teenager and a trillion-dollar company makes "consent" meaningless. Where do you land?
Privacy law in the US is fragmented, inconsistent, and significantly behind the technology it's supposed to govern. Here's the honest picture.
The most relevant federal law for young users is the Children's Online Privacy Protection Act (COPPA), passed in 1998. It requires websites to obtain parental consent before collecting data from children under 13. This is why you must be 13 to sign up for most social platforms. It also means that if you were under 13 when you signed up — using a fake birthdate, as millions of children do — the platform may claim it had no legal obligation to protect your data, because you technically lied about your age.
Beyond COPPA, federal privacy protections in the US are sparse. The US does not have a comprehensive national privacy law — unlike the EU, which has the General Data Protection Regulation (GDPR), in force since 2018. The GDPR gives EU residents specific rights: to access their data, correct it, delete it, and object to certain uses. California has the California Consumer Privacy Act (CCPA), in force since 2020, which provides similar rights to California residents. Several other states have followed. But if you live in a state without specific privacy legislation, you have far fewer formal rights over your data than someone in Germany or France.
The EU's AI Act, passed by the European Parliament in March 2024, goes further — banning real-time facial recognition in public spaces for most purposes, prohibiting AI systems that exploit psychological vulnerabilities, and requiring transparency disclosures when AI is used to make consequential decisions about people. These regulations do not apply to US companies operating in the US — but they affect US companies operating in Europe, which is most of them.
The gap between US and EU privacy law is not abstract — it affects what companies can do with your data right now, depending on where you are. Knowing that this gap exists, that it's the subject of ongoing legislative debate, and that the rules are actively changing means you're reading every story about "AI and privacy" with a frame that most adults don't have. The EU AI Act is not just a European story — it is reshaping how every major tech company builds its products globally.
The most important thing to understand at the end of this course is that you are not a passive subject of these systems. You are a person making choices inside them — choices with real consequences, but also real possibilities.
The practical moves: use privacy-protecting browsers and search engines (Firefox with uBlock Origin, Brave, DuckDuckGo) where you have a choice. Review app permissions on your phone — many apps request access to your microphone, location, and contacts far beyond what their function requires. Regularly search your own name to know what's out there. Use two-factor authentication that doesn't rely solely on SMS. Know what COPPA and CCPA entitle you to. If you're in California, you can formally request that data brokers delete your data — and some advocacy groups run tools to help with that process automatically.
The bigger moves: the people writing the rules that govern how AI handles your data are — right now — regulators, lobbyists, technologists, and academics. Almost none of them are teenagers. The EU's GDPR was shaped significantly by a 14-year-old Austrian student named Max Schrems who filed a complaint against Facebook in 2011 when he was in college. He did it because he was curious and persistent, not because he had special access. That complaint eventually led to the invalidation of the EU-US Privacy Shield agreement — one of the most significant privacy law outcomes of the decade. You are not too young to have an opinion about these rules, or to make noise about them.
You now understand something consequential: the mechanics of virality, the permanence of digital memory, the gap between your profile and your person, and the contested landscape of privacy law. That understanding doesn't give you control over every system. But it means you are navigating these systems with your eyes open — and that is not a small thing.
You have been hired to design the privacy policy for a new AI-powered app called "Pulse" — a journaling app that analyzes your entries and provides emotional pattern insights, suggesting when you might be stressed or anxious. Pulse is targeted at teenagers aged 13–17. It collects: journal text, emotional tone analysis results, usage timestamps, device location, and it shares anonymized aggregate data with university researchers.
Your analyst is a privacy rights advocate. They will push back on every policy choice you make. Your goal is not to make a perfect policy — it's to think through the real trade-offs and defend the choices you make with specific reasoning.