Lesson 1 · Module 4

What Is Data Privacy — and Why Did It Disappear?

From the newspaper morgue to the server farm: how scale shattered an old bargain.

When every tap, search, and pause is recorded forever, what exactly have we lost?

In 2013, Aleksandr Kogan, a Cambridge University psychology researcher, built a personality-quiz app called thisisyourdigitallife. About 270,000 Facebook users installed it and consented — in vague terms buried deep in a user agreement — to share their data for "academic research." What they did not read, and Facebook's API freely permitted at the time, was that Kogan's app also harvested the profile data of every one of those users' Facebook friends. The final haul: 87 million people's personal information, none of whom had clicked "agree" to anything.

Kogan sold that dataset to a political consultancy called Cambridge Analytica. By 2016 the firm was modeling voter psychology in the United States and United Kingdom, micro-targeting political ads at people who had simply logged in to take a quiz about their personality. The data never traveled with a warning label. The 87 million people had no idea.

Privacy Before the Digital Age

For most of human history, privacy was protected by friction. Collecting information about someone required physical effort: knocking on doors, visiting courthouses, filing through paper records. This friction was not a designed feature — it was an accidental byproduct of analog media. It meant that surveillance was expensive, information was localized, and memory was imperfect. Most embarrassing or sensitive facts simply faded with time.

Legal scholars Samuel Warren and Louis Brandeis articulated "the right to be let alone" in an 1890 Harvard Law Review article — a response to the then-new technology of the camera and the gossipy penny press. Their argument: individuals had a natural interest in controlling what others knew about them. That idea underpinned privacy law for a century.

The digital era erased the friction. When Facebook launched in 2004, storing one user's profile cost fractions of a cent. Copying it cost nothing. Analyzing it across millions of people simultaneously became routine by 2010. The old bargain — you share some things publicly, the rest quietly disappears — collapsed entirely. Everything is now recorded, cross-referenced, and retained indefinitely.

The Anatomy of a Privacy Violation

Privacy scholars distinguish between different kinds of harm. Informational harm occurs when data about you reaches someone who was not supposed to have it. Decisional harm occurs when that information is used to make a consequential decision about you — a loan denial, a targeted ad designed to manipulate a vote, a medical insurance premium hike. Relational harm occurs when your relationships are damaged because private communications are exposed.

In the Cambridge Analytica case, all three operated simultaneously. Voter profiles were built from friend data (informational harm), used to craft psychologically targeted political messaging (decisional harm), and the revelations about the data collection eroded public trust in Facebook as a platform for genuine social connection (relational harm).

Why AI Makes This Worse

AI systems can now infer information you never shared. In 2013, researchers at MIT demonstrated that 87% of Americans could be uniquely identified using only their ZIP code, birthdate, and sex — three data points available on most voter rolls. Modern AI models can infer sexual orientation from facial images, mental health status from social-media posting patterns, and political affiliation from purchase histories. You do not have to volunteer sensitive facts for them to be known.

Key Concepts

Contextual integrityThe principle (philosopher Helen Nissenbaum, 2004) that privacy is violated when information flows outside the social context in which it was shared. Medical data shared with a doctor should not flow to an employer; personal photos shared with friends should not flow to advertisers.

The aggregation problemIndividually harmless facts — name, employer, neighborhood, gym schedule — combine into a profile that enables stalking, harassment, or manipulation. AI excels at aggregation at scale.

Surveillance capitalismEconomist Shoshana Zuboff's term (2019) for the economic logic under which human experience is converted into behavioral data, predicted, and sold as a commodity. The product is not the app; it is your predicted future behavior.

The Core Tension

AI systems are more useful when they have more data. A medical AI trained on millions of patient records catches diagnoses a solo physician might miss. A navigation AI that knows where millions of drivers are right now routes you around accidents in real time. The ethical challenge of this module is not to reject data collection entirely — it is to ask: who benefits, who bears the risk, and who decided?

Quiz · Lesson 1

What Is Data Privacy?

Five questions · Select the best answer for each.

1. How many people's data did Cambridge Analytica ultimately obtain via Aleksandr Kogan's quiz app?

Correct. Kogan's app exploited Facebook's friend-data API, extending the harvest far beyond the 270,000 who directly consented, reaching roughly 87 million profiles.

Not quite. The 270,000 figure counts only direct installers. The API also collected their friends' data, multiplying the total to ~87 million people who never consented.

2. Helen Nissenbaum's concept of "contextual integrity" holds that privacy is violated when:

Correct. Nissenbaum argues that medical data shared with a doctor should not flow to employers, and social data shared with friends should not flow to political campaigns — even if users technically "consented."

Not quite. Contextual integrity is specifically about information flowing outside its original social context. Storage and profit are separate concerns.

3. Before the digital era, what primarily protected everyday privacy from mass surveillance?

Correct. Collecting information about people in the analog era required physical effort and expense. This friction — not law — was the main practical barrier to mass surveillance.

Analog privacy was protected mainly by practical friction: gathering, copying, and searching paper records was time-consuming and expensive, not by legal frameworks or technical encryption.

4. Shoshana Zuboff's term "surveillance capitalism" describes an economic model in which:

Correct. Zuboff's 2019 analysis argues that the core product of platforms like Google and Facebook is not a service but a prediction: your predicted future behavior, sold to advertisers and others.

Zuboff's concept is broader and more systemic. She argues the entire economic logic converts lived human experience — every search, click, and pause — into behavioral data that is then sold as predictions of future behavior.

5. The "aggregation problem" in data privacy refers to:

Correct. Your name, employer, neighborhood, and gym schedule are each innocuous alone. Combined by an AI, they can enable stalking, manipulation, or targeted discrimination.

The aggregation problem is about combination of individual facts. Harmless data points — name, location, schedule — become dangerous when merged into a detailed profile by AI systems.

Lab · Lesson 1

The Aggregation Detective

Explore how harmless data points combine into sensitive profiles.

Your Mission

You will play the role of a privacy investigator. The AI will give you a realistic scenario involving data collection. Your job is to identify which individually harmless facts, when combined, create a privacy risk — and explain why contextual integrity is violated.

Have at least three exchanges. Push the AI to give you harder cases.

Start by asking: "Give me a scenario where five pieces of public data combine to create a serious privacy problem."

Privacy Lab — Data Aggregation

AI Ethics · M4 L1

Welcome to the Aggregation Detective lab. I'll present privacy scenarios drawn from real AI-era cases, and you'll analyze how individually harmless data points combine into serious risks. Ready? Ask me for your first scenario — or describe one you've read about and I'll help you break it down.

Lesson 2 · Module 4

Consent: The Word That Lost Its Meaning

When clicking "I agree" became the world's least meaningful act.

Is consent real if the alternative is being locked out of modern life?

In January 2012, Facebook conducted a now-infamous experiment. Without notifying users, researchers manipulated the emotional content of 689,003 people's News Feeds — some saw more positive posts, some saw more negative ones — to test whether emotions spread through social networks. The results were published in the Proceedings of the National Academy of Sciences in 2014 under the title "Experimental evidence of massive-scale emotional contagion through social networks."

When the experiment became public, the backlash was fierce. Had users consented? Facebook pointed to a line in its 2012 Data Use Policy permitting data use for "internal operations, including troubleshooting, data analysis, testing, research and service improvement." Buried there, the company argued, was consent. Adam Kramer, the lead researcher, later wrote that he was sorry for the distress, acknowledging the experiment "was poorly communicated." No user had been asked. No user had the practical ability to say no.

The Terms-of-Service Fiction

In 2008, Carnegie Mellon researcher Lorrie Faith Cranor calculated that reading every privacy policy a typical American encounters in a year would take 76 work days — roughly 250 hours. That figure has only grown. The policies are written by lawyers, for lawyers, in language deliberately designed to maximize corporate latitude while minimizing user comprehension.

Informed consent — the ethical gold standard borrowed from medical research — requires three elements: the person must understand what they are agreeing to, the agreement must be voluntary, and they must have the capacity to consent. Platform consent violates all three. Users understand almost nothing about algorithmic profiling. Consent is not voluntary when the alternative is social and professional exclusion. And the sheer cognitive overload of modern data agreements tests the limits of practical capacity.

The GDPR Experiment

In May 2018, the European Union's General Data Protection Regulation took effect, requiring "freely given, specific, informed and unambiguous" consent for data processing. Companies responded with elaborate cookie-consent banners. Research published in 2019 by Midas Nouwens and colleagues at University College London found that only 11.8% of major UK websites presented consent options in a way that met GDPR's own standards — and dark patterns (interfaces designed to steer users toward "accept all") were ubiquitous. Regulation existed; meaningful consent still did not.

Dark Patterns and Manufactured Consent

UX researcher Harry Brignull coined the term "dark patterns" in 2010 to describe interface designs that trick users into actions they didn't intend. In the data-consent context, dark patterns are endemic:

Confirmshaming: "No thanks, I don't want better recommendations" forces users to feel foolish for declining. Hidden opt-outs: The "Accept All" button is large and brightly colored; "Manage Preferences" requires navigating three sub-menus. Moving goalposts: Facebook changed its privacy settings interface 12 times between 2005 and 2015, each redesign making it harder to restrict data sharing. Forced bundling: You cannot use Google Maps without location tracking; you cannot use WhatsApp without agreeing to share your phone contacts with Facebook's systems.

These are not accidents. A/B testing has proven, repeatedly, that friction reduces opt-outs. The interfaces are engineered to maximize data collection by minimizing meaningful choice.

Models for Better Consent

Researchers and regulators have proposed several alternatives. Dynamic consent — developed in the biomedical research context — allows participants to update their consent preferences at any time via a persistent online interface, receiving real-time notifications when their data is accessed. Layered consent presents a simple one-paragraph summary up front, with detailed options available for users who want them. Opt-in by default — the GDPR's preferred approach — requires explicit affirmative action before data is collected, rather than requiring users to hunt for opt-out controls.

The challenge is that each of these models, if implemented honestly, would reduce data collection — and therefore revenue. The economic incentive structure consistently works against genuine consent.

Key Distinction

There is a difference between notice and consent. A privacy policy is notice — it tells you something is happening. Consent requires that you understand it, that you have a real choice, and that saying no is a viable option. Most platform "consent" is really just notice with a button to acknowledge it.

Key Terms

Informed consentAgreement that is genuinely understood, voluntarily given, and based on sufficient information — the standard from medical ethics, rarely met in digital data collection.

Dark patternsInterface designs that manipulate users into decisions they did not intend, typically by making the privacy-preserving choice difficult or embarrassing.

Opt-in vs. opt-outOpt-out systems collect data by default and require users to act to stop it. Opt-in systems collect nothing until users affirmatively agree. Default settings determine most actual behavior, since the vast majority of users never change them.

Quiz · Lesson 2

Consent and Its Failures

Five questions · Select the best answer for each.

1. In Facebook's 2012 emotional-contagion experiment, what justification did the company use for not obtaining explicit user consent?

Correct. Facebook's legal team pointed to a broadly worded clause covering "internal operations, including… research." Critics argued that buried contractual language does not substitute for meaningful informed consent.

Facebook's defense was contractual — a clause in the Data Use Policy. No explicit individual consent was sought or obtained. The IRB issue was actually a subsequent controversy: the Cornell IRB had reviewed only their portion of the analysis, not the manipulation itself.

2. Carnegie Mellon researcher Lorrie Faith Cranor estimated that reading all the privacy policies a typical American encounters annually would require approximately:

Correct. Cranor's 2008 study found that reading just the privacy policies a typical user encounters — not understanding them, just reading — would consume approximately 76 full work days per year.

The correct figure is approximately 76 work days, or roughly 250 hours per year — just for reading, not for understanding or evaluating the terms.

3. The EU's GDPR requires that consent for data processing be all of the following EXCEPT:

Correct. The GDPR requires consent to be freely given, specific, informed, and unambiguous. It does not require notarization — that would make everyday digital consent practically impossible.

The GDPR's four consent requirements are: freely given, specific, informed, and unambiguous. Notarization is not among them — the correct answer identifies the requirement that does NOT appear in the regulation.

4. "Confirmshaming" is a dark pattern in which:

Correct. A classic confirmshaming example: "No thanks, I prefer paying full price" is the opt-out button for a discount offer. The phrasing makes declining feel irrational, steering users toward acceptance.

Confirmshaming works through button or link wording that makes the privacy-preserving option feel socially or personally embarrassing — "No thanks, I don't want to learn more" — without any external communication.

5. Which best describes the core difference between "notice" and "consent" in data privacy?

Correct. A privacy policy is notice — it discloses practices. Genuine consent requires comprehension, voluntariness, and a real ability to say no without significant consequence. Most platform "consent" is really just notice with a required button click.

The key distinction is substantive: notice is disclosure, while consent is meaningful agreement with real alternatives. They are not legally synonymous under the GDPR, which sets specific consent standards.

Lab · Lesson 2

Consent Form Autopsy

Identify dark patterns and design better consent mechanisms.

Your Mission

The AI will present you with excerpts from real-style platform terms and consent interfaces. Your job is to identify the dark patterns present, explain which of the three informed-consent requirements they violate, and then redesign the consent moment to be genuinely ethical.

Have at least three exchanges. Ask for progressively trickier examples.

Start by asking: "Show me a realistic consent banner with at least two dark patterns hidden in it."

Consent Design Lab

AI Ethics · M4 L2

Welcome to the Consent Form Autopsy lab. I'll present realistic consent scenarios and interface excerpts — based on patterns documented in real platforms — and you'll dissect the dark patterns and ethical failures. You can also propose redesigns and I'll evaluate them. What would you like to start with?

Lesson 3 · Module 4

Surveillance at Scale: From Clearview to Clearances

How AI transformed face recognition from a science-fiction prop into a global infrastructure of identification.

When AI can match your face to your name in 0.3 seconds, is anonymity in public still possible?

In late 2019, Hoan Ton-That, a 31-year-old Australian entrepreneur living in New York, quietly approached police departments with an offer: a facial recognition app that could search a database of three billion photographs scraped from Facebook, Instagram, Twitter, LinkedIn, Venmo, and millions of other websites — all without the permission of any of those platforms or the people pictured. Officers would take a photo of a suspect, upload it, and within seconds receive potential matches with links to the originating web pages.

The company was called Clearview AI. By February 2020, when a New York Times investigation by Kashmir Hill exposed the operation, over 600 law enforcement agencies — including the FBI and Departments of Homeland Security — had used the tool. Clearview had never asked for a single person's consent. It had never disclosed its database to those pictured. It had built the world's largest facial recognition system out of photographs people had shared in contexts they believed were social — tagged at a graduation, photographed at a rally, captured at a restaurant — not in a government biometric registry.

The Unique Risk of Biometric Data

Facial recognition operates on biometric data — physical characteristics that cannot be changed. You can change your password. You can get a new credit card number. You cannot replace your face. This makes biometric privacy violations categorically more serious than most data breaches: the damage is permanent, the subject cannot mitigate it, and every future encounter with any face-recognition system using the same database reproduces the original violation.

Illinois recognized this in 2008 with the Biometric Information Privacy Act (BIPA), requiring companies to obtain written consent before collecting fingerprints, retina scans, or facial geometry. In 2021, Clearview settled a BIPA lawsuit, agreeing to limit sales of its technology to private companies in Illinois — though law enforcement use continued elsewhere. In 2022, Facebook's parent company Meta paid $650 million to settle a BIPA class action over its Tag Suggestions feature, which had used facial recognition to identify users in uploaded photos without their consent.

Error Rates and the Stakes of Being Wrong

In 2019, the NIST (National Institute of Standards and Technology) tested 189 facial recognition algorithms and found that the best systems were 99.5% accurate on high-quality images of cooperative subjects. But in real-world conditions — low-resolution surveillance footage, varied lighting, partial occlusion — error rates climbed dramatically. Critically, many systems showed significantly higher false-positive rates for darker-skinned faces, women, and elderly individuals. Robert Williams, a Black man in Detroit, was wrongly arrested in 2020 based on a faulty facial recognition match. He was detained for 30 hours before the case collapsed. He was the first documented US case. He was not the last.

Corporate and State Surveillance Convergence

The Clearview model illustrates a structural shift: AI enables private companies to build surveillance infrastructure that previously only nation-states could maintain. China's system — combining mandatory national ID, ubiquitous cameras, and state-controlled AI — is often cited as the extreme case. But the components of comparable capacity now exist in the private sector in liberal democracies: Amazon Ring cameras create a privately-owned street-level surveillance network; Amazon Rekognition offers facial recognition APIs to any paying customer; Palantir aggregates law enforcement databases across jurisdictions.

The legal frameworks governing these systems lag dramatically. The US has no federal facial recognition law. The EU's AI Act (formally adopted in 2024) bans real-time remote biometric identification in public spaces by law enforcement — with exceptions for terrorism, missing children, and serious criminal investigation — but enforcement architecture is still being built.

The Chilling Effect

Privacy is not only about preventing individual harm. It is about preserving the social conditions for free thought, dissent, and democratic participation. When people know — or suspect — that their face is being scanned at a political rally, a union meeting, or a mosque, many choose not to attend. This chilling effect on lawful behavior is a documented consequence of mass surveillance, studied extensively in the context of NSA bulk data collection after Edward Snowden's 2013 disclosures. The facial recognition layer makes the chilling effect more immediate: unlike phone metadata, it applies to the physical world without any device interaction at all.

Design Question

San Francisco banned government use of facial recognition technology in 2019, followed by Boston, Portland, and other cities. The bans are not permanent scientific judgments — they are temporary governance tools, creating space for democratic deliberation about whether and how such systems should be used. Is a technology moratorium the right response when accuracy is inadequate and governance is absent? Or does it delay benefits — like finding missing persons — that could save lives?

Key Terms

Biometric dataPhysical or behavioral characteristics that uniquely identify an individual — fingerprints, facial geometry, iris patterns, gait — and which cannot be changed if compromised.

False positive rateThe frequency with which a system incorrectly identifies an individual as matching a target. In facial recognition, a high false positive rate can result in wrongful arrest or other serious consequences.

Chilling effectThe deterrence of lawful behavior — political participation, religious observance, protest — caused by the knowledge or suspicion of surveillance.

Quiz · Lesson 3

Surveillance at Scale

Five questions · Select the best answer for each.

1. Clearview AI built its facial recognition database primarily by:

Correct. Clearview AI scraped approximately three billion photographs from Facebook, Instagram, Twitter, Venmo, and millions of other sites — without the consent of those platforms or the individuals pictured.

Clearview scraped public web content without licensing or consent agreements. The company did not purchase, partner with DMVs, or use donated footage to build its initial database.

2. Illinois's Biometric Information Privacy Act (BIPA), passed in 2008, requires companies to:

Correct. BIPA's core requirement is prior written consent before biometric data collection. This law enabled major lawsuits against Clearview AI and Facebook's Tag Suggestions feature.

BIPA's primary mechanism is a consent requirement — written permission before collection. It does not mandate annual registration, 30-day deletion, or restriction to law enforcement use.

3. What made Robert Williams's wrongful arrest in Detroit in 2020 historically significant?

Correct. Robert Williams was detained for 30 hours before the facial recognition error was identified. His case is historically documented as the first known US instance of wrongful arrest due to a false positive from facial recognition AI.

The Williams case is significant as the first documented wrongful arrest from a facial recognition false positive in the US — not for subsequent legislation or damages awarded.

4. Why is biometric data considered categorically more sensitive than most other types of personal data?

Correct. The irreversibility is the key ethical and practical problem. A password breach allows you to reset. A facial geometry breach is permanent — every future use of any database containing that data reproduces the original violation.

The critical distinction is permanence. Unlike passwords or account numbers, you cannot replace your face, fingerprints, or iris pattern. A biometric breach is therefore irreversible in a way that other data breaches are not.

5. The "chilling effect" of surveillance on civil liberties refers to:

Correct. Research following the Snowden NSA revelations confirmed measurable drops in web searches for sensitive topics and participation in some civil activities. Facial recognition in physical spaces produces the same deterrent effect without requiring any device interaction.

The chilling effect is a civil-liberties concept: when people believe they are being watched, they self-censor lawful behavior. Facial recognition extends this effect into physical public space, since no phone or device is required for identification.

Lab · Lesson 3

Surveillance System Audit

Evaluate facial recognition deployments against ethical and legal frameworks.

Your Mission

The AI will describe a realistic facial recognition deployment scenario — a city transit system, a stadium security operation, a retail loss-prevention program. Your job is to evaluate it: What are the accuracy risks? Who bears the harm if the system errs? Is consent possible? Does it produce a chilling effect? What governance rules should apply?

Have at least three exchanges. Request a new scenario once you've thoroughly analyzed the first.

Start by asking: "Describe a realistic facial recognition deployment in public transit and help me audit its ethical risks."

Surveillance Audit Lab

AI Ethics · M4 L3

Welcome to the Surveillance System Audit lab. I'll present realistic facial recognition scenarios based on actual deployments — from transit systems to retail environments to law enforcement — and help you build an ethical evaluation framework. I'll play devil's advocate when useful to sharpen your analysis. Ready to audit a system?

Lesson 4 · Module 4

Rights, Remedies, and Responsible Design

From legal frameworks to privacy-by-design: what protection actually looks like when it works.

If privacy is a right, who is responsible for protecting it — and what tools do they have?

On May 25, 2018, the EU's General Data Protection Regulation took effect. Within hours, Max Schrems — an Austrian lawyer who had spent years fighting Facebook's data practices through European courts — filed four complaints against Google, Facebook, Instagram, and WhatsApp. The total claimed damages: 3.9 billion euros. His argument: that the platforms' "forced consent" — accept our terms or don't use the service — was not legally valid consent under GDPR's requirements. The cases would drag through regulatory bodies for years, but they established a new reality: data protection law had teeth, and someone was willing to use them.

By 2022, the Irish Data Protection Commission — the lead regulator for most major US tech companies whose European headquarters sit in Dublin — had issued fines totaling over €900 million against Meta alone. In 2023, the DPC fined Meta a record €1.2 billion for transferring European users' data to US servers in violation of GDPR rules on international data transfers. The era of consequence-free data exploitation was, at least in Europe, ending.

The Regulatory Landscape

Privacy law today operates at three levels. At the international level, the EU's GDPR is the most comprehensive, establishing data subjects' rights — access, erasure ("right to be forgotten"), portability, and objection to automated decision-making. The Council of Europe's Convention 108+ extends similar principles to non-EU members. The OECD Privacy Guidelines, while non-binding, influence national frameworks globally.

At the national level, the US remains fragmented: sector-specific laws (HIPAA for health data, COPPA for children's data, FERPA for education records) coexist with the California Consumer Privacy Act (CCPA, 2020) and its successor the California Privacy Rights Act (CPRA, 2023), which extended GDPR-style rights to California residents. As of 2024, 19 US states have passed comprehensive consumer privacy laws, though no federal framework exists.

At the organizational level, corporate privacy programs vary enormously — from genuinely privacy-conscious firms that build data minimization into their architectures, to those that implement the minimum required to avoid liability while maximizing data extraction.

The Right to Be Forgotten

In 2014, the Court of Justice of the European Union ruled in Google Spain v. AEPD that individuals have the right to request removal of search results linking to personal information that is "inadequate, irrelevant, or no longer relevant." This "right to erasure" is now codified in GDPR Article 17. By 2023, Google had received over 5 million erasure requests and complied with approximately 46% of them. Critics argue the right is under-enforced; others argue it enables suppression of legitimate journalism. The tension between privacy and public interest in accurate information is unresolved.

Privacy by Design

Canadian privacy regulator Ann Cavoukian developed the Privacy by Design (PbD) framework in the 1990s, and it is now codified in GDPR Article 25. PbD's seven principles require that privacy protection be built into systems from the start — not bolted on after launch, not traded against other design goals, but embedded as the default condition. The principles include: data minimization (collect only what is necessary), purpose limitation (use data only for its stated purpose), default privacy (the most private setting is the automatic one), and full functionality (privacy should not degrade service).

DuckDuckGo's search engine is a functional example: it provides comparable search results to Google without storing user IP addresses, search histories, or building behavioral profiles. Apple's App Tracking Transparency framework, introduced in iOS 14.5 in 2021, requires apps to obtain explicit permission before cross-app tracking — a change that cost Meta an estimated $10 billion in 2022 revenue by eliminating much of its behavioral targeting capability. Privacy by design, implemented at infrastructure scale, produces real economic consequences for surveillance capitalism.

What Individuals Can Actually Do

Individual privacy protection operates at the technical, legal, and social levels. Technically: privacy-respecting browsers (Firefox, Brave), tracker-blocking extensions (uBlock Origin), end-to-end encrypted messaging (Signal), and VPNs reduce — though do not eliminate — passive data collection. Legally: GDPR and CCPA give residents of covered jurisdictions rights to access, delete, and opt out of sale of their data — rights exercisable through companies' formal request mechanisms. Socially: supporting organizations like the Electronic Frontier Foundation, the Privacy Rights Clearinghouse, or the ACLU's Privacy Project contributes to advocacy for systemic change.

The honest caveat: individual protective action is necessary but insufficient. A single user blocking trackers while everyone around them remains unprotected does not fix the system. Privacy is a collective infrastructure problem, not a personal hygiene problem. The companies that profit from data collection have teams of engineers, lawyers, and lobbyists working to maintain the status quo. Individual choices matter, but structural reform matters more.

The Accountability Gap

The core unsolved problem: the entities that collect and profit from personal data are not the entities that bear the consequences of privacy violations. A data broker sells your location history to an abusive ex-partner's private investigator and faces no direct liability. A facial recognition company provides a false match that leads to wrongful arrest and may face only civil BIPA claims. Until the economic and legal incentive structure changes — until the costs of privacy violations land on those who caused them — technological capability will continue to outrun ethical constraint.

Key Terms

Right to erasureThe GDPR right (Article 17) allowing individuals to request deletion of personal data held by a company, subject to exemptions for public interest, legal obligations, and freedom of expression.

Data minimizationA privacy-by-design principle requiring that systems collect only the minimum personal data necessary to fulfill their stated purpose.

Accountability gapThe structural misalignment in which those who profit from data collection do not bear the direct costs of the harms that collection enables.

Quiz · Lesson 4

Rights, Remedies, and Responsible Design

Five questions · Select the best answer for each.

1. In 2023, the Irish Data Protection Commission fined Meta a record amount for what violation?

Correct. The €1.2 billion fine — the largest in GDPR history at that time — concerned Meta's transfer of EU personal data to US servers under a legal mechanism (Standard Contractual Clauses) the DPC ruled was inadequate to protect European data subjects' rights.

The record 2023 DPC fine was specifically about international data transfers — moving EU user data to the US under legal frameworks the regulator found inadequate under GDPR's transfer rules.

2. Ann Cavoukian's Privacy by Design framework requires that privacy be:

Correct. PbD's core principle is proactive embedding: privacy protection is built into the system architecture from the design phase, not bolted on later, and the default setting is always the most private available option.

Privacy by Design rejects the retrofitting approach. Its principles require privacy to be embedded from the beginning — as an architectural default — not managed reactively after deployment.

3. Apple's App Tracking Transparency (ATT) framework, introduced in iOS 14.5, was estimated to cost Meta approximately how much in 2022 revenue?

Correct. Meta disclosed in its 2022 earnings reporting that ATT cost it approximately $10 billion in revenue by eliminating the cross-app behavioral tracking that powered its targeted advertising business for iOS users who declined tracking.

Meta publicly estimated the ATT impact at approximately $10 billion in 2022 — a concrete demonstration that privacy-by-design decisions at infrastructure scale produce real economic consequences for surveillance-based business models.

4. The EU's "right to be forgotten" (GDPR Article 17) allows individuals to request:

Correct. The right to erasure applies to search result links to personal information meeting specific criteria — it does not require complete deletion of all online content or compensation for past collection, and it is subject to public-interest and journalistic exemptions.

The right is more specific and limited: it applies to search results linking to personal information that meets the inadequacy or irrelevance criteria, with significant exemptions for public interest and journalism. It does not guarantee complete removal from the web.

5. The "accountability gap" in data privacy refers to the structural problem that:

Correct. The accountability gap is the central incentive problem: a data broker, advertiser, or AI company that enables a privacy harm typically faces minimal direct liability, while individuals bear the full personal cost of the violation.

The accountability gap specifically describes the misaligned incentive structure — those who profit from collection are not those who pay when collection causes harm. Until this alignment problem is fixed, technological capability will outpace ethical constraint.

Lab · Lesson 4

Privacy by Design Architect

Redesign a real product to embed privacy from the ground up.

Your Mission

You are a privacy architect advising a product team. The AI will describe a product or service that currently has significant privacy problems — drawn from real documented cases. You will apply Privacy by Design principles to redesign it: proposing data minimization measures, default settings, consent mechanisms, and accountability structures.

Have at least three exchanges. Push for concrete, implementable design decisions, not abstract principles.

Start by asking: "Describe a real product with documented privacy problems that I can redesign using Privacy by Design principles."

Privacy Architecture Lab

AI Ethics · M4 L4

Welcome to the Privacy by Design Architecture lab. I'll present real products with documented privacy failures — based on cases from Google, Amazon, fitness tracking companies, and others — and you'll serve as the privacy architect proposing concrete redesigns. I'll push back on vague proposals and ask you to get specific about trade-offs. Ready to start building?

Module Test · Module 4

Privacy, Consent, and Your Data

15 questions · Score 80% or higher to pass the module.

1. Kogan's Facebook quiz app in 2013 collected data from ~87 million people even though only 270,000 installed it. This occurred because:

Correct. Facebook's Graph API v1 permitted third-party apps to collect profile data of app installers' friends by default — a policy Facebook changed after the Cambridge Analytica scandal became public.

The mechanism was Facebook's own API design: apps could harvest friend data as a standard feature, not through hacking or explicit commercial agreements. This policy was later changed.

2. Helen Nissenbaum's "contextual integrity" principle would be violated when:

Correct. Medical data shared in a clinical context with an expectation of doctor-patient confidentiality flowing to an employment context is the paradigmatic contextual integrity violation — information used outside the social norms of its original sharing.

Contextual integrity is specifically about information flowing outside the social norms and expectations of the original sharing context. Medical data to an employer crosses that contextual boundary clearly.

3. What did Shoshana Zuboff mean by "behavioral surplus"?

Correct. Zuboff's concept: companies collect far more behavioral data than they need to improve their services. This "surplus" — the excess — is the raw material of surveillance capitalism, processed into predictions sold to advertisers, insurers, and political campaigns.

Behavioral surplus is Zuboff's term for the excess behavioral data — beyond what improves the service — that companies harvest and sell as predictions of future behavior. It is the economic engine of surveillance capitalism.

4. Facebook's 2012 emotional-contagion experiment was controversial primarily because:

Correct. The manipulation of nearly 700,000 users' emotional environments — without notification, let alone genuine informed consent — raised fundamental questions about the ethics of platform experimentation and the limits of terms-of-service consent clauses.

The core controversy was the non-consensual manipulation of users' emotional feeds. Facebook's contractual defense — a vague clause covering "internal research" — did not satisfy critics who applied medical research informed-consent standards.

5. A 2019 UCL study of major UK websites found that, following GDPR implementation, only what percentage presented consent options meeting the regulation's own standards?

Correct. Midas Nouwens and colleagues at UCL found that despite GDPR being in force, only 11.8% of major UK websites presented cookie consent in a way that technically complied with the regulation's own standards — with dark patterns ubiquitous in the remainder.

The UCL study found 11.8% compliance — a striking demonstration that regulation on paper does not automatically produce meaningful consent in practice, especially when dark patterns are economically incentivized.

6. "Forced bundling" as a dark pattern in consent design means:

Correct. Using Google Maps requires accepting location tracking; using WhatsApp required accepting contact-sharing with Facebook. These are forced bundles: the core service is unavailable unless you accept data collection beyond what the service strictly requires.

Forced bundling is the practice of making a desired service conditional on accepting ancillary data collection — you cannot use the app unless you accept the tracking, even if the tracking is not necessary for the core function.

7. Clearview AI's database of ~3 billion photographs was originally exposed to the public by:

Correct. Kashmir Hill's February 2020 New York Times investigation revealed Clearview's operations, database scale, and law enforcement client list — triggering regulatory scrutiny, lawsuits, and global coverage of the company's practices.

Kashmir Hill's reporting in the New York Times in February 2020 brought Clearview's operations to public attention. The subsequent regulatory and legal consequences followed from that journalistic investigation.

8. The NIST facial recognition algorithm testing found that real-world error rates were significantly higher for which demographic groups?

Correct. NIST's 2019 testing of 189 algorithms found significantly higher false positive rates for darker-skinned faces, women, and elderly individuals — disparities that translate into higher rates of wrongful identification in real-world law enforcement use.

NIST testing documented higher false positive rates for darker-skinned individuals, women, and elderly people — not uniform performance across demographics. These disparities have direct consequences for who is most at risk of wrongful arrest.

9. Illinois's BIPA law enabled major lawsuits against which two companies for facial recognition practices?

Correct. Clearview AI settled BIPA claims restricting its Illinois commercial sales, and Meta paid $650 million to settle a BIPA class action over its Tag Suggestions facial recognition feature — both enabled by the 2008 Illinois biometric privacy law.

Illinois BIPA was most prominently deployed against Clearview AI (settled, restricting commercial sales in Illinois) and Facebook/Meta ($650 million settlement over Tag Suggestions facial recognition).

10. The "chilling effect" of surveillance research documented after Edward Snowden's 2013 NSA disclosures found:

Correct. Research published following the Snowden disclosures documented measurable self-censorship in online search behavior — people avoiding searches for topics they feared might attract government scrutiny, even when their activity was entirely lawful.

Research documented real behavioral changes: reduced searches for sensitive keywords, shifts in civil participation. The chilling effect is empirically documented, not hypothetical. Facial recognition extends this into physical space.

11. Max Schrems's GDPR complaints filed on May 25, 2018, argued that platform consent was invalid because:

Correct. Schrems's argument was that take-it-or-leave-it terms are not "freely given" consent under GDPR, because the power imbalance between a dominant platform and a user makes refusal practically impossible for most people.

Schrems challenged the validity of "forced" platform consent under GDPR's requirement that consent be "freely given." When the alternative to consent is exclusion from a dominant social or professional platform, freedom of choice is illusory.

12. GDPR Article 25, which codifies Privacy by Design, requires that the most privacy-protective option be:

Correct. "Privacy by default" means the most protective setting is the one users receive automatically without taking any action. This is the opposite of most current platform design, where maximum data collection is the default.

GDPR Article 25's "data protection by default" requires that systems automatically apply the most privacy-protective settings — not make them available, not document them, but implement them as the starting state for all users.

13. Apple's App Tracking Transparency change specifically required apps to:

Correct. ATT requires apps to ask users with a standardized prompt before tracking them across other apps and websites using Apple's IDFA identifier. When most users declined, it eliminated the behavioral tracking that powered Meta's targeted advertising for iOS.

ATT specifically addressed cross-app and cross-website tracking — requiring explicit permission (via a standardized system prompt) before apps could use Apple's IDFA to track users across other services.

14. The "data minimization" principle in Privacy by Design means a system should:

Correct. Data minimization requires that systems be designed to collect no more than what is necessary for the specific, declared purpose — the opposite of the "collect everything and figure out uses later" approach that characterizes surveillance capitalism.

Data minimization is a design constraint: collect only what is necessary for the stated purpose, nothing more. It is not about compression, user-visible profiles, or mandatory anonymization.

15. Which statement best captures the "accountability gap" as a structural problem in data privacy?

Correct. The accountability gap is an incentive misalignment: a data broker who sells location data that enables stalking faces minimal direct liability; the stalking victim bears all the harm. Until costs land on those who cause them, economic incentives favor continued data extraction over privacy protection.

The accountability gap is specifically about incentive misalignment: profits from data collection go to companies; harms from that collection fall on individuals. This structural asymmetry is why voluntary self-regulation has generally failed to produce meaningful privacy protection.