Session 1 of 8

How AI Uses Personal Data

Data collection pipelines, training data, inference logging, and what AI systems actually know about you
~60 minutes · Instructor-presented

Learning Objectives

  • Explain the difference between training data collection and real-time inference data collection
  • Identify the major categories of personal data consumed by modern AI systems
  • Describe how data flows from end-users through AI pipelines and where it is retained
  • Articulate why "public" data does not mean "safe to train on" from a privacy standpoint

Session Overview

This opening session sets the conceptual foundation for the entire course. Before students can protect their privacy, they need an accurate mental model of what AI systems actually do with data — and that picture is considerably more complex than most people assume. The goal is to replace vague anxiety with specific, accurate understanding.

Start by drawing a clear distinction between two separate data lifecycles: the massive, one-time ingestion of training data, and the ongoing, often invisible collection of inference-time data every time a user interacts with a deployed model. Both matter for privacy, but for different reasons and with different mitigations. Ground the discussion in systems students already use — search engines, voice assistants, recommendation feeds — before moving to more specialized AI applications.

Key Teaching Points

  • Training data is the foundation. Large language models and other AI systems are trained on billions of data points scraped from the open web, licensed datasets, and sometimes user-contributed content. This data is not deleted — it is baked into the model's weights, which means individuals whose data was included have essentially no way to "opt out" after the fact.
  • Inference logging is the ongoing risk. Every prompt you type, every image you upload, every voice command you give may be logged, reviewed by human annotators, and fed back into future training rounds. Most users do not read the data retention clauses in terms of service and are unaware this is happening.
  • Metadata is as revealing as content. Even if an AI provider doesn't store the exact text of your queries, metadata — timing, session length, device fingerprint, query frequency — can reveal sensitive information about health conditions, relationships, financial stress, and political beliefs.
  • Third-party data brokers feed AI training pipelines. AI companies often purchase data from brokers who have aggregated consumer profiles across hundreds of sources. This means your AI assistant may already "know" things about you that you never directly provided to it.
  • Federated learning doesn't fully solve the problem. Techniques like federated learning keep raw data on-device but still transmit model gradients to a central server — gradients that can sometimes be used to reconstruct portions of the original training data through gradient inversion attacks.
  • "Public" does not mean "consented." Scraping publicly available social media posts, forum comments, or photos for AI training exploits a legal gray zone. Users who posted that content had a reasonable expectation it would be read by humans, not used to train commercial AI systems at scale.

Discussion Prompts

  • Think about an AI product you use regularly. What data do you think it collects about you during each interaction? What data might it have collected about you before you ever signed up?
  • If a company scrapes your publicly available LinkedIn posts to train a hiring AI, is that a privacy violation? Does your answer change if the AI is used to screen you out of jobs?
  • How would you explain "inference logging" to a family member who uses a voice assistant every day? What would you want them to know?
  • Should AI companies be required to disclose exactly which data sources were used to train their models? What would the practical challenges be?
Instructor Notes

Open with a show-of-hands poll: "Who has read the full privacy policy of an AI product they use?" This reliably generates laughter and sets a candid tone. Have a printed excerpt from a major AI provider's data retention clause ready to read aloud — the contrast between casual use and what users actually agreed to is consistently surprising even to technically aware audiences. Avoid spending too long on federated learning; surface it briefly so students know it exists, then move on — deep technical treatment belongs in later sessions.

Timing Guide

0–10 minOpening poll & framing
10–30 minTraining vs. inference data
30–48 minKey teaching points & metadata
48–60 minDiscussion & Q&A
Session 2 of 8

Privacy Risks and Threats

Re-identification, model inversion, data leakage, and the risks of AI-powered surveillance
~60 minutes · Instructor-presented

Learning Objectives

  • Define and distinguish the major AI-specific privacy attack types: re-identification, model inversion, membership inference, and data poisoning
  • Explain how anonymized datasets can be de-anonymized using AI techniques
  • Describe how AI-powered surveillance technologies amplify traditional privacy threats
  • Assess the real-world likelihood and impact of these threats for ordinary people

Session Overview

Session 2 moves from how AI collects data to the concrete harms that can result when that data is misused, breached, or actively exploited. Students often arrive with vague fears about "AI privacy" — this session gives those fears precise names and teaches students to distinguish high-probability everyday risks from lower-probability but high-impact attack scenarios.

Anchor each threat type in a documented real-world case. The Netflix Prize de-anonymization (Narayanan and Shmatikoff, 2008), the AOL search query re-identification (2006), and facial recognition misidentification cases in law enforcement all make abstract concepts visceral. This session should leave students with a concrete threat model they can apply to their own situation.

Key Teaching Points

  • Anonymization is much weaker than it sounds. Research consistently shows that datasets stripped of obvious identifiers (name, email, SSN) can be re-identified using as few as four data points — location timestamps are particularly powerful. What a company calls "anonymized data" may be trivially linkable to real identities by anyone with access to a second dataset.
  • Model inversion attacks reconstruct private training data. By querying a trained model many times with carefully crafted inputs, attackers can sometimes recover approximations of the private data the model was trained on. This is especially concerning for medical AI trained on patient records or facial recognition systems trained on private photo sets.
  • Membership inference tells attackers whether you were in the training set. An adversary can query a model to determine with statistical confidence whether a specific individual's data was used during training — which can itself be sensitive information (e.g., confirming someone was a patient at a specific hospital).
  • AI-powered surveillance dramatically lowers the cost of mass monitoring. Facial recognition, gait analysis, voice identification, and behavioral biometrics allow state and corporate actors to monitor populations at scale with far fewer human operators than traditional surveillance required. The chilling effects on free speech and assembly are well-documented.
  • Data leakage through model outputs is an underappreciated risk. Large language models trained on proprietary or sensitive data can "memorize" and regurgitate verbatim passages from their training set when prompted correctly — a problem that has already caused real-world exposure of PII and trade secrets.

Discussion Prompts

  • The city you live in deploys AI-powered cameras that can identify individuals by gait (walking pattern) without ever capturing a clear face. Is this a privacy violation? Should it require a warrant?
  • A hospital trains an AI on patient records and then licenses the model to a pharmaceutical company. If a model inversion attack recovers a patient's diagnosis, who is liable — the hospital or the pharma company?
  • You learn that your personal data was included in a training set without your consent. The company says the data was "anonymized." What recourse do you have, and what would you do?
  • Which of the threats covered today do you think is the biggest risk for ordinary people in the next five years, and why?
Instructor Notes

The Netflix Prize re-identification case is your anchor — walk through it step by step because it's intuitive, well-documented, and genuinely surprising to most students. If you have a technically mixed audience, be careful not to spend too long on the mathematical mechanics of membership inference; instead, focus on what it means practically: knowing someone was in a dataset can be harmful even if you can't extract their record. Save 10 minutes at the end for the discussion — threat model conversations are where students connect this material to their own lives.

Timing Guide

0–8 minRecap session 1 & orient
8–30 minAttack types with case studies
30–48 minAI surveillance landscape
48–60 minDiscussion & threat modeling
Session 3 of 8

Protecting Your Digital Footprint

Practical tools and strategies to limit AI data collection and protect your online privacy
~60 minutes · Instructor-presented

Learning Objectives

  • Apply a layered defense approach to reducing personal data exposure across AI-powered services
  • Evaluate the privacy trade-offs of common tools including VPNs, browser extensions, and private search engines
  • Use platform opt-out controls and data deletion requests effectively
  • Create a realistic, sustainable personal privacy practice rather than an all-or-nothing approach

Session Overview

After two sessions focused on threats, students are typically ready for agency — they want to know what they can actually do. This session delivers practical, actionable guidance while being honest about the limits of individual action. The framing matters: privacy protection is not about achieving perfect anonymity (which is both impossible and often counterproductive), but about raising the cost and difficulty of surveillance to a level that makes you a lower-value target than the next person.

Structure the session around three levels of effort: low-friction changes anyone can make today, medium-effort tools for users who want stronger protection, and high-effort configurations for those with specific high-stakes needs. This tiered approach respects the audience's time constraints and avoids the common failure mode where perfect becomes the enemy of good.

Key Teaching Points

  • Browser isolation is your first line of defense. Switching to a privacy-focused browser (Firefox with uBlock Origin, Brave, or LibreWolf), enabling tracking protection, and compartmentalizing browsing activity across separate browser profiles prevents the cross-site tracking that feeds the largest advertising AI systems.
  • Search engines are surveillance engines in disguise. Every query to a major search engine is logged, profiled, and sold. Privacy-preserving alternatives (DuckDuckGo, Kagi, Brave Search, SearXNG self-hosted) provide comparable results without building a permanent query history linked to your identity.
  • Opt-out controls are real but require effort to find. Major AI providers (OpenAI, Google, Meta, Apple) all offer data controls — training opt-outs, deletion requests, and download exports — but these are deliberately buried. Walk students through the actual UI steps for at least one major platform.
  • Alias email addresses break identity graphs. Using unique alias addresses for each service (Apple Hide My Email, SimpleLogin, AnonAddy) prevents data brokers from correlating your activity across platforms using email as the join key — one of the most powerful de-anonymization techniques in practice.
  • VPNs protect transport, not content. A VPN hides your traffic from your ISP and changes your apparent IP address, but the VPN provider itself sees your traffic, and your behavior on platforms remains fully visible to those platforms. VPNs are a useful tool but widely misunderstood as providing more protection than they deliver.
  • Data broker opt-outs reduce but don't eliminate exposure. Services like DeleteMe, Privacy Bee, and manual opt-outs to individual brokers (there are hundreds) can significantly reduce your profile's availability. The process requires periodic repetition since data re-aggregates over time.

Discussion Prompts

  • You've implemented several privacy tools and your experience online is noticeably more inconvenient — broken site features, more CAPTCHAs, fewer relevant recommendations. How do you decide when the trade-off is worth it?
  • Should the default setting for AI training opt-outs be opt-in (you must actively consent) rather than opt-out (you must actively refuse)? Who benefits from the current default?
  • Is individual privacy protection meaningful when the data brokers already have decades of your history? Or is this too defeatist a view?
Instructor Notes

This session benefits from live demonstration — if you have a laptop connected to a display, walk through finding the training opt-out control on ChatGPT or Google's AI settings in real time. Students find this far more memorable than slide descriptions. Be prepared for the "is it even worth trying?" question — it comes up every time. Validate the frustration, then redirect to the practical reality that raising friction matters even when it doesn't provide perfect protection. Have the EFF's Cover Your Tracks tool (coveryourtracks.eff.org) ready to show browser fingerprinting in action — it's a reliable audience moment.

Timing Guide

0–5 minFrame the session
5–30 minTools demo & browser layer
30–48 minOpt-outs, aliases, VPNs
48–60 minDiscussion & personal action plans
Session 4 of 8

AI Governance and Regulation

GDPR, CCPA, the EU AI Act, and how regulatory frameworks shape AI data practices
~60 minutes · Instructor-presented

Learning Objectives

  • Summarize the key privacy rights granted under GDPR, CCPA, and the EU AI Act
  • Explain the concept of lawful basis for processing and why it matters for AI systems
  • Compare the regulatory philosophies of the EU, US, and other major jurisdictions
  • Identify the practical limits of current regulation in governing AI data practices

Session Overview

Regulation is the structural layer of privacy protection — where individual choices hit their limits, law either fills the gap or fails to. This session gives students a working knowledge of the major frameworks without becoming a law school lecture. The goal is functional literacy: students should be able to read a privacy notice, understand what rights they actually have, and recognize when a company is making compliance claims that don't hold up under scrutiny.

Organize the session around three frameworks: GDPR (the global gold standard, even for non-EU residents because of its extraterritorial reach), CCPA/CPRA (the US's strongest state-level analog), and the EU AI Act (the first comprehensive attempt to regulate AI systems specifically, with significant implications for high-risk AI applications). Acknowledge frankly that US federal AI privacy law remains thin, and that enforcement of existing law is spotty and under-resourced.

Key Teaching Points

  • GDPR created the modern privacy rights template. The right to access, right to erasure ("right to be forgotten"), right to portability, right to object to automated decision-making — these rights exist because of GDPR and have influenced legislation globally. However, enforcement depends on the resources and political will of national Data Protection Authorities, which varies enormously.
  • Lawful basis is the linchpin of GDPR compliance. Every processing operation requires a lawful basis — consent, legitimate interest, contractual necessity, legal obligation, vital interests, or public task. AI companies frequently invoke "legitimate interest" as a catch-all, which GDPR requires to be balanced against the data subject's interests — a test that is frequently performed loosely.
  • CCPA/CPRA gives California residents meaningful opt-out rights. The right to know what data is collected, the right to delete, the right to opt out of sale, and new CPRA protections for sensitive personal information (including AI-inferred information about health and sexual orientation) represent the strongest US privacy protections, though they apply only to qualifying businesses and California residents.
  • The EU AI Act introduces risk-tiered regulation of AI systems specifically. High-risk AI applications (hiring, credit scoring, biometric identification, critical infrastructure) face strict conformity assessments, transparency requirements, and human oversight mandates. Prohibited AI practices include real-time remote biometric surveillance in public spaces and social scoring — with significant extraterritorial reach for systems deployed in the EU.
  • Automated decision-making protections are underenforced. GDPR Article 22 gives individuals the right not to be subject to purely automated decisions with significant effects — which should apply to AI-driven hiring, lending, and insurance decisions. In practice, companies argue that humans are "in the loop" in ways that regulators have only recently begun to scrutinize seriously.

Discussion Prompts

  • A company's privacy notice says it uses "legitimate interests" as the lawful basis for training its AI on user data. What questions would you ask to evaluate whether that claim is valid?
  • The EU AI Act bans real-time remote biometric surveillance in public spaces (with limited exceptions). The US has no equivalent federal law. What does this difference tell you about the underlying political and economic priorities of each jurisdiction?
  • GDPR's right to erasure says companies must delete your personal data on request. But a large language model has "learned" from your data — it can't forget you the way a database can delete a row. How should the law handle this?
  • Should compliance with privacy law be sufficient proof that an AI system is treating users fairly? Or is legal compliance a floor, not a ceiling?
Instructor Notes

Students often arrive with the assumption that GDPR = Europe and therefore irrelevant to them. Spend a few minutes on GDPR's extraterritorial scope: any company that processes data of EU residents is subject to GDPR, regardless of where the company is based — which means virtually every major tech company operates under GDPR to some degree. For US-based students, CCPA is the most immediately actionable framework; walk through how to actually submit a CCPA data access or deletion request. The EU AI Act section will be the newest material for most students — lean on concrete examples of what counts as "high-risk" rather than abstract legal language.

Timing Guide

0–5 minRegulatory landscape overview
5–25 minGDPR deep dive
25–42 minCCPA/CPRA & EU AI Act
42–60 minDiscussion & enforcement limits
Session 5 of 8

Workplace Privacy and AI Monitoring

Employee surveillance, productivity tracking, and the acceptable limits of AI in the workplace
~60 minutes · Instructor-presented

Learning Objectives

  • Describe the range of AI monitoring technologies now deployed in workplace settings
  • Analyze the legal framework governing employee monitoring in the US and EU
  • Evaluate the ethical tensions between legitimate business interests and employee privacy
  • Identify what disclosures employees should expect and what questions to ask their employer

Session Overview

Workplace AI monitoring has accelerated dramatically since the pandemic-driven shift to remote work, and it represents one of the most immediate and personal privacy issues for most working adults. The technology deployed by employers now includes keystroke logging, screenshot capture, application usage tracking, attention monitoring via webcam, communications sentiment analysis, and AI-generated productivity scores. Students need to understand both what is legally permissible and what is ethically defensible — and recognize that the gap between those two things is often large.

This session is best run as a case-study-heavy discussion. The legal picture is straightforward in the US (employers have broad latitude, especially for employer-owned devices), but the ethical conversation is rich. Push students to think from both sides: as employees, what monitoring would they find acceptable versus degrading? As managers or business owners, what monitoring would they implement and how would they justify it to their teams?

Key Teaching Points

  • The legal standard for employee monitoring in the US is permissive. Employees generally have very limited privacy expectations on employer-owned equipment and networks. The Electronic Communications Privacy Act (ECPA) contains an employer exception that allows monitoring of business communications. Many states have notification requirements, but the bar for employer monitoring is low by international standards.
  • EU and UK workers have substantially stronger protections. Under GDPR, employee monitoring must have a lawful basis, be proportionate, and employees must be clearly informed. The UK ICO has issued explicit guidance that covert monitoring of employees is rarely justified and that keystroke logging and screenshot capture require careful justification.
  • Productivity surveillance often measures activity rather than performance. AI productivity systems frequently track proxies (mouse movement, keystrokes, meeting attendance) rather than actual work quality. Research suggests these systems correlate poorly with job performance and can drive counterproductive gaming behavior — employees optimize for the metric rather than the goal.
  • Biometric monitoring in the workplace introduces heightened risks. Emotion AI and attention-monitoring systems (marketed as tools for detecting driver fatigue or student engagement) have been shown to perform worse for darker-skinned individuals and those with certain disabilities. Deploying these systems at work raises discrimination liability beyond privacy concerns.
  • Transparency and proportionality are the key ethical tests. A useful heuristic: employees should be clearly told what is monitored, why, how long data is retained, and who has access. If an employer would not be comfortable telling employees exactly what is monitored in plain language, that is a signal the monitoring may not survive ethical scrutiny.

Discussion Prompts

  • Your employer installs software that takes a screenshot of your computer every five minutes during work hours. The handbook says "your work device may be monitored." Is this disclosure sufficient? Would your answer change if you are working from home?
  • A hospital uses AI to analyze nurses' communications for signs of burnout and proactively intervenes with support resources. Is this a legitimate use of workplace AI? Does it matter whether nurses opted in?
  • As a manager, you are offered an AI tool that assigns each employee a daily "performance score" based on their digital activity. What information would you want before deciding whether to deploy it?
  • Where should the line be between acceptable quality-assurance monitoring (call centers recording customer interactions, for example) and intrusive surveillance of employees?
Instructor Notes

This session often generates strong personal reactions — many students will have direct experience with workplace monitoring, either as employees who resented it or as managers who implemented it. Create space for both perspectives without letting the discussion become a grievance session. The dual-perspective exercise (what would you accept as an employee / what would you implement as a manager) is particularly effective for generating nuanced thinking. Be prepared for the "I have nothing to hide" response — the counter-question is whether they would be comfortable with their manager reading every message they sent to a coworker, since that's often what these systems capture.

Timing Guide

0–8 minSurvey of monitoring tech
8–25 minLegal framework US vs. EU
25–45 minCase studies & dual perspectives
45–60 minDiscussion & ethical heuristics
Session 6 of 8

Children's Privacy and AI

COPPA, FERPA, and the heightened risks of AI data collection targeting minors
~60 minutes · Instructor-presented

Learning Objectives

  • Explain the key protections granted under COPPA and FERPA and their application to AI systems
  • Identify the specific risks that AI data collection poses for children and adolescents
  • Evaluate the adequacy of current legal protections for minors in AI-powered educational and consumer environments
  • Describe practical steps parents, educators, and institutions can take to protect minors' privacy in AI contexts

Session Overview

Children occupy a unique position in privacy law — they are recognized as a class deserving heightened protection because they cannot meaningfully consent to data collection, may not understand the long-term consequences of data exposure, and are developmentally susceptible to manipulation by AI systems designed to maximize engagement. The laws intended to protect them, however, were written before the AI era and show significant gaps.

Ground this session in the concrete environments where children encounter AI: educational technology (AI tutoring, adaptive learning platforms, proctoring software), social media (algorithmic feeds, deepfake generation, recommendation systems), consumer AI assistants (smart speakers, AI companions), and gaming platforms. Each environment has different legal coverage and different risk profiles. Students in this session often include parents, educators, and school administrators — tailor discussion accordingly.

Key Teaching Points

  • COPPA requires verifiable parental consent for data collection on children under 13. The Children's Online Privacy Protection Act prohibits operators of websites and online services from collecting personal information from children under 13 without verifiable parental consent. In practice, most platforms enforce this only through age gates that children bypass trivially — a compliance fiction that regulators have only recently begun to take seriously.
  • FERPA protects education records but has significant loopholes for EdTech. The Family Educational Rights and Privacy Act gives parents (and students over 18) rights to access and control education records. However, FERPA's "school official" exception allows schools to share student data with EdTech vendors as long as the vendor is acting on the school's behalf — meaning student data flows to commercial AI companies under contracts students and parents have no visibility into.
  • AI recommendation systems pose specific risks to adolescent mental health. Internal research at major platforms (revealed through whistleblower disclosures) confirmed that recommendation algorithms amplify content linked to eating disorders, self-harm, and depression for adolescent users. AI systems optimized for engagement do not distinguish between healthy and harmful engagement.
  • AI-generated deepfakes of minors represent an emerging and severe harm. The proliferation of AI image generation tools has enabled the creation of non-consensual intimate imagery (NCII) of minors at scale. This is a federal crime under US law, but enforcement is resource-constrained and the technology outpaces regulatory response.
  • AI proctoring software in schools raises profound civil liberties concerns. Systems that use webcams to infer cheating through eye movement, facial expression, and environmental scanning have been shown to produce higher false-positive rates for students of color and students with disabilities — and collect biometric data from minors under institutional coercion.

Discussion Prompts

  • Your child's school deploys an AI tutoring platform that adapts to their learning style. The platform's terms of service allow it to use student interaction data to improve its commercial product. Is this an acceptable trade-off for educational benefit?
  • A 14-year-old signs up for a social media platform by lying about their age. When the platform's AI recommendation system drives them toward harmful content, who bears responsibility?
  • Should there be a minimum age for using AI companions and chatbots? How would you set and enforce such a limit?
  • COPPA sets the age threshold at 13 — a number chosen in 1998. Given what we now know about adolescent brain development and social media's effects, should this threshold be raised?
Instructor Notes

This session reliably generates the most emotionally engaged discussion of the course — particularly when parents are in the room. Be prepared to hold space for genuine distress about risks to children without catastrophizing. Keep the discussion grounded in evidence: the connection between algorithmic recommendation and adolescent mental health is real and documented, but the causal picture is more complex than headlines suggest. The FERPA/EdTech loophole is a policy area where students can take concrete action — advocating for stronger contract terms at the school board level is realistic civic participation. Know your local state law: many states have enacted stronger children's privacy protections than federal law requires.

Timing Guide

0–8 minFrame the unique vulnerabilities of minors
8–28 minCOPPA & FERPA with gaps
28–46 minEdTech, social media & deepfakes
46–60 minDiscussion & practical steps
Session 7 of 8

Privacy by Design

Building AI systems that protect privacy from the ground up — not as an afterthought
~60 minutes · Instructor-presented

Learning Objectives

  • Apply the seven foundational principles of Privacy by Design to AI system development
  • Describe practical privacy-enhancing technologies (PETs) relevant to AI systems
  • Evaluate the trade-offs between privacy-preserving techniques and AI system utility
  • Identify where in the AI development lifecycle privacy protections must be built in to be effective

Session Overview

Privacy by Design (PbD) — the framework developed by Ann Cavoukian in the 1990s and now embedded in GDPR as a legal requirement — holds that privacy protection must be baked into systems from the initial design phase rather than bolted on afterward. For AI systems, this principle is especially important because retrofitting privacy protections into a trained model is technically far harder than building them in from the start. This session is oriented toward students who build, procure, or oversee AI systems, not just use them.

Structure the session around three layers: data minimization and purpose limitation at the data collection stage, technical privacy-enhancing technologies (differential privacy, federated learning, homomorphic encryption, synthetic data) at the training and inference stage, and governance mechanisms (Privacy Impact Assessments, data protection officers, audit logs) at the organizational level. Be honest that each technique involves real trade-offs with model utility and development cost.

Key Teaching Points

  • Data minimization is the most powerful privacy protection. The most effective way to prevent AI systems from misusing data is not to collect it in the first place. "Collect everything and figure out what to do with it later" is the dominant industry practice and the root cause of most AI privacy failures. Purpose limitation — collecting data only for a specific, defined purpose — requires organizational discipline but pays dividends in reduced breach exposure and regulatory risk.
  • Differential privacy adds mathematically calibrated noise to protect individuals. Differential privacy (pioneered by Cynthia Dwork) is a formal mathematical framework that adds carefully calibrated noise to datasets or query results so that no individual's data can be identified from the output. Apple uses differential privacy in its usage statistics collection; Google uses it in Chrome. The trade-off is reduced statistical accuracy — the privacy budget must be managed carefully.
  • Synthetic data can replace real personal data for many AI training tasks. Synthetic datasets generated to match the statistical properties of real datasets can train capable AI models without exposing real individuals' data. The technique is not a silver bullet — synthetic data can still encode biases present in the original data — but it substantially reduces re-identification risk.
  • Privacy Impact Assessments should be mandatory before deployment. A Privacy Impact Assessment (PIA) — or Data Protection Impact Assessment (DPIA) under GDPR — is a structured process for identifying privacy risks in a new system before it is deployed. GDPR makes DPIAs mandatory for high-risk processing. Many organizations treat PIAs as a compliance checkbox; effective ones treat them as genuine design reviews.
  • Default settings must protect privacy, not expose it. One of the core PbD principles is "privacy as the default" — systems should be configured to protect privacy by default, with users who want to share more data opting in rather than users who want to protect their privacy opting out. This principle inverts the dominant commercial model and is where regulation and industry interest conflict most directly.

Discussion Prompts

  • A startup building a healthcare AI argues that data minimization would make their model less accurate and therefore less helpful to patients. How do you evaluate this claim, and what questions would you ask?
  • Your organization must conduct a DPIA before launching a new AI hiring tool. Who should be involved in that assessment, and what would you consider a genuinely independent review?
  • Is "privacy by design" compatible with the current business models of AI companies that monetize user data? Or does it require a fundamentally different economic model?
  • Which of the privacy-enhancing technologies discussed today do you think has the most realistic near-term adoption potential, and why?
Instructor Notes

This is the most technically dense session in the course — calibrate the depth of your differential privacy and federated learning explanations to your audience. For non-technical audiences, the key message is: "These techniques exist and work, but they involve real trade-offs, and privacy must be a design requirement not a nice-to-have." For technical audiences, go deeper into the mathematics of the privacy budget and the practical limitations of synthetic data. The PIA/DPIA discussion is useful for any audience because it is process-oriented rather than technical. If you have time, ask students to sketch a PIA for a hypothetical AI system they might deploy — even a rough exercise makes the concept concrete.

Timing Guide

0–8 minPbD principles overview
8–30 minData minimization & PETs
30–46 minPIAs & default settings
46–60 minDiscussion & design exercise
Session 8 of 8

Responding to Privacy Incidents

Data breach response, notification requirements, and recovery strategies for AI-related privacy failures
~60 minutes · Instructor-presented

Learning Objectives

  • Describe the key phases of an effective data breach response plan and the roles involved
  • Explain breach notification requirements under GDPR, US state laws, and sector-specific regulations
  • Apply a structured approach to assessing whether an AI-related incident constitutes a notifiable breach
  • Identify the long-term recovery steps — technical, legal, and reputational — required after an AI privacy incident

Session Overview

Privacy incidents happen even to organizations that have done most things right. The question is not whether your AI system will ever be involved in a privacy incident, but whether your organization is prepared to respond effectively when it is. This final session shifts from prevention to response, giving students a practical playbook grounded in real incident patterns.

AI-related privacy incidents have some unique characteristics that distinguish them from traditional data breaches: the "leaked" data may be embedded in model weights rather than a database, the breach may be discovered through a model's outputs rather than an intrusion alert, and remediation may require retraining the model rather than patching a system. Cover these AI-specific nuances alongside the established incident response frameworks students may already know from other security training.

Key Teaching Points

  • Incident response starts before the incident. Organizations that respond effectively to breaches have prepared in advance: they have a documented incident response plan, designated roles (incident commander, legal counsel, communications lead), pre-approved notification templates, and relationships with outside forensic and legal counsel. The worst time to design these is after a breach has occurred.
  • Breach notification timelines are strict and vary by jurisdiction. GDPR requires notification to the supervisory authority within 72 hours of becoming aware of a breach — a very short window. US state laws vary: California requires "expedient" notification, while other states have windows from 30 to 90 days. Sector-specific rules (HIPAA's 60-day window for covered entities) add further complexity. Organizations operating across jurisdictions need a triage process to identify the most demanding applicable timeline.
  • AI-specific breaches include model memorization incidents. When a language model regurgitates training data containing PII — credit card numbers, medical records, personal addresses — this is a privacy breach even though no external attacker accessed a database. The investigation, remediation, and notification obligations are the same as for a traditional data breach but the technical response (retraining, output filtering, model replacement) is different.
  • Scope determination is the hardest part of AI breach response. In a traditional database breach, you can usually determine which records were exposed. When PII leaks through a model's outputs, determining which individuals are affected, how many queries surfaced sensitive data, and what was actually disclosed requires careful log analysis — which is only possible if organizations have maintained adequate audit logs of model inputs and outputs.
  • Reputational recovery requires transparency, not minimization. Organizations that respond to AI privacy incidents by minimizing the scope, blaming users, or obscuring the cause consistently fare worse in the long run than those that disclose clearly, acknowledge failures, and commit to specific remediation steps. The FTC and state attorneys general are increasingly focused on whether post-breach conduct is evidence of systemic privacy failures warranting enforcement action.

Discussion Prompts

  • Your company's customer-facing AI chatbot has been found to occasionally reproduce text that contains real customers' email addresses. How do you determine whether this is a notifiable breach, and what do you do in the first 24 hours?
  • GDPR's 72-hour notification window is designed to push organizations to act quickly. Critics argue it is too short to properly scope a complex AI incident. Do you agree, and what would a better standard look like?
  • After a major privacy incident, your organization's CEO wants to issue a statement saying "fewer than 1% of users were affected." What questions do you ask before approving that statement?
  • Looking back across all eight sessions: what is the single most important thing you are taking away from this course that will change how you think or act?
Instructor Notes

Close the course with the final discussion question as a capstone — go around the room and ask each student to name one thing they will actually do differently. This creates accountability and sends students out with a concrete action rather than a diffuse sense that "privacy matters." The chatbot PII leakage scenario is realistic and close to incidents that have actually occurred; if you have a real incident you can discuss (under NDA or through public case studies), use it in place of the hypothetical. For the 72-hour question, there is no consensus right answer — the point is to get students thinking about the tension between speed and accuracy in incident response, which is genuinely hard.

Timing Guide

0–8 minIncident response framework
8–28 minNotification requirements & timelines
28–46 minAI-specific breach patterns
46–60 minCapstone discussion & course close