AI Ethics: Right and Wrong · Introduction

Every Powerful Tool Arrives Before We Know How to Use It Wisely

This course exists because the people building AI are not the only ones who should be deciding what it does.

In 1844, Samuel Morse tapped the first long-distance telegraph message from Washington to Baltimore: "What hath God wrought?" The question was half-celebratory, half-genuinely terrified. Within fifteen years the telegraph had reorganized financial markets, military command, and journalism — and had also enabled a new wave of fraud, stock manipulation, and surveillance that legislators scrambled to address decades after the damage was done. The pattern was consistent: the technology arrived, the benefits were obvious and immediate, and the harms emerged slowly, unevenly, and mostly to people who had no hand in building the system.

Today that pattern is repeating at machine speed. Between 2022 and 2024, large language models moved from academic curiosity to tools embedded in hiring pipelines, medical triage software, criminal risk-scoring systems, and school assessments. A hiring algorithm used by Amazon, trained on ten years of male-dominated tech resumes, began systematically downgrading applications from women — discovered internally in 2018 and quietly shelved. A healthcare resource-allocation algorithm used by Optum affected roughly 200 million Americans and was found in 2019 to assign lower risk scores to Black patients than white patients with identical medical conditions, diverting care away from those who needed it more. The decisions were not made by a person intending harm. They emerged from choices — about what data to use, which outcomes to optimize, whose interests to weigh — that seemed purely technical at the time.

This course is about learning to see those choices for what they are: ethical decisions disguised as engineering ones. You will not finish it with a universal formula for right and wrong. No one has one. What you will leave with is a set of frameworks, questions, and historical anchors that let you examine any AI decision — one you encounter, one you are asked to build, one that affects you — and reason about it with rigor and honesty. That is a skill the next decade will demand of almost everyone.

AI Ethics: Right and Wrong · Lesson 1

What Makes an AI Decision Ethical?

Three ethical frameworks, one concrete case, and why "the algorithm decided" is never a complete answer.

When a machine makes a decision that harms a person, who is morally responsible — and by what standard do we even judge it?

In the spring of 2020, the Dutch city of Rotterdam was using an AI risk-scoring system called SyRI — the System Risk Indication — to flag citizens for welfare fraud investigations. The system ingested seventeen categories of government data: tax records, employment history, debt registers, vehicle ownership, even energy usage. It produced a numerical score. High scorers were referred to human investigators. No one who received a high score was told they had been flagged. No one could appeal a score they didn't know existed. The system operated overwhelmingly in lower-income, ethnically diverse neighborhoods. In February 2020, a Dutch court ruled SyRI violated the European Convention on Human Rights — specifically Article 8, the right to private life — and ordered it shut down. The government argued the system was merely a neutral tool for allocating investigative resources. The court found that neutrality was a fiction: the choice of which data to weight, and which neighborhoods to deploy in, encoded assumptions that could not be separated from the decision itself.

The SyRI case is not unusual. It is a template. It illustrates something that will recur throughout this course: AI systems do not make value-free decisions. Every system embeds the values of its designers, even when — especially when — those designers insist they have kept values out of it.

Why Ethics Applies to AI Decisions

An ethical decision is one that affects the interests, rights, or welfare of people — and where the choice between options is not purely factual. Whether a bridge will hold a certain weight is an engineering question. Whether a welfare algorithm should weigh debt history more heavily than employment history is not. The second question looks technical; it is actually moral. It asks: whose circumstances do we take seriously, and whose do we discount?

AI systems face ethical dimensions at three distinct junctures: design (what problem to solve and how to define success), training (what data to learn from and how to handle its biases), and deployment (who is affected and how their interests are weighed). Ethical analysis is not a checklist appended at the end of development. It is a discipline that must be active at all three stages — which is why it requires a vocabulary.

Three Frameworks for Ethical Reasoning

Moral philosophers have debated the foundations of ethics for more than two millennia. Three frameworks dominate practical applied ethics today, and all three are relevant to AI. None of them provides a simple answer to every case. Each illuminates a different dimension of the same decision.

Consequentialism Judges actions by their outcomes. The right decision maximizes overall welfare (or minimizes harm). Associated with Jeremy Bentham and John Stuart Mill. In AI terms: does this system, on net, produce better outcomes across all affected parties?

Deontology Judges actions by whether they respect rules, duties, and rights — regardless of outcome. Associated with Immanuel Kant. In AI terms: does this system treat people as ends in themselves, or does it use them as data points for someone else's benefit?

Virtue Ethics Judges actions by whether they reflect the character of a good agent. Associated with Aristotle. In AI terms: what kind of institution are we becoming by building and deploying this system? Would a person of good character be comfortable with this decision?

Framework in Tension — SyRI Revisited

A consequentialist might ask: did SyRI actually detect more fraud than traditional investigation at lower cost? If yes, and the total harm from fraud exceeded the harm from misidentification, was it justified? A deontologist asks a different question entirely: did citizens have a right to know they were being scored and to contest that scoring? The Dutch court answered that question with a firm no — and shut the system down regardless of its detection rate. A virtue ethicist asks: what kind of government secretly scrutinizes its poorest residents while exempting wealthier ones? What does that say about institutional character?

The Moral Responsibility Problem

When an AI system produces a harmful outcome, responsibility does not automatically attach to any single actor — and this diffusion is not accidental. Engineers say they followed specifications. Product managers say they relied on engineers' judgment. Executives say they relied on legal review. Legal teams say the system complied with current law. And the law, as often as not, has not caught up with the technology. The philosopher Helen Nissenbaum called this "the problem of many hands" — harm that emerges from a system no single person designed to be harmful.

Understanding this does not mean no one is responsible. It means responsibility must be traced carefully — upstream to design choices, sideways to deployment decisions, and outward to the institutions that permitted the system to operate without oversight. One of the practical goals of this course is to teach you to trace that chain.

Core Principle

"The algorithm decided" is never a complete answer to an ethical question about an AI system. Algorithms are designed, trained, deployed, and maintained by people operating within institutions that have interests, incentives, and choices. Every step in that chain involves moral agency — and moral accountability.

What Counts as an "AI Decision"?

Not every computation is an ethical decision in the relevant sense. A spam filter classifying email as junk is making a decision — but the stakes are low and easily corrected. A risk-scoring algorithm determining which defendants receive bail before trial, as COMPAS does in jurisdictions across the United States, is making a decision with potentially years of someone's life at stake. The ethical weight scales with: (1) the severity of the consequence, (2) the reversibility of the outcome, (3) the vulnerability of those affected, and (4) the availability of recourse.

ProPublica's 2016 investigation into COMPAS found that the system falsely flagged Black defendants as future criminals at nearly twice the rate it falsely flagged white defendants. The company that made COMPAS, Northpointe, disputed the methodology. What the dispute revealed — and this is the lesson — is that the very definition of "fairness" for a risk-scoring system is contested, and the contest is not statistical. It is ethical. You cannot resolve it with more data.

Key Distinction

Technical accuracy and ethical soundness are different properties. A system can be accurate on its training metric and still produce ethically unacceptable outcomes. Optimizing for accuracy at predicting recidivism using historically biased arrest data does not produce a fair system. It produces a precise replica of historical injustice.

Lesson 1 Quiz

Five questions · Select the best answer for each

1. The Dutch court ruled SyRI illegal primarily because it violated which right?

Correct. The court found that SyRI's opaque scoring of citizens' private data, without their knowledge or right of appeal, violated Article 8 — the right to respect for private and family life.

Not quite. The court's ruling centered on Article 8 of the European Convention on Human Rights — the right to private life — because citizens were scored on personal data without their knowledge or ability to contest the result.

2. Which ethical framework asks "does this system treat people as ends in themselves, not merely as means?"

Correct. Deontology, particularly Kant's categorical imperative, holds that persons must never be treated merely as instruments for others' ends — a principle directly relevant to AI systems that process people as data inputs.

Not quite. This formulation — treating persons as ends, not merely means — is the signature of Kantian deontology, which evaluates actions by whether they respect duties and rights rather than by their outcomes.

3. ProPublica's 2016 investigation found that the COMPAS recidivism algorithm disproportionately affected which group?

Correct. ProPublica's analysis found Black defendants were falsely labeled future criminals at roughly twice the rate of white defendants — even when controlling for prior crime, age, and gender.

The investigation found that Black defendants were falsely flagged as high risk at nearly twice the rate of white defendants with equivalent criminal histories — a finding that sparked a major public debate about algorithmic fairness.

4. What does philosopher Helen Nissenbaum's concept of "the problem of many hands" describe in the context of AI?

Correct. Nissenbaum's concept captures how complex systems produce harms that no individual designed, because responsibility is spread across engineers, managers, executives, and legal teams — each of whom can credibly deny full ownership of the outcome.

Nissenbaum's "problem of many hands" refers specifically to the diffusion of moral responsibility in complex sociotechnical systems — where harm emerges from a chain of individually defensible decisions and no single actor can be held solely accountable.

5. Which statement best captures the relationship between technical accuracy and ethical soundness in an AI system?

Correct. COMPAS illustrated this precisely: the system could be statistically "accurate" at predicting recidivism while encoding racial disparities baked into historical arrest data. Accuracy and fairness are distinct properties that can and do conflict.

These are distinct properties. A system optimized for accuracy on biased training data will accurately reproduce historical injustice — making it technically precise and ethically problematic at the same time.

Lab 1 · Applying Ethical Frameworks

Practice applying consequentialism, deontology, and virtue ethics to real AI decision scenarios

Your Task

You have encountered three ethical frameworks — consequentialism, deontology, and virtue ethics — and two real cases: the SyRI welfare fraud system and the COMPAS recidivism algorithm. In this lab, you will practice applying those frameworks to new scenarios and defending your reasoning.

The AI tutor will present you with a scenario and ask you to analyze it through one or more frameworks. There are no trick questions. The goal is to practice structured ethical reasoning, not to arrive at a predetermined answer.

Start by telling the tutor which of the three frameworks you found most compelling in Lesson 1, and why. Then ask for a new scenario to analyze.

Ethics Reasoning Tutor

Lab 1

Welcome to Lab 1. We've covered consequentialism, deontology, and virtue ethics — three lenses for examining AI decisions. Tell me which framework resonated most with you after reading about SyRI and COMPAS, and briefly explain why. Then I'll give you a fresh scenario to work through together.

AI Ethics: Right and Wrong · Lesson 2

Bias, Fairness, and the Data We Trust

Why "garbage in, garbage out" understates the problem — and how bias survives even clean data.

If an AI is trained on data that reflects historical injustice, can its outputs ever be fair — and what does "fair" even mean?

In 2014, Amazon built an automated résumé screening tool designed to take the human bottleneck out of technical hiring. Engineers trained it on ten years of submitted résumés — which reflected, accurately, the fact that Amazon's technical workforce was predominantly male. By 2015 the system was downgrading résumés that contained the word "women's" — as in "women's chess club" or "women's college" — and penalizing graduates of all-women's universities. Amazon discovered this in 2018, tried and failed to correct it, and quietly disbanded the project. The system had not been programmed to discriminate. It had learned to discriminate from data that encoded a decade of discriminatory hiring patterns, and then reproduced those patterns at machine scale.

The engineers called this a data problem. It was, in one sense. But it was also a question problem: when you define "a good hire" as "looks like our past hires," you have already made an ethical choice — and disguised it as an engineering one.

Where Bias Enters the Pipeline

Algorithmic bias does not require a biased programmer. It requires only a biased world — and data collected from that world. Researchers have identified five points in the AI development pipeline where bias can enter and compound:

1. Historical bias — the world the data describes was already unequal. Arrest records, loan approvals, medical diagnoses: all reflect who was policed, who was given credit, who was taken seriously by doctors.

2. Representation bias — certain groups are underrepresented in training data, so the system performs worse on them. A 2018 study by Joy Buolamwini and Timnit Gebru (the "Gender Shades" paper) found that commercial facial recognition systems from IBM, Microsoft, and Face++ had error rates up to 34 percentage points higher for darker-skinned women than for lighter-skinned men.

3. Measurement bias — the proxy metric chosen to represent the outcome of interest is more accurate for some groups than others. Using credit score as a proxy for creditworthiness, when credit scores are themselves shaped by decades of discriminatory lending, embeds that discrimination in the new system.

4. Aggregation bias — building a single model for a diverse population obscures important within-group variation. A diabetes prediction model trained on aggregate data may perform well on average and poorly for specific ethnic groups with distinct physiological patterns.

5. Deployment bias — a system built for one context is applied in another where its assumptions no longer hold. A hiring tool trained on one industry's norms applied to another sector will carry its first industry's assumptions invisibly.

The Gender Shades Study — 2018

Joy Buolamwini at MIT and Timnit Gebru at Google published a landmark audit of commercial facial recognition systems in 2018. IBM's system had a 99.7% accuracy rate on lighter-skinned men and a 65.3% accuracy rate on darker-skinned women — a 34.4 point gap. Microsoft's showed a similar pattern. The companies did not dispute the results. IBM improved its system within months. What the study demonstrated was that companies had been deploying systems they had not audited for demographic parity, on the assumption that high overall accuracy meant universal reliability.

The Fairness Impossibility Result

In 2016, in direct response to the COMPAS controversy, computer scientists Jon Kleinberg, Sendhil Mullainathan, and Manish Raghavan proved something that has since reshaped the field: under realistic conditions, it is mathematically impossible to satisfy all common definitions of algorithmic fairness simultaneously. You can have a system where the false positive rate is equal across groups, or one where the positive predictive value is equal across groups, but not both — unless base rates in the population are equal across groups, which in historically unequal societies they typically are not.

This is not a technical problem awaiting a better algorithm. It is a values problem dressed in mathematical clothes. Choosing which fairness criterion to satisfy is a choice about whose risks matter more — and that choice belongs in the domain of ethics and democratic deliberation, not engineering alone.

Equal opportunity All qualified applicants (or defendants, or patients) have the same chance of a favorable outcome regardless of group membership. Focus on true positive rates being equal across groups.

Demographic parity Outcomes are distributed proportionally across groups regardless of underlying qualification rates. Can conflict directly with merit-based selection when historical opportunity has been unequal.

Calibration A risk score of 70% means the same thing (70% probability of the outcome) regardless of which group the person belongs to. Northpointe argued COMPAS was fair by this definition; ProPublica argued it was not fair by equal opportunity.

The Core Lesson

Choosing a fairness definition is not a neutral technical act. It is a decision about which harms are acceptable and to whom. That decision requires explicit ethical justification — and it should involve, at minimum, the communities most likely to be affected by the system.

What Can Be Done

The impossibility result does not license passivity. Several concrete interventions reduce bias without requiring its elimination: pre-processing methods that transform training data to reduce historical disparities before training begins; in-processing constraints that build fairness criteria directly into the optimization objective; post-processing thresholds that adjust decision boundaries differently for different groups to equalize error rates. Each of these has trade-offs. Each requires an explicit choice about which trade-off to accept.

Perhaps more importantly: the decision about whether to deploy a system at all is always available. Amazon disbanded its résumé tool. That decision — not building a better tool, but declining to deploy a harmful one — is also an ethical choice, and often the correct one.

Lesson 2 Quiz

Five questions · Select the best answer for each

1. Amazon's résumé-screening tool penalized applications from women primarily because of what?

Correct. The system learned from historical hiring patterns. Since successful hires had been predominantly male, it reproduced that pattern — not because it was told to, but because the data it learned from encoded a decade of gender imbalance.

The system was not explicitly programmed to discriminate. It learned to discriminate by training on ten years of predominantly male hiring outcomes, and then reproduced those patterns in its recommendations.

2. The "Gender Shades" study by Buolamwini and Gebru found error rate gaps as large as how many percentage points between lighter-skinned men and darker-skinned women?

Correct. IBM's system showed a 34.4 percentage-point gap between its accuracy on lighter-skinned men (99.7%) and darker-skinned women (65.3%) — revealing that high overall accuracy concealed severe disparities for underrepresented groups.

The gap was approximately 34 percentage points. IBM's facial recognition achieved 99.7% accuracy on lighter-skinned men and 65.3% on darker-skinned women — a finding that demonstrated how high average accuracy can mask serious performance disparities.

3. The fairness impossibility result (Kleinberg, Mullainathan, Raghavan, 2016) showed that:

Correct. The mathematical proof showed that common fairness criteria — equal opportunity, demographic parity, calibration — are mutually incompatible when group base rates differ, which means choosing a fairness definition is unavoidably a values decision.

The impossibility result proved that when base rates differ across groups (as they typically do in historically unequal societies), it is mathematically impossible to simultaneously satisfy all common definitions of algorithmic fairness.

4. "Measurement bias" in an AI pipeline refers to:

Correct. Measurement bias occurs when the chosen proxy metric (e.g., credit score as a proxy for creditworthiness) is itself shaped by historical inequities — so the model learns and reproduces those inequities even when trained correctly on that metric.

Measurement bias refers to the use of a proxy variable that performs differently across groups — for example, using arrest rates as a proxy for criminal propensity when arrest rates themselves reflect unequal policing practices.

5. Which statement about algorithmic bias interventions is most accurate?

Correct. Pre-processing, in-processing, and post-processing interventions all carry trade-offs — between accuracy and fairness, between different fairness definitions, between affected groups. The choice of which trade-off to accept is inherently normative, not purely technical.

Each intervention — whether pre-processing data, constraining training, or adjusting thresholds post-hoc — involves trade-offs. Deciding which trade-off is acceptable requires explicit ethical reasoning, not just technical optimization.

Lab 2 · Diagnosing Bias in a Scenario

Practice identifying which type of bias is present and what intervention it calls for

Your Task

You have learned five points where bias enters an AI pipeline: historical bias, representation bias, measurement bias, aggregation bias, and deployment bias. You have also learned three intervention approaches: pre-processing, in-processing, and post-processing.

The tutor will describe a scenario involving a biased AI system. Your job is to: (1) identify which type(s) of bias are present, (2) explain the mechanism by which harm is produced, and (3) propose an intervention and defend it.

Start by asking the tutor for your first scenario. Be specific in your diagnosis — name the bias type and trace how it enters the pipeline.

Bias Diagnosis Tutor

Lab 2

Ready when you are. Ask me for a scenario and I'll describe an AI system with a bias problem. Your job is to identify which type of bias is at work, explain how the harm is generated, and propose a concrete intervention. Let's begin.

AI Ethics: Right and Wrong · Lesson 3

Transparency, Explainability, and the Right to Know

When an AI makes a decision that affects your life, do you have a right to understand why — and is that even possible?

What does it mean to demand an explanation from an algorithm, and who bears the burden of providing one?

In 2011, a woman named Jeanette Yarger received notice that her Medicaid-funded home care hours had been cut — from twenty hours per week to less than seven. She had cerebral palsy. Without those hours, she could not live independently. The state of Idaho had implemented a new algorithmic assessment system to allocate home care hours, and it had flagged her case for reduction. When her legal advocates asked state officials to explain the algorithm's reasoning, they were told the methodology was proprietary. The vendor refused to disclose the logic. Yarger sued. In 2016, a federal appeals court ruled in her favor: the state had violated due process by cutting benefits through an opaque system that recipients could not meaningfully contest.

The same pattern emerged in Arkansas in 2016, in New York in 2018, and in Louisiana in 2019 — all involving algorithmic benefit-assessment systems, all generating unexplained reductions in care hours for people with disabilities, all ultimately challenged in court on due process grounds. The explainability problem was not academic. It was the difference between living independently and institutional care.

The Transparency-Explainability Distinction

These two terms are often used interchangeably but describe different things. Transparency refers to the availability of information about a system: what data it uses, how it was trained, what its performance metrics are, who is accountable for it. You can be fully transparent about a system without being able to explain any particular decision it makes. Explainability refers to the ability to give a meaningful account of why a specific output was produced for a specific input. A decision tree is both transparent and explainable. A large neural network may be fully transparent (its architecture and weights disclosed) and still not explainable — no one can reliably trace why input X produced output Y.

This distinction matters for policy. The European Union's General Data Protection Regulation (GDPR), enacted in 2018, establishes what Article 22 calls a right to an explanation when an automated decision significantly affects a person. What "explanation" means technically — and whether current AI systems can provide it — is genuinely contested.

GDPR Article 22 — The Right Not to Be Subject to Automated Decisions

The GDPR gives EU residents the right to object to decisions made solely by automated processing that produce "legal or similarly significant effects." Controllers must provide "meaningful information about the logic involved." The precise scope of this right — whether it requires full explainability, or merely a high-level summary — has been disputed in courts and regulatory guidance since 2018, and interpretations vary across member states.

The Interpretability-Accuracy Trade-off

For much of machine learning history, there was a rough empirical trade-off: models that were highly interpretable (logistic regression, decision trees) were less accurate than models that were opaque (deep neural networks, ensemble methods). This gave organizations a financial incentive to choose accuracy over explainability and a convenient technical rationale for doing so. Research from 2018–2024 has complicated this picture significantly. Rudin et al. have demonstrated that for many high-stakes structured data problems — exactly the domain where explainability matters most, such as criminal risk, medical diagnosis, and credit scoring — interpretable models can achieve accuracy comparable to black-box models. The trade-off, in many real cases, is smaller than claimed.

This matters because "we can't explain it without losing accuracy" is frequently offered as the final word on transparency requests. The evidence suggests this answer is often wrong — and that when it is offered in high-stakes contexts, it should be treated with skepticism.

Post-hoc explanation An approximation of a complex model's behavior, generated after the fact by a separate interpretability tool (e.g., LIME, SHAP). Describes which input features influenced the output — but is an approximation, not a direct causal account.

Inherently interpretable model A model whose decision logic can be directly read and understood by a human — e.g., a decision tree, scorecard, or constrained linear model. No separate explanation tool required.

Algorithmic audit Independent examination of an AI system's inputs, logic, and outputs to assess accuracy, fairness, and compliance. The Gender Shades study was an external audit; some jurisdictions now require audits before deployment in high-stakes domains.

Who Has the Right to an Explanation, and Who Has the Duty to Provide One?

The due process cases from Idaho, Arkansas, and other states establish a legal answer for government benefit decisions: when a public agency uses an algorithm to cut legally-required benefits, the affected person has a constitutional right to a meaningful opportunity to contest the decision. An explanation is a precondition for that contest. You cannot challenge a decision you cannot understand.

The ethical answer is broader. Even in private-sector contexts — a hiring algorithm, a credit decision, an insurance pricing model — people whose life opportunities are affected have a strong moral claim to understanding why. The absence of a legal mandate does not eliminate the ethical obligation. Organizations that deploy consequential AI systems and then invoke trade secrecy to avoid accountability are not behaving neutrally. They are making a choice about whose interests to prioritize — and it is not the person whose life is affected.

Core Principle

Explainability is not merely a technical feature. It is a condition of legitimacy for consequential automated decisions. A decision-maker — human or algorithmic — who cannot explain their reasoning to the person most affected has not met the minimum standard for a fair process.

Lesson 3 Quiz

Five questions · Select the best answer for each

1. In the Idaho Medicaid case, what was the central legal and ethical problem with the algorithmic system?

Correct. The federal appeals court ruled that using an opaque system to cut legally required benefits — without providing recipients an understandable explanation they could contest — violated constitutional due process guarantees.

The core problem was procedural: the algorithm's reasoning was proprietary and unexplained, leaving recipients with no meaningful basis to challenge decisions that drastically reduced their care hours. The court found this violated due process.

2. What is the key difference between transparency and explainability in an AI system?

Correct. A system can be fully transparent — its architecture and weights publicly disclosed — and still not be explainable: no one may be able to trace reliably why a specific input produced a specific output. The two properties are distinct and both matter for accountability.

These are distinct. Transparency refers to disclosing information about a system (data sources, training methodology, performance metrics). Explainability refers to giving a meaningful account of a specific individual decision — which is a harder problem and not automatically achieved by transparency.

3. What does EU GDPR Article 22 establish regarding automated decision-making?

Correct. Article 22 gives EU residents the right to object to solely automated decisions with significant effects and requires that controllers provide meaningful information about the logic involved — though the precise technical meaning of "explanation" has been disputed in subsequent regulatory guidance.

Article 22 of the GDPR grants EU residents the right to object to automated decisions with significant legal or similar effects, and requires controllers to provide meaningful information about the system's logic — not a blanket ban, but a meaningful accountability requirement.

4. Research by Rudin et al. on interpretability challenged what common assumption in machine learning?

Correct. Rudin and colleagues demonstrated that for high-stakes structured data tasks — the exact domain where explainability matters most — interpretable models can often match the accuracy of black-box models, challenging the frequent claim that transparency requires sacrificing performance.

The research challenged the widespread assumption that choosing interpretability necessarily costs accuracy. For many high-stakes structured data problems, interpretable models can match black-box performance — meaning the accuracy trade-off is often overstated as a reason to avoid explainable systems.

5. A "post-hoc explanation" (e.g., LIME or SHAP) for a neural network decision is best described as:

Correct. Post-hoc explanation tools like LIME and SHAP approximate a complex model's behavior by analyzing which features were most influential for a specific output. They are useful but are approximations — not direct causal accounts of the computation — and can themselves be unreliable or gamed.

Post-hoc tools like LIME and SHAP provide approximations: they examine which input features appear to have driven a specific output. They are not direct causal traces of the model's computation, and they can be imprecise or manipulated. Understanding this limitation matters when evaluating explainability claims.

Lab 3 · Arguing for Explainability

Practice making and evaluating the case for transparency in a specific high-stakes AI deployment

Your Task

You have learned the distinction between transparency and explainability, the due process cases from Idaho and Arkansas, the GDPR Article 22 framework, and the research challenging the accuracy-interpretability trade-off. Now you will apply this to a role-play scenario.

The tutor will play the role of a product manager at a company that has just deployed an algorithmic system in a high-stakes domain (hiring, benefits, or credit). Your role is to make the case — ethically and practically — for why the system must be explainable to the people it affects. You must also anticipate and respond to objections.

Ask the tutor to describe the system you'll be arguing about. Then make your case for explainability — and be ready to respond to pushback about trade secrets and accuracy trade-offs.

Explainability Debate Tutor

Lab 3

I'm playing the product manager defending our new automated hiring screening system. Ask me to describe it, then make your case for why it must be explainable to rejected applicants. I'll push back with real objections — trade secrecy, accuracy claims, legal sufficiency. Your job is to hold the ethical line while engaging seriously with the objections.

AI Ethics: Right and Wrong · Lesson 4

Consent, Autonomy, and the Person in the Data

AI systems are built from data about people. What do those people owe the system — and what does the system owe them?

When your data trains a system that makes decisions about others, did you consent to participate in that system — and does consent even solve the problem?

In 2014, a researcher named Aleksandr Kogan built a Facebook quiz app called "thisisyourdigitallife." About 270,000 users installed it and completed the quiz, agreeing in the terms of service to share their Facebook data for academic purposes. What the terms did not prominently disclose — and what Kogan did not adequately explain — was that the app also harvested the data of every Facebook friend of each user who installed it. The ultimate dataset contained profiles of approximately 87 million people, nearly all of whom had consented to nothing at all. Kogan sold the dataset to Cambridge Analytica, which used it to build psychographic profiles and micro-targeted political advertising for the 2016 U.S. presidential election and the Brexit referendum. Facebook's data-sharing policies, which allowed third-party apps to access friends' data by default, had transformed 270,000 individual consents into a 87-million-person dataset without the knowledge of 86.7 million of the people in it.

The incident revealed something important and uncomfortable: individual consent is not sufficient protection when data about one person is simultaneously data about others. The friend who did not install the app had consented to share data with Facebook, not with Kogan, not with Cambridge Analytica, and not with political campaigns targeting their psychological vulnerabilities. The consent framework broke at exactly the point where it was needed most.

What Consent Actually Requires

In bioethics — where informed consent has been a legal and ethical standard since the 1947 Nuremberg Code — valid consent requires four elements: disclosure (what will be done and why), comprehension (the person actually understands), voluntariness (no coercion), and competence (the person has the capacity to decide). The terms-of-service model of digital consent fails on at least three of these four dimensions in nearly every deployed case.

A 2008 study by Carnegie Mellon researchers Lorrie Cranor and Aleecia McDonald estimated that reading the privacy policies of every website an average American visits would require approximately 76 work days per year. The consent is not meaningless because people are lazy. It is meaningless because the system is designed to be unreadable. Informed consent to data collection and use is not the norm — it is the exception, dressed in legal language to appear otherwise.

The Social Network Problem

Cambridge Analytica illustrated that data is relational. Your data is simultaneously data about your friends, your family, your colleagues, and anyone with whom you have interacted. This means individual consent frameworks are structurally insufficient for social network data — and by extension, for any AI trained on data that describes human relationships. The person who did not join the platform still appears in it.

Autonomy Beyond Consent

Even when consent is genuine and informed, it does not resolve all autonomy concerns. Consider predictive systems: a loan algorithm that uses your past spending behavior, your zip code, and your social connections to infer your future behavior is, in a meaningful sense, overriding your autonomy — deciding what you will do before you have done it, and then making it harder to do otherwise. You might have consented to share spending data. You did not consent to have your creditworthiness determined by inferences about your social network's spending habits.

The philosopher Martha Nussbaum's capabilities approach offers one useful frame: people have the right to the practical conditions necessary for self-directed lives. An AI system that forecloses opportunities based on predictions about group membership — rather than individual choices — undermines that autonomy regardless of whether any technical consent was given. The ethical question is not only "did the person agree?" but "does this system treat them as a self-directing agent, or as a data point to be acted upon?"

Contextual integrity A framework developed by Helen Nissenbaum: information flows appropriately when they match the norms of the context in which the information was originally shared. Medical data shared with a doctor flows appropriately to other treating physicians; it does not flow appropriately to an insurance underwriter. Most data-driven AI violates contextual integrity by repurposing data across contexts.

Data minimization A GDPR principle (Article 5) requiring that personal data be adequate, relevant, and limited to what is necessary for the stated purpose. Collecting and retaining data "because it might be useful later" violates this principle and increases privacy risk.

Differential privacy A mathematical privacy technique that adds carefully calibrated noise to datasets so that no individual's data can be precisely inferred from statistical outputs. Apple and the U.S. Census Bureau have implemented versions of differential privacy to allow aggregate analysis without exposing individual records.

Structural Solutions Beyond Individual Consent

Because individual consent cannot bear the ethical weight placed on it, researchers and regulators have proposed structural alternatives. Data trusts — independent bodies that hold data on behalf of communities and negotiate data-use agreements with third parties — allow collective representation that no individual consent form can achieve. Algorithmic impact assessments, analogous to environmental impact assessments, would require organizations to evaluate systemic effects before deployment. The EU AI Act, passed in 2024, mandates risk assessments and conformity checks for high-risk AI systems before they reach market — a structural intervention that does not rely on any individual's ability to read a terms-of-service document.

The Cambridge Analytica episode ended in a $5 billion FTC fine against Facebook in 2019 — the largest in the commission's history at that point. Cambridge Analytica itself declared bankruptcy in 2018. The fine did not undo the psychological profiling of 87 million people or its downstream effects on electoral processes. Individual recourse after the fact is not a substitute for structural protection before it.

Core Principle

Consent is a floor, not a ceiling, for ethical data practice. A system that satisfies legal consent requirements can still violate autonomy, misuse contextual expectations, harm third parties who gave no consent, and produce outcomes that no reasonable person would have agreed to had the full consequences been explained. The ethical standard must be higher than the legal minimum.

Lesson 4 Quiz

Five questions · Select the best answer for each

1. Approximately how many people's data did Cambridge Analytica ultimately access through the "thisisyourdigitallife" app, despite only about 270,000 installing it?

Correct. By accessing the Facebook friend networks of each app installer, Kogan collected data on approximately 87 million people — the vast majority of whom had consented to nothing. This demonstrated that individual consent frameworks are structurally insufficient when data is relational.

The dataset reached approximately 87 million people, because Facebook's policies allowed third-party apps to access not just the installing user's data but also the data of all their Facebook friends — none of whom had consented to participate.

2. According to the 2008 Carnegie Mellon study, how long would it take an average American to read the privacy policies of all websites they visit annually?

Correct. The Cranor and McDonald estimate — 76 work days per year — illustrated that the consent framework is not failing because users are inattentive. It is failing because the volume of policies is structurally unreadable, making "informed consent" to data practices a legal fiction for most people.

The study estimated approximately 76 work days per year — not a reflection of user negligence, but evidence that the terms-of-service consent model is structurally incompatible with genuine informed consent at internet scale.

3. Helen Nissenbaum's concept of "contextual integrity" holds that:

Correct. Contextual integrity holds that what matters is not just whether data is shared, but whether it is shared in a way that matches the norms of the context in which it was originally disclosed. Medical data shared with a physician flows appropriately to other treating doctors — not to insurance underwriters, even if the data itself is identical.

Contextual integrity says the ethical question is whether a data flow matches the norms of its original context — not just whether any sharing agreement exists. Data shared in one context (a doctor's visit) carries norms that are violated when it is repurposed in another context (insurance underwriting) without appropriate expectations.

4. What was the FTC fine imposed on Facebook in 2019 related to the Cambridge Analytica incident?

Correct. The $5 billion fine was the largest in FTC history at that time. Critics noted that even this substantial penalty did not undo the data harvesting, the psychographic profiling, or its potential effects on electoral outcomes — highlighting why after-the-fact fines are an inadequate substitute for structural pre-deployment protections.

The FTC imposed a $5 billion fine — the largest in its history at that point. The lesson drawn in the course is that this did not reverse the data collection or its downstream effects, illustrating why structural pre-deployment protections matter more than retrospective penalties.

5. Which of the following best describes why individual consent is structurally insufficient for social-network data?

Correct. Social network data is relational: your posts, messages, and connections contain information about people who did not consent to share it. This structural property means that even robust individual consent cannot protect non-participants — requiring collective or structural approaches beyond individual terms-of-service agreements.

The fundamental issue is structural: data is relational. When 270,000 people consented to share their Facebook data, they inadvertently exposed the data of 87 million friends who made no such agreement. Individual consent cannot protect people who never had the opportunity to consent.

Lab 4 · Designing for Consent and Autonomy

Apply contextual integrity and structural consent principles to a real-world AI data scenario

Your Task

You have studied the Cambridge Analytica case, the structural limits of individual consent, contextual integrity, and structural alternatives such as data trusts and algorithmic impact assessments. In this lab, you will take on the role of an ethics reviewer examining a proposed AI data collection and training plan.

The tutor will describe a product team's data collection and training proposal. You will evaluate it for: (1) whether consent is genuine and informed, (2) whether contextual integrity is respected, (3) which populations are exposed to risk without their knowledge, and (4) what structural safeguards you would require before approving the proposal.

Ask the tutor to present you with the data collection proposal you'll be reviewing. Be specific — name the contextual integrity violations and propose concrete structural safeguards, not vague recommendations.

Data Ethics Review Tutor

Lab 4

I'm ready to present a data collection and AI training proposal for your ethics review. Ask me for the proposal, then evaluate it systematically: genuine consent, contextual integrity, unexposed third-party risks, and structural safeguards you'd require. I'll challenge you to be more specific wherever your recommendations are vague. Let's begin.

AI Ethics: Right and Wrong · Module 1

Module Test

15 questions covering all four lessons · 80% required to pass

1. Which of the following best describes why "the algorithm decided" is an ethically insufficient answer to a harm produced by an AI system?

Correct. Algorithms do not exist independently of the humans who design them, select their training data, define their objectives, and deploy them in specific contexts. Each of those steps involves choices that carry moral weight.

The key issue is that every stage of AI development — design, data selection, training objective, deployment context — involves human choices that carry moral agency. Responsibility can be traced through those choices, even when the system operates automatically.

2. The SyRI system in Rotterdam was shut down in 2020 primarily because:

Correct. The Dutch court found that secret scoring of citizens using seventeen categories of personal data, with no notification or right of contest, violated Article 8 of the European Convention on Human Rights.

The court's ruling centered on Article 8 of the European Convention on Human Rights — the right to private life. Citizens were scored on extensive personal data without their knowledge or ability to contest the result, which the court found incompatible with that right.

3. Consequentialism would evaluate an AI hiring system primarily by asking:

Correct. Consequentialism evaluates actions by their outcomes — maximizing welfare or minimizing harm across all affected parties. A consequentialist analysis of a hiring algorithm would tally costs and benefits across applicants, employers, and society.

The consequentialist question is always about outcomes and their distribution across affected parties. The other options describe deontological (rules and duties), virtue ethics (character), and legal compliance approaches respectively.

4. ProPublica's 2016 COMPAS investigation found that Black defendants were falsely flagged as high risk at what rate compared to white defendants?

Correct. ProPublica found Black defendants were falsely labeled as future criminals at nearly twice the rate of white defendants with equivalent histories — a disparity that sparked a major debate about how to define algorithmic fairness.

ProPublica's analysis found Black defendants were falsely flagged at nearly twice the rate of white defendants — a finding the system's maker disputed by pointing to a different fairness metric, illustrating that the choice of fairness definition is inherently a values question.

5. "Aggregation bias" in an AI pipeline occurs when:

Correct. Aggregation bias occurs when a model trained on aggregate population data performs well on average but poorly for subgroups with distinct patterns — for example, a diabetes prediction model that is accurate overall but unreliable for specific ethnic groups with different physiological characteristics.

Aggregation bias arises when one model is built for a heterogeneous population and high average accuracy masks poor performance for specific subgroups. A diabetes model accurate "on average" may fail systematically for particular groups whose patterns differ from the population mean.

6. The mathematical fairness impossibility result (Kleinberg et al., 2016) implies that:

Correct. The impossibility result shows that the choice between competing fairness definitions is not a technical question with a correct answer — it is a normative question about whose risks matter more, which requires ethical reasoning and ideally democratic input from affected communities.

The result does not license inaction — it shows that the choice of which fairness criterion to optimize is inherently a values decision. That decision cannot be made by the algorithm; it must be made by humans through explicit ethical reasoning and ideally with input from those most affected.

7. The difference between "transparency" and "explainability" in an AI system is:

Correct. A neural network can be fully transparent — its architecture, weights, and training data disclosed — while remaining unexplainable: no one can reliably trace why a specific input produced a specific output. Both properties are important but they are distinct.

These are distinct. Transparency concerns the availability of information about a system. Explainability concerns the ability to give a meaningful account of a specific decision for a specific person — a harder problem that transparency alone does not solve.

8. Federal courts in the Idaho Medicaid home care case ruled that cutting benefits through an opaque algorithmic system without explanation violated:

Correct. The federal appeals court found that cutting legally required benefits through a system whose reasoning recipients could not access or contest violated due process — because a meaningful opportunity to challenge a government decision requires understanding why it was made.

The court ruled on constitutional due process grounds: when a government agency uses an opaque system to cut legally mandated benefits, the affected person's right to a meaningful opportunity to contest that decision is violated if the decision cannot be explained.

9. The Gender Shades study is significant to AI ethics primarily because it demonstrated:

Correct. The study's key contribution was showing that companies had been deploying commercial systems without auditing for demographic performance gaps — and that aggregate accuracy metrics had concealed error rate disparities of more than 34 percentage points for darker-skinned women.

The study showed that systems achieving very high overall accuracy (IBM at 99.7% for lighter-skinned men) could simultaneously perform at 65.3% for darker-skinned women — meaning aggregate performance metrics are insufficient for evaluating equitable AI deployment.

10. Which of the following is a core requirement for valid informed consent under the bioethics standard established post-Nuremberg Code?

Correct. The four-part bioethics standard requires: disclosure (what will happen and why), comprehension (the person actually understands), voluntariness (no coercion), and competence (capacity to decide). Most digital terms-of-service consent fails on at least three of these four dimensions.

Valid informed consent requires all four elements: disclosure of what will be done and why, genuine comprehension by the person consenting, voluntariness (no coercion or undue pressure), and competence to make the decision. Standard click-through consent typically fails on comprehension and voluntariness at minimum.

11. Helen Nissenbaum's "problem of many hands" is most relevant to AI ethics because it explains:

Correct. The problem of many hands describes how harm emerges when engineers, product managers, executives, and legal teams each make individually defensible choices that together produce an outcome no single person designed — making it possible for everyone to deny responsibility while the harm persists.

Nissenbaum's concept explains how harmful AI outcomes can emerge from chains of individually defensible decisions — each team member following their role appropriately — so that moral responsibility is diffused across many actors and no single person can be held fully accountable.

12. What does the principle of "contextual integrity" require for ethical data flows?

Correct. Contextual integrity holds that the appropriateness of a data flow depends on whether it matches the norms of the context in which the data was originally disclosed — not just whether any formal permission exists. Medical data shared with a physician flows appropriately to other treating doctors, not to insurance underwriters.

Contextual integrity requires that data flows respect the norms of the context where data was originally shared. Medical data shared in a clinical context carries different expectations than the same data shared in an insurance context — and violating those expectations is an ethical harm even if no law is broken.

13. Amazon's résumé-screening tool is an example of which type of bias?

Correct. The system learned from historical hiring data that reflected a predominantly male technical workforce. By treating past successful hires as the standard for future hiring, it replicated the historical gender imbalance without any explicit instruction to do so.

This is a case of historical bias: the training data accurately described ten years of hiring outcomes that happened to be heavily male-dominated. The system learned to reproduce those patterns because they were the ground truth it was optimizing toward.

14. The EU AI Act (2024) addresses the limits of individual consent in AI contexts primarily by:

Correct. The EU AI Act takes a structural approach: rather than relying on individual consent, it requires that high-risk AI systems undergo risk assessments and conformity checks before reaching the market — a pre-deployment safeguard that does not depend on individuals' ability to protect themselves through consent agreements.

The AI Act takes a structural approach: requiring risk assessments and conformity checks before deployment of high-risk systems. This shifts protection from the individual-consent model to systemic pre-market oversight — recognizing that individuals cannot meaningfully consent to protect themselves from complex AI harms.

15. Which of the following statements best captures the core ethical principle across all four lessons of this module?

Correct. This is the through-line of the module: every technical decision in the AI development pipeline — what data to use, what to optimize for, which fairness definition to adopt, how much transparency to provide — is an ethical decision with real consequences for real people. Responsibility for those decisions cannot be avoided by pointing at the system.

The module's through-line is that AI systems embed values at every stage of development and deployment, and that moral responsibility for those values cannot be transferred to the algorithm. The people who design, train, and deploy systems remain accountable for the choices embedded in those systems.