In 1844, Samuel Morse tapped the first long-distance telegraph message from Washington to Baltimore: "What hath God wrought?" The question was half-celebratory, half-genuinely terrified. Within fifteen years the telegraph had reorganized financial markets, military command, and journalism β and had also enabled a new wave of fraud, stock manipulation, and surveillance that legislators scrambled to address decades after the damage was done. The pattern was consistent: the technology arrived, the benefits were obvious and immediate, and the harms emerged slowly, unevenly, and mostly to people who had no hand in building the system.
Today that pattern is repeating at machine speed. Between 2022 and 2024, large language models moved from academic curiosity to tools embedded in hiring pipelines, medical triage software, criminal risk-scoring systems, and school assessments. A hiring algorithm used by Amazon, trained on ten years of male-dominated tech resumes, began systematically downgrading applications from women β discovered internally in 2018 and quietly shelved. A healthcare resource-allocation algorithm used by Optum affected roughly 200 million Americans and was found in 2019 to assign lower risk scores to Black patients than white patients with identical medical conditions, diverting care away from those who needed it more. The decisions were not made by a person intending harm. They emerged from choices β about what data to use, which outcomes to optimize, whose interests to weigh β that seemed purely technical at the time.
This course is about learning to see those choices for what they are: ethical decisions disguised as engineering ones. You will not finish it with a universal formula for right and wrong. No one has one. What you will leave with is a set of frameworks, questions, and historical anchors that let you examine any AI decision β one you encounter, one you are asked to build, one that affects you β and reason about it with rigor and honesty. That is a skill the next decade will demand of almost everyone.
In the spring of 2020, the Dutch city of Rotterdam was using an AI risk-scoring system called SyRI β the System Risk Indication β to flag citizens for welfare fraud investigations. The system ingested seventeen categories of government data: tax records, employment history, debt registers, vehicle ownership, even energy usage. It produced a numerical score. High scorers were referred to human investigators. No one who received a high score was told they had been flagged. No one could appeal a score they didn't know existed. The system operated overwhelmingly in lower-income, ethnically diverse neighborhoods. In February 2020, a Dutch court ruled SyRI violated the European Convention on Human Rights β specifically Article 8, the right to private life β and ordered it shut down. The government argued the system was merely a neutral tool for allocating investigative resources. The court found that neutrality was a fiction: the choice of which data to weight, and which neighborhoods to deploy in, encoded assumptions that could not be separated from the decision itself.
The SyRI case is not unusual. It is a template. It illustrates something that will recur throughout this course: AI systems do not make value-free decisions. Every system embeds the values of its designers, even when β especially when β those designers insist they have kept values out of it.
An ethical decision is one that affects the interests, rights, or welfare of people β and where the choice between options is not purely factual. Whether a bridge will hold a certain weight is an engineering question. Whether a welfare algorithm should weigh debt history more heavily than employment history is not. The second question looks technical; it is actually moral. It asks: whose circumstances do we take seriously, and whose do we discount?
AI systems face ethical dimensions at three distinct junctures: design (what problem to solve and how to define success), training (what data to learn from and how to handle its biases), and deployment (who is affected and how their interests are weighed). Ethical analysis is not a checklist appended at the end of development. It is a discipline that must be active at all three stages β which is why it requires a vocabulary.
Moral philosophers have debated the foundations of ethics for more than two millennia. Three frameworks dominate practical applied ethics today, and all three are relevant to AI. None of them provides a simple answer to every case. Each illuminates a different dimension of the same decision.
A consequentialist might ask: did SyRI actually detect more fraud than traditional investigation at lower cost? If yes, and the total harm from fraud exceeded the harm from misidentification, was it justified? A deontologist asks a different question entirely: did citizens have a right to know they were being scored and to contest that scoring? The Dutch court answered that question with a firm no β and shut the system down regardless of its detection rate. A virtue ethicist asks: what kind of government secretly scrutinizes its poorest residents while exempting wealthier ones? What does that say about institutional character?
When an AI system produces a harmful outcome, responsibility does not automatically attach to any single actor β and this diffusion is not accidental. Engineers say they followed specifications. Product managers say they relied on engineers' judgment. Executives say they relied on legal review. Legal teams say the system complied with current law. And the law, as often as not, has not caught up with the technology. The philosopher Helen Nissenbaum called this "the problem of many hands" β harm that emerges from a system no single person designed to be harmful.
Understanding this does not mean no one is responsible. It means responsibility must be traced carefully β upstream to design choices, sideways to deployment decisions, and outward to the institutions that permitted the system to operate without oversight. One of the practical goals of this course is to teach you to trace that chain.
"The algorithm decided" is never a complete answer to an ethical question about an AI system. Algorithms are designed, trained, deployed, and maintained by people operating within institutions that have interests, incentives, and choices. Every step in that chain involves moral agency β and moral accountability.
Not every computation is an ethical decision in the relevant sense. A spam filter classifying email as junk is making a decision β but the stakes are low and easily corrected. A risk-scoring algorithm determining which defendants receive bail before trial, as COMPAS does in jurisdictions across the United States, is making a decision with potentially years of someone's life at stake. The ethical weight scales with: (1) the severity of the consequence, (2) the reversibility of the outcome, (3) the vulnerability of those affected, and (4) the availability of recourse.
ProPublica's 2016 investigation into COMPAS found that the system falsely flagged Black defendants as future criminals at nearly twice the rate it falsely flagged white defendants. The company that made COMPAS, Northpointe, disputed the methodology. What the dispute revealed β and this is the lesson β is that the very definition of "fairness" for a risk-scoring system is contested, and the contest is not statistical. It is ethical. You cannot resolve it with more data.
Technical accuracy and ethical soundness are different properties. A system can be accurate on its training metric and still produce ethically unacceptable outcomes. Optimizing for accuracy at predicting recidivism using historically biased arrest data does not produce a fair system. It produces a precise replica of historical injustice.
You have encountered three ethical frameworks β consequentialism, deontology, and virtue ethics β and two real cases: the SyRI welfare fraud system and the COMPAS recidivism algorithm. In this lab, you will practice applying those frameworks to new scenarios and defending your reasoning.
The AI tutor will present you with a scenario and ask you to analyze it through one or more frameworks. There are no trick questions. The goal is to practice structured ethical reasoning, not to arrive at a predetermined answer.
In 2014, Amazon built an automated rΓ©sumΓ© screening tool designed to take the human bottleneck out of technical hiring. Engineers trained it on ten years of submitted rΓ©sumΓ©s β which reflected, accurately, the fact that Amazon's technical workforce was predominantly male. By 2015 the system was downgrading rΓ©sumΓ©s that contained the word "women's" β as in "women's chess club" or "women's college" β and penalizing graduates of all-women's universities. Amazon discovered this in 2018, tried and failed to correct it, and quietly disbanded the project. The system had not been programmed to discriminate. It had learned to discriminate from data that encoded a decade of discriminatory hiring patterns, and then reproduced those patterns at machine scale.
The engineers called this a data problem. It was, in one sense. But it was also a question problem: when you define "a good hire" as "looks like our past hires," you have already made an ethical choice β and disguised it as an engineering one.
Algorithmic bias does not require a biased programmer. It requires only a biased world β and data collected from that world. Researchers have identified five points in the AI development pipeline where bias can enter and compound:
1. Historical bias β the world the data describes was already unequal. Arrest records, loan approvals, medical diagnoses: all reflect who was policed, who was given credit, who was taken seriously by doctors.
2. Representation bias β certain groups are underrepresented in training data, so the system performs worse on them. A 2018 study by Joy Buolamwini and Timnit Gebru (the "Gender Shades" paper) found that commercial facial recognition systems from IBM, Microsoft, and Face++ had error rates up to 34 percentage points higher for darker-skinned women than for lighter-skinned men.
3. Measurement bias β the proxy metric chosen to represent the outcome of interest is more accurate for some groups than others. Using credit score as a proxy for creditworthiness, when credit scores are themselves shaped by decades of discriminatory lending, embeds that discrimination in the new system.
4. Aggregation bias β building a single model for a diverse population obscures important within-group variation. A diabetes prediction model trained on aggregate data may perform well on average and poorly for specific ethnic groups with distinct physiological patterns.
5. Deployment bias β a system built for one context is applied in another where its assumptions no longer hold. A hiring tool trained on one industry's norms applied to another sector will carry its first industry's assumptions invisibly.
Joy Buolamwini at MIT and Timnit Gebru at Google published a landmark audit of commercial facial recognition systems in 2018. IBM's system had a 99.7% accuracy rate on lighter-skinned men and a 65.3% accuracy rate on darker-skinned women β a 34.4 point gap. Microsoft's showed a similar pattern. The companies did not dispute the results. IBM improved its system within months. What the study demonstrated was that companies had been deploying systems they had not audited for demographic parity, on the assumption that high overall accuracy meant universal reliability.
In 2016, in direct response to the COMPAS controversy, computer scientists Jon Kleinberg, Sendhil Mullainathan, and Manish Raghavan proved something that has since reshaped the field: under realistic conditions, it is mathematically impossible to satisfy all common definitions of algorithmic fairness simultaneously. You can have a system where the false positive rate is equal across groups, or one where the positive predictive value is equal across groups, but not both β unless base rates in the population are equal across groups, which in historically unequal societies they typically are not.
This is not a technical problem awaiting a better algorithm. It is a values problem dressed in mathematical clothes. Choosing which fairness criterion to satisfy is a choice about whose risks matter more β and that choice belongs in the domain of ethics and democratic deliberation, not engineering alone.
Choosing a fairness definition is not a neutral technical act. It is a decision about which harms are acceptable and to whom. That decision requires explicit ethical justification β and it should involve, at minimum, the communities most likely to be affected by the system.
The impossibility result does not license passivity. Several concrete interventions reduce bias without requiring its elimination: pre-processing methods that transform training data to reduce historical disparities before training begins; in-processing constraints that build fairness criteria directly into the optimization objective; post-processing thresholds that adjust decision boundaries differently for different groups to equalize error rates. Each of these has trade-offs. Each requires an explicit choice about which trade-off to accept.
Perhaps more importantly: the decision about whether to deploy a system at all is always available. Amazon disbanded its rΓ©sumΓ© tool. That decision β not building a better tool, but declining to deploy a harmful one β is also an ethical choice, and often the correct one.
You have learned five points where bias enters an AI pipeline: historical bias, representation bias, measurement bias, aggregation bias, and deployment bias. You have also learned three intervention approaches: pre-processing, in-processing, and post-processing.
The tutor will describe a scenario involving a biased AI system. Your job is to: (1) identify which type(s) of bias are present, (2) explain the mechanism by which harm is produced, and (3) propose an intervention and defend it.
In 2011, a woman named Jeanette Yarger received notice that her Medicaid-funded home care hours had been cut β from twenty hours per week to less than seven. She had cerebral palsy. Without those hours, she could not live independently. The state of Idaho had implemented a new algorithmic assessment system to allocate home care hours, and it had flagged her case for reduction. When her legal advocates asked state officials to explain the algorithm's reasoning, they were told the methodology was proprietary. The vendor refused to disclose the logic. Yarger sued. In 2016, a federal appeals court ruled in her favor: the state had violated due process by cutting benefits through an opaque system that recipients could not meaningfully contest.
The same pattern emerged in Arkansas in 2016, in New York in 2018, and in Louisiana in 2019 β all involving algorithmic benefit-assessment systems, all generating unexplained reductions in care hours for people with disabilities, all ultimately challenged in court on due process grounds. The explainability problem was not academic. It was the difference between living independently and institutional care.
These two terms are often used interchangeably but describe different things. Transparency refers to the availability of information about a system: what data it uses, how it was trained, what its performance metrics are, who is accountable for it. You can be fully transparent about a system without being able to explain any particular decision it makes. Explainability refers to the ability to give a meaningful account of why a specific output was produced for a specific input. A decision tree is both transparent and explainable. A large neural network may be fully transparent (its architecture and weights disclosed) and still not explainable β no one can reliably trace why input X produced output Y.
This distinction matters for policy. The European Union's General Data Protection Regulation (GDPR), enacted in 2018, establishes what Article 22 calls a right to an explanation when an automated decision significantly affects a person. What "explanation" means technically β and whether current AI systems can provide it β is genuinely contested.
The GDPR gives EU residents the right to object to decisions made solely by automated processing that produce "legal or similarly significant effects." Controllers must provide "meaningful information about the logic involved." The precise scope of this right β whether it requires full explainability, or merely a high-level summary β has been disputed in courts and regulatory guidance since 2018, and interpretations vary across member states.
For much of machine learning history, there was a rough empirical trade-off: models that were highly interpretable (logistic regression, decision trees) were less accurate than models that were opaque (deep neural networks, ensemble methods). This gave organizations a financial incentive to choose accuracy over explainability and a convenient technical rationale for doing so. Research from 2018β2024 has complicated this picture significantly. Rudin et al. have demonstrated that for many high-stakes structured data problems β exactly the domain where explainability matters most, such as criminal risk, medical diagnosis, and credit scoring β interpretable models can achieve accuracy comparable to black-box models. The trade-off, in many real cases, is smaller than claimed.
This matters because "we can't explain it without losing accuracy" is frequently offered as the final word on transparency requests. The evidence suggests this answer is often wrong β and that when it is offered in high-stakes contexts, it should be treated with skepticism.
The due process cases from Idaho, Arkansas, and other states establish a legal answer for government benefit decisions: when a public agency uses an algorithm to cut legally-required benefits, the affected person has a constitutional right to a meaningful opportunity to contest the decision. An explanation is a precondition for that contest. You cannot challenge a decision you cannot understand.
The ethical answer is broader. Even in private-sector contexts β a hiring algorithm, a credit decision, an insurance pricing model β people whose life opportunities are affected have a strong moral claim to understanding why. The absence of a legal mandate does not eliminate the ethical obligation. Organizations that deploy consequential AI systems and then invoke trade secrecy to avoid accountability are not behaving neutrally. They are making a choice about whose interests to prioritize β and it is not the person whose life is affected.
Explainability is not merely a technical feature. It is a condition of legitimacy for consequential automated decisions. A decision-maker β human or algorithmic β who cannot explain their reasoning to the person most affected has not met the minimum standard for a fair process.
You have learned the distinction between transparency and explainability, the due process cases from Idaho and Arkansas, the GDPR Article 22 framework, and the research challenging the accuracy-interpretability trade-off. Now you will apply this to a role-play scenario.
The tutor will play the role of a product manager at a company that has just deployed an algorithmic system in a high-stakes domain (hiring, benefits, or credit). Your role is to make the case β ethically and practically β for why the system must be explainable to the people it affects. You must also anticipate and respond to objections.
In 2014, a researcher named Aleksandr Kogan built a Facebook quiz app called "thisisyourdigitallife." About 270,000 users installed it and completed the quiz, agreeing in the terms of service to share their Facebook data for academic purposes. What the terms did not prominently disclose β and what Kogan did not adequately explain β was that the app also harvested the data of every Facebook friend of each user who installed it. The ultimate dataset contained profiles of approximately 87 million people, nearly all of whom had consented to nothing at all. Kogan sold the dataset to Cambridge Analytica, which used it to build psychographic profiles and micro-targeted political advertising for the 2016 U.S. presidential election and the Brexit referendum. Facebook's data-sharing policies, which allowed third-party apps to access friends' data by default, had transformed 270,000 individual consents into a 87-million-person dataset without the knowledge of 86.7 million of the people in it.
The incident revealed something important and uncomfortable: individual consent is not sufficient protection when data about one person is simultaneously data about others. The friend who did not install the app had consented to share data with Facebook, not with Kogan, not with Cambridge Analytica, and not with political campaigns targeting their psychological vulnerabilities. The consent framework broke at exactly the point where it was needed most.
In bioethics β where informed consent has been a legal and ethical standard since the 1947 Nuremberg Code β valid consent requires four elements: disclosure (what will be done and why), comprehension (the person actually understands), voluntariness (no coercion), and competence (the person has the capacity to decide). The terms-of-service model of digital consent fails on at least three of these four dimensions in nearly every deployed case.
A 2008 study by Carnegie Mellon researchers Lorrie Cranor and Aleecia McDonald estimated that reading the privacy policies of every website an average American visits would require approximately 76 work days per year. The consent is not meaningless because people are lazy. It is meaningless because the system is designed to be unreadable. Informed consent to data collection and use is not the norm β it is the exception, dressed in legal language to appear otherwise.
Cambridge Analytica illustrated that data is relational. Your data is simultaneously data about your friends, your family, your colleagues, and anyone with whom you have interacted. This means individual consent frameworks are structurally insufficient for social network data β and by extension, for any AI trained on data that describes human relationships. The person who did not join the platform still appears in it.
Even when consent is genuine and informed, it does not resolve all autonomy concerns. Consider predictive systems: a loan algorithm that uses your past spending behavior, your zip code, and your social connections to infer your future behavior is, in a meaningful sense, overriding your autonomy β deciding what you will do before you have done it, and then making it harder to do otherwise. You might have consented to share spending data. You did not consent to have your creditworthiness determined by inferences about your social network's spending habits.
The philosopher Martha Nussbaum's capabilities approach offers one useful frame: people have the right to the practical conditions necessary for self-directed lives. An AI system that forecloses opportunities based on predictions about group membership β rather than individual choices β undermines that autonomy regardless of whether any technical consent was given. The ethical question is not only "did the person agree?" but "does this system treat them as a self-directing agent, or as a data point to be acted upon?"
Because individual consent cannot bear the ethical weight placed on it, researchers and regulators have proposed structural alternatives. Data trusts β independent bodies that hold data on behalf of communities and negotiate data-use agreements with third parties β allow collective representation that no individual consent form can achieve. Algorithmic impact assessments, analogous to environmental impact assessments, would require organizations to evaluate systemic effects before deployment. The EU AI Act, passed in 2024, mandates risk assessments and conformity checks for high-risk AI systems before they reach market β a structural intervention that does not rely on any individual's ability to read a terms-of-service document.
The Cambridge Analytica episode ended in a $5 billion FTC fine against Facebook in 2019 β the largest in the commission's history at that point. Cambridge Analytica itself declared bankruptcy in 2018. The fine did not undo the psychological profiling of 87 million people or its downstream effects on electoral processes. Individual recourse after the fact is not a substitute for structural protection before it.
Consent is a floor, not a ceiling, for ethical data practice. A system that satisfies legal consent requirements can still violate autonomy, misuse contextual expectations, harm third parties who gave no consent, and produce outcomes that no reasonable person would have agreed to had the full consequences been explained. The ethical standard must be higher than the legal minimum.
You have studied the Cambridge Analytica case, the structural limits of individual consent, contextual integrity, and structural alternatives such as data trusts and algorithmic impact assessments. In this lab, you will take on the role of an ethics reviewer examining a proposed AI data collection and training plan.
The tutor will describe a product team's data collection and training proposal. You will evaluate it for: (1) whether consent is genuine and informed, (2) whether contextual integrity is respected, (3) which populations are exposed to risk without their knowledge, and (4) what structural safeguards you would require before approving the proposal.