AI Ethics & Decision-Making · Introduction

Every Tool That Thinks Also Chooses

This course exists because the machines making consequential decisions about your life were built by people who had to make ethical choices — and those choices are now yours to understand.

In 1890, the Eastman Kodak Company introduced the hand-held camera and, almost immediately, lawyers began arguing about something no one had anticipated: could a photograph taken of a person in public be published without their consent? Louis Brandeis and Samuel Warren wrote "The Right to Privacy" that same year, warning that "instantaneous photographs" threatened to expose private life in ways the law had never contemplated. The technology arrived first; the ethics and law scrambled to follow for the next four decades, culminating in portrait-rights statutes that varied wildly by state well into the 1930s.

The same scramble is happening now, but compressed into years rather than decades. Between 2016 and 2023, algorithmic systems moved from novelty to infrastructure: they set bail amounts in Broward County, Florida; filtered job applications at Amazon; scored creditworthiness at tens of millions of banks; and flagged welfare fraud in the Netherlands — where a 2020 court ruling found the government's SyRI risk-scoring system violated human rights law and ordered it shut down. Each of these deployments was decided by engineers and product managers working under commercial pressure, without any agreed ethical framework governing what they were building.

This course will not make you an AI engineer. It will make you a sharper reader of the systems that govern modern life — someone who can identify when an automated decision deserves scrutiny, who bears accountability when it goes wrong, and what structural checks can constrain the worst outcomes. Four lessons, four labs, one module test. The goal is practical literacy, not false comfort.

If you finish every module, here's who you become:

You'll recognize when an automated system — in bail, hiring, credit, or healthcare — is making a consequential decision that deserves scrutiny.
You'll be able to trace accountability when an AI system causes harm: who built it, who deployed it, and who had the power to stop it.
You'll understand how surveillance infrastructure transforms the relationship between institutions and the individuals they monitor.
You'll see data collection not as a technical step but as an ethical choice — one that encodes values, historical bias, and assumptions about people.
You'll become someone who reads the terms of AI-powered systems the way a lawyer reads a contract: looking for what's missing, not just what's there.
You'll know what professional obligation looks like for engineers and designers caught between commercial pressure and public harm.
You'll leave thinking in structural terms — asking not just whether a system failed, but what checks were absent that allowed it to be built at all.

Lesson 1 · When Machines Decide

The Anatomy of an Automated Decision

What is actually happening when an algorithm determines your loan, your parole, your medical diagnosis?

Who decided the machine should decide — and what did they assume?

Every defendant who passed through Broward County, Florida's criminal courts between 2013 and 2016 received a COMPAS score — a number between one and ten rating their likelihood of reoffending. Judges used those scores in bail and sentencing decisions. In 2016, investigative journalists at ProPublica analyzed 7,000 cases and published their findings under the headline "Machine Bias." The core finding: COMPAS was nearly twice as likely to falsely flag Black defendants as future criminals, and nearly twice as likely to falsely label white defendants as low risk. The algorithm's creator, Northpointe Inc., disputed the methodology. The Wisconsin Supreme Court, in State v. Loomis (2016), ruled that COMPAS scores could be used in sentencing as long as they were not the "determinative" factor. No court has ever had access to the algorithm's source code — it remains a trade secret.

The COMPAS case distills nearly every tension this module will address: opacity (no one could inspect the model), disparate impact (statistically unequal outcomes by race), accountability gaps (Northpointe bears no legal liability for wrongful sentences), and automation bias (judges deferring to a number they couldn't interrogate). None of these problems required bad intentions to materialize. They emerged from design choices made before a single defendant ever stood in court.

What "Automated Decision" Actually Means

An automated decision-making system (ADMS) is any process in which a computational model produces an output that directly or substantially influences a consequential outcome for a person — without that person having real-time recourse to a human reviewer. The key word is consequential: a spam filter is automated, but missing an email is recoverable. A bail algorithm is automated, and the consequences of its errors fall on people who may spend months in pretrial detention they cannot afford to challenge.

Three components are present in virtually every ADMS. First, a training dataset — the historical records from which the model learns patterns. Second, a feature set — the variables the model actually uses (age, zip code, prior arrests, credit history). Third, an objective function — the thing the model is mathematically optimized to predict (recidivism, default probability, click-through rate). The ethical problems of AI decision-making cluster around these three components more than anywhere else.

Training Data and the Inheritance of History

In 2018, researchers at MIT and Microsoft published an audit of three major commercial facial analysis systems — Microsoft Azure, IBM Watson, and Face++ — showing that error rates for darker-skinned women ran as high as 34.7%, while error rates for lighter-skinned men were below 1%. The researchers, Joy Buolamwini and Timnit Gebru, named this disparity "intersectional accuracy gaps." Their paper, "Gender Shades," became the most-cited work in algorithmic fairness research that year. The cause was straightforward: all three systems had been trained on datasets that were disproportionately composed of light-skinned male faces, because that is what was easiest to scrape from publicly available sources in the early 2010s.

The lesson is not that datasets are always biased, but that they always reflect the conditions under which they were collected. If a hospital system trains a diagnostic model on patient records from 2000–2015, it inherits whatever systematic disparities existed in who received diagnoses and what treatments were documented. A model predicting loan default trained on FICO data inherits decades of redlining-era credit exclusions. The model does not invent discrimination; it compresses historical discrimination into a number and applies it at scale.

Documented Case

In 2019, researchers at UC Berkeley published a study in the Proceedings of the National Academy of Sciences showing that a health-care algorithm used by major US hospital systems — affecting an estimated 200 million people — systematically under-prioritized Black patients for high-risk care programs. The algorithm used past health care spending as a proxy for illness severity. Because Black patients had historically received less care, they had lower spending records, so the model rated them as less sick even when their clinical indicators were equivalent or worse.

Feature Selection and Proxy Discrimination

Proxy discrimination occurs when a model uses a variable that is not a protected characteristic — race, gender, religion — but that correlates so strongly with a protected characteristic that it effectively substitutes for it. Zip code is the canonical example: in US cities shaped by residential segregation, zip code correlates powerfully with race. A model that uses zip code as a feature for loan approval, insurance pricing, or criminal risk assessment can reproduce race-based discrimination without ever containing a "race" variable.

In 2014, the Federal Trade Commission published a report examining the relationship between big data and discrimination, warning that zip code, purchasing behavior, and even browsing history could function as proxies for protected class membership. In 2016, ProPublica's follow-up investigation into car insurance pricing in six states found that major insurers charged higher premiums in minority neighborhoods than in white neighborhoods with identical risk profiles, apparently because of zip-code-based modeling. No insurer's algorithm contained the word "race."

Proxy VariableA feature included in a model that correlates with a protected characteristic and thereby transmits discriminatory effects without explicitly encoding them.

Disparate ImpactA legal and statistical concept describing when a facially neutral policy produces significantly different outcomes across demographic groups, regardless of intent.

Automation BiasThe tendency for human reviewers to defer to algorithmic outputs even when they have independent information suggesting the algorithm may be wrong.

The Objective Function Problem

A model optimizes for what it is told to optimize for, and nothing else. This seems obvious until you realize how difficult it is to specify what you actually want in mathematical terms. Amazon ran an internal machine-learning recruiting tool from 2014 to 2017 that scored résumés on a scale of one to five. The model was trained on résumés submitted over the previous decade — a dataset that skewed heavily male, reflecting tech industry demographics. By 2015, Amazon's engineers discovered the system had learned to penalize résumés containing the word "women's" — as in "women's chess club" — and had downgraded graduates of two all-women's colleges. Amazon disbanded the project in 2017 without ever deploying the tool publicly, a fact that became public only through a Reuters investigation in October 2018.

The objective function was "predict which candidates Amazon will hire." The model correctly learned that pattern. The problem is that "candidates Amazon historically hired" was not the same as "best qualified candidates." The optimization target was precisely defined but conceptually wrong. This gap — between the quantity you can measure and the outcome you actually care about — is sometimes called Goodhart's Law in the context of AI: when a measure becomes a target, it ceases to be a good measure.

Core Principle

Every ADMS embeds three sets of human value judgments: which data to collect, which features to use, and what outcome to optimize for. These are not technical choices. They are ethical choices made by specific people under specific constraints — and the machine will execute them at scale without further deliberation.

Lesson 1 Quiz

The Anatomy of an Automated Decision — five questions

1. The ProPublica 2016 investigation "Machine Bias" found that COMPAS was approximately how much more likely to falsely label Black defendants as high risk compared to white defendants?

Correct. ProPublica found COMPAS was nearly twice as likely to falsely flag Black defendants as future criminals — a finding Northpointe disputed but which independent replications have largely supported.

Not quite. ProPublica's analysis found COMPAS produced false high-risk labels for Black defendants at nearly twice the rate it did for white defendants.

2. The "Gender Shades" study by Buolamwini and Gebru found error rates for darker-skinned women in commercial facial recognition systems as high as:

Correct. The 2018 Gender Shades paper documented error rates up to 34.7% for darker-skinned women, compared to under 1% for lighter-skinned men, across Microsoft, IBM, and Face++ systems.

Incorrect. The Gender Shades paper found error rates as high as 34.7% for darker-skinned women — far above what any of those figures represent.

3. The 2019 UC Berkeley PNAS study found that a widely used health-care algorithm under-prioritized Black patients because it used what variable as a proxy for illness severity?

Correct. Because Black patients had historically received less care, they had lower spending records. The algorithm interpreted lower spending as lower illness severity, systematically under-referring them for high-risk care programs.

Not quite. The algorithm used past health care spending as its proxy for how sick a patient was — a variable that embedded historical disparities in access to care.

4. "Proxy discrimination" in algorithmic systems refers to:

Correct. Proxy discrimination occurs when variables like zip code or purchasing behavior correlate so tightly with protected characteristics that the model effectively discriminates without explicitly encoding those characteristics.

Incorrect. Proxy discrimination refers to the use of seemingly neutral variables — like zip code — that strongly correlate with protected characteristics, reproducing discriminatory outcomes without explicit encoding.

5. Amazon's internal résumé-screening tool, developed from 2014 to 2017, was abandoned because it:

Correct. Because the training data reflected a decade of male-skewed hiring outcomes, the model learned that female-associated terms were negatively correlated with being hired — and penalized them accordingly.

Incorrect. The tool was abandoned after engineers discovered it had learned to penalize women's-associated terms from its training data, which reflected the historically male-dominant hiring pool.

Lab 1 — Dissecting an Automated Decision

Conversation-based critical analysis lab · at least 3 exchanges to complete

Your Task

You will analyze a real automated decision-making system with the AI assistant below. Choose one of the following systems and interrogate it: the COMPAS recidivism scoring system, the Amazon résumé screener, or the health-care risk-scoring algorithm identified in the 2019 UC Berkeley PNAS study.

For your chosen system, work through: (1) what training data it likely used and what biases that data may have embedded, (2) what its objective function was and how that objective may have diverged from the actual goal, and (3) who bears accountability when the system produces a harmful outcome.

Starter prompt: "I want to analyze [system name]. Let's start with its training data. What do we know about it and what biases might it have inherited?"

AI Ethics Lab Assistant

Lab 1

Welcome to Lab 1. We're going to dissect a real automated decision-making system together. Choose one — COMPAS, the Amazon résumé screener, or the UC Berkeley health-care algorithm — and tell me which one you'd like to analyze. We'll work through training data, objective function, and accountability in sequence.

Lesson 2 · When Machines Decide

Fairness Is Not a Single Number

Mathematical definitions of algorithmic fairness are mutually incompatible. You cannot satisfy all of them at once — so who chooses which one applies?

When we say an algorithm is "fair," what are we actually claiming — and what are we quietly giving up?

In the months after ProPublica published "Machine Bias," Northpointe and a group of academic statisticians published rebuttals arguing that COMPAS was, in fact, fair — because among defendants who scored as high risk, Black and white defendants reoffended at statistically similar rates. ProPublica's team agreed with this fact but maintained the system was unfair — because the false positive rate (being labeled high risk when you would not reoffend) was dramatically higher for Black defendants. Both sides were correct by their own definition of fairness. The conflict was not about data; it was about which definition of fairness ought to govern.

In December 2016, three independent research groups — at Cornell, Google, and the Max Planck Institute — published papers within weeks of each other demonstrating a formal impossibility result: calibration (equal accuracy across groups) and error-rate parity (equal false positive and false negative rates across groups) cannot both be satisfied simultaneously when base rates differ between groups. This became known in the research literature as the "fairness impossibility theorem." No algorithm, however well-designed, can be fair by all reasonable definitions at once. This is not a temporary problem awaiting a better model. It is a permanent mathematical constraint.

The Major Fairness Definitions

Academic and industry researchers have proposed more than twenty distinct formal definitions of algorithmic fairness since 2016. The most frequently debated cluster around five core concepts:

Demographic parity requires that the algorithm's positive outcome rate be equal across demographic groups. If 30% of white loan applicants are approved, 30% of Black loan applicants must also be approved. This definition ignores whether individuals within each group differ in their actual default risk.

Equalized odds, proposed by Hardt, Price, and Srebro at NeurIPS 2016, requires that both the true positive rate and the false positive rate be equal across groups. It is stricter than demographic parity because it conditions on the actual outcome.

Calibration (also called predictive parity) requires that a score of, say, 7 out of 10 mean the same probability of the predicted event across all groups. If the score means a 70% reoffense probability for white defendants, it should also mean 70% for Black defendants.

Individual fairness, formalized by Dwork et al. in 2012, requires that similar individuals receive similar outcomes. It sidesteps group comparisons entirely — but requires agreement on what "similar" means, which is itself a value-laden judgment.

Counterfactual fairness, proposed by Kusner et al. in 2017, asks: would the outcome change if the individual's protected characteristic were different, holding all else equal? This approach grapples with the fact that changing one attribute — like race — counterfactually would change many others in a racially structured society.

The Impossibility Constraint

When base rates differ between groups — as they do for recidivism, loan default, and many other predicted outcomes — you cannot simultaneously achieve calibration and equalized false positive rates. Choosing one definition means accepting worse performance on another. This is not a technical problem. It is a political and moral choice about whose errors society is willing to tolerate.

Who Chooses the Definition — and How

In 2018, the New York City government passed Local Law 49, creating a task force to study the use of automated decision systems in city agencies. The task force's 2019 report — a landmark document in municipal AI governance — found that no city agency had documented which fairness definition, if any, it had used in procuring or deploying algorithmic tools. Vendors provided accuracy statistics; they did not provide fairness audits.

The same year, NIST (the National Institute of Standards and Technology) began a years-long project to evaluate facial recognition algorithms, eventually publishing results showing that nearly all commercial systems had higher false match rates for Black women, East Asian men, and older adults than for young white men — with false match rates in some cases 10 to 100 times higher. The NIST results were not disputed. What remained disputed was whether those error differentials were acceptable and, if not, who had the authority to say so.

In the European Union, the 2021 AI Act moved toward resolving the governance question by designating certain AI applications — including AI used in criminal justice, credit scoring, and hiring — as "high-risk systems" subject to mandatory conformity assessments, transparency requirements, and human oversight obligations. The Act does not specify which fairness definition applies; it requires that developers document which definition they have chosen and justify it. This shifts the burden from implicit to explicit — which is itself a significant change.

Implication for Practice

Any time an organization deploys an automated decision system and claims it is "fair," the first question to ask is: fair by which definition? The second question is: who made that choice, when, and with whose input? In the absence of explicit answers, the choice was made implicitly — by an engineer's default settings, a vendor's benchmark, or no one in particular.

Fairness Trade-offs in Practice

In 2020, a research team at Stanford published a study in Science examining a child welfare screening tool used in Allegheny County, Pennsylvania. The system, called the Allegheny Family Screening Tool (AFST), scored families referred to child protective services to help prioritize investigations. The researchers found that the tool performed with statistical parity across racial groups on some metrics but not others — and that the choice of which metric to prioritize had been made by the county's data analytics director, a single official, without a public deliberation process.

The Allegheny case is instructive not because the tool was irresponsible — it was, by many accounts, one of the more carefully designed public-sector AI systems in the United States — but because it illustrates that even careful deployment embeds value choices that were never put to a democratic vote. The families whose lives the tool affected had no mechanism to challenge the definition of fairness that governed it.

CalibrationA model is calibrated if a predicted probability of X% corresponds to an actual event rate of X% across all demographic groups.

Equalized OddsA fairness criterion requiring equal true positive rates and equal false positive rates across demographic groups.

Fairness Impossibility TheoremThe mathematical result showing that calibration and error-rate parity cannot both be satisfied when base rates differ between groups.

Lesson 2 Quiz

Fairness Is Not a Single Number — five questions

1. The "fairness impossibility theorem" demonstrated in 2016 shows that, when base rates differ between groups, a model cannot simultaneously satisfy:

Correct. Papers from Cornell, Google, and the Max Planck Institute in 2016 each independently proved that calibration and equal error rates are mathematically incompatible when base rates differ — which they almost always do.

Incorrect. The impossibility result specifically concerns the conflict between calibration (equal predictive accuracy across groups) and equalized false positive rates — not the pairs listed in the other options.

2. "Demographic parity" as a fairness definition requires that:

Correct. Demographic parity sets a group-level quota on outcomes without conditioning on actual individual risk — which is why critics argue it can be fair at the group level while producing individually arbitrary decisions.

Not quite. Demographic parity specifically requires equal positive outcome rates across groups, independent of whether individuals within those groups actually differ in their predicted risk.

3. New York City's Local Law 49 (2018) and its resulting task force report found that city agencies deploying algorithmic tools had:

Correct. The 2019 task force report found that no city agency had documented a fairness definition. Vendors provided accuracy statistics; fairness frameworks were absent.

Incorrect. The task force found the opposite: no agency had documented which fairness criterion, if any, it used — a gap the report identified as a significant governance failure.

4. NIST evaluations of commercial facial recognition systems found that, compared to young white men, false match rates for Black women and other demographic groups were in some cases:

Correct. NIST's evaluations documented false match rates 10 to 100 times higher for some demographic groups relative to young white male faces — one of the most dramatic disparity ranges in any technology audit.

Incorrect. NIST found false match rates in some cases 10 to 100 times higher for groups including Black women and East Asian men compared to young white men.

5. The EU AI Act's approach to fairness definitions in high-risk AI systems is best characterized as:

Correct. The EU AI Act shifts fairness from an implicit engineering default to an explicit documented choice — requiring transparency about which definition was chosen and why, without mandating a single universal definition.

Incorrect. The EU AI Act does not mandate a specific fairness definition. It requires that developers explicitly document and justify the definition they have chosen — making implicit choices visible and contestable.

Lab 2 — Choosing a Fairness Definition

Conversation-based reasoning lab · at least 3 exchanges to complete

Your Task

You are advising a county government that is evaluating a pretrial risk assessment tool. The tool predicts whether a defendant will miss their court date. Two fairness definitions are in conflict: calibration (the scores mean the same probability across racial groups) and equalized false positive rates (equal rates of incorrect high-risk labels across groups). The base rates of prior court appearance differ between groups due to historical enforcement patterns.

Work with the AI assistant to: (1) explain why both definitions cannot be satisfied simultaneously in this case, (2) articulate the real-world consequences of prioritizing one over the other, and (3) identify who should legitimately make this choice and through what process.

Starter prompt: "Walk me through why calibration and equalized false positive rates conflict in the pretrial context — use concrete numbers if that helps."

AI Ethics Lab Assistant

Lab 2

Welcome to Lab 2. We're working through a real governance dilemma: a county must choose a fairness definition for a pretrial risk tool, knowing it cannot satisfy all definitions at once. Tell me where you want to start — the mathematical conflict, the real-world stakes, or who should make this decision — and we'll build up a complete analysis together.

Lesson 3 · When Machines Decide

Accountability in a System with No Author

When an AI system causes harm, responsibility diffuses across vendors, deployers, regulators, and end users — often reaching no one.

If everyone involved followed their organization's rules, and the outcome was still catastrophic, who is responsible?

Between 2014 and 2020, the Dutch government operated the System Risk Indication — SyRI — a centralized data-fusion platform that combined seventeen categories of government data (tax records, benefits claims, employment records, housing registrations) to generate risk scores for welfare fraud. Municipalities could request SyRI analyses of entire neighborhoods; the outputs were lists of individuals scored as high-risk for investigation. Crucially, the scored individuals were never told they had been scored, could not inspect the algorithm, and had no mechanism to challenge their placement on an investigation list. In February 2020, a Dutch court ruled that SyRI violated Article 8 of the European Convention on Human Rights — the right to private life — partly because the government could not explain how the algorithm worked even to the court. The system was shut down. No individual official was found liable. No vendor faced penalties.

The SyRI case illustrates what legal scholars call the accountability gap in automated decision-making: when a harm occurs, responsibility disperses across the chain of actors — the government ministry that contracted the system, the vendor that built it, the municipality that deployed it, the caseworkers who acted on its outputs — in ways that allow each actor to credibly claim limited responsibility while no single actor bears the full burden of accountability.

The Accountability Gap

The accountability gap in AI systems is structural, not incidental. It arises from three features of how complex algorithmic systems are built and deployed. First, distributed development: modern AI systems are built from open-source libraries, third-party APIs, fine-tuned foundation models, and proprietary components — each developed by different organizations under different governance frameworks. No single actor has full visibility into the whole system.

Second, contractual diffusion: service agreements between AI vendors and deploying organizations typically limit vendor liability to the direct cost of the software contract, exclude consequential damages, and place responsibility for deployment decisions on the customer. When a hospital uses a third-party diagnostic AI, the contract may specify that the hospital bears sole responsibility for clinical decisions — even if those decisions were driven by the AI's outputs.

Third, automation bias creates a paradox: the more consequential the decision, the more likely humans are to defer to the algorithmic output, yet the more likely those humans are to claim the decision was "ultimately human" when accountability is sought. Judges who routinely followed COMPAS scores but occasionally overrode them could truthfully claim they made their own decisions. The algorithm provided cover without providing accountability.

The Uber ATG Case, March 2018

When an Uber autonomous test vehicle struck and killed Elaine Herzberg in Tempe, Arizona — the first fatal crash involving a self-driving car — investigators found the vehicle's safety software had detected Herzberg 6 seconds before impact but classified her as an "unknown object" and then as a "vehicle" before finally classifying her correctly as a pedestrian. Emergency braking had been disabled to prevent erratic vehicle behavior. The human safety driver was watching a streaming video. Uber faced no criminal conviction. The safety driver was charged with negligent homicide in 2020. The question of whether Uber's engineers bore criminal responsibility remained contested through 2023 without resolution.

Legal Frameworks and Their Limits

Existing legal frameworks for accountability were built around human decision-makers and identifiable physical products — not statistical models whose outputs are probabilistic and whose inner workings may be opaque even to their creators. Three frameworks have been attempted with limited success.

Products liability applies when a product is defective by design or manufacture. Courts in several jurisdictions have considered whether an AI system that produces discriminatory outputs is a "defective product," but the probabilistic nature of algorithmic outputs makes the causation analysis complex: the system did not malfunction; it performed exactly as designed, and the design reflected contested value choices.

Negligence requires a duty of care, a breach of that duty, causation, and harm. The challenge in AI contexts is establishing both duty (did the vendor owe a duty to end users it never contracted with?) and causation (was the harm caused by the algorithm's output, or by the human who acted on that output?).

Anti-discrimination law, specifically disparate impact doctrine under US civil rights law, prohibits employment practices that produce statistically significant disparate outcomes across protected groups unless the employer can show business necessity. In 2021, Illinois passed the AI Video Interview Act, requiring employers using AI to analyze video interviews to disclose their use and provide a mechanism for job applicants to request the removal of their biometric data — one of the first US laws imposing affirmative obligations on AI deployers toward affected individuals.

The Explainability Requirement

The EU's GDPR (General Data Protection Regulation, effective 2018) includes a right not to be subject to solely automated decisions that produce significant effects, and a right to an explanation of such decisions. In practice, "explanation" has been interpreted minimally by most companies — often amounting to a general description of the model's purpose rather than an account of why this individual received this outcome. The gap between legal requirement and practical implementation remains wide.

Structural Approaches to Closing the Gap

In 2021, the US Federal Trade Commission published guidance titled "Aiming for Truth, Fairness, and Equity in Your Company's Use of AI," listing practices the agency considered potentially unfair or deceptive — including using biased data, failing to test for disparate impact, and deploying AI without meaningful human oversight. The FTC's guidance stopped short of creating new legal rights but signaled the agency's intent to use existing Section 5 unfair practices authority against egregious AI deployments.

The most concrete structural intervention to date is the EU AI Act's "conformity assessment" requirement for high-risk systems: before deployment, vendors must document intended use, training data characteristics, performance metrics disaggregated by relevant groups, residual risk, and human oversight mechanisms. These requirements do not guarantee accountability after harm, but they create a paper trail that makes accountability claims more tractable.

Accountability GapThe structural condition in which responsibility for algorithmic harms is distributed across developers, deployers, and human reviewers in ways that prevent any single actor from bearing full accountability.

Automation BiasThe tendency of human decision-makers to over-weight algorithmic outputs and under-weight countervailing evidence, often while maintaining that the final decision was their own.

Disparate ImpactStatistically significant differences in outcomes across demographic groups produced by a facially neutral practice; actionable under US civil rights law in employment and lending contexts.

Lesson 3 Quiz

Accountability in a System with No Author — five questions

1. The Dutch SyRI system was ruled a violation of the European Convention on Human Rights in 2020 primarily because:

Correct. The court found that SyRI violated Article 8 (the right to private life) because the government could not explain the algorithm's workings and affected individuals had no recourse — core requirements for lawful automated processing of personal data.

Incorrect. The court's ruling centered on opacity (the government could not explain the algorithm's operation even to the court) and the absence of any mechanism for affected individuals to contest their classification.

2. The "accountability gap" in AI systems arises structurally from which combination of factors?

Correct. The accountability gap is structural: it emerges from how complex AI systems are built (across many organizations), how liability is allocated contractually (typically away from vendors), and how humans interact with outputs (deferring while claiming oversight).

Incorrect. The accountability gap is structural, arising from distributed development across organizations, contractual diffusion of liability, and the paradox of automation bias — not primarily from concealment or licensing gaps.

3. In the 2018 Uber autonomous vehicle fatality in Tempe, Arizona, investigators found that the vehicle's software had detected Elaine Herzberg before impact but:

Correct. The NTSB investigation documented the classification sequence — unknown object, vehicle, pedestrian — across the 6 seconds before impact, with braking disabled and the safety operator inattentive.

Incorrect. The investigation found the system did detect Herzberg approximately 6 seconds before impact but mis-classified her multiple times; emergency braking had been disabled to prevent erratic vehicle behavior, and the safety driver was watching a streaming service.

4. The EU GDPR's right not to be subject to solely automated decisions that produce significant effects is, in practice, most often implemented by companies as:

Correct. Legal scholars and regulators have noted that "explanations" under GDPR in practice typically describe the model's general logic rather than the specific factors driving an individual's outcome — a significant gap between the legal right and its practical implementation.

Incorrect. In practice, companies have implemented GDPR's explanation right minimally — typically providing general descriptions of how the model works rather than individual-specific explanations of why a particular person received a particular outcome.

5. Illinois' AI Video Interview Act (2021), one of the first US laws imposing affirmative obligations on AI deployers, requires employers to:

Correct. The Illinois AI Video Interview Act requires disclosure of AI use in video interviews and a process for applicants to request deletion of their biometric information — establishing affirmative obligations on the deployer toward affected individuals.

Incorrect. The Illinois AI Video Interview Act requires employers to disclose their use of AI analysis to job applicants and to provide a mechanism for those applicants to request the removal of their biometric data from the system.

Lab 3 — Mapping the Accountability Chain

Structured reasoning lab · at least 3 exchanges to complete

Your Task

A municipal hospital has deployed a third-party AI diagnostic support tool that flagged a patient incorrectly as low-risk for sepsis. The patient deteriorated without intervention and suffered permanent organ damage. The tool was built by a startup using a foundation model licensed from a large AI company. The hospital purchased the tool after a procurement review. The treating physician saw the AI output and did not order additional tests.

Work with the AI assistant to map every actor in this chain, identify what accountability claim each actor could make, and identify where — if anywhere — accountability actually lodges. Then propose at least one structural change that would close the accountability gap.

Starter prompt: "Let's map the accountability chain in this sepsis case. Start with the AI company that built the foundation model — what is their accountability exposure and what would they typically argue in their defense?"

AI Ethics Lab Assistant

Lab 3

Welcome to Lab 3. We have a sepsis case with a layered accountability chain: a foundation model provider, a startup that built the diagnostic tool, a hospital that deployed it, and a physician who acted on its output. Let's map each actor's exposure and defense systematically. Tell me where you want to start in the chain, or ask me to lay out the full structure first.

Lesson 4 · When Machines Decide

Oversight, Transparency, and the Limits of Both

Human oversight is the most commonly proposed solution to AI risk. It is also routinely circumvented, poorly resourced, and misunderstood as a guarantee rather than a safeguard.

What does meaningful human oversight of an automated decision system actually require — and why is it so rarely achieved?

In 2016, Facebook employed approximately 4,500 human content reviewers globally to oversee content moderation decisions. By 2021, the company had grown that workforce to approximately 15,000 — while operating platforms used by 3.5 billion people. Each reviewer was responsible for assessing hundreds of pieces of content per shift, in languages they were often not native speakers of, under time-pressure metrics that penalized slow decisions. Algorithmic systems made initial classification decisions; humans reviewed appeals and edge cases. When the Facebook Oversight Board — an independent body established in 2020 — reviewed cases referred to it, it overturned Facebook's decisions in more than 80% of cases in its first year. Human oversight existed at every formal level of the system. It was also structurally incapable of meaningfully reviewing more than a fraction of a percent of consequential decisions.

The Facebook moderation case illustrates the core tension in oversight design: adding human reviewers to a system operating at machine scale does not produce meaningful oversight unless those reviewers have the time, information, authority, and incentive to actually override algorithmic recommendations. In the absence of those conditions, human review provides liability cover without providing genuine accountability.

What Meaningful Oversight Requires

In 2021, the EU AI Act proposed a spectrum of human oversight requirements calibrated to risk level. At the highest risk level — which includes biometric identification systems, credit scoring, employment screening, and criminal justice tools — the Act requires that human overseers be able to "fully understand the system's capabilities and limitations," "monitor its operation," and "override or refuse" its outputs. Crucially, the Act requires that human overseers have "the necessary competence, training and authority" to perform these functions.

Academic researchers studying human-AI decision-making teams have documented consistent findings on the conditions under which humans actually override algorithmic recommendations. A 2019 study by Ben Green and Yiling Chen at Harvard found that when human decision-makers were given algorithmic risk scores for recidivism, they did not simply defer to the algorithm — but they also did not improve on it. Instead, they exhibited a systematic pattern: they deferred to the algorithm on cases where the algorithmic prediction aligned with their intuition, and overrode it in cases where it did not, producing outcomes no better than the algorithm alone while introducing additional inconsistency from case to case.

The "Human in the Loop" Fallacy

The presence of a human in a decision process does not guarantee meaningful oversight. Research by Deanna Messervey and colleagues (2019) found that decision-makers under time pressure with high caseloads routinely approved algorithmic recommendations without substantive review — a phenomenon they called "check-box compliance." The human fulfilled the formal oversight requirement while providing no substantive check on the algorithm's outputs.

Transparency — and Its Limits

Transparency is frequently proposed as the primary mechanism for enabling accountability: if affected individuals can understand why an algorithmic decision was made, they can challenge it; if the public can inspect algorithmic systems, they can demand corrections. Both propositions are weaker in practice than they appear in theory.

In 2020, ProPublica launched its "Machine Bias" follow-up, examining three cities' use of predictive policing software: PredPol (now Geolitica) in Santa Cruz, New Orleans, and Los Angeles. In Los Angeles, the LAPD used the system for years without disclosing its use to the public or the city council. When disclosure was finally made, the department released aggregated accuracy statistics but declined to provide the data inputs or model specifications, citing the vendor contract. Santa Cruz became the first US city to ban predictive policing outright in 2020, citing civil rights concerns. New Orleans terminated its contract after learning that the city had entered a secret arrangement with Palantir Technologies without council approval, a fact reported by The Verge in 2018.

Transparency requirements face two structural obstacles. First, vendors frequently claim trade secret protection for model architectures and training data, which courts have generally upheld — leaving affected individuals unable to inspect the systems that govern them. Second, even when model details are technically disclosed, meaningful interpretation requires statistical expertise that most affected individuals and their advocates do not have. Transparency that requires a PhD to decode is not, in practice, transparency for affected communities.

Algorithmic Auditing as an Alternative

Algorithmic auditing — systematic external evaluation of an AI system's inputs, outputs, and processes — has emerged as a proposed middle ground between full transparency and opacity. Joy Buolamwini's Algorithmic Justice League and the AI Now Institute have developed audit frameworks. In 2021, the FTC's policy guidance endorsed regular auditing by qualified third parties as a best practice. New York City's Local Law 144 (2023) became the first US law to require bias audits of AI hiring tools, with public disclosure of results — a model that may spread to other jurisdictions.

Designing Oversight That Works

Research on effective human-AI oversight converges on several structural requirements that go beyond simply placing a human in the approval chain. First, overseers need calibrated uncertainty information — not just the algorithm's recommendation but a clear representation of the confidence level and the conditions under which the model is known to perform poorly. Second, overseers need adequate time — high-volume review under time pressure consistently degrades decision quality to or below the algorithm's baseline. Third, overseers need access to information the algorithm did not use — if the human reviewer sees only what the algorithm saw, override becomes nearly impossible to justify.

The 2022 US Executive Order on Improving Government's Investigative and Review Capabilities for AI incorporated several of these principles into federal guidance, requiring that AI systems used in government benefit and services decisions include "meaningful human review" defined as review by someone with independent access to the underlying case file — not just the algorithmic output. Whether this guidance will be consistently implemented and enforced remains an open question as of 2024.

Algorithmic AuditA systematic external evaluation of an AI system's inputs, processes, outputs, and effects, conducted by parties independent of the system's developers and deployers.

Check-Box ComplianceThe phenomenon in which a human formally fulfills an oversight requirement — reviewing and approving an algorithmic output — without performing substantive independent evaluation.

Conformity AssessmentThe EU AI Act's requirement that developers of high-risk AI systems document intended use, performance characteristics, and human oversight mechanisms before deployment.

Lesson 4 Quiz

Oversight, Transparency, and the Limits of Both — five questions

1. When the Facebook Oversight Board reviewed cases referred to it in its first year (2020–2021), it overturned Facebook's original content moderation decisions in approximately what proportion of cases?

Correct. The Oversight Board overturned Facebook's decisions in more than 80% of referred cases in its first year, a statistic that illustrates the gap between formal and substantive oversight in large-scale content moderation.

Incorrect. The Facebook Oversight Board overturned Facebook's decisions in more than 80% of referred cases in its first year — a striking figure that reveals how inadequate routine review was at the operational level.

2. The 2019 Green and Chen study at Harvard found that when humans were given algorithmic recidivism risk scores to inform their decisions, they:

Correct. The Green and Chen findings are sobering: humans did not simply defer, but neither did they improve on the algorithm. They introduced case-to-case inconsistency while producing aggregate outcomes equivalent to the algorithm alone.

Incorrect. Green and Chen found that humans given algorithmic scores neither outperformed the algorithm nor simply deferred to it. They produced outcomes no better than the algorithm while adding inconsistency — deferring when the algorithm agreed with intuition and overriding when it did not.

3. Santa Cruz, California became notable in 2020 for what action related to algorithmic policing?

Correct. Santa Cruz banned predictive policing in 2020, making it the first US city to do so outright. The decision followed years of civil rights advocacy and a growing national conversation about the civil liberties implications of algorithmic law enforcement tools.

Incorrect. Santa Cruz became the first US city to ban predictive policing outright in 2020 — not to regulate or audit it, but to prohibit it entirely within city limits.

4. New York City's Local Law 144 (2023) was notable as the first US law to require:

Correct. Local Law 144 requires employers using AI tools in hiring to commission bias audits and publicly disclose the results — establishing a precedent for mandatory external auditing of commercial AI systems.

Incorrect. NYC's Local Law 144 specifically requires bias audits of AI tools used in hiring decisions, with public disclosure — the first US law to mandate external auditing of commercial AI hiring tools.

5. Research on effective human-AI oversight identifies which of the following as necessary for overseers to genuinely function as a check on algorithmic outputs?

Correct. Effective oversight requires that reviewers know where the model is uncertain, have enough time to exercise real judgment, and have access to contextual information beyond what the algorithm processed — without all three, "oversight" becomes check-box compliance.

Incorrect. The research consensus points to three structural requirements: calibrated uncertainty information (not just the recommendation), adequate time (not high-volume pressure), and access to information the algorithm did not use (enabling independent evaluation). Only the combination enables genuine oversight.

Lab 4 — Designing Meaningful Oversight

Design and critique lab · at least 3 exchanges to complete

Your Task

You are advising a state benefits agency that is deploying an AI system to flag benefit applications for additional review. The system will process approximately 50,000 applications per month. The agency has proposed a "human oversight" mechanism in which 12 reviewers spend 2 minutes per flagged application checking the AI's recommendation before approving or denying.

Work with the AI assistant to: (1) identify the specific ways this oversight mechanism fails to meet the criteria for meaningful oversight, (2) propose a redesigned oversight structure that addresses those failures, and (3) estimate the resource and process implications of your proposed design. Draw on the research and cases from Lesson 4.

Starter prompt: "What are the specific problems with the 2-minute review model this agency has proposed? Let's work through each failure point against the criteria for meaningful oversight."

AI Ethics Lab Assistant

Lab 4

Welcome to Lab 4. We're evaluating a proposed oversight mechanism for a high-volume AI benefits system and designing a better one. The agency's 2-minute review model raises several problems. Tell me where you want to start — the time pressure issue, the information problem, the staffing math, or something else — and we'll build a complete critique and redesign together.

Module 1 — Test

When Machines Decide · 15 questions · 80% required to pass

1. Which publication first documented that COMPAS produced racially disparate false positive rates in Broward County criminal cases?

Correct. ProPublica's 2016 investigation "Machine Bias" was the first to document and quantify COMPAS's racially disparate false positive rates using a dataset of approximately 7,000 Broward County cases.

Incorrect. The investigation was published by ProPublica in 2016 under the headline "Machine Bias."

2. The trade secret protection Northpointe claimed over the COMPAS algorithm meant that:

Correct. The algorithm's source code remained a trade secret throughout the litigation and public debate — meaning judicial review of COMPAS-influenced sentences proceeded without any court ever independently examining the model's mechanics.

Incorrect. Northpointe's trade secret claim meant no court — including the Wisconsin Supreme Court in State v. Loomis — had access to the algorithm's actual source code.

3. The primary cause of the "Gender Shades" accuracy disparities in commercial facial recognition systems was:

Correct. Buolamwini and Gebru traced the disparities to the composition of training datasets — which were heavily skewed toward light-skinned male faces because those were easiest to scrape from publicly available sources in the early 2010s.

Incorrect. The root cause was the composition of training data, not intentional exclusion or hardware differences. Publicly available datasets were disproportionately composed of light-skinned male faces, and the models reflected that imbalance.

4. The 2019 UC Berkeley health-care algorithm study found disparities affecting an estimated how many people in the US?

Correct. The Berkeley PNAS study estimated that the health-care algorithm in question was used in systems affecting approximately 200 million people — making the scale of the identified disparity one of the largest documented in algorithmic fairness research.

Incorrect. The researchers estimated the algorithm affected approximately 200 million people — a figure that underscored how AI systems operating at scale can embed and amplify disparities across enormous populations.

5. "Goodhart's Law" as applied to AI objective functions describes the problem that:

Correct. Amazon's résumé screener illustrates this precisely: the objective was to predict "who Amazon will hire," which the model optimized correctly — but "who Amazon historically hired" was not the same as "best qualified candidates."

Incorrect. Goodhart's Law describes the gap between the optimized measure and the underlying goal: when you optimize for a proxy measure, it becomes divorced from the thing it was supposed to represent.

6. The three research groups that independently proved the fairness impossibility theorem in 2016 were affiliated with:

Correct. Three separate research groups at Cornell, Google, and the Max Planck Institute published the impossibility result within weeks of each other in late 2016, following the public COMPAS debate.

Incorrect. The three independent groups were affiliated with Cornell, Google, and the Max Planck Institute — publishing their results nearly simultaneously in December 2016.

7. The Allegheny Family Screening Tool case in Pennsylvania illustrated that even carefully designed AI systems embed ethical choices because:

Correct. The AFST case is notable precisely because the tool was relatively well-designed — but the fairness definition governing it was chosen unilaterally by a single data analytics official, without the participation of the families whose lives it affected.

Incorrect. The key point of the Allegheny case was that the choice of fairness definition — a value-laden decision — was made by a single official without public deliberation, and affected families had no mechanism to contest it.

8. The EU AI Act's approach to the fairness definition problem for high-risk AI systems is to:

Correct. The EU AI Act does not resolve the mathematical impossibility — it cannot. Instead, it shifts implicit choices to explicit ones: developers must document which definition they chose and justify it, making the choice visible and contestable.

Incorrect. The EU AI Act requires transparency about the chosen fairness definition rather than mandating any specific one — recognizing that the choice is a value judgment that must be made explicit.

9. The New Orleans predictive policing controversy, reported by The Verge in 2018, involved:

Correct. The Verge reported that New Orleans had entered a secret agreement with Palantir for a predictive policing program that city council members were unaware of — a governance failure that eventually led to the contract's termination.

Incorrect. The New Orleans controversy involved a secret arrangement with Palantir Technologies for predictive policing software that bypassed city council oversight — discovered by journalists rather than through official disclosure.

10. "Check-box compliance" in human oversight of AI systems describes:

Correct. Check-box compliance — reviewers approving algorithmic outputs without substantive independent evaluation, typically due to time pressure and high caseloads — provides accountability in form without accountability in substance.

Incorrect. Check-box compliance describes the phenomenon in which human reviewers fulfill the formal requirement of oversight without performing substantive evaluation — giving the appearance of accountability without its substance.

11. For human oversight of AI systems to be meaningful rather than nominal, research indicates reviewers need all of the following EXCEPT:

Correct. The research on effective oversight identifies the three requirements as: uncertainty information, adequate time, and access to independent information — not technical AI expertise per se. Domain expertise in the subject matter (medicine, law, etc.) is more relevant than knowledge of machine learning.

Incorrect. The three identified requirements for meaningful oversight are: calibrated uncertainty information, adequate time, and access to information the algorithm did not use. Technical AI engineering expertise is not among them — domain expertise in the decision area is more relevant.

12. The SyRI court ruling in the Netherlands (2020) found the system violated the right to private life under the European Convention on Human Rights. A key factor in the court's reasoning was:

Correct. The court's reasoning centered on opacity — the government could not explain the algorithm's workings even to a judge — and the absence of any contestation mechanism for affected individuals, both prerequisites for lawful automated processing of sensitive personal data.

Incorrect. The ruling turned on the government's inability to explain the algorithm to the court and the lack of any mechanism for scored individuals to challenge their classification — structural failures of transparency and recourse, not evidence of specific wrong outcomes.

13. The FTC's 2021 AI policy guidance titled "Aiming for Truth, Fairness, and Equity" indicated the agency's intent to use which existing legal authority to address AI harms?

Correct. The FTC's 2021 guidance signaled that using biased data, failing to test for disparate impact, and deploying AI without meaningful oversight could constitute "unfair or deceptive" practices under Section 5 of the FTC Act — an existing authority requiring no new legislation.

Incorrect. The FTC guidance specifically invoked Section 5 of the FTC Act's prohibition on unfair or deceptive practices — existing authority the agency indicated it would use against egregious AI deployments without waiting for new AI-specific legislation.

14. Amazon's internal résumé screening tool was discovered to have a gender bias problem in what year, and was subsequently disbanded in what year?

Correct. Amazon's engineers discovered the bias problem in 2015 and the project was disbanded in 2017 — the fact becoming public only through the Reuters investigation in October 2018.

Incorrect. Amazon's engineers discovered the problem in 2015, disbanded the project in 2017, and the whole sequence became public only through a Reuters investigation published in October 2018.

15. Which of the following best describes the core ethical challenge that unifies all four lessons of this module?

Correct. This is the throughline across all four lessons: the value judgments embedded in training data, feature selection, objective functions, fairness definitions, and oversight mechanisms are made once by specific people and then executed at enormous scale — creating a structural mismatch between the deliberateness of design and the reach of deployment.

Incorrect. The unifying challenge is that AI systems encode human value judgments at design time and execute them at scale — making those initial choices uniquely consequential and difficult to contest after the fact. This is neither purely a legal problem nor purely a training problem; it is a structural feature of how automated decision systems work.