Module 3 · Lesson 1

What Bias Actually Means

How the data we collect encodes the history we'd rather forget — and why "objective" is the most dangerous word in AI.

If an algorithm is trained only on past decisions, whose past is it learning from?

In May 2016, investigative journalists at ProPublica published a piece that would rattle the criminal justice world. They had obtained COMPAS scores — risk assessments generated by an algorithm made by a company called Northpointe — for more than 7,000 people arrested in Broward County, Florida. Then they followed up two years later to see who actually reoffended.

The results were stark. Black defendants were nearly twice as likely to be falsely flagged as future criminals compared to white defendants. White defendants were more likely to be incorrectly labeled low risk and go on to commit new crimes. The algorithm had never been shown to a judge as anything but a clean number — a score from one to ten — and yet it was shaping who went home and who went to prison.

Northpointe insisted their tool was accurate. ProPublica said it was biased. Both were telling the truth. That paradox is where the study of algorithmic bias begins.

What Is Algorithmic Bias?

Bias in AI systems is not a bug in the traditional sense. It is not a programmer typing the wrong symbol. It is a pattern inherited from data — data that was generated by human beings operating inside societies with documented histories of unequal treatment. When an algorithm is trained on past decisions, it learns to replicate those decisions, including the prejudices embedded in them.

The word "bias" in everyday language implies intent: someone is biased when they consciously or unconsciously prefer one group over another. In machine learning, bias often requires no such intent. A hiring algorithm trained on a decade of résumés from a company that historically hired mostly men will learn that "maleness" correlates with success — not because any engineer chose that, but because the historical data said so.

Bias in AI can be loosely divided into three entry points: the data used for training, the design choices made during development, and the deployment context in which the system is applied. Understanding which type is operating in any given situation is essential to fixing it.

Historical biasArises when the real world at the time data was collected was already skewed. Even a perfect snapshot of a biased society will encode that bias into any model trained on it.

Representation biasOccurs when certain groups are underrepresented in training data, causing the model to perform worse on those groups — not from malice, but from incomplete information.

Measurement biasEmerges when the proxy used to measure a concept is not neutral. COMPAS measured "risk of reoffending" using prior arrests — but arrest rates themselves reflect unequal policing.

The COMPAS Paradox in Detail

Northpointe's rebuttal to ProPublica was mathematically coherent. They demonstrated that COMPAS was calibrated: among people who scored a 7, roughly the same percentage of Black and white defendants went on to reoffend. From one angle, that's fairness — the score means the same thing regardless of race.

But ProPublica was measuring something different: error rates across groups. A Black defendant who would not reoffend was far more likely to be labelled high risk than a white defendant who would not reoffend. From this angle, the tool was inflicting different costs on different people for the same outcome.

A landmark 2016 paper by computer scientists Jon Kleinberg, Sendhil Mullainathan, and Manish Raghavan proved something that made the whole debate feel more like tragedy than scandal: when base rates of the outcome differ between groups — as they do when one group faces higher rates of policing and arrest — you mathematically cannot satisfy both definitions of fairness at the same time. You must choose which unfairness to accept.

Why This Matters

The COMPAS case is not a story about one bad algorithm. It is a demonstration that choosing a fairness metric is itself an ethical and political act. There is no neutral option. The question "which fairness?" is always also the question "whose interests take priority?"

Data as a Mirror — Not a Window

A common misconception about AI is that it discovers objective truth from data. More accurately, it reflects the world that generated the data. In 2015, Google Photos' image recognition system automatically labelled photos of two Black people as "gorillas." The engineers had not written racist code. They had trained on a dataset that dramatically underrepresented dark-skinned faces, causing the system's edge-detection features to misfire on skin tones it had rarely seen.

Google's response — reportedly blocking the word "gorilla" as a label entirely, a fix still in place years later — illustrated the difficulty of remediation. You cannot just remove the bad output; you must address the underlying cause, and the underlying cause is often the distribution of the world that produced the data in the first place.

This is why researchers like Joy Buolamwini at the MIT Media Lab introduced the term "the coded gaze" — the idea that AI systems encode the perspective of whoever built them and whoever generated the majority of the training data, often at the expense of those who did not.

Key Insight

Bias is not a property of the algorithm in isolation. It is a relationship between the algorithm, the data it was trained on, the world that generated that data, and the specific task being performed. Fixing it requires understanding all four elements together.

Module 3 · Quiz 1

What Bias Actually Means

Five questions — select the best answer for each.

1. The ProPublica investigation into COMPAS found which primary disparity?

Correct. ProPublica's 2016 analysis of Broward County data showed Black defendants faced nearly double the false-positive rate — labelled high risk when they would not reoffend — compared to white defendants.

Not quite. The central finding was a racial disparity in false-positive rates: Black defendants were mislabelled as high risk at nearly twice the rate of white defendants.

2. What is "historical bias" in the context of algorithmic systems?

Correct. Historical bias reflects the fact that data is a snapshot of a world that may already be deeply unequal. Training on that data faithfully reproduces the inequalities, even without any malicious intent.

Not quite. Historical bias refers to bias inherited from the real-world conditions that existed when the data was generated — not a technical coding error or a deliberate choice.

3. The "COMPAS paradox" demonstrated that when base rates differ between groups, it is mathematically impossible to simultaneously satisfy which two conditions?

Correct. The Kleinberg, Mullainathan, and Raghavan paper proved that when outcome base rates differ between groups, you cannot achieve both calibration (the score means the same thing across races) and equal false-positive and false-negative rates simultaneously.

Not quite. The mathematical proof by Kleinberg and colleagues showed the specific conflict is between score calibration across groups and equal error rates — you cannot satisfy both when base rates differ.

4. Google Photos' 2015 error of labelling Black people as "gorillas" is best explained by which type of bias?

Correct. The system had seen so few dark-skinned faces during training that its edge-detection features misfired on skin tones it had rarely encountered — a classic case of representation bias.

Not quite. This case is primarily about representation bias: the training dataset dramatically underrepresented dark-skinned faces, causing systematic errors on that group.

5. Joy Buolamwini's concept of "the coded gaze" refers to which phenomenon?

Correct. Buolamwini's "coded gaze" concept captures how the viewpoint of those who build AI systems — and whose images, voices, and behaviors dominate the training data — gets embedded into the system's outputs.

Not quite. The coded gaze is Buolamwini's term for how AI reflects the perspectives of those who built it and provided most of the data, systematically disadvantaging those who are underrepresented.

Module 3 · Lab 1

Diagnosing Bias Types

Identify and explain bias types in real scenarios with your AI discussion partner.

Your Task

You will be presented with real-world AI scenarios. For each one, identify which type of bias is most likely operating — historical, representation, or measurement — and explain your reasoning. Your AI partner will probe your thinking and provide feedback.

Complete at least 3 exchanges to finish this lab.

Start by describing a scenario where you think bias could enter an AI hiring tool, and identify which type of bias it represents and why.

Bias Diagnostics Lab

L1 · Bias Types

Welcome to the Bias Diagnostics Lab. I'm here to help you sharpen your ability to identify and explain different types of algorithmic bias. Let's start with something concrete: describe a scenario where you think bias could enter an AI hiring tool, identify which type of bias it represents, and explain your reasoning. There's no single right answer — I'm interested in how you're thinking about it.

Module 3 · Lesson 2

Facial Recognition and the Limits of Seeing

When the technology that polices our faces can't reliably read them — and whose faces it fails most.

If a tool used to identify suspects is wrong 35% of the time on Black women and 1% of the time on white men, is it one tool or two?

On January 9, 2020, Robert Williams was standing in his driveway in Farmington Hills, Michigan, when two Detroit police officers pulled up and told him he was under arrest. They had a warrant. His wife and daughters watched as he was handcuffed and driven away. He spent the night in jail before learning the charge: shoplifting watches from a Shinola store in 2018.

Williams had never been in that store. Detroit police had fed surveillance footage into a facial recognition system that matched it to a driver's license photo in a database. The match was wrong. Investigators had not sought any corroborating evidence before seeking the warrant. Williams was the first documented American to be arrested based solely on a false facial recognition match.

He was not the last. In the year that followed, at least two more Black men — Michael Oliver and Nijeer Parks — were wrongfully arrested in the Detroit area on the same basis. Parks spent ten days in jail. All three men were Black. The technology's developers had tested it primarily on lighter-skinned faces.

The Gender Shades Study

In 2018, MIT researcher Joy Buolamwini and data scientist Timnit Gebru published "Gender Shades," a landmark audit of three commercial facial analysis systems sold by IBM, Microsoft, and Face++. They tested each system's ability to classify the gender of faces across a carefully stratified dataset of 1,270 faces ranging from dark-skinned women to light-skinned men.

The results were alarming. On light-skinned men, all three systems performed near-perfectly. On dark-skinned women, error rates reached as high as 34.7%. The performance gap was not a minor calibration issue — it was a systematic failure concentrated almost entirely on the group most underrepresented in the training data: darker-skinned female faces.

After the paper's publication, IBM and Microsoft both significantly improved their systems' performance on darker-skinned faces. Face++ showed smaller improvement. The study demonstrated that independent auditing — not vendor self-reporting — is often the mechanism through which these failures come to light.

Facial recognitionA category of AI that attempts to verify or identify individuals by comparing facial features in an image against a database. Unlike face detection (is there a face?), recognition asks: whose face is this?

Differential performanceWhen an AI system's accuracy varies significantly across demographic subgroups — often because those subgroups were not equally represented in training data.

AuditAn independent evaluation of an AI system's performance and fairness properties, ideally conducted by parties without a financial stake in a positive result.

High Stakes, Low Accuracy: Policing Use Cases

Facial recognition has been deployed in policing in dozens of U.S. cities and internationally, often without public disclosure or legislative approval. The Washington D.C. area, New Orleans, New York City, and many others have used commercial systems from vendors including Clearview AI, Amazon's Rekognition, and NEC. These systems are typically used to generate "leads" — potential matches — rather than definitive identifications. But in practice, as the Robert Williams case shows, a lead can become an arrest warrant with insufficient scrutiny.

A 2019 NIST (National Institute of Standards and Technology) study tested 189 facial recognition algorithms submitted by commercial vendors. It found that most algorithms performed 10 to 100 times worse on African-American and Asian faces compared to Caucasian faces. False-positive rates — where the system incorrectly identifies someone as a match — were highest for African-American women. In a criminal justice context, a false positive is not a minor inconvenience. It can mean arrest, incarceration, and lasting reputational damage.

Several cities responded to these findings by banning or restricting government use of facial recognition: San Francisco (2019), Boston (2020), Minneapolis (2021), and others. The European Union's AI Act, finalized in 2024, places facial recognition used in real-time public surveillance in the highest risk category, with strict prohibitions on most use cases.

Case Note — Amazon Rekognition

In 2018, the ACLU tested Amazon's Rekognition tool by running photos of all 535 members of the U.S. Congress against a database of 25,000 publicly available arrest photos. The system produced 28 false matches. Disproportionately, those misidentified were members of Congress who were people of color — despite people of color making up only 20% of the congressional membership tested. Amazon disputed the test methodology, saying the confidence threshold was set too low.

The Consent Problem

Beyond accuracy, facial recognition raises a distinct ethical issue: the collection and use of biometric data without consent. Clearview AI built a database of more than three billion facial images by scraping social media platforms — Facebook, Instagram, LinkedIn, Twitter — without permission from users or platforms. Law enforcement agencies could then upload a photo of a suspect and receive a list of potential matches with links to public posts.

In 2022, an Illinois court ordered Clearview to pay $52 million in a class-action settlement under the state's Biometric Information Privacy Act — the most stringent biometric privacy law in the United States. Canada, Australia, and multiple EU member states also found Clearview's practices to violate their privacy laws and ordered data deletion. The episode crystallized a broader question: even if facial recognition were perfectly accurate, should your face be searchable by anyone with a subscription?

Core Tension

Facial recognition sits at the intersection of two distinct ethical failures: a technical failure (differential accuracy by race and gender) and a consent failure (biometric data collected and used without meaningful permission). Solving one does not solve the other. A perfectly accurate system deployed without consent is still an ethical violation.

Module 3 · Quiz 2

Facial Recognition and the Limits of Seeing

Five questions — select the best answer for each.

1. Robert Williams' 2020 wrongful arrest is significant because it was the first documented case of what?

Correct. Williams' case is documented as the first known instance of a wrongful arrest in the United States where the sole basis for the arrest warrant was an incorrect facial recognition match.

Not quite. Williams is documented as the first American known to have been arrested based solely on a false facial recognition match — no corroborating evidence had been sought before the warrant was issued.

2. In the Gender Shades study by Buolamwini and Gebru, which demographic group showed the highest error rates in commercial facial analysis systems?

Correct. The Gender Shades study found that dark-skinned women — the group most underrepresented in training data — experienced error rates as high as 34.7%, compared to near-perfect performance on light-skinned men.

Not quite. The highest error rates were concentrated on dark-skinned women, the group most underrepresented in training datasets, with errors reaching 34.7% on some systems.

3. A 2019 NIST study of 189 commercial facial recognition algorithms found that most performed how much worse on African-American and Asian faces compared to Caucasian faces?

Correct. The NIST study, one of the most comprehensive independent audits of commercial facial recognition, found false-positive rates 10 to 100 times higher for African-American and Asian faces than for Caucasian faces across most algorithms tested.

Not quite. The NIST finding was far more severe — most algorithms performed 10 to 100 times worse on African-American and Asian faces, not merely slightly worse.

4. How did Clearview AI build its database of over three billion facial images?

Correct. Clearview AI scraped billions of images from Facebook, Instagram, LinkedIn, Twitter, and other platforms without users' or platforms' permission — a practice multiple jurisdictions subsequently found to violate privacy laws.

Not quite. Clearview AI's database was built by scraping images from social media platforms without consent — an approach that resulted in regulatory action and a $52 million settlement in Illinois.

5. According to the lesson, which of the following best characterizes the "consent problem" with facial recognition that is distinct from its accuracy problem?

Correct. The consent problem is conceptually separate from accuracy: even if facial recognition worked perfectly, deploying it using data collected without meaningful consent — as Clearview AI did — raises independent ethical and legal violations.

Not quite. The lesson's key insight is that the consent problem is separate from accuracy: a perfectly accurate system using biometrically identifying data collected without consent still constitutes an ethical violation.

Module 3 · Lab 2

Auditing Facial Recognition Claims

Evaluate vendor claims and audit methodologies with your AI discussion partner.

Your Task

Vendors selling facial recognition tools often make claims about their systems' accuracy and fairness. In this lab, you'll practice critically evaluating those claims by applying what you've learned about audit methodology, differential performance, and consent.

Complete at least 3 exchanges to finish this lab.

Imagine a company claims their facial recognition system is "98% accurate and fair across all demographics." What questions would you ask before accepting that claim? Start with at least two specific questions.

Facial Recognition Audit Lab

L2 · Accuracy & Consent

Welcome to the Facial Recognition Audit Lab. Vendors regularly make bold claims about their systems' accuracy and fairness — and those claims deserve scrutiny. Your challenge: a company is pitching their facial recognition system to a city police department with the claim that it's "98% accurate and fair across all demographics." What questions would you ask before accepting that claim? Give me at least two specific questions you'd want answered.

Module 3 · Lesson 3

Hiring, Credit, and the Automated Gatekeepers

When algorithms decide who gets a job, a loan, or an opportunity — and the patterns they've learned from history.

If past promotion decisions discriminated against women, and an AI learns to predict "success" from those decisions, has the discrimination been automated or amplified?

In 2014, Amazon began building an AI recruiting tool that the company hoped would automate the search for talent. Engineers trained it on a decade of résumé submissions and hiring decisions — the inputs and outputs of Amazon's own past hiring process. By 2015, the system was operational. By 2017, it had been scrapped.

The reason: the system had learned to penalize résumés that included the word "women's" — as in "women's chess club" or "women's college." It also downgraded graduates of all-female colleges. The tool was not told to discriminate. It had observed that men were hired at higher rates and inferred that signals of maleness correlated with hireability. It was doing exactly what it was designed to do — finding patterns in the data. The patterns it found were the legacy of a decade of biased decisions.

Amazon quietly dissolved the team. Reuters broke the story in October 2018. The company said the tool had never been used in final hiring decisions, though the degree to which it influenced candidate screening remained disputed.

Why Hiring Algorithms Fail Women and Minorities

Amazon's case is a textbook example of historical bias feeding forward. The company's historical hiring data encoded the gender imbalance of the tech industry — and the model faithfully reproduced it. But the problem extends beyond one company. In 2019, researchers at the University of Toronto analyzed a widely used pre-employment screening tool and found that it consistently scored candidates with "White-sounding" names higher than equally qualified candidates with "Black-sounding" names, echoing the findings of a famous 2003 audit study by economists Marianne Bertrand and Sendhil Mullainathan that sent identical résumés with racially coded names to employers and found a 50% callback gap.

The issue compounds when algorithms score on proxies that correlate with protected characteristics. Credit scoring systems that penalize applicants without a credit history disproportionately affect recent immigrants, young adults, and communities where formal banking access has historically been limited. The variable "no credit history" is not race — but its distribution in the population is shaped by racially differentiated access to banking infrastructure.

This is what researchers call proxy discrimination: when a facially neutral variable serves as a statistical stand-in for a protected characteristic. Zip code, school attended, employment gap, credit history — each can be predictively valid and systematically unfair at the same time.

Proxy discriminationWhen a facially neutral variable — such as zip code or credit history — correlates so strongly with a protected characteristic (race, gender) that using it produces discriminatory outcomes equivalent to using that characteristic directly.

Feedback loopA cycle in which an AI system's outputs influence the data that future models are trained on, causing initial biases to compound over time rather than attenuate.

FCRA / ECOAU.S. federal laws governing credit decisions. The Fair Credit Reporting Act and Equal Credit Opportunity Act prohibit discrimination on protected bases in lending — laws that apply to algorithmic systems as well as human loan officers.

Credit Scoring and the Geometry of Exclusion

In 2019, the U.S. Department of Housing and Urban Development filed a complaint against Facebook alleging that its ad-targeting algorithms were facilitating housing discrimination. Advertisers could show housing listings only to users Facebook classified as likely to be interested — but those classification signals included proxies for race, national origin, and religion. The settlement required Facebook to overhaul its ad system for housing, employment, and credit categories.

The same year, the New York Department of Financial Services investigated Apple's credit card — the Apple Card, issued by Goldman Sachs — after David Heinemeier Hansson, the creator of Ruby on Rails, tweeted that he had received a credit limit 20 times higher than his wife despite their sharing all assets. Dozens of similar complaints followed. Goldman Sachs maintained that its algorithm did not use gender as an input. The regulator found no legal violation — but the investigation exposed the opacity of algorithmic credit decisions and the inadequacy of existing disclosure requirements.

When asked to explain its decision, Goldman Sachs could not provide an individual applicant with a meaningful explanation of what factors drove their score. This is not unique to Goldman. Most gradient-boosted decision tree models used in credit scoring are not designed for interpretability. The right to explanation — enshrined in Europe's GDPR and partially addressed in the U.S. by adverse action notices — is difficult to satisfy in practice when the model itself cannot clearly articulate why it decided what it decided.

The Feedback Loop Problem

When biased models make decisions — who gets a loan, who gets interviewed — those decisions create the next round of training data. People denied loans don't appear in the "successful borrower" dataset. People not hired don't appear in the "successful employee" dataset. The model's biases become invisible because the evidence that would reveal them was never generated.

Audits, Explainability, and Regulatory Responses

New York City Local Law 144, which took effect in 2023, was the first law in the United States to regulate automated employment decision tools specifically. It requires employers using AI hiring tools to conduct annual bias audits by independent third parties and to disclose audit results publicly. Applicants must be notified when AI is being used. Enforcement has been slow, and critics have noted that the law allows employers to commission their own audits — creating conflicts of interest — but it represents the first substantive legislative effort to impose accountability on automated hiring.

The EU AI Act, passed in 2024, classifies AI systems used in employment, education, essential services, and credit scoring as "high risk," requiring conformity assessments, ongoing monitoring, human oversight mechanisms, and detailed documentation of training data before deployment. These requirements represent a significant structural shift: from voluntary vendor standards to binding legal obligations.

The Core Problem

When algorithms replace human gatekeepers, they do not eliminate human bias — they often amplify and entrench it by making it faster, cheaper, and harder to see. The illusion of objectivity is not a side effect. It is, for many deployers, a feature: a way to disclaim responsibility for decisions that were always going to be made.

Module 3 · Quiz 3

Hiring, Credit, and the Automated Gatekeepers

Five questions — select the best answer for each.

1. Amazon's AI recruiting tool learned to discriminate against women primarily because of what?

Correct. Amazon's tool learned that signals of maleness correlated with hiring success because men had been hired at higher rates historically. It was not programmed to discriminate — it learned to from biased past decisions.

Not quite. The tool's bias was not programmed in — it was learned from training data: a decade of Amazon's own hiring decisions, which reflected the broader gender imbalance of the tech industry.

2. What is "proxy discrimination" as the term is used in algorithmic fairness?

Correct. Proxy discrimination occurs when neutral-seeming variables — zip code, credit history, school name — are so correlated with protected characteristics that their use produces discriminatory outcomes even without any explicit reference to race, gender, or other protected traits.

Not quite. Proxy discrimination refers specifically to a facially neutral variable that correlates so tightly with a protected characteristic (like race or gender) that using it generates discriminatory results — without ever naming the protected trait.

3. The Apple Card / Goldman Sachs investigation by New York regulators in 2019 revealed what significant limitation?

Correct. A key outcome of the investigation was the exposure of algorithmic opacity: Goldman Sachs couldn't clearly explain what drove individual credit decisions — a serious problem given legal requirements around adverse action notices and the broader right to explanation.

Not quite. The investigation's notable finding was Goldman Sachs' inability to provide meaningful individual explanations for credit decisions — highlighting how complex ML models fail to meet transparency obligations that credit laws assume are achievable.

4. New York City Local Law 144, effective 2023, was significant because it was the first U.S. law to do what?

Correct. Local Law 144 requires annual bias audits by independent third parties, public disclosure of audit results, and notification to applicants when AI is used in hiring — the first such regulatory requirements in the U.S.

Not quite. Local Law 144's key provisions are annual independent bias audits with public disclosure and applicant notification — not a ban or pre-approval requirement.

5. A "feedback loop" in the context of algorithmic hiring bias refers to which problem?

Correct. The feedback loop problem is that biased decisions — who gets hired, who gets a loan — determine who appears in future training data. People excluded by a biased model don't generate the "success" data that would correct the model's errors.

Not quite. The feedback loop problem is that a biased model's decisions shape future training data: those denied opportunities don't appear in success datasets, so the model's biases become self-reinforcing rather than self-correcting.

Module 3 · Lab 3

Proxy Variables and Feedback Loops

Trace bias pathways through hiring and credit systems with your AI discussion partner.

Your Task

Proxy discrimination and feedback loops are subtle mechanisms. In this lab, you'll work through scenarios to identify proxy variables, trace how feedback loops form, and think through what interventions could break these cycles.

Complete at least 3 exchanges to finish this lab.

A fintech company is building a credit-scoring model for small business loans. They plan to use the following features: zip code, years of formal education, number of prior loans, LinkedIn profile completeness, and average daily account balance over the past year. Which of these variables might function as proxies for race or socioeconomic background? Explain your reasoning for at least two of them.

Proxy & Feedback Loop Lab

L3 · Credit & Hiring

Welcome to the Proxy and Feedback Loop Lab. Let's think carefully about how seemingly neutral data features can carry discriminatory payload. A fintech company is building a credit-scoring model for small business loans using these features: zip code, years of formal education, number of prior loans, LinkedIn profile completeness, and average daily account balance over the past year. Which of these might function as proxies for race or socioeconomic background? Walk me through your reasoning on at least two of them.

Module 3 · Lesson 4

Mitigation, Accountability, and the Limits of Technical Fixes

Debiasing techniques can help. But who is responsible when they're not enough — and what does structural fairness actually require?

If you can prove an algorithm discriminates but cannot determine why, and cannot easily fix it without degrading its usefulness — what do you do?

In February 2020, a Dutch court ordered the government of the Netherlands to immediately halt a fraud-detection system called SyRI — System Risk Indication. SyRI was an algorithm that combined data from seventeen government databases — tax records, employment data, housing registers, benefit claims — to generate risk scores for citizens suspected of welfare fraud. The system had been deployed in fourteen municipalities, overwhelmingly in low-income and ethnically diverse neighborhoods.

The court found SyRI violated the European Convention on Human Rights — specifically the right to private life under Article 8. The government had not made the risk model public. Citizens had no way to know they were being scored, no access to what data was used, and no clear mechanism to challenge or correct errors. The court ruled that opaque algorithmic surveillance of disadvantaged populations, without meaningful transparency or appeal rights, crossed a fundamental legal line.

It was among the first court rulings anywhere in the world to invoke human rights law directly against an automated government decision system — and it anticipated much of the regulatory architecture that would follow.

Technical Approaches to Bias Mitigation

Researchers and engineers have developed a range of technical approaches to reducing bias in AI systems. These generally fall into three categories based on where in the pipeline they intervene:

Pre-processing interventions modify or rebalance training data before the model is trained. Techniques include resampling underrepresented groups, synthetic data generation (creating additional examples of underrepresented cases), and removing or transforming proxy variables. The risk is that removing proxies may degrade predictive performance, and that synthetizing data for underrepresented groups may introduce artifacts.

In-processing interventions modify the learning algorithm itself to include a fairness constraint — effectively penalizing the model during training if its predictions diverge too greatly across demographic groups. This requires specifying in advance which fairness metric to optimize for — and as the COMPAS paradox showed, different metrics can be mutually incompatible.

Post-processing interventions adjust the model's outputs after the fact, applying different decision thresholds for different groups to equalize error rates. This approach is implementable without retraining but is controversial because it explicitly treats groups differently — the very thing that discrimination law in most jurisdictions formally prohibits.

Pre-processingBias mitigation applied to training data before model training — including resampling, data augmentation, and proxy variable removal.

In-processingBias mitigation integrated into the learning algorithm itself, typically as a fairness constraint added to the optimization objective.

Post-processingBias mitigation applied to model outputs after training, such as adjusting classification thresholds differently across demographic groups.

Why Technical Fixes Are Not Sufficient

Every technical mitigation technique has a limitation: it operates within the system as designed. It cannot question whether the system should exist, whether the task being automated is itself appropriate to automate, or whether the fairness metric chosen reflects whose interests were prioritized in the design process.

In 2019, a study by Obermeyer, Powers, Vogeli, and Mullainathan in Science examined a widely used healthcare algorithm that predicted which patients needed intensive care management. The system had been deployed for 200 million people across the United States. It was found to be systematically under-identifying Black patients with the same level of illness as white patients — effectively allocating less care to Black patients with equal need.

The root cause: the algorithm used past healthcare spending as a proxy for health need. But spending is not the same as need. Black patients in the United States, due to documented barriers in healthcare access including cost, distrust, and geography, had historically spent less on healthcare for the same conditions. The algorithm had learned that Black patients "needed less," because they had historically received less.

The fix involved recalibrating the proxy — switching from spending to illness burden directly. The result was that the algorithm identified 46% more Black patients for enrollment in care management programs. But it required researchers outside the vendor to identify the problem, and the system had run for years before the audit.

The Accountability Gap

In most jurisdictions, when an algorithmic decision harms someone, it is extraordinarily difficult to establish legal liability. Vendors argue their systems are general-purpose tools and bear no responsibility for how deployers use them. Deployers argue they relied on the vendor's representations of accuracy and fairness. The person harmed — wrongfully arrested, denied a loan, excluded from care — is left with a harm and no clear path to remedy. Closing this accountability gap is among the central challenges of AI governance.

Structural vs. Technical Solutions

The SyRI ruling, NYC Local Law 144, and the EU AI Act all point toward the same conclusion: technical debiasing cannot substitute for structural accountability. Structural solutions require transparency (you must disclose what your system does), contestability (affected individuals must have a meaningful way to challenge decisions), human oversight (consequential decisions must have a human review mechanism), and ongoing monitoring (you must continuously audit performance, not just test before deployment).

Algorithmic impact assessments — modelled on environmental impact assessments — are now required or recommended in several jurisdictions before high-risk AI systems can be deployed in public services. They require developers to articulate who is affected, what the expected benefits are, what the foreseeable harms are, and what mitigation measures are in place.

Critics from civil rights organizations argue that even these frameworks place too much burden on after-the-fact remediation. The more fundamental question, they argue, is whether some domains — criminal risk assessment, welfare fraud detection, predictive policing — should be automated at all given the current state of the technology and the severity of the harms when it fails. That is not a technical question. It is a political and moral one.

The Deepest Lesson

Algorithmic bias is not primarily a machine learning problem. It is a power problem: who decides what to automate, whose data is used, which fairness metric is chosen, and who bears the cost when the system is wrong. Technical tools can help. But they cannot replace the governance structures, legal accountability mechanisms, and political will required to ensure that automated systems serve everyone equitably.

Module 3 · Quiz 4

Mitigation, Accountability, and Structural Fairness

Five questions — select the best answer for each.

1. The Dutch court's 2020 ruling against SyRI was historically significant because it was among the first to do what?

Correct. The SyRI ruling was a landmark because it applied Article 8 of the European Convention on Human Rights — the right to private life — to an algorithmic government system, finding that opaque automated scoring of disadvantaged populations without transparency or appeal rights violated fundamental rights.

Not quite. The SyRI ruling's significance was in invoking human rights law — specifically Article 8 ECHR — directly against an automated government scoring system on grounds of opacity and the absence of meaningful challenge mechanisms.

2. A "post-processing" bias mitigation technique refers to which approach?

Correct. Post-processing interventions operate on the model's outputs after training — typically by applying different classification thresholds to different groups to equalize false-positive or false-negative rates.

Not quite. Post-processing refers to adjustments made to model outputs after training is complete — such as different decision thresholds per demographic group — as opposed to modifying the data or the training process itself.

3. The 2019 Obermeyer et al. study in Science found that a widely used healthcare algorithm systematically underserved Black patients. What was the root cause?

Correct. The algorithm equated spending with need — but spending reflects access as much as illness. Because Black patients faced greater barriers to care and thus spent less for equivalent conditions, the model systematically underestimated their medical need.

Not quite. The root cause was a flawed proxy: the algorithm used healthcare spending as a stand-in for health need, but spending reflects access barriers as much as illness — causing it to underestimate Black patients' needs.

4. One criticism of post-processing bias mitigation techniques from a legal perspective is that they may violate anti-discrimination law because they do what?

Correct. The legal paradox of post-processing is that equalizing outcomes by applying different thresholds to different groups is itself a form of explicit differential treatment — which anti-discrimination law in many contexts prohibits, even when the intent is to produce more equitable results.

Not quite. The legal concern is that explicitly applying different thresholds to different demographic groups — even to equalize outcomes — is itself a form of differential treatment that anti-discrimination statutes in many jurisdictions formally prohibit.

5. According to the lesson, what distinguishes "structural" solutions to algorithmic bias from "technical" solutions?

Correct. Structural solutions — like the requirements in NYC Local Law 144 and the EU AI Act — create accountability frameworks that apply regardless of technical choices: mandatory transparency, rights to contest decisions, human review requirements, and ongoing auditing obligations.

Not quite. Structural solutions are distinguished by being system-level requirements — transparency, contestability, oversight, monitoring — that apply regardless of which model is used, rather than corrections applied to a specific model's parameters or outputs.

Module 3 · Lab 4

Designing Accountability Frameworks

Apply structural thinking to real-world bias scenarios with your AI discussion partner.

Your Task

You'll be asked to design or evaluate accountability frameworks for AI systems — going beyond technical fixes to structural requirements like transparency, contestability, and human oversight. Draw on all four lessons in this module.

Complete at least 3 exchanges to finish this lab.

A city government is considering deploying an AI system to help prioritize which households receive social services (housing assistance, food support, childcare subsidies). Using the structural framework from Lesson 4 — transparency, contestability, human oversight, and ongoing monitoring — outline what accountability requirements you would demand before allowing this system to operate.

Accountability Framework Lab

L4 · Structural Fairness

Welcome to the Accountability Framework Lab. You're advising a city government that wants to deploy an AI system to prioritize which households receive social services — housing assistance, food support, childcare subsidies. Using the structural framework from Lesson 4 — transparency, contestability, human oversight, and ongoing monitoring — what accountability requirements would you insist on before allowing this system to operate? Be specific: who does what, and what happens if requirements aren't met?

Module 3 · Final Assessment

Bias: When Algorithms Get It Wrong

15 questions covering all four lessons. Score 80% or higher to pass.

1. Which of the following best describes the ProPublica finding about COMPAS that Northpointe disputed?

Correct. ProPublica's central finding was that Black defendants faced nearly twice the false-positive rate — incorrectly labelled high risk — as white defendants with the same actual outcome.

Not quite. The ProPublica finding was about false-positive rate disparities: Black defendants were nearly twice as likely to be incorrectly labelled high risk when they would not go on to reoffend.

2. Kleinberg, Mullainathan, and Raghavan's mathematical proof about fairness metrics established which key impossibility result?

Correct. The impossibility result shows that different, mathematically coherent definitions of fairness are mutually incompatible when outcome base rates differ — making the choice of fairness metric itself an ethical and political decision.

Not quite. The key result is that calibration and equal error rates cannot both be satisfied when base rates of the predicted outcome differ between groups — forcing a choice between fairness criteria.

3. The term "representation bias" in machine learning most precisely refers to what?

Correct. Representation bias specifically refers to insufficient representation of some demographic groups in training data, causing the model to generalize poorly for those groups — as seen in facial recognition's failures on darker-skinned women.

Not quite. Representation bias is the technical term for training data that underrepresents certain groups, causing systematically worse performance on those groups at inference time.

4. Joy Buolamwini and Timnit Gebru's Gender Shades study found error rates as high as 34.7% on which subgroup in commercial facial analysis systems?

Correct. The Gender Shades study's most striking finding was that dark-skinned women — the intersection of two underrepresented groups in training data — experienced error rates up to 34.7%, compared to near-perfect performance on light-skinned men.

Not quite. The highest error rates were found for dark-skinned women, reaching 34.7% on some systems — a dramatic contrast with the near-perfect performance on light-skinned men.

5. What did the NIST 2019 study of 189 commercial facial recognition algorithms conclude about false-positive rates?

Correct. The NIST study was the most comprehensive independent audit of commercial facial recognition and found false-positive rates dramatically elevated — 10 to 100 times higher — for African-American and Asian faces across most tested algorithms.

Not quite. NIST found false-positive rates 10 to 100 times higher for African-American and Asian faces compared to Caucasian faces — a finding with severe implications for law enforcement use cases where false positives can mean wrongful arrest.

6. Which legal mechanism did Illinois use that resulted in Clearview AI paying a $52 million settlement?

Correct. Illinois' Biometric Information Privacy Act was the legal vehicle for the Clearview settlement. BIPA requires informed written consent before collecting biometric data and provides a private right of action — making it uniquely powerful against companies that harvest facial data without consent.

Not quite. The Illinois Biometric Information Privacy Act (BIPA) — which requires informed written consent for biometric data collection and allows individuals to sue — was the legal basis for the $52 million Clearview settlement.

7. Amazon's AI recruiting tool penalized résumés that included the word "women's" — for example, "women's chess club." This is best explained as an instance of which type of bias?

Correct. This is historical bias: the model was trained on past decisions from an era when men dominated Amazon's technical hiring. The model faithfully replicated that history, penalizing signals of female identity that correlated in the data with lower hiring rates.

Not quite. This is historical bias: the algorithm learned that signals correlating with being female were negatively associated with hiring success — because the historical data reflected a decade of gender-unequal hiring decisions.

8. A credit-scoring algorithm that penalizes applicants with no credit history disproportionately affects recent immigrants and young adults. This is an example of what concept?

Correct. "No credit history" is facially neutral but correlates with immigrant status, age, and race due to historically unequal access to formal banking infrastructure — making it a proxy variable for protected characteristics.

Not quite. This is proxy discrimination: a variable that appears neutral (credit history status) disproportionately affects protected groups because access to credit itself has been historically unequal.

9. The 2019 Obermeyer et al. study in Science found a healthcare algorithm under-identified Black patients with equal medical need because of which specific data problem?

Correct. The flawed proxy — spending as a stand-in for need — embedded inequality because spending reflects access as much as illness. After recalibration to use illness burden directly, 46% more Black patients were identified for enrollment in care programs.

Not quite. The flaw was a proxy: healthcare spending was used as a stand-in for health need, but because Black patients faced greater access barriers, they spent less for equal conditions — causing systematic underestimation of their need.

10. Which city passed the first U.S. law specifically requiring annual independent bias audits and public disclosure for automated employment decision tools?

Correct. New York City Local Law 144, effective in 2023, was the first U.S. law to require annual independent bias audits with public disclosure for employers using automated employment decision tools, as well as applicant notification requirements.

Not quite. New York City was first, with Local Law 144 taking effect in 2023. San Francisco is known for banning government use of facial recognition, but not for this specific hiring AI requirement.

11. An "in-processing" bias mitigation technique is best described as which of the following?

Correct. In-processing techniques modify the learning algorithm itself — typically adding a fairness penalty to the optimization function during training — as distinct from pre-processing (data modification) or post-processing (output adjustment).

Not quite. In-processing refers to changes made during training — specifically to the learning algorithm's optimization objective — not to the data beforehand or outputs afterward.

12. The SyRI system was deployed in the Netherlands primarily in which types of areas, which contributed to the court's human rights concerns?

Correct. SyRI's deployment in low-income and ethnically diverse areas — combined with its opacity and lack of appeal mechanisms — was central to the court's finding that it violated the right to private life under Article 8 ECHR.

Not quite. SyRI was deployed in low-income and ethnically diverse neighborhoods, which the court found particularly troubling given the system's opacity and the lack of any meaningful way for residents to challenge or even know about their scores.

13. The EU AI Act (2024) classifies AI used in employment, credit scoring, and essential services as which risk category?

Correct. The EU AI Act places AI systems used in employment, credit, education, and essential services in the "high risk" category, triggering requirements for conformity assessments, ongoing monitoring, human oversight, and documentation of training data.

Not quite. These systems are classified as "high risk" — not prohibited — meaning they can be deployed but subject to conformity assessments, ongoing monitoring, human oversight, and extensive training data documentation requirements.

14. According to the module's core argument, what distinguishes structural solutions to algorithmic bias from technical debiasing approaches?

Correct. The key distinction is that structural solutions create accountability frameworks that are model-agnostic: they require governance practices regardless of the technical choices made, whereas technical solutions target the parameters or outputs of a specific model.

Not quite. Structural solutions are distinguished by being governance frameworks — transparency, contestability, oversight, monitoring — that apply no matter which specific algorithm is deployed, rather than corrections to a particular model.

15. Which of the following best captures the module's central argument about the nature of algorithmic bias?

Correct. The module's core argument is that algorithmic bias cannot be resolved by technical improvements alone because it is, at root, a question of power: who gets to define the problem, whose history becomes training data, and who absorbs the harm when the system is wrong.

Not quite. The module's central argument is that algorithmic bias is a power problem as much as a technical one — it reflects whose interests shape design decisions and whose harms are deemed acceptable — requiring governance responses, not just better algorithms.