When credit scoring arrived in the United States in the late 1950s β the Fair Isaac Corporation released its first scoring model in 1958 β it promised to replace the loan officer's gut feeling with objective mathematics. The promise was partly kept: lending did become more consistent. But the scores encoded the same residential segregation and employment discrimination already embedded in the data they were trained on. By the 1970s, consumer advocates were documenting how ZIP codes in redlined neighborhoods systematically produced lower scores regardless of individual repayment behavior. Congress responded with the Equal Credit Opportunity Act of 1974 and the Fair Credit Reporting Act, the first major legislative attempts to govern an automated decision system. The lesson was clear: a number that looks neutral can carry old prejudices forward at industrial scale.
That pattern is repeating now, faster and less visibly. Between 2014 and 2018, Amazon built and then quietly discarded a machine-learning hiring tool because it systematically downgraded rΓ©sumΓ©s that included the word "women's" β as in "women's chess club." In 2016, ProPublica published an investigation into COMPAS, a recidivism-prediction algorithm used in courtrooms across the United States, showing it falsely flagged Black defendants as future criminals at roughly twice the rate of white defendants. In 2019, researchers at UC Berkeley found that online mortgage lenders using algorithmic pricing charged Black and Latino borrowers about 11 basis points more than equally qualified white borrowers. The tools changed. The outcomes did not.
This course is not an argument that AI is broken beyond repair, nor that these systems should be abolished. It is a structured examination of how bias enters algorithmic systems, how it can be measured, and what technical and institutional tools exist to reduce it. You will encounter real documented cases, real mathematical definitions, and real trade-offs β because fairness in machine learning turns out to be not one thing but several, and they cannot all be satisfied simultaneously. That tension is where the most important work is happening, and it is where this course begins.
If you finish every module, here's who you become:
In 2015, a software engineer named Jacky AlcinΓ© noticed that Google Photos had tagged photos of him and his friend β both Black β as "gorillas." The label came not from a programmer's prejudice but from a neural network trained on images that dramatically underrepresented dark-skinned faces. Google's response was to remove the gorilla category from the classifier entirely β a fix that still held in 2023 when journalists re-tested the product. The underlying problem, insufficient representation in training data, was never solved. It was patched.
The Google Photos incident is memorable because it was viscerally offensive and easily photographed. But it represents only the most visible end of a wide spectrum of algorithmic bias β a spectrum that also includes quiet disparities in loan approval rates, healthcare resource allocation scores, and predictive policing heat maps that determine where officers are deployed.
Algorithmic bias refers to systematic and repeatable errors in a computer system that create unfair outcomes for certain groups relative to others. The word "systematic" is doing important work here. Every predictive model makes errors. Bias is not about individual mistakes β it is about patterns of mistakes that fall disproportionately on people defined by race, gender, age, disability status, or other protected characteristics.
Three distinct meanings of "bias" collide in this field, and keeping them separate matters:
Bias can enter an AI system at every stage of its development. The three most consequential entry points are data collection, problem framing, and feedback loops.
Data collection. Training data is not a neutral sample of reality. It reflects the world as it was recorded β by whom, with what instruments, for what purpose. ImageNet, the image dataset that powered the deep-learning revolution starting around 2009, was assembled largely from images tagged by English-speaking, US-based internet users. A 2019 study by Vinay Prabhu and Abeba Birhane found that ImageNet's person-category images dramatically overrepresented lighter-skinned, Western subjects. Models trained on it were correspondingly worse at tasks involving darker-skinned faces β not by design, but by data.
Problem framing. Before any data is collected, someone must decide what the algorithm is trying to predict. This choice embeds values. When Northpointe designed COMPAS in the 1990s, they defined "recidivism risk" as re-arrest within two years. Re-arrest and re-offending are not the same thing. Black defendants are arrested at higher rates for equivalent behavior due to differential policing. An algorithm trained on arrest data will therefore predict higher risk for Black defendants partly because policing patterns β not underlying behavior β produce more arrest records in those communities.
Feedback loops. When an algorithm's outputs influence future inputs, initial errors compound. Predictive policing tools deployed by the Santa Cruz, Chicago, and New Orleans police departments directed more officers to already over-policed areas, generating more arrests, which fed back into training data, which reinforced the model's belief that those areas required more policing. Santa Cruz became the first U.S. city to ban predictive policing software in June 2020, partly for this reason.
In 2019, a study published in Science by Ziad Obermeyer and colleagues found that a widely used healthcare algorithm β deployed by Optum and used to allocate care management resources for roughly 200 million people per year β systematically assigned lower risk scores to Black patients than to equally sick white patients. The algorithm used healthcare costs as a proxy for health need. Because Black patients had historically spent less on healthcare (due to access barriers, not lower need), the algorithm interpreted lower past costs as lower current need. The researchers estimated that correcting the bias would more than double the number of Black patients identified for extra care.
A common defense of algorithmic systems is that they are merely "as biased as their data" β implying the problem lies upstream with society, not with the system itself. This argument has some validity but misses the amplification effect. When a human loan officer holds a biased view, that view affects the people they personally review. When an algorithmic system encodes the same view, it affects every person processed through the system β at the speed of software, with no natural brake from fatigue or social accountability.
Scale changes the moral calculus. A biased algorithm deployed at national scale can cause more harm in a week than a biased human practitioner causes in a career. This is why researchers and regulators increasingly treat algorithmic bias as a distinct category of risk, not merely a technological reflection of social problems.
Algorithmic bias is systematic, repeatable, and scalable harm β it arises from data, design choices, and deployment context, not from malicious intent. The absence of intent does not reduce the harm. Lesson 2 will examine how researchers measure these disparities mathematically.
In this lab you will examine a realistic scenario β a company deploying an AI hiring tool β and practice identifying where bias could enter at the data, framing, and feedback stages. Discuss the scenario with the AI assistant below. There are no trick questions; the goal is to think carefully and articulate your reasoning.
In May 2016, ProPublica published "Machine Bias," documenting that COMPAS assigned higher risk scores to Black defendants who did not re-offend and lower scores to white defendants who did. Northpointe, COMPAS's developer, responded two weeks later with their own analysis β arguing the tool was fair because it was equally accurate for Black and white defendants: both groups had roughly the same probability of re-offending when assigned a given risk score. Both claims were true simultaneously. They measured different things. This was not a dispute about facts; it was a dispute about which mathematical definition of fairness should govern a tool with life-altering consequences. The argument has never been fully resolved β not because the math is ambiguous, but because the choice between fairness criteria is a value judgment that mathematics alone cannot make.
All fairness metrics are built from the confusion matrix β the four-cell table tracking true positives, false positives, true negatives, and false negatives. The disagreement between ProPublica and Northpointe reduces to which cells of that matrix you require to be equal across demographic groups.
In 2016, researchers Chouldechova and Kleinberg et al. independently proved that no algorithm can simultaneously satisfy calibration and equalized odds unless base rates β the actual prevalence of the outcome β are equal across groups. Because recidivism rates differed between demographic groups in the data COMPAS was trained on, satisfying one definition mathematically required violating the other. ProPublica and Northpointe were both correct β but they were measuring different things, and no single tool could satisfy both simultaneously.
This is not a limitation of COMPAS specifically. It is a mathematical fact about any binary classifier. It means that deploying a predictive system requires a prior decision about which errors are most costly and to whom β a decision that is fundamentally political and ethical, not statistical.
When base rates differ between groups, you cannot have all of: (1) equal false positive rates, (2) equal false negative rates, and (3) equal positive predictive value. Any two of these can be achieved, but achieving all three requires equal base rates β which the data often does not provide. The choice of which constraint to relax is a value judgment, not a technical one.
The answer depends on the decision domain and which errors cause more harm. In criminal justice, a false positive β predicting high risk for someone who will not re-offend β results in harsher bail conditions, longer sentences, or denied parole for an innocent person. That asymmetry argues for prioritizing equal false positive rates (an equalized-odds requirement). In medical screening, a false negative β missing a disease in someone who has it β may be more costly than a false positive. Equal opportunity (equal true positive rates) may be the appropriate constraint.
The EU's AI Act, which entered into force in August 2024, requires high-risk AI systems to be tested for bias before deployment and to document which fairness metrics were used and why β an implicit acknowledgment that the choice of metric is a consequential design decision, not a technicality.
Fairness cannot be reduced to a single number. Demographic parity, equalized odds, calibration, and individual fairness are all legitimate definitions that capture different moral intuitions β and they are mathematically incompatible when base rates differ across groups. Choosing which fairness constraint to prioritize is an ethical and political decision that precedes any technical implementation.
You are advising a city government that wants to deploy a predictive algorithm to determine which residents qualify for a job-training subsidy program. The program has limited slots. Work through the fairness criteria with the assistant β decide which metric to prioritize and defend your reasoning.
In January 2020, Robert Julian-Borchak Williams was arrested at his home in Detroit while his daughters watched from the doorway. He was handcuffed, placed in a police car, and held overnight for a crime he did not commit. The identification that led to his arrest was made by a facial recognition algorithm. Detroit police had run a surveillance photo through a commercial system β later reported to be DataWorks Plus technology using an NEC algorithm β which returned a match to Williams's driver's license photo. A human investigator confirmed the match without conducting further verification. Williams was innocent; the charges were eventually dropped. He became the first documented case of a wrongful arrest caused by facial recognition in the United States, and the ACLU filed a complaint on his behalf in 2021.
The technical evidence for demographic disparities in facial recognition systems accumulated steadily before it reached public consciousness. In 2018, MIT researcher Joy Buolamwini and Timnit Gebru published "Gender Shades," testing commercial facial analysis APIs from IBM, Microsoft, and Face++ on a curated dataset of 1,270 faces with known gender labels. The systems achieved error rates below 1% on lighter-skinned male faces. For darker-skinned female faces, error rates reached 34.7% (IBM), 20.8% (Microsoft), and 34.5% (Face++). The disparity was not subtle.
The NIST Face Recognition Vendor Test, published in December 2019, examined 189 commercial facial recognition algorithms across a dataset of 18 million images. It found that many algorithms produced false positive rates 10 to 100 times higher for Black and Asian faces than for white faces when used for one-to-one verification. The systems that performed most poorly on these demographic groups were predominantly developed in the United States and Europe. Algorithms developed in China showed smaller disparities on East Asian faces β consistent with the hypothesis that the training data's demographic distribution drives performance gaps.
By 2020, three major cities β San Francisco (May 2019), Oakland (July 2019), and Boston (June 2020) β had banned government use of facial recognition technology entirely. In June 2020, IBM, Amazon, and Microsoft each announced they would halt or pause sales of facial recognition tools to law enforcement, citing accuracy disparities and the need for federal regulation.
Amazon's secret AI recruiting tool, reported by Reuters in October 2018, was built between 2014 and 2017 to automate the initial screening of job applications. The system was trained on rΓ©sumΓ©s submitted to Amazon over a ten-year period β a dataset overwhelmingly composed of male applicants, because the technology industry is male-dominated. The model learned to downgrade rΓ©sumΓ©s that included the word "women's" (as in "women's chess club") and to penalize graduates of two all-women's colleges. Amazon disbanded the team and scrapped the tool in 2017 after discovering these patterns. The tool was never used for actual hiring decisions, but its existence illustrated how training data composition directly shapes model outputs.
A 2019 study by the University of Washington and Princeton found that simply posting identical rΓ©sumΓ©s with stereotypically white names versus stereotypically Black names on a major job platform produced different callback rates β a discrimination pattern first documented in audit studies of human reviewers in 2004 by Marianne Bertrand and Sendhil Mullainathan, and now replicated in algorithmic systems that ostensibly removed human judgment from the process.
In 2021, researchers at Vanderbilt University Medical Center published findings that a commercial algorithm used to allocate dermatology clinic appointments systematically ranked Black patients as lower priority than white patients with equivalent clinical urgency. The algorithm used insurance type as one input β and because Black patients were more likely to hold Medicaid, which reimburses at lower rates, the tool effectively encoded a financial preference as a clinical judgment. The hospital discontinued use of the algorithm after the study was published.
Apple Card, launched in August 2019, attracted scrutiny in November of that year when David Heinemeier Hansson β the creator of Ruby on Rails β publicly reported that Apple Card's algorithm offered him a credit limit twenty times higher than his wife's, despite their filing taxes jointly and her having a higher credit score. Goldman Sachs, which issued the card, stated that gender was not used as an input. Investigators from the New York Department of Financial Services opened an inquiry. The case illustrated a persistent challenge: algorithms that do not explicitly use protected characteristics can still produce discriminatory outcomes through correlated variables. Gender was not in the model; proxies correlated with gender may have been.
The UC Berkeley study from 2019, examining 30 million mortgage records, found that algorithmic lenders β those using automated underwriting without human loan officers β charged Black and Latino borrowers approximately 11 basis points more than equally qualified white borrowers. This disparity translated to roughly $765 million in excess interest payments annually. The researchers concluded the disparity likely arose from algorithms trained on data reflecting existing wealth disparities, not from explicit race discrimination.
Algorithmic bias is not a single phenomenon. In facial recognition, it manifests as higher error rates for underrepresented groups. In hiring, it arises from historically skewed training populations. In healthcare, it appears through proxy variables that correlate financial factors with clinical priority. In credit, it surfaces through variables correlated with race even when race is excluded. Each domain requires domain-specific auditing methods and regulatory frameworks.
You are conducting a bias audit of a facial recognition system proposed for use in a regional airport for access control to secure areas. The vendor has provided an accuracy report showing 97.5% overall accuracy on their test set. Work through the audit with the assistant β what questions do you ask, what additional data do you demand, and what would cause you to reject the system?
In March 2021, the U.S. Equal Employment Opportunity Commission launched an initiative specifically targeting algorithmic hiring tools, signaling that existing civil rights law β in particular, Title VII of the Civil Rights Act of 1964 and the concept of disparate impact β already applied to automated systems. The EEOC's 2022 technical assistance document on AI and disability discrimination stated plainly that employers cannot avoid liability by delegating a discriminatory decision to an algorithm. The defense that "the computer did it" had no legal standing. The regulatory landscape was catching up to the technology, even without new AI-specific legislation in the United States.
Bias mitigation techniques are typically categorized by when in the machine learning pipeline they are applied: before training (pre-processing), during training (in-processing), or after training (post-processing).
All technical mitigation strategies face the impossibility constraints covered in Lesson 2. They can shift which fairness criterion is prioritized; they cannot satisfy all criteria simultaneously. Pre-processing methods also risk reducing overall model accuracy in exchange for more equitable error distribution. These trade-offs must be documented and justified β they are not automatically worth making.
Technical fixes applied to a poorly governed process tend to be undone by the next model update or data refresh. Organizational mitigation focuses on the institutional conditions that allow bias to accumulate and persist.
Algorithmic auditing. Independent third-party audits of AI systems have become a standard organizational practice in finance (where model risk management frameworks have existed since 2011 under OCC and Federal Reserve guidance) and are now required for high-risk AI systems under the EU AI Act. An audit examines training data composition, model performance across demographic segments, and whether documented fairness criteria match actual model behavior.
Diverse development teams. The "Gender Shades" finding that algorithms developed primarily by lighter-skinned engineers performed worst on darker-skinned faces is widely cited as evidence that team composition affects what developers notice and test. Google, after the 2015 Photos incident, began publishing annual diversity reports and publicly tracking progress on hiring. Whether team diversity alone is sufficient without structural changes to testing protocols is contested.
Impact assessments. Canada's Directive on Automated Decision-Making, which came into force in April 2019, requires federal government departments to complete an Algorithmic Impact Assessment before deploying any automated system. The assessment scores systems on risk level and mandates proportional oversight β including human review for high-impact decisions. The EU AI Act's risk classification system is built on a similar logic.
The most comprehensive AI fairness regulation currently in force is the EU AI Act (August 2024). It classifies AI systems by risk tier. Systems used in employment decisions, credit scoring, law enforcement, and social benefits allocation are classified as "high risk" and must comply with requirements including bias testing on representative datasets, logging of system outputs for accountability, and human oversight mechanisms. Facial recognition in public spaces is largely prohibited outright for government use.
In the United States, no equivalent federal AI fairness law exists as of 2024, but the patchwork of existing civil rights law β Title VII, the Fair Housing Act, the Equal Credit Opportunity Act β applies to algorithmic systems under the disparate impact doctrine established in Griggs v. Duke Power Co. (1971). New York City's Local Law 144 (effective July 2023) requires employers using automated employment decision tools to commission independent bias audits and publish the results publicly. Illinois, California, and Colorado have passed similar but narrower requirements for specific sectors.
Bias mitigation requires action at the technical, organizational, and regulatory levels simultaneously. Technical methods can shift trade-offs but cannot eliminate them. Organizational practices β audits, diverse teams, impact assessments β create the conditions under which technical work has lasting effect. Regulation sets the floor. None of these is sufficient alone, and each requires explicit choices about which harms to prioritize reducing.
You are the AI ethics officer at a regional bank. The bank's loan underwriting algorithm has been found β through an internal audit β to approve white applicants at a 12 percentage-point higher rate than equally creditworthy Black applicants. The CEO wants a mitigation plan within two weeks. Work through a structured response with the assistant below, covering technical, organizational, and regulatory dimensions.