Module 5 · Lesson 1

The Decision Framework

When a biased system surfaces, the first question isn't how to fix it — it's whether fixing is even the right move.

What criteria should determine whether a biased AI gets repaired, redesigned from scratch, or retired entirely?

In 2018, Amazon's internal recruiting tool was quietly shelved after engineers discovered it had been downgrading résumés containing the word "women's" — as in "women's chess club" or "women's college." The system had been trained on a decade of hiring data that skewed male. The response wasn't a patch. Amazon scrapped the tool entirely. The question this module asks: how do you know when that's the right call?

Three Possible Verdicts

When bias is confirmed in a deployed AI system, decision-makers face a three-way fork. Each path has distinct preconditions, costs, and downstream risks. Choosing the wrong one can make things worse — or create liability while changing nothing meaningful.

🔧

Fix It

Definition: Targeted remediation of a specific, isolated bias source. The underlying architecture and purpose are sound; the problem is a correctable flaw in data, weighting, or a single decision boundary.

🏗️

Rebuild It

Definition: The purpose is legitimate but the architecture is too contaminated to patch reliably. A full rebuild from different data, different design choices, and new governance is warranted.

🗑️

Scrap It

Definition: The task itself is inappropriate for automation in the current state of the art, or the harm caused is irreversible and disproportionate regardless of architecture. Retirement is the only ethical option.

The Five-Factor Test

No single factor determines the verdict. In practice, researchers and regulators have converged on a cluster of considerations that together drive the decision. The NIST AI Risk Management Framework (2023) and the EU AI Act's prohibited/high-risk categorization both embed versions of this logic.

Scope of Harm

Is the bias producing legally protected-class discrimination? Does it affect life-altering decisions (credit, hiring, bail, medical triage)? Narrow harms in low-stakes contexts may be fixable; pervasive harms in high-stakes domains often require scrapping.

Root Cause Locatability

Can the bias be traced to a specific, isolatable source — a mislabeled dataset, a single feature, a skewed training window? If the bias is entangled throughout the model's weights with no clear origin, patching is largely cosmetic.

Task Necessity

Is the task one that AI can perform fairly at all, given the available data ecosystems? Some tasks (predicting recidivism, inferring creditworthiness from behavioral patterns) may be structurally impossible to debias because the input data itself encodes historical inequality.

Reversibility of Past Harm

Has the biased system already harmed identifiable people? Can those harms be remediated? A system that wrongly denied parole to thousands over years creates obligations that go beyond technical fixes.

Governance Capacity

Does the deploying organization have the internal capacity — technical, legal, ethical — to maintain a repaired system responsibly over time? A fix without ongoing oversight is often worse than retirement, because it provides false assurance.

Real Case — COMPAS (2016–present)

ProPublica's 2016 investigation found that the COMPAS recidivism-prediction algorithm falsely flagged Black defendants as future criminals at roughly twice the rate of white defendants. Northpointe (now Equivant) argued the tool was fair by a different statistical definition. Applying the five-factor test: scope of harm is extreme (liberty); root cause is entangled in structurally biased criminal-justice data; task necessity is contested; past harms affect tens of thousands; governance capacity in most jurisdictions is near zero. The framework points toward scrap — yet COMPAS remains in use in multiple U.S. states as of 2024.

Why "Fix" Is Often Chosen for the Wrong Reasons

Organizations routinely choose to fix biased systems not because the five-factor analysis supports it, but because fixing is cheaper in the short term, avoids admission of past wrongdoing, and allows continued operation of a revenue-generating product. Researchers at the AI Now Institute have called this "bias laundering" — superficial technical changes that create a reputational shield without addressing underlying structural problems.

The 2019 Apple Card credit algorithm controversy illustrates this: Goldman Sachs applied standard adjustments after complaints that the algorithm offered women systematically lower credit limits than men with comparable financial profiles. But because the underlying credit-scoring logic remained, follow-up analyses suggested the disparity persisted in attenuated form. "We fixed it" was announced; the structural question was not resolved.

Key Principle

The fix/rebuild/scrap decision is not primarily a technical question. It is a question of values, power, and accountability. The framework gives structure to that decision — but only if applied honestly and by people with genuine independence from the deploying organization's commercial interests.

Lesson 1 Quiz

The Decision Framework

Three questions · Select the best answer

1. Amazon scrapped its AI recruiting tool in 2018 primarily because:

Correct. Amazon's tool had learned to penalize résumés from women's colleges and organizations because it was trained on historical hiring data dominated by male hires. The bias was entangled throughout the model, making targeted repair impractical.

Incorrect. The reason for scrapping was systematic gender bias embedded in training data, not performance speed, regulatory action, or a security breach.

2. According to the Five-Factor Test, which scenario most strongly points toward "Scrap It" rather than "Fix It"?

Correct. This scenario triggers all five factors: extreme scope of harm (liberty), bias entangled throughout the data pipeline, a task of contested fairness, irreversible past harm to identifiable people, and low governance capacity.

Incorrect. The bail-risk tool scenario combines the highest possible harm scope, entangled bias, contested task necessity, and absent governance — the strongest case for scrapping rather than fixing.

3. The AI Now Institute's concept of "bias laundering" refers to:

Correct. Bias laundering describes cosmetic fixes — small adjustments, new metrics, rebranded tools — that allow organizations to claim they have addressed bias without making the structural changes needed to actually eliminate it.

Incorrect. Bias laundering is not about encryption, data washing, or outsourcing — it describes superficial remediation that provides reputational cover without genuine structural change.

Lesson 1 Lab

Apply the Five-Factor Test

AI-assisted reasoning lab · minimum 3 exchanges to complete

Your Task

You'll work through two real cases using the Five-Factor Test. For each case, the AI will guide you through the factors and push you to justify your fix/rebuild/scrap verdict with evidence.

Start by typing: "I want to analyze the COMPAS case" or "I want to analyze the Apple Card case" — or ask me to assign one.

Bias Decision Lab

Welcome to the Fix-It-or-Scrap-It decision lab. We're going to apply the Five-Factor Test to real documented cases of biased AI. Which case would you like to start with — COMPAS (recidivism scoring) or the Apple Card (credit limits)? Or tell me you'd like me to assign one.

Module 5 · Lesson 2

Debiasing Techniques — What Actually Works

The toolbox is real. The limitations are equally real. Knowing the difference is half the battle.

Which debiasing interventions have demonstrated durable results in production systems, and which are primarily cosmetic?

In 2019, a landmark study in Science by Obermeyer et al. revealed that a widely used healthcare algorithm — deployed across hundreds of U.S. hospitals — was systematically underestimating the health needs of Black patients. The algorithm used healthcare cost as a proxy for health need, not recognizing that historical inequities in healthcare access meant Black patients had generated lower costs despite being sicker. The fix was not a hyperparameter tweak. The proxy variable itself had to be replaced.

Three Categories of Debiasing

Bias interventions can happen at three stages of the AI pipeline. Each has different leverage, different costs, and different failure modes.

Stage 1

Pre-Processing

Interventions on the training data before the model is built. Includes resampling, reweighting, relabeling, and removing or replacing biased features (as in the Obermeyer healthcare case). Highest leverage — problems caught here don't propagate through training.

Stage 2

In-Processing

Modifications to the training algorithm itself — adding fairness constraints, adversarial debiasing, or fairness-aware regularization. Technically complex; can involve explicit accuracy-fairness trade-offs that require value judgments, not just engineering ones.

Stage 3

Post-Processing

Adjusting model outputs after training — applying different thresholds by subgroup, re-ranking results, or adding human review layers. Lowest leverage; easiest to implement; most prone to creating new disparities while appearing to fix old ones.

What the Research Shows

A 2021 meta-analysis by Friedler et al. at the ACM FAccT conference reviewed dozens of debiasing interventions across domains. Their findings were sobering: post-processing interventions reduced measured bias on tested metrics but frequently introduced disparities on untested ones. Pre-processing interventions were more durable but required access to training data that deploying organizations often did not have. In-processing techniques showed the most promise for robustness but required significant expertise and extended development time.

The Google Translate gender-bias corrections, rolled out in 2018 for Turkish-to-English (where Turkish uses gender-neutral pronouns), illustrate the limits: Google added "he" and "she" alternatives for ambiguous sentences. But the underlying model still default-generated masculine forms for high-status professions and feminine for care roles. The post-processing patch surfaced the problem without resolving it.

The Accuracy-Fairness Trade-off

Most debiasing techniques reduce overall model accuracy slightly — because accuracy is typically measured on test sets that reflect the same historical distributions that produced the bias. A model that performs "accurately" on a biased test set may be performing accurately at perpetuating inequality. This is not a technical paradox — it is a values question about what "accuracy" should mean when the ground truth itself is contaminated.

Documented Cases of Successful Debiasing

Cases of genuinely durable bias remediation are rarer than press releases suggest, but they exist:

2019

Optum Healthcare Algorithm: After the Science study, the deploying health system replaced healthcare cost as the proxy variable with a composite of clinical indicators. Follow-up analysis confirmed the racial disparity in identified high-risk patients dropped from 46.5% to near parity. This is a genuine pre-processing fix — proxy replacement with a better-specified variable.

2021

LinkedIn Job Recommendations: LinkedIn published research on its Economic Graph team's effort to debias job recommendation feeds that systematically surfaced fewer opportunities to women. Their approach combined reweighting training samples (pre-processing) with fairness constraints in the ranking model (in-processing). Post-deployment audits showed durable reduction across the tested demographic dimensions — though the company acknowledged the audit scope was limited.

2022

UK Home Office Visa Algorithm: After a legal challenge, the Home Office retired a visa application-ranking tool found to treat applications from certain nationalities systematically worse. Rather than repair, it was scrapped — an acknowledgment that the task (ranking visa-worthy applications from a pool including nationalities with historically different approval rates) was structurally impossible to perform fairly with the existing data governance.

The Whack-a-Mole Problem

One of the most persistent findings in bias research is that fixing a measured disparity on one metric often causes a new disparity to emerge elsewhere. This occurs because fairness criteria are mathematically incompatible in certain configurations — a result known as the fairness impossibility theorem (Chouldechova, 2017; Kleinberg et al., 2016). You cannot simultaneously achieve calibration, false positive rate parity, and false negative rate parity when base rates differ across groups. Any tool that claims to have achieved all three simultaneously is misrepresenting its results.

Takeaway

Debiasing techniques are tools, not solutions. They require honest pre-deployment specification of which fairness criteria matter most (and why), rigorous post-deployment auditing across multiple metrics simultaneously, and ongoing maintenance as data distributions shift. Without all three, technical debiasing is largely theatrical.

Lesson 2 Quiz

Debiasing Techniques

Three questions · Select the best answer

1. In the Obermeyer et al. (2019) healthcare algorithm case, the core fix required by the research was:

Correct. The bias arose because healthcare cost was used as a proxy for health need. Because Black patients historically had less access to care, they had generated lower costs despite being sicker. The fix required replacing the flawed proxy — a pre-processing intervention at the feature-engineering stage.

Incorrect. The fundamental problem was a flawed proxy variable — using healthcare cost to represent health need. No amount of retraining, threshold adjustment, or feature removal would fix the core issue without replacing the proxy itself.

2. The fairness impossibility theorem (Chouldechova, 2017; Kleinberg et al., 2016) states that:

Correct. This mathematical result means that organizations must choose which fairness criteria to prioritize — a values decision, not a purely technical one. Any system claiming to have satisfied all three simultaneously when base rates differ is misrepresenting its results.

Incorrect. The impossibility theorem makes a specific mathematical claim: when base rates differ, multiple fairness criteria cannot be simultaneously satisfied. It does not say AI can never be fair, impose a specific accuracy penalty, or rank intervention types.

3. Why is post-processing the least durable form of debiasing?

Correct. Post-processing patches the surface of outputs without addressing how the model actually represents and processes inputs. Research has repeatedly shown this creates a whack-a-mole dynamic — measured bias drops on the audited metric while persisting or emerging on others.

Incorrect. Post-processing is actually the easiest to implement technically, is not generally prohibited, and applies across model types. Its weakness is that it adjusts outputs without touching the model's internal representations, leaving underlying bias intact.

Lesson 2 Lab

Choose Your Intervention

AI-assisted reasoning lab · minimum 3 exchanges to complete

Your Task

You'll be given a bias scenario and asked to recommend a debiasing intervention (pre-processing, in-processing, or post-processing). The AI will probe the strengths and limitations of your choice and whether the fairness impossibility theorem applies.

Start by typing: "Give me a bias scenario to work through."

Debiasing Intervention Lab

Welcome to the Debiasing Intervention Lab. I'll present you with a documented bias scenario and you'll recommend a technical intervention — pre-processing, in-processing, or post-processing — and defend your choice against the limitations we covered in Lesson 2. Ready? Type "Give me a scenario" to begin.

Module 5 · Lesson 3

Regulatory Responses & Legal Accountability

Governments are no longer asking companies to self-regulate. The enforcement era has begun.

What legal and regulatory mechanisms now compel organizations to address AI bias — and what do they actually require?

In November 2022, the U.S. Department of Housing and Urban Development charged Facebook with violating the Fair Housing Act via its ad-targeting algorithm, which allowed advertisers to exclude users by race, national origin, religion, sex, and family status. Facebook had been warned about this in 2016 by ProPublica. The settlement, reached in 2022, required Facebook to build an entirely new ad-delivery system for housing, employment, and credit — not a patch, but a structural rebuild mandated by law.

The Regulatory Landscape (as of 2024)

AI bias has moved from an ethics conversation to a compliance obligation in multiple jurisdictions. The frameworks differ in scope, enforcement, and what exactly they require organizations to do.

EU — 2024

EU AI Act

The world's first comprehensive AI law. Creates a risk-tiered system: prohibited uses (e.g., social scoring), high-risk categories (hiring, credit, bail, education) with mandatory bias auditing, transparency requirements, and human oversight obligations before deployment. Fines up to €35M or 7% of global turnover.

USA — 2023

EEOC Guidance on AI Hiring

The Equal Employment Opportunity Commission issued guidance making clear that employers are liable for discriminatory outcomes from algorithmic hiring tools even if the bias is "unintentional." The employer, not the vendor, bears the legal responsibility. This extends existing disparate impact doctrine to AI.

USA — 2023

NYC Local Law 144

The first U.S. law specifically requiring bias audits of AI hiring tools. Employers in New York City using automated employment decision tools must conduct annual bias audits by independent third parties and publish results. Took effect July 2023.

USA — 2023

Biden AI Executive Order

Required federal agencies to develop standards for red-teaming, bias testing, and safety evaluations for AI systems used in federal decision-making. Established the AI Safety Institute at NIST. Partially reversed by the Trump administration's January 2025 executive order, though NIST standards remain in place.

Disparate Impact Doctrine Applied to AI

The most significant legal development is not new legislation — it is the application of existing civil rights law to algorithmic systems. Disparate impact doctrine, established in Griggs v. Duke Power Co. (1971), holds that facially neutral practices that produce racially disparate outcomes can constitute illegal discrimination even without discriminatory intent.

In 2023, the Consumer Financial Protection Bureau explicitly stated that this doctrine applies to algorithmic credit decisions. The FTC has used Section 5 of the FTC Act (prohibition on unfair or deceptive practices) to pursue companies whose AI systems produced biased consumer outcomes. The Department of Justice has brought enforcement actions against AI-powered mortgage platforms under the Fair Housing Act.

Real Case — HireVue (2020)

The Electronic Privacy Information Center filed an FTC complaint in 2019 against HireVue, whose AI video-interview platform analyzed facial micro-expressions and voice patterns to score job candidates. The complaint argued the system could not be audited for bias because it was proprietary and the underlying basis for scores was unexplainable. In response to regulatory scrutiny, HireVue discontinued its facial-expression analysis feature in 2021 — not because bias was proven, but because the system was structurally inauditable. This became a template: opacity itself, in a high-stakes context, is now treated as a regulatory problem.

What Regulators Are Actually Requiring

Across jurisdictions, regulatory requirements are converging on a similar cluster of obligations:

Pre-Deployment Bias Auditing

Testing for disparate impact across protected classes before a system goes live. NYC Local Law 144 requires this annually; the EU AI Act requires it for all high-risk systems.

Transparency and Explainability

Affected individuals must be informed when an AI is making decisions about them (EU AI Act, GDPR Art. 22). In some jurisdictions, they have the right to a human review of automated decisions. Black-box systems in high-stakes domains are increasingly legally precarious.

Ongoing Monitoring

Bias audits are not one-time events. The NIST AI RMF and EU AI Act both require continuous monitoring as data distributions and user populations shift. A system that was fair at launch can become biased as conditions change.

Vendor Liability Clarification

The EEOC guidance makes clear that using a vendor's biased tool does not insulate the deploying employer from liability. Organizations must conduct due diligence on third-party AI tools they deploy — "we bought it off the shelf" is no longer a defense.

The Enforcement Gap

Despite the regulatory framework's rapid expansion, enforcement remains inconsistent. Most regulatory actions have targeted high-profile cases with clear victims and documented evidence. The vast majority of biased AI systems operating in hiring, credit, healthcare, and criminal justice have not been audited, challenged, or remediated. The law has moved faster than the institutional capacity to enforce it.

Lesson 3 Quiz

Regulatory Responses & Legal Accountability

Three questions · Select the best answer

1. New York City's Local Law 144, which took effect in 2023, specifically requires:

Correct. NYC Local Law 144 is notable for being the first U.S. law specifically targeting bias in AI hiring tools. It requires annual third-party bias audits and public disclosure of results — establishing a transparency obligation that goes beyond self-reporting.

Incorrect. Local Law 144 specifically targets employers using automated employment decision tools, requiring annual independent bias audits and publication of results. It does not create a company registry, impose a facial recognition moratorium, or require human review within 48 hours.

2. HireVue discontinued its facial-expression analysis feature in 2021 primarily because:

Correct. The EPIC FTC complaint focused heavily on the system's opacity — that it could not be audited for bias because the basis for scores was proprietary and unexplainable. This established an important precedent: structural inauditability in a high-stakes context is treated as a regulatory problem even before specific bias is proven.

Incorrect. The key issue was structural inauditability under regulatory scrutiny — a proprietary black box making hiring decisions could not demonstrate freedom from bias. No court ruling, cost issue, or gaming complaint drove the decision.

3. Under the EEOC's 2023 guidance on AI hiring tools, if a vendor's algorithm produces discriminatory outcomes:

Correct. The EEOC guidance is explicit: employers cannot offload liability by pointing to a vendor. Deploying organizations must conduct due diligence on third-party AI tools, and disparate impact liability applies regardless of intent. This extends pre-existing disparate impact doctrine fully to algorithmic systems.

Incorrect. The EEOC guidance places responsibility squarely on the deploying employer. "We used a vendor's tool" is not a defense. Disparate impact doctrine applies regardless of intent, so lack of discriminatory intent is also not a shield.

Lesson 3 Lab

Regulatory Compliance Audit

AI-assisted reasoning lab · minimum 3 exchanges to complete

Your Task

You'll assess a fictional company's AI deployment against actual regulatory requirements from the EU AI Act, NYC Local Law 144, and EEOC guidance. Identify compliance gaps and recommend specific remediation steps.

Start by typing: "Give me a compliance scenario to audit."

Regulatory Compliance Lab

Welcome to the Regulatory Compliance Lab. I'll describe a company's AI deployment and you'll audit it against the EU AI Act, NYC Local Law 144, and EEOC guidance on AI hiring tools. You'll identify the specific compliance gaps and recommend what the company must do to come into compliance. Type "Give me a compliance scenario" to begin.

Module 5 · Lesson 4

Building Accountability From the Start

The most effective bias remediation is the kind that happens before deployment, not after.

What structural practices and governance mechanisms actually prevent biased AI from reaching production in the first place?

When the MIT Media Lab's Joy Buolamwini published her 2018 Gender Shades research, she found that three major commercial facial analysis systems — from Microsoft, IBM, and Face++ — had error rates for darker-skinned women up to 34 percentage points higher than for lighter-skinned men. All three companies had extensive AI ethics frameworks on paper. None had tested their commercial products on representative demographic benchmarks before deployment. The frameworks existed; the accountability structures to enforce them did not.

Why Ethics Frameworks Alone Fail

Between 2016 and 2020, virtually every major technology company published an AI ethics framework. Google's "AI Principles." Microsoft's "Responsible AI." IBM's "Pillars of Trust." Meta's "Five Pillars." Research by the AI Now Institute and the Algorithm Watch project found that the vast majority of these documents shared a common feature: they contained aspirational principles with no binding enforcement mechanisms, no independent oversight, and no consequences for violation.

Timnit Gebru's dismissal from Google's Ethical AI team in December 2020 — widely reported as retaliation for a paper on large language model risks — illustrated the specific problem: ethics teams embedded within commercial organizations face structural pressure to accommodate product timelines, not interrupt them. The conflict of interest is architectural.

Structural Mechanisms That Work

Research on organizational governance of AI — including the NIST AI RMF, Stanford HAI studies, and the UK AI Safety Institute's published guidance — has identified a cluster of structural practices that correlate with actually preventing biased systems from reaching production:

Algorithmic Impact Assessments (AIAs)

Mandatory documentation of a system's intended use, population affected, data sources, known limitations, and bias risk assessment — completed before training begins, not after. Canada's Directive on Automated Decision-Making has required AIAs for federal AI systems since 2019. The EU AI Act requires conformity assessments for high-risk AI that function similarly.

Independent Red-Teaming

Structured adversarial testing by teams with no stake in the system's success, specifically tasked with finding bias and failure modes. The Biden administration's AI Executive Order made red-teaming mandatory for high-capability AI systems at federal agencies. Anthropic, OpenAI, and Google DeepMind have all published red-teaming methodologies, though the independence of internal red teams remains debated.

Diverse Development Teams

The Gender Shades finding — that facial analysis systems performed worst on darker-skinned women — correlated directly with the near-complete absence of darker-skinned women from the benchmark datasets used for development. Diverse teams are not a social justice measure; they are a quality control measure. Teams that include people from affected communities identify failure modes that homogeneous teams miss.

Model Cards and Datasheets

Standardized documentation for AI models (Model Cards, introduced by Mitchell et al. at Google, 2019) and datasets (Datasheets for Datasets, Gebru et al., 2021). These require developers to explicitly document intended uses, out-of-scope uses, performance across demographic subgroups, and known failure modes. When required by procurement contracts or regulation, they create accountability trails.

Participatory Design

Involving affected communities in system design before deployment. The most successful examples come from public health AI — community advisory boards that reviewed algorithmic health resource allocation tools before deployment reduced both technical bias and community resistance. The 2022 Boston COVID-19 vaccine allocation algorithm used participatory design with community health advocates to build equity metrics into the objective function.

Case Study — Canada's AIA Directive

Canada's 2019 Directive on Automated Decision-Making requires all federal agencies to complete an Algorithmic Impact Assessment before deploying any AI system for administrative decisions. The AIA categorizes systems by risk level (I–IV) and triggers escalating requirements: higher-risk systems require peer review, bias auditing, and Deputy Head approval. Between 2019 and 2023, the Directive caused several federal AI projects to be redesigned at the design phase — before any biased system could be deployed. This pre-deployment intervention model is now being cited in EU AI Act implementation guidance as a reference standard.

The Microsoft Turnaround: A Post-Gender Shades Case

After Buolamwini's 2018 Gender Shades publication, Microsoft retested its Face API and found error rates consistent with her findings. Rather than dispute the research, Microsoft took three structural steps: it updated its benchmark dataset to include substantially more diverse facial images; it commissioned independent external audits of the updated model; and it publicly committed to ongoing demographic performance reporting. By 2019, its published error rates across demographic groups had narrowed significantly — and in 2022, Microsoft announced it would retire its public Face API's emotion inference, age estimation, and hair/makeup attributes capabilities, citing concerns about the scientific validity and potential for misuse of inferring personal attributes from faces.

This is one of the cleaner documented cases of genuine structural response: external research → honest self-assessment → benchmark improvement → independent audit → ongoing transparency → further restriction where the task itself was problematic.

The Central Lesson of Module 5

Bias in AI is not primarily a technical problem with a technical solution. It is a governance problem — a question of who has the power to decide what systems are built, who they affect, how they are tested, and who is accountable when they fail. Technical tools (debiasing methods, model cards, red-teaming) are necessary but not sufficient. They work only within governance structures that create genuine independence, real accountability, and meaningful consequences for getting it wrong. Building those structures is harder than tuning a hyperparameter. It is also the only thing that durably works.

Lesson 4 Quiz

Building Accountability From the Start

Three questions · Select the best answer

1. Joy Buolamwini's Gender Shades research (2018) found that the highest error rates in commercial facial analysis systems affected:

Correct. Buolamwini found error rates for darker-skinned women were up to 34 percentage points higher than for lighter-skinned men across three commercial systems — from Microsoft, IBM, and Face++. The disparity traced directly to underrepresentation in training and benchmark datasets.

Incorrect. Gender Shades specifically documented that darker-skinned women experienced the highest error rates — up to 34 percentage points above lighter-skinned men. The disparity was demographic, not universal, and traced to dataset underrepresentation.

2. Canada's 2019 Directive on Automated Decision-Making is notable because it:

Correct. Canada's directive is significant as an early, working example of pre-deployment governance — mandatory AIAs that must be completed before systems go live, with risk-tiered requirements that have demonstrably caused federal AI projects to be redesigned at the design stage rather than patched post-deployment.

Incorrect. The directive did not ban AI or create a registry or a court. It established a pre-deployment Algorithmic Impact Assessment requirement for federal agencies, with tiered requirements based on risk level — and has demonstrably caused systems to be redesigned before deployment.

3. The research finding most relevant to why "diverse development teams are a quality control measure, not just a social justice measure" is:

Correct. Gender Shades revealed a direct chain: darker-skinned women were underrepresented in development teams → they were underrepresented in training and benchmark data → the systems performed worst on them. Diverse teams identify the failure modes that matter to people not otherwise in the room — which is precisely what technical quality requires.

Incorrect. The relevant evidence is from Gender Shades — the direct correlation between demographic underrepresentation in development teams, underrepresentation in datasets, and worst performance outcomes. This establishes the quality-control argument from technical evidence, not from business cases or regulatory mandates.

Lesson 4 Lab

Design an Accountability Structure

AI-assisted reasoning lab · minimum 3 exchanges to complete

Your Task

You'll design a pre-deployment accountability structure for a specific high-risk AI deployment. The AI will challenge you to make the governance mechanisms concrete, independent, and enforceable — not just aspirational.

Start by typing: "I want to design accountability for a healthcare AI" or "Give me a deployment scenario to govern."

Governance Design Lab

Welcome to the Governance Design Lab. You're going to design a pre-deployment accountability structure for a high-risk AI system — drawing on Algorithmic Impact Assessments, red-teaming, model cards, and participatory design. I'll push you to make each mechanism concrete and genuinely enforceable. What type of AI deployment do you want to design governance for? (Examples: healthcare triage AI, AI hiring tool, bail-risk scoring, loan approval algorithm.) Or type "Assign me a scenario."

Module 5 · Final Assessment

Fix It or Scrap It? — Module Test

15 questions · Score 80% or higher to pass

1. The Five-Factor Test for fix/rebuild/scrap decisions includes all of the following EXCEPT:

Correct. The five factors are: scope of harm, root cause locatability, task necessity, reversibility of past harm, and governance capacity. Vendor revenue is not a factor in an ethical decision framework, though it often drives organizational choices in practice.

Incorrect. The Five-Factor Test consists of: scope of harm, root cause locatability, task necessity, reversibility of past harm, and governance capacity. Vendor revenue is not among them.

2. Amazon's 2018 scrapping of its AI recruiting tool is best classified as a response to what type of bias?

Correct. The model learned from historical hiring data in which most successful candidates were male, encoding historical hiring bias as a predictive signal. This is the classic historical bias pattern — the model was accurate at predicting past decisions, which were themselves biased.

Incorrect. Amazon's tool reproduced historical bias from its training data — a decade of hiring patterns skewed toward male candidates taught the model to treat features associated with women as negative signals.

3. Pre-processing bias interventions are considered highest leverage because:

Correct. Pre-processing intervenes before the model learns from biased data. By cleaning, reweighting, or replacing the data or features at this stage, bias is prevented from being embedded in the model's weights — rather than being patched after the fact in outputs that already encode it.

Incorrect. Pre-processing's advantage is causal: catching problems before they are encoded into model weights. It typically requires more data access and expertise, not less, and its advantage is independent of regulation or fairness constraints.

4. The fairness impossibility theorem implies that when base rates differ across groups, an organization deploying a risk-scoring tool must:

Correct. The impossibility theorem makes it mathematically clear that choosing between fairness criteria is unavoidable when base rates differ. This means the choice must be made explicitly and transparently — with community and stakeholder input — rather than left implicit in a technical specification.

Incorrect. The impossibility theorem means organizations must explicitly choose which fairness criteria to prioritize, making this a values decision requiring transparency and accountability rather than a pure technical problem.

5. In the 2019 Optum healthcare algorithm case, the racial disparity in identified high-risk patients was reduced from approximately 46.5% to near parity by:

Correct. The fix was a pre-processing proxy replacement. Healthcare cost encoded historical inequities in healthcare access; replacing it with direct clinical measures of health need eliminated the mechanism through which the bias operated.

Incorrect. The Optum fix was a proxy replacement — substituting healthcare cost with clinical health indicators. This was a pre-processing intervention that eliminated the mechanism of bias rather than patching its outputs.

6. "Bias laundering" as described by the AI Now Institute most closely resembles which real-world pattern?

Correct. The Apple Card case is a textbook example of bias laundering: adjustments were announced, providing reputational cover, while the underlying credit-scoring model that produced gendered disparities was not structurally changed. Follow-up analyses suggested the disparity persisted in attenuated form.

Incorrect. Bias laundering describes superficial fixes that provide reputational cover without structural change. The Apple Card case — threshold adjustments without changing the underlying model — fits this pattern. Publishing data, retiring problematic features, or requiring pre-deployment assessment are the opposite.

7. The Facebook/Meta Fair Housing Act settlement (2022) required the company to:

Correct. The HUD settlement required a structural rebuild of the ad-delivery infrastructure for protected categories — housing, employment, and credit. This is notable as a case where regulatory enforcement mandated the "rebuild" verdict rather than allowing the company to choose a cheaper patch.

Incorrect. The settlement required a full structural rebuild of the ad-delivery system for housing, employment, and credit — not a fine, source code release, or blanket targeting ban.

8. COMPAS, the recidivism-prediction tool at the center of ProPublica's 2016 investigation, remains in use in multiple U.S. states as of 2024 primarily because:

Correct. COMPAS demonstrates that the fix/scrap framework, even when it clearly points toward scrapping, is not self-executing. Organizational inertia, institutional dependency, and the absence of enforcement capacity have allowed the tool to persist despite the ethical analysis pointing clearly toward retirement.

Incorrect. COMPAS's continued use reflects institutional inertia, not ethical or technical justification. The tool fails the Five-Factor Test on multiple dimensions, has not been shown to be fair by neutral analysis, and has not been subject to enforced retirement — despite ongoing criticism.

9. NYC Local Law 144 specifically addresses which type of AI system?

Correct. Local Law 144 specifically targets automated employment decision tools — AI or algorithm-based systems used to substantially assist in employment decisions. It requires annual independent bias audits and public disclosure of results.

Incorrect. NYC Local Law 144 specifically targets automated employment decision tools — AI systems used in hiring and promotion — requiring annual third-party bias audits and public reporting.

10. Why was HireVue's discontinuation of facial expression analysis in 2021 significant as a regulatory precedent?

Correct. The HireVue case established that opacity itself — being unable to audit a high-stakes AI system for bias — is treated as a compliance problem under emerging regulatory frameworks. Systems that cannot demonstrate freedom from bias are legally precarious even if bias hasn't been formally proven.

Incorrect. The HireVue precedent is about inauditability: a proprietary, unexplainable system making high-stakes hiring decisions cannot defend itself against bias allegations because it cannot be audited. Opacity in high-stakes contexts is now itself a regulatory risk.

11. Model Cards (Mitchell et al., 2019) and Datasheets for Datasets (Gebru et al., 2021) contribute to accountability primarily by:

Correct. Model Cards and Datasheets are documentation frameworks, not technical tools. Their accountability function depends on their being required — by procurement contracts, by regulation, or by institutional policy — creating traceable records of what developers knew and disclosed before deployment.

Incorrect. Model Cards and Datasheets are standardized documentation frameworks. Their value is in creating accountability trails — records of what was known and disclosed — not in encryption, audit replacement, or automated detection.

12. The Timnit Gebru incident at Google in 2020 illustrates which structural governance problem?

Correct. The Gebru case illustrates that ethics teams embedded inside organizations with commercial interests in the systems they are supposed to critique face a structural conflict of interest. Independence — not just good intentions — is required for effective governance.

Incorrect. The Gebru case illustrates a structural, not individual, problem: ethics teams inside commercial organizations face inherent conflicts of interest. The team had the skills; the architecture of their role prevented them from exercising independent judgment without institutional consequences.

13. Canada's Algorithmic Impact Assessment requirement caused several federal AI projects to be redesigned before deployment. This is significant because it demonstrates:

Correct. This is the core argument for pre-deployment accountability: intervening during design is structurally cheaper, more effective, and less harmful than trying to patch deployed systems that are already affecting real people. Canada's AIA is one of the few documented cases of this actually working as intended.

Incorrect. Canada's AIA demonstrates that mandatory pre-deployment governance can prevent biased systems from going live — intervening when changes are cheapest, before harm occurs, rather than requiring post-deployment patches that may be structurally inadequate.

14. The EU AI Act's approach to high-risk AI systems — hiring, credit, bail, education — primarily requires organizations to:

Correct. The EU AI Act creates a risk-tiered compliance system. High-risk systems must meet conformity assessment requirements — bias auditing, transparency, human oversight — before deployment, with substantial financial penalties for non-compliance. It does not ban high-risk AI; it conditions deployment on meeting those requirements.

Incorrect. The EU AI Act requires pre-deployment conformity assessments (bias auditing, transparency documentation, human oversight) for high-risk AI systems, with fines up to €35M or 7% of global turnover. It does not require per-decision approval, mandate open source, or require retirement of AI in favor of humans.

15. The central lesson of Module 5 — "Fix It or Scrap It?" — is that AI bias is fundamentally:

Correct. This is the module's core thesis. Debiasing techniques, model cards, red-teaming, and AIAs are necessary but not sufficient. They require governance structures — independent oversight, binding accountability, meaningful consequences — that go beyond technical specification. Building those structures is harder than engineering, and it is the only thing that durably works.

Incorrect. While data quality, technical sophistication, and engineer education matter, the module's central conclusion is that AI bias is primarily a governance problem. Technical tools are only effective inside governance structures that create genuine independence from commercial pressures and real accountability for failure.