Module 4 · Lesson 1

When Algorithms Decide Who Gets Hired

Automated hiring tools promise efficiency — and embed discrimination at scale.

If an AI system rejects 75% of qualified women for software roles, whose responsibility is it to act — and how?

In 2014, Amazon's machine learning team built a résumé-screening engine trained on a decade of successful hires. The historical data reflected a tech industry that had long skewed male. By 2015, internal audits revealed the model was systematically downgrading CVs that included the word "women's" — as in "women's chess club" — and penalising graduates of all-women colleges. The engineers who discovered this were staff inside the same company that deployed the tool. Amazon disbanded the team in 2018 and scrapped the system. The tool had been used, in some form, by recruiters in the interim.

No external regulator mandated the shutdown. An internal professional decision — by engineers willing to surface an uncomfortable audit result — ended the system's use. That decision carried career risk. It also prevented ongoing harm to an unknown number of candidates.

Why Hiring AI Encodes Historical Bias

Machine learning systems trained on historical hiring data learn to replicate past decisions, including discriminatory ones. If an industry hired mostly men for engineering roles over a 10-year window, the training signal treats "maleness" as a latent predictor of success — not because of any explicit instruction, but because correlation is extracted regardless of causation.

This is not a hypothetical edge case. A 2019 audit by HireVue's own external reviewers found that facial-expression analysis in video interviews correlated candidate scores with lighting quality and webcam resolution — proxies for socioeconomic background, not competence. HireVue removed facial analysis from its products in 2021 following sustained pressure from the Electronic Privacy Information Center (EPIC) and a coalition of civil-rights organisations.

The pattern is consistent: automated hiring tools trained on biased historical data amplify past discrimination unless actively audited and corrected. The professional question is not whether this happens — it does — but what obligations a developer, deployer, or HR professional carries when they suspect or confirm it.

Documented Impact

A 2020 study published in Science Advances (Lambrecht & Tucker) found that Facebook's ad-delivery algorithm showed software engineering job ads to 20% fewer women than men — not because of any advertiser instruction, but because the platform's optimisation engine found female users more expensive to reach in competitive advertising auctions. The discrimination emerged from the algorithm's cost-minimisation logic, not explicit bias.

The Professional's Position in the Hiring AI Chain

Responsibility in automated hiring is distributed across a long chain: the data scientist who selects training data, the engineer who builds the scoring model, the product manager who decides which outputs to surface, the HR director who chooses to deploy the vendor's tool, and the recruiter who acts on the score. Each link in this chain can either amplify or interrupt discriminatory outcomes.

The key professional obligations that have emerged from documented cases include:

Audit before deployment. The Illinois Artificial Intelligence Video Interview Act (2019) requires employers using AI video-interview tools to notify candidates and conduct annual audits for racial and gender bias before expanding use — the first US law imposing pre-deployment bias auditing on hiring AI.

Document decisions. When the New York City Local Law 144 (2023) required bias audits of automated employment decision tools, it also required employers to post summary results publicly — making professional accountability visible rather than internal.

Maintain a human override. Best-practice frameworks from the EEOC, the UK's Equality and Human Rights Commission, and the EU AI Act all converge on one point: high-stakes employment decisions must retain a meaningful human review path that cannot be blocked by algorithm output alone.

Disparate Impact: A facially neutral policy that disproportionately harms a protected group, regardless of intent. US law (Title VII) prohibits it; proving it requires statistical evidence of differential outcomes.

Proxy Discrimination: When an AI uses a variable (e.g. zip code, college name, résumé gap) that correlates strongly with a protected characteristic (race, gender, disability), producing discriminatory outcomes without using the protected variable directly.

Algorithmic Auditing: Systematic testing of a model's outputs across demographic groups to detect differential error rates, differential approval rates, or other evidence of biased performance.

Professional Principle

The Amazon and HireVue cases share a structure: internal professionals identified the problem before external regulators did. The ethical moment is not the deployment decision — it is the point at which someone with knowledge of the problem chooses whether to act on it. Silence at that moment is itself a professional choice.

What Responsible Deployment Looks Like

LinkedIn's 2019 Fairness-Aware Ranking paper, authored by its engineering team, describes how the platform adjusted its job-recommendation algorithm after discovering it over-represented male candidates for certain roles. The team introduced a fairness constraint into the ranking objective — accepting a slight reduction in click-through rate in exchange for more equitable exposure across genders. The paper was published openly, making the trade-off visible.

This represents the positive case: professionals who identify bias, document it, implement a correction with known trade-offs, and publish their reasoning. The approach was not perfect — fairness constraints in ranking remain technically contested — but it demonstrated that commercial pressure and ethical obligation are not always mutually exclusive when professionals choose to treat them as joint constraints.

Lesson 1 Quiz

When Algorithms Decide Who Gets Hired — 5 questions

1. Amazon's résumé-screening AI downgraded candidates primarily because it was trained on data that:

Correct. The model learned correlations from historical data in which most successful hires were male — no explicit gender tag was needed for bias to emerge.

Not quite. The discrimination arose from learned correlations in biased historical outcomes, not explicit labels or deliberate manipulation.

2. The term "proxy discrimination" in AI hiring refers to:

Correct. Proxy discrimination uses variables like zip code or college name — which correlate with race or gender — to discriminate without directly using protected attributes.

Proxy discrimination is specifically about using correlated variables, not explicit protected-characteristic inputs or human overrides.

3. HireVue removed facial analysis from its video-interview product in 2021 primarily due to:

Correct. External civil-society pressure following an independent audit — not a direct legal order — drove HireVue's decision to remove facial analysis.

The Illinois Act requires notification and audits, but HireVue's removal of facial analysis was driven primarily by civil-society pressure from EPIC and coalition partners.

4. New York City's Local Law 144 (2023) requires employers using automated employment decision tools to:

Correct. Local Law 144 mandates bias audits and public posting of summary audit results — making accountability externally visible.

Local Law 144's key requirement is bias auditing with public disclosure of results, not a blanket consent or per-application human review mandate.

5. The Lambrecht & Tucker (2020) finding about Facebook's ad-delivery algorithm demonstrated that gender-discriminatory outcomes can arise from:

Correct. The algorithm found female users more expensive to reach in competitive auctions and de-prioritised them — discrimination as a side-effect of cost optimisation, not explicit intent.

No advertiser instruction or training-data bias was the cause here. The discrimination emerged from the platform's own cost-minimisation objective in ad auctions.

Lab 1 — Auditing a Hiring Algorithm

Practical AI ethics conversation · Complete 3 exchanges to finish

Your Scenario

You are an HR data analyst at a mid-size technology company. Your team has just deployed a third-party résumé-screening tool. After two months, a junior recruiter notices that only 9% of the candidates advancing past the first screen are women, despite women making up 38% of applicants. You have been asked to investigate and recommend next steps.

Use this AI assistant to work through: (1) how to determine whether the disparity is statistically significant, (2) what types of proxy variables might be driving the gap, and (3) what you should recommend to leadership — including whether to suspend the tool pending investigation.

Ethics Lab Assistant

Hiring AI Bias

You're investigating a potential gender-bias issue in a résumé-screening tool. Let's work through this systematically. Start by telling me: what data do you currently have access to, and what's your first priority — determining statistical significance, identifying proxy variables, or framing a recommendation to leadership?

Module 4 · Lesson 2

Medical AI and the Duty to Disclose

When clinical AI fails unequally, the professional silence that follows can be as harmful as the algorithm itself.

A tool used to allocate care to 200 million people underestimates illness severity in Black patients. Who had the obligation to act — and when?

In October 2019, researchers Ziad Obermeyer, Brian Powers, Christine Vogeli, and Sendhil Mullainathan published findings in Science that a widely deployed commercial health risk algorithm — built by Optum and used by hospitals and insurers to identify patients needing complex care management — was systematically underestimating illness severity in Black patients relative to equally sick white patients. The model used health-care cost as a proxy for health need. Because structural racism in the US health system means Black patients historically incur lower costs for the same level of illness — due to documented barriers to access — the algorithm interpreted lower cost as lower need. The result: Black patients were assigned lower risk scores and were less likely to be enrolled in care-management programs. The disparity was large: at the same risk-score threshold, Black patients were significantly sicker than white patients on objective clinical measures.

Optum acknowledged the findings and said it was working to address them. But the tool had been used, in various versions, for years before the academic audit surfaced the problem. Hospitals and payers who licensed the tool had made care-allocation decisions based on its output without knowing — or without disclosing — that its proxy variable embedded structural inequity.

The Proxy-Variable Problem in Clinical AI

The Optum case is a textbook example of what researchers now call "outcome proxy failure": when a model is trained to predict a measurable proxy (cost, readmission, appointment compliance) rather than the underlying health need, it inherits all the structural biases embedded in that proxy. Health-care cost is shaped by access, insurance status, geography, historical discrimination in medical treatment, and patient trust — none of which reflect need. Training on cost as a proxy for need encodes all of those structural inequities into the scoring logic.

The problem is not unique to this vendor. A 2021 review in The Lancet Digital Health analysed 130 clinical AI studies and found that only 56% reported model performance by any demographic subgroup. Without subgroup analysis, disparate performance in minority populations is invisible until deployment reveals it — often after harm has already occurred at scale.

Documented Scale

Obermeyer et al. estimated the Optum algorithm was influencing care decisions for approximately 200 million people across US hospitals and insurers at the time of publication. This is not a small pilot study — it represents routine algorithmic deployment at a scale that dwarfs most clinical trials of new drugs, which are held to strict bias-reporting requirements under FDA regulation.

Professional Obligations in Clinical AI Deployment

The clinical AI context is unusual because it intersects two sets of professional duties: data-science ethics and medical ethics. Physicians and nurses who use algorithmic recommendations carry duties of non-maleficence and beneficence that do not disappear because the recommendation came from software. The question "should I act on this score?" remains a clinical judgment, and clinical judgment carries professional accountability.

Clinician disclosure obligations: The American Medical Association's 2019 policy on augmented intelligence states that physicians retain responsibility for patient care even when using AI tools, and that they should understand the limitations of tools they rely on. Relying on a tool known (or suspected) to perform differently across demographic groups, without disclosure to the patient, may constitute a violation of informed consent.

Vendor disclosure obligations: The FDA's 2021 Action Plan for AI/ML-Based Software as a Medical Device introduced the concept of a "predetermined change control plan" — vendors must pre-specify how their models will be monitored and updated post-deployment. This implicitly requires subgroup monitoring as part of post-market surveillance.

Institutional obligations: A 2021 joint statement by the Association of American Medical Colleges and the National Academy of Medicine called for hospitals deploying clinical AI to maintain standing bias-audit committees with authority to suspend tools pending investigation — analogous to the data-safety monitoring boards used in clinical trials.

Outcome Proxy Failure: A model error arising when a measurable proxy variable (e.g., healthcare cost) is used to represent an unmeasurable target (e.g., health need), and the proxy is itself shaped by structural inequity, causing the model to replicate that inequity at scale.

Subgroup Analysis: Evaluating model performance separately for demographic subgroups (race, gender, age, insurance status) to detect differential error rates that would be invisible in aggregate-performance metrics.

Post-Market Surveillance: Ongoing monitoring of a deployed AI tool's real-world performance, analogous to post-approval drug safety monitoring. Required by FDA for certain Software as a Medical Device (SaMD) categories.

The Disclosure Moment

In the Optum case, the ethical moment was not deployment — it was the point at which hospital administrators and vendor data scientists had sufficient data to suspect the disparity existed. The 2019 Science paper used data that vendors and hospital partners could, in principle, have analysed themselves. The professional question is: when does an obligation to look become an obligation to act on what you find?

Structural vs. Individual Responsibility

One response to documented clinical AI bias is to treat it as a vendor problem — the tool was wrong, the vendor must fix it. But the Optum case illustrates that deploying institutions bear independent responsibility. The hospitals and payers that licensed the tool made a professional choice to rely on vendor-provided risk scores for high-stakes care allocation decisions. Outsourcing the algorithm does not outsource the obligation to understand it.

The parallel to financial services is instructive: when banks used third-party credit-scoring algorithms that produced discriminatorily disparate outcomes, regulators held the banks — not just the scoring vendors — liable under the Equal Credit Opportunity Act. The principle — that an institution cannot disclaim responsibility for a tool it has chosen to embed in a consequential decision process — is increasingly being applied to health care through the emerging regulatory frameworks described above.

Lesson 2 Quiz

Medical AI and the Duty to Disclose — 5 questions

1. The Optum health-risk algorithm underestimated illness severity in Black patients because it used which proxy variable?

Correct. The model used cost as a proxy for health need. Because structural barriers reduce healthcare utilisation for Black patients at equal illness severity, lower cost was misread as lower need.

The key proxy variable was health-care cost — not hospitalisations, diagnoses, or patient-reported data. Cost embedded structural inequity in access and utilisation.

2. The 2021 Lancet Digital Health review found that what proportion of clinical AI studies reported performance by demographic subgroup?

Correct. Only about 56% of the 130 studies reviewed reported subgroup performance — meaning nearly half provided no demographic breakdown of model accuracy.

The review found only about 56% of studies reported any subgroup performance, leaving roughly 44% with no demographic breakdown of model accuracy.

3. Under the AMA's 2019 policy on augmented intelligence, when a physician relies on an AI tool that performs poorly across demographic groups, responsibility for patient outcomes:

Correct. The AMA's 2019 policy explicitly states that physicians retain responsibility for patient care even when using AI recommendations — clinical judgment obligations do not transfer to the software vendor.

AMA policy holds that physician responsibility for patient care is retained even when using AI tools. Responsibility does not transfer to vendors or IT departments.

4. The principle that a hospital cannot disclaim responsibility for a third-party AI tool it has embedded in high-stakes care decisions is most directly analogous to which legal precedent?

Correct. Regulators held banks liable under ECOA for discriminatory third-party credit scores — establishing that outsourcing a decision tool does not outsource the legal or ethical obligation to ensure it is fair.

The closest analogy is ECOA: banks were held liable for discriminatory outcomes from third-party scoring tools, establishing that embedding a tool in a consequential decision process carries accountability.

5. "Outcome proxy failure" in clinical AI specifically refers to:

Correct. Outcome proxy failure occurs specifically when the proxy variable used for training (like cost) embeds structural inequity, causing the model to systematically underserve disadvantaged groups.

Outcome proxy failure is specifically about using a measurable proxy — like cost — that embeds structural inequity, not about distribution shift, mislabelled data, or actionability of predictions.

Lab 2 — Clinical AI Disclosure Decision

Practical AI ethics conversation · Complete 3 exchanges to finish

Your Scenario

You are a clinical informatics director at a regional hospital network. Your data team has just completed a subgroup analysis of your sepsis-prediction AI — a tool that has been in production for 18 months. The analysis shows the model's false-negative rate (missed sepsis cases) is 31% higher for Black patients than for white patients. The vendor says this is within acceptable aggregate performance benchmarks. Your CMO is asking whether you need to disclose this to affected patients, suspend the tool, or simply flag it for the next software update cycle.

Use this assistant to think through: (1) what your disclosure obligations are under informed consent principles, (2) what the difference between "acceptable aggregate performance" and equitable performance means ethically, and (3) what you should recommend to your CMO with a clear rationale.

Ethics Lab Assistant

Clinical AI Ethics

This is a genuinely difficult professional situation — a documented disparity in a production clinical AI tool. Before I help you think through the disclosure and suspension question, tell me: does your hospital have a standing AI ethics committee or bias-monitoring protocol? And has the vendor been formally notified of your subgroup analysis results?

Module 4 · Lesson 3

Predictive Policing and the Feedback Loop

When an algorithm trains on biased enforcement data, it predicts — and reinforces — the bias itself.

If a crime-prediction tool sends more officers to already over-policed neighbourhoods, what does it mean for the data it collects to be called "objective"?

PredPol — later rebranded Geolitica — was deployed by the Los Angeles Police Department beginning in 2012, and subsequently by dozens of US police departments. The system used historical crime report data to generate 500-square-foot "hotspot" boxes where officers were directed to patrol. Its proponents claimed it was race-neutral, using only crime type, crime location, and time as inputs — not demographic data.

A 2021 investigation by the Los Angeles Times and the Human Rights Data Analysis Group (HRDAG), using internal LAPD data obtained via public-records requests, found that PredPol directed patrols disproportionately to low-income communities of colour — not because those areas had objectively higher underlying crime rates, but because they had historically higher rates of police contact. More patrols produced more arrests, which produced more data, which produced more hotspot predictions. The feedback loop was self-reinforcing. The HRDAG analysis also found that drug-related offences — highly sensitive to where police actually patrol — were the strongest driver of hotspot designation in several precincts. LAPD announced it was ending its contract with PredPol in April 2020, several months before the formal analysis was published.

The Self-Reinforcing Feedback Loop

The central structural problem with predictive policing is that it creates a closed loop between prediction and data collection. In most machine-learning contexts, training data is collected independently of the model's predictions — a medical AI does not cause patients to develop the conditions it predicts. Policing is fundamentally different: deploying officers to a location based on an algorithmic prediction increases the probability of recording a crime in that location, because crimes go undetected without police presence.

This means the historical crime data used to train PredPol-style tools is not a neutral record of where crime occurs — it is a record of where police were previously deployed. Training a new model on that data does not learn where crime is; it learns where prior policing concentrated. The model then concentrates policing further in those areas, generating confirmatory data, and the loop tightens.

Santa Cruz, California became the first US city to ban predictive policing tools in June 2020, citing this structural feedback problem alongside civil-liberties concerns. The Santa Cruz resolution explicitly used the phrase "discriminatory feedback loop" in its legislative findings — a legal document formally acknowledging the algorithmic mechanism described above.

The "Race-Neutral" Claim

PredPol's developers consistently described it as race-neutral because demographic data was not a direct input. The HRDAG analysis demonstrated why this argument fails: historical police contact data is not race-neutral data. It is the product of decades of racially disparate enforcement policy. Using it as training data does not neutralise that history — it computes with it. Excluding race as a variable does not exclude racial disparity as an outcome when the input data encodes that disparity structurally.

Professional Ethics in Law Enforcement AI

Law enforcement officers and department administrators face a specific version of the professional dilemma: using an algorithm that may constitute a discriminatory practice, in a profession where explicit racial discrimination is illegal under the Equal Protection Clause and the pattern-or-practice provisions of 42 U.S.C. § 14141.

The officer's dilemma: An individual officer directed by PredPol to patrol a specific block faces a choice about how to conduct that patrol. Discretionary enforcement in that block — whom to stop, whom to question, whom to cite — carries its own bias risks. The algorithm amplifies the stakes of that individual-level discretion rather than eliminating it.

The administrator's dilemma: Police chiefs and city officials who procure predictive policing tools bear responsibility for understanding what "race-neutral inputs" actually means given the structure of the training data. The LAPD's 2020 termination of PredPol came after sustained pressure from civil-rights organisations and city council members — not from internal data review alone.

The data scientist's dilemma: PredPol's founders were academics who published their methodology and believed it would reduce crime. The 2021 investigative findings confronted them with evidence that their system was operating differently in practice than their model assumed. The professional obligation to engage with that evidence — rather than defend the original model design — is a central test of engineering ethics.

Feedback Loop Bias: A self-reinforcing cycle in which an algorithm's predictions influence the data-collection process in ways that confirm and amplify the initial bias. Particularly acute when prediction drives deployment of observation resources.

Structural Proxy: A variable that appears race-neutral but encodes racial disparity through historical structural processes — e.g., historical arrest rates in over-policed neighbourhoods, or healthcare costs under a racially inequitable access system.

Pattern-or-Practice: A legal standard under 42 U.S.C. § 14141 under which systematic discriminatory policing practices — even without individual discriminatory intent — can trigger federal investigation and consent decree requirements.

Chicago's Strategic Subject List

Chicago's Strategic Subject List (SSL), also known as the "heat list," assigned risk scores to individuals based on criminal history and social network analysis, purporting to predict who would be involved in gun violence. A 2017 RAND Corporation evaluation found no evidence the SSL reduced gun violence. An ACLU investigation found that Black and Latino men were disproportionately listed. Chicago terminated the program in 2020. This case illustrates a second failure mode beyond the feedback loop: predictive tools that generate racially disparate outputs without demonstrable efficacy — causing harm with no corresponding benefit.

What Responsible Discontinuation Looks Like

Santa Cruz (2020), LAPD (2020), and Chicago (2020) all terminated predictive policing tools in the same year. The mechanism differed: Santa Cruz acted on principled grounds before a full local audit; LAPD acted under external pressure with a contract termination; Chicago acted after both a RAND efficacy evaluation and an ACLU investigation documented failure. Each path reached the same professional conclusion through a different accountability mechanism.

The lesson for professionals in adjacent fields — social services, child protective services, parole systems — is that the structure of feedback-loop bias is not unique to policing. Any algorithmic system deployed to allocate investigative or supervisory resources, trained on data reflecting historical over-surveillance of disadvantaged communities, risks replicating and amplifying those patterns. The obligation to ask "what does this training data actually represent?" precedes deployment, not follows it.

Lesson 3 Quiz

Predictive Policing and the Feedback Loop — 5 questions

1. PredPol's developers claimed the tool was "race-neutral" because it did not use demographic data as an input. The HRDAG analysis showed this claim failed because:

Correct. The "race-neutral" claim fails because historical arrest and patrol data encodes decades of racially disparate enforcement — using that data without demographic inputs still computes racial disparity into the output.

The failure was structural, not technical. Historical enforcement data is not race-neutral data; it reflects decades of racially disparate policing, regardless of whether race appears as an explicit model variable.

2. In predictive policing, "feedback loop bias" specifically arises because:

Correct. Deploying officers to predicted hotspots increases observation density, which increases recorded crime, which feeds back into the training data — a self-reinforcing loop that confirms and amplifies the original prediction.

Feedback loop bias arises from the structural relationship between prediction and data collection — more patrols produce more recorded crime, which reinforces the hotspot prediction. It does not require falsification or deliberate vendor manipulation.

3. Santa Cruz, California's 2020 ordinance banning predictive policing tools was notable because it:

Correct. Santa Cruz was the first US city to ban predictive policing tools, and its resolution formally named the discriminatory feedback loop mechanism — an unusual degree of technical specificity in a legislative document.

Santa Cruz acted proactively on principled grounds, without a federal court order or local efficacy study. It was the first US city to ban such tools and explicitly named the feedback loop in its findings.

4. Chicago's Strategic Subject List (SSL) was terminated in 2020 partly because a RAND evaluation found:

Correct. The 2017 RAND evaluation found no evidence of efficacy — the SSL did not demonstrably reduce gun violence — while simultaneously generating racially disparate outputs and no measurable benefit.

The RAND finding was specifically about efficacy: no evidence of gun-violence reduction. This is distinct from accuracy rates, security vulnerabilities, or constitutional claims.

5. The professional ethics principle illustrated across the PredPol, SSL, and Optum cases is best described as:

Correct. The recurring professional failure across these cases was deploying tools without adequately interrogating what the training data represents and whose historical disadvantage it encodes.

The lesson is not a blanket prohibition or a claim about transparency alone. It is the specific obligation to interrogate training data before deployment — asking what it represents and who it may harm.

Lab 3 — Evaluating a Predictive Tool

Practical AI ethics conversation · Complete 3 exchanges to finish

Your Scenario

You are a senior analyst at a county child protective services agency. Your director wants to implement an algorithmic risk-scoring tool to prioritise which families receive home visits after a maltreatment report. The vendor claims the tool is validated on national data and race-neutral because it does not use race as an input. Your team's preliminary review suggests the tool's training data comes largely from counties with historically high CPS contact rates in low-income neighbourhoods.

Explore with the assistant: (1) why the "race-neutral inputs" claim may not ensure equitable outcomes in this context, (2) how the feedback loop problem from predictive policing applies to child protective services, and (3) what due-diligence steps you should insist on before recommending deployment to your director.

Ethics Lab Assistant

Algorithmic Accountability

Child protective services is one of the highest-stakes domains for algorithmic decision-making — errors have direct consequences for family safety and for wrongful family separation. The feedback-loop concern you've identified is real and well-documented. Let's start: what do you know about the specific variables the vendor uses as inputs, and does the vendor offer subgroup performance data broken down by race and income level?

Module 4 · Lesson 4

Whistleblowing, Dissent, and the Internal Advocate

The hardest professional decision is often not what to recommend — it is whether to speak when the organisation has already decided.

When Google engineers refused to continue work on Project Maven in 2018, what ethical framework were they using — and does it apply beyond Big Tech?

In 2017, Google entered a contract with the US Department of Defense to help develop object-recognition algorithms for drone footage analysis under Project Maven. In early 2018, a group of Google employees learned the full scope of the project and began circulating an internal petition that eventually gathered over 4,000 signatures. The petition called on CEO Sundar Pichai to cancel the contract, arguing that "Google should not be in the weapons business." Several senior AI researchers resigned rather than continue work on the project.

Google did not renew the Maven contract when it expired in 2019. The company subsequently published its AI Principles, which explicitly prohibit applications "whose purpose contravenes widely accepted principles of international law and human rights" and weapons systems "that could cause or directly facilitate injury to people." The principles were a direct institutional response to the internal dissent. Whether the employees' action was whistleblowing, organised labour action, or principled professional refusal depends on how one categorises it — but its documented outcome was an institutional policy change driven by employee professional objection.

The Taxonomy of Internal AI Dissent

Professional dissent in AI development takes several forms, each with distinct legal protections and organisational consequences:

Internal objection: Raising concerns through recognised channels — manager, ethics review board, legal team. The lowest risk form of dissent. At Google, the initial petition was internal. This form is protected under general employment law if it relates to legal compliance but carries no specific whistleblower protection unless it involves a regulated domain (securities, health and safety, discrimination).

Collective action: Coordinating with colleagues to escalate objections — as in the Maven petition or the subsequent Google Walkout (2018, over sexual harassment policy). This is legally protected concerted activity under the National Labor Relations Act in the US, regardless of union membership. Employers cannot lawfully retaliate against employees organising collectively over workplace policy.

Public disclosure: Going to press or publishing externally. This carries the highest professional risk and the most variable legal protection. The Whistleblower Protection Act covers federal employees reporting government contract violations; private-sector whistleblower protections are sector-specific (Dodd-Frank for securities, OSHA for safety, etc.) and often have narrow scope.

The ethical framework underlying each form is distinct. Internal objection is a duty-based argument: I am obligated not to participate in a project I believe is unethical. Collective action is a rights-based argument: employees have rights to shape the conditions and purpose of their labour. Public disclosure is a consequentialist argument: the public harm from silence exceeds the personal and institutional harm from disclosure.

Timnit Gebru — Google, 2020

In December 2020, Google AI researcher Timnit Gebru was dismissed following a dispute over a research paper she co-authored examining risks of large language models, including environmental costs and bias in training data. Gebru said she was fired after pushing back on a demand to retract the paper before publication. Google said she resigned. The dispute prompted a second wave of employee protest and the resignation of several colleagues, including Margaret Mitchell, who led Google's AI ethics team. The episode documented the specific professional risk faced by AI researchers who produce findings that conflict with their employer's commercial interests — not a hypothetical, but a specific 2020 career outcome.

The Structural Position of AI Ethics Roles

One lesson from both the Maven and Gebru cases is that the institutional placement of an ethics function determines how much power it carries. At Google in 2020, the AI ethics team lacked authority to block publication decisions or product deployments — it operated as an advisory function within an engineering organisation whose primary accountability was to product and revenue goals.

By contrast, the FDA's Center for Devices and Radiological Health has statutory authority to block the deployment of medical AI that fails pre-market review. The difference is not ethical seriousness — it is institutional design. Ethics functions without authority to say no are, structurally, communication exercises rather than governance mechanisms.

The EU AI Act (effective 2024–2026) attempts to impose this structural design requirement from outside organisations: for high-risk AI systems (employment, credit, law enforcement, migration, critical infrastructure), it requires a conformity assessment with a notified body, a human oversight mechanism with authority to intervene, and post-market monitoring — creating governance obligations that cannot be met by advisory-only ethics teams.

Concerted Activity: Collective action by employees regarding workplace conditions, including workplace policy objections, protected under the NLRA regardless of union status. Employer retaliation is unlawful.

Ethics Washing: The practice of creating visible ethics processes (principles, review boards, ethics teams) that lack authority to halt harmful products — providing reputational cover without functional governance.

EU AI Act High-Risk Category: A classification under the 2024 EU AI Act applying to AI systems used in employment, credit scoring, law enforcement, education, and critical infrastructure — requiring conformity assessments, human oversight mechanisms, and post-market monitoring as legal obligations, not voluntary best practice.

The Professional Question Across All Four Lessons

Each case in this module — Amazon's hiring AI, Optum's health risk scores, PredPol's patrol algorithm, Google Project Maven, and Timnit Gebru's dismissal — involves a professional who had knowledge of a potential harm before external regulators or the public did. In each case, the ethical moment was not the system's deployment. It was the moment a professional with relevant knowledge had to decide whether to act on it, stay silent, or actively suppress it. The question this module asks you to carry forward is simple: what is your default, and is it defensible?

Building Organisational Conditions for Ethical Dissent

Waiting for individual professionals to risk their careers is not a governance system — it is an externalisation of institutional responsibility onto individuals. Organisations that produce defensible AI outcomes typically have several structural features that reduce the individual cost of raising concerns:

Anonymous ethics reporting channels with a documented response protocol and non-retaliation policy (not just a stated commitment).

Pre-deployment review gates with documented criteria for what triggers additional ethical review — including fairness testing, subgroup analysis, and human-override requirements — before a product ships.

Cross-functional review authority, where an ethics or legal team can issue a formal hold that pauses deployment pending review, rather than offering an opinion that product managers can ignore.

Post-deployment monitoring obligations assigned to specific named roles, with defined escalation criteria and a board-level reporting line for AI risk — analogous to how financial institutions treat compliance risk.

None of these structures guarantee ethical outcomes. But they create the organisational conditions under which individual professionals can raise concerns without bearing the full personal cost of doing so — which is the prerequisite for consistent, rather than exceptional, ethical behaviour.

Lesson 4 Quiz

Whistleblowing, Dissent, and the Internal Advocate — 5 questions

1. What documented outcome did the Google Project Maven employee petition most directly produce?

Correct. Google did not renew the Maven contract when it expired in 2019, and the AI Principles published in response explicitly addressed the concerns raised in the petition.

The documented outcome was non-renewal of the contract upon expiration and the subsequent publication of Google's AI Principles — not an immediate cancellation or federal investigation.

2. Under US law, employee "concerted activity" regarding workplace policy — including collective objection to AI projects — is legally protected under:

Correct. The NLRA protects concerted activity by employees — including collective policy objections — regardless of whether they are in a union. Employer retaliation is unlawful under Section 7.

Concerted workplace activity is protected under the NLRA, not the WPA (which covers federal employees) or Dodd-Frank (securities sector). The First Amendment does not apply to private employers.

3. The term "ethics washing" in AI governance refers to:

Correct. Ethics washing is the structural pattern of creating visible ethics infrastructure — principles, review boards, responsible AI teams — that lacks the authority to stop problematic deployments.

Ethics washing specifically refers to ethics processes that provide reputational cover without functional governance authority — not to NDA misuse, research misrepresentation, or retroactive labelling.

4. The Timnit Gebru case at Google (2020) illustrates which specific professional risk for AI researchers?

Correct. Gebru's dismissal documented the specific career risk faced by researchers whose work produces findings inconvenient to their employer's commercial interests — a concrete 2020 professional outcome, not a theoretical risk.

The Gebru case documented the career consequences of producing research findings that conflict with an employer's commercial interests — not legal liability, academic misconduct, or patent issues.

5. Under the EU AI Act's requirements for high-risk AI systems, which of the following is a legal obligation — NOT merely a best-practice recommendation?

Correct. The EU AI Act imposes legal obligations for high-risk systems: conformity assessment with a notified body, human oversight with actual intervention authority, and post-market monitoring — not voluntary commitments.

Publishing metrics, internal ethics policies, and voluntary codes of conduct are all voluntary. The EU AI Act's high-risk requirements include legally mandated conformity assessments, human oversight with authority to intervene, and post-market monitoring.

Lab 4 — The Internal Advocate

Practical AI ethics conversation · Complete 3 exchanges to finish

Your Scenario

You are a senior engineer at a financial services company. Your team is six weeks from launching a credit-scoring model that uses rental payment history, utility bill data, and social graph signals as "alternative credit data" for applicants without traditional credit histories. Internal testing shows the model increases approvals for underserved populations — but also shows a 19% higher false-positive rate (incorrectly denied credit) for applicants in majority-Black zip codes. Your manager wants to launch on schedule. Legal has said the model passes disparate-impact testing under current regulatory thresholds. You believe the documented disparity is ethically unacceptable even if technically legal.

Work through with the assistant: (1) how you distinguish "legally compliant" from "ethically acceptable" in this context, (2) what internal mechanisms you can use to escalate this concern without going public, and (3) at what point — if internal escalation fails — public disclosure or regulatory reporting might become an obligation rather than a choice.

Ethics Lab Assistant

Professional Dissent & AI Ethics

This is a genuinely hard professional situation — you're looking at a gap between legal compliance and ethical obligation, with a real launch deadline and career stakes. Let's be precise. When you say 19% higher false-positive rate in majority-Black zip codes: is that the overall disparity, or a residual disparity after controlling for income and credit-relevant factors? And does your company have a formal AI ethics review process separate from legal sign-off?

Module 4 Test

The Professional's Dilemma — 15 questions · 80% to pass

1. Amazon's résumé-screening AI penalised candidates who included "women's" in their CVs because:

Correct. The model learned correlations from biased historical outcomes — no explicit instruction was required for female-associated signals to become negative predictors.

The bias emerged from learned correlations in historical data, not deliberate design or labelling errors.

2. New York City Local Law 144 (2023) specifically requires employers using automated employment decision tools to:

Correct. Local Law 144 mandates bias audits and public disclosure of summary results — external accountability rather than internal consent.

Local Law 144 requires bias audits with public posting of results. It does not mandate per-candidate consent, per-application human review, or federal certification.

3. "Proxy discrimination" occurs when an AI system:

Correct. Proxy discrimination uses correlated variables — zip code, college name, résumé gap — to discriminate without directly using protected attributes.

Proxy discrimination is specifically about using correlated variables as indirect routes to discriminatory outcomes — not explicit use of protected attributes or data-generation issues.

4. The Optum health-risk algorithm produced racially disparate outcomes primarily because it used health-care cost as a proxy for health need, and health-care cost is shaped by:

Correct. Structural barriers — access, insurance, historical discrimination in treatment — mean Black patients incur lower costs at equal illness severity. The model read lower cost as lower need.

The disparity arose from structural inequity in healthcare access, not individual choices, biological differences, or fraud patterns.

5. The FDA's 2021 Action Plan for AI/ML-Based Software as a Medical Device introduced the concept of a "predetermined change control plan," which requires vendors to:

Correct. The predetermined change control plan requires vendors to pre-specify post-deployment monitoring and update protocols — creating an implicit requirement for ongoing bias surveillance.

The plan requires pre-specified monitoring and update protocols post-deployment, not halting updates, patient consent, or public data release.

6. In predictive policing, feedback loop bias is structurally distinct from most machine-learning bias because:

Correct. The unique structure is that prediction → patrol deployment → increased crime recording → confirmatory future training data. The algorithm influences its own future training environment.

The structural distinctiveness is the closed loop between prediction and data collection — deploying based on a prediction changes what data is recorded in that location.

7. Santa Cruz, California's 2020 ordinance banning predictive policing was historically significant because it:

Correct. Santa Cruz was the first US city ban and unusual in naming the technical mechanism — the discriminatory feedback loop — in its legislative findings.

Santa Cruz was the first US city to ban predictive policing tools and named the feedback loop mechanism explicitly — not the first in the world, and not triggered by a court ruling or voluntary disclosure.

8. Chicago's Strategic Subject List was evaluated by RAND in 2017. The core finding was that the tool:

Correct. RAND found no evidence of gun-violence reduction, while an ACLU investigation documented racially disparate risk assignments — harm with no demonstrable benefit.

RAND's finding was specifically about efficacy: no evidence of gun-violence reduction. The tool produced racially disparate outputs without a demonstrable benefit.

9. Google's Project Maven internal petition gathered over 4,000 employee signatures. Under US labour law, this collective action was protected from employer retaliation under:

Correct. The NLRA protects concerted activity — employees acting together regarding workplace conditions — regardless of union membership. This includes collective policy objections.

The First Amendment does not apply to private employers; the WPA covers federal employees; CFAA has no such employee protections. The NLRA's concerted-activity provisions apply here.

10. The Timnit Gebru dismissal case at Google (2020) directly illustrated that AI ethics roles can carry high career risk when:

Correct. Gebru's case documented that producing inconvenient research findings — specifically about large language model risks — carried real career consequences within the organisation.

The Gebru case was specifically about the career risk of producing research findings that conflicted with her employer's commercial interests — not about publication process, conflicts, or conference talks.

11. "Ethics washing" in AI governance is best identified by which structural feature?

Correct. Ethics washing is defined by the gap between visible ethics infrastructure and actual governance authority — specifically, the inability to stop harmful products from deploying.

Ethics washing is not about absence of documentation, team composition, or outdated principles. It is specifically about having visible ethics processes that lack authority to halt harmful deployments.

12. The Lambrecht & Tucker (2020) finding about Facebook's ad-delivery algorithm demonstrated that discriminatory outcomes can emerge from:

Correct. The platform's own cost-minimisation logic — finding female users more expensive to reach in competitive auctions — produced gender-discriminatory ad delivery without any advertiser instruction.

No advertiser intent, training-data exclusion, or regulatory exemption was the cause. The discrimination came from the platform's internal auction optimisation logic.

13. The EU AI Act's high-risk category for AI systems used in employment and credit requires which of the following as a legal obligation?

Correct. The EU AI Act imposes legally mandatory conformity assessments, human oversight with actual intervention authority, and post-market monitoring for high-risk systems — not voluntary commitments or public dashboards.

The EU AI Act requires legally mandatory conformity assessments, intervention-capable human oversight, and post-market monitoring — not model cards, voluntary audits, or real-time dashboards.

14. Across the Amazon, Optum, and PredPol cases, the most consistent pattern in how harm was surfaced was:

Correct. In all three cases, the harm was identified by professionals or researchers with data access — not by pre-deployment regulatory requirements, litigation, or voluntary corporate disclosure.

In all three cases, the pattern was internal or academic investigation before regulatory action — not pre-deployment audits, class-action suits, or voluntary company disclosure.

15. The core professional principle this module argues is shared across all four lesson cases is:

Correct. The module's unifying argument is that in each case, a professional had knowledge before external regulators did, and the ethical obligation lay in deciding whether to act on that knowledge — not in the deployment decision alone.

The module's principle is not a blanket prohibition, an abdication of individual responsibility, or a claim about regulation. It is specifically about the professional obligation at the moment of knowledge — the choice to act or stay silent.