Module 4 · Lesson 1

Auditing the Machine

Before you can fix a biased system, someone has to prove it's broken — and that turns out to be one of the hardest jobs in tech.

Who gets to examine the algorithm, and what happens when the answer is "nobody"?

In November 2021, New York City passed a law — Local Law 144 — that required any company using an AI tool to help hire employees in New York to get that tool audited for bias before using it. The city gave businesses until January 2023 to comply. It was, at the time, the most specific AI regulation of its kind anywhere in the United States.

What happened next was instructive. Companies hired auditing firms. Those firms asked for access to the AI systems' internal workings. And many vendors — the companies that built the hiring tools — refused to share their code. They called it proprietary. A trade secret. Revealing it, they argued, would destroy their competitive advantage.

The auditors had to work from the outside: sending test résumés, measuring outcomes, inferring patterns. Like trying to diagnose a sick engine by listening to the noise it makes rather than opening the hood.

What an Audit Actually Is

An algorithmic audit is an investigation into whether an AI system produces fair outcomes — or whether it systematically advantages some groups and disadvantages others. Think of it like a health inspection for software: someone independent comes in, tests the system, and files a report.

But auditing software is nothing like inspecting a kitchen. A restaurant inspector can see the fridge, the counters, the food. An algorithm auditor often can't see the code at all. They get access to inputs and outputs — what goes in, what comes out — and they have to reason backward about what's happening in between.

This creates a fundamental problem: you can't fully audit what you can't see. And the people who build AI systems have strong financial reasons to keep those systems invisible.

Algorithmic audit An independent investigation that tests whether an AI system treats different groups of people fairly — examining its outputs, and sometimes its code, to find patterns of bias.

Black box An AI system whose internal logic is hidden — even from auditors. You can see what goes in and what comes out, but not how decisions are made in between.

The Three Auditing Approaches

Researchers and regulators have developed three main ways to investigate an algorithm's fairness when you can't simply open the hood:

1. Outcome testing. Send the same job application — or loan request, or rental inquiry — with different names or photos attached. If "Emily" gets called back and "Lakisha" doesn't, despite identical qualifications, the disparity is measurable evidence of bias. This is called an audit study, and researchers have used it for decades to prove housing and hiring discrimination. In 2004, economists Marianne Bertrand and Sendhil Mullainathan published a landmark study — sending out 5,000 résumés to 1,300 job ads in Boston and Chicago, changing only the name at the top. Applicants with "white-sounding" names got 50% more callbacks. The same technique is now applied to AI systems.

2. Documentation review. Ask the company to hand over records: what data was the model trained on? How was it tested? What error rates were found before deployment, and for which groups? The EU's AI Act, which passed in 2024, mandates this kind of documentation for high-risk AI systems — a requirement called a conformity assessment. The catch is that the documents are often reviewed by regulators, not the public.

3. Code inspection (source audit). Actually examine the model's weights, training data, and decision logic. This is the most powerful method — and the rarest, because companies treat their AI systems as intellectual property.

The Harder Question

If a company's AI system is making decisions that affect thousands of people's lives — who gets hired, who gets a loan, who gets bail — does the public have a right to see inside it? Or does the company have the right to keep its technology secret? There is no clean answer here. Both sides have real arguments.

Who Actually Does This Work?

In 2019, a researcher named Joy Buolamwini — then a PhD student at MIT — published a study showing that commercial facial recognition systems from IBM, Microsoft, and Amazon misidentified the gender of dark-skinned women at rates as high as 34%, compared to under 1% for light-skinned men. She called it the Gender Shades project. Her method was outcome testing: she built a dataset of faces and measured what each company's system got right and wrong across demographic groups.

What made her work matter wasn't just the findings — it was that she published them publicly. Within a year, IBM had publicly retired their facial recognition product. Amazon halted law enforcement sales of its Rekognition system. Microsoft called for federal regulation.

Buolamwini wasn't a government regulator. She wasn't even a full-time employee at any of those companies. She was a researcher who decided to look carefully at a system that most people assumed was working fine — and showed that it wasn't.

This is the first lesson of redesigning biased systems: someone has to actually look. Auditing is not automatic. It requires people who are willing to do the unglamorous work of testing, measuring, and documenting — and then publishing what they find even when it makes powerful companies uncomfortable.

You now understand something that most people — including many technology professionals — don't think carefully about: proving an AI system is biased requires a specific kind of investigation, and that investigation is often blocked, incomplete, or simply never done. When you read a headline saying an AI system was "tested for bias," you know to ask: tested how? By whom? With access to what?

The Limits of Any Audit

Even when audits happen, they have real limits. An audit is a snapshot — it captures how a system behaves at one moment in time, tested against one set of inputs. But AI systems change. They are retrained, updated, fine-tuned. A system that passes an audit in January may behave differently in July.

Audits are also only as good as the questions they ask. If an audit tests for racial bias but not for bias against people with disabilities, or against non-native English speakers, it might produce a clean report while genuine harms go unmeasured.

And perhaps most importantly: an audit can tell you what a system does — it can't tell you whether what it does is acceptable. That's a values question. It's a political question. It's the kind of question that requires society to decide what it wants — not just what it can measure.

Pause Point

If you're reading this in one sitting, this is a natural place to stop and think: of the three auditing methods described above, which one do you think provides the strongest evidence? And which is most likely to actually happen in practice? Those two answers might not be the same.

Lesson 1 Quiz

Auditing the Machine — 5 questions

1. In 2021, New York City's Local Law 144 required companies to do what before using AI hiring tools?

Correct. Local Law 144 was the first US law specifically requiring bias audits of AI hiring tools — though auditors often couldn't see inside the systems they were supposed to examine.

Not quite. The law required independent bias audits — but vendors often refused to share their code, forcing auditors to work from the outside.

2. A company's AI loan-approval system approves 80% of applications from one zip code and 40% from another. An auditor notices this but can't see the system's code. Which method are they using?

Exactly right. Measuring outputs across different groups — without seeing how decisions are made — is outcome testing. It's powerful but limited: it shows disparity without proving what caused it.

Look again at the scenario: the auditor can't see the code, and is measuring what happens to different groups. That's outcome testing, not a source audit or documentation review.

3. Joy Buolamwini's Gender Shades project (2019) found that some facial recognition systems had error rates for dark-skinned women as high as:

Correct. Up to 34% error on dark-skinned women's gender vs. under 1% for light-skinned men. The disparity was so stark that several major companies changed their products or policies within a year of the study being published.

The actual number was up to 34% — a dramatic gap that helped trigger policy changes at IBM, Amazon, and Microsoft.

4. Which of the following is the biggest limitation of auditing an AI system?

Right. AI systems are updated and retrained constantly. A clean audit today doesn't mean the system stays clean — and the auditor isn't watching 24/7. Ongoing monitoring is a separate, harder problem.

The deepest limitation is that audits capture a moment in time, not ongoing behavior. Systems change — sometimes dramatically — after they've been audited.

5. A company says, "Our AI hiring tool passed a bias audit — it doesn't discriminate." Based on what you learned, what's the most important follow-up question?

Exactly. "Passed a bias audit" can mean many things, from rigorous source inspection to a surface-level outcome test covering only a few demographic categories. The method, scope, and access level determine how much that claim actually tells you.

Timing matters, but the most important questions are about method: how was bias tested, which groups were included, and could auditors actually see inside the system?

Lab 1: The Auditor's Report

You've been hired to audit a hiring AI. Convince your partner what you found — and defend your method.

Your Assignment

A city government uses an AI system to screen applicants for city jobs. You've been brought in as an independent auditor. You ran an outcome test: you submitted 200 identical résumés, half with names typically associated with white applicants and half with names typically associated with Black applicants. The "white-name" résumés advanced to interview 62% of the time. The "Black-name" résumés advanced 38% of the time.

Your lab partner — an AI called VERA — is going to push back on your methodology and your conclusions. You need to defend your work, explain what you found, and take a position on what should happen next.

Start by telling VERA what you think your audit proves — and what you think the city should do about it. Be specific.

VERA — Audit Review AI Peer Reviewer

I've read your summary. A 24-point gap in advancement rates is significant — I won't pretend otherwise. But before I can endorse your report, I have questions about what it actually proves. Tell me: what exactly does this disparity demonstrate? And what do you think the city should do in response?

Module 4 · Lesson 2

Redesign by Constraint

Fixing a biased AI system isn't about adding fairness on top — it means going back to the beginning and making different choices at every step.

When engineers know their system is biased, what exactly do they change — and why is it so much harder than it sounds?

In October 2018, Reuters reported a story that Amazon had quietly shut down an internal AI recruiting tool — one the company had spent several years and millions of dollars building. The system was designed to automatically score and rank job applicants, freeing up human recruiters from the early stages of candidate review.

The problem: the AI had learned to penalize résumés that included the word "women's" — as in "women's chess club" or "women's college." It downgraded graduates of all-women's colleges. It had learned these patterns from a decade of Amazon's own hiring data — data that reflected the fact that Amazon, like most tech companies, had historically hired far more men than women.

Amazon's engineers tried to fix it. They retrained the model. They added explicit rules to neutralize the gender-related terms. But the system kept finding new proxies — other words and patterns that correlated with being female. The team eventually concluded that they could not make the system fair, and shut it down entirely in 2017. The public learned about it in 2018.

Why "Just Remove the Bias" Doesn't Work

The Amazon story exposes something that sounds technical but is actually deeply important: you cannot fix a biased AI by simply removing the obviously biased features. Bias often hides in the connections between things that look neutral.

Consider: a hiring AI trained on historical data will learn that "successful" candidates in the past lived in certain zip codes, went to certain schools, used certain words in their cover letters. None of those features is labeled "race" or "gender." But they correlate with race and gender because of historical patterns of housing segregation, unequal school funding, and cultural difference in writing style. The bias travels through the data like water through rock — finding every available path.

This phenomenon is called proxy discrimination: when a seemingly neutral variable acts as a stand-in for a protected characteristic. The AI isn't using race directly. It's using zip code. Which is almost the same thing — because in many American cities, zip code predicts race with high accuracy, a direct legacy of redlining policies from the mid-20th century.

Proxy discrimination When an AI uses a variable that seems neutral — like zip code or school name — but effectively acts as a substitute for a protected characteristic like race, creating the same discriminatory outcome through a roundabout path.

The Five Places Bias Can Enter

Meaningful redesign requires understanding that bias can enter an AI system at five distinct points — and fixing one doesn't fix the others.

1. The goal. What is the AI trying to optimize? If a hiring AI is trained to predict "will this person be like our current successful employees," it will reproduce whatever patterns are in that existing workforce. The goal itself encodes the past.

2. The training data. Historical data reflects historical decisions, which were often discriminatory. Training on that data teaches the model that discriminatory outcomes were correct.

3. The features. Which variables are included in the model? Variables that seem neutral can carry discriminatory signal as proxies.

4. The threshold. At what score does the AI say "yes" or "no"? If you set a threshold that maximizes overall accuracy, it will often perform worse for smaller demographic groups — because the model saw fewer examples of them during training.

5. The feedback loop. Once the AI is deployed, its decisions affect the world. Those outcomes become the next round of training data. If the AI rejects more applications from Group A, Group A is underrepresented in the "successful hire" data — which teaches the next model that Group A is less qualified. The bias compounds.

Ethical Tension

Here is a question researchers genuinely disagree about: if you train a hiring AI only on data from employees hired after 2010 — to avoid older, more discriminatory patterns — you get a smaller, less representative dataset, which makes the model less accurate overall. You've reduced historical bias but potentially increased other errors. Which failure is worse? There is no consensus answer.

What Redesign Actually Looks Like

In 2016, the city of New Orleans used a predictive algorithm called Palantir (developed by the private data company Palantir Technologies) to help police identify individuals likely to commit future crimes. Civil liberties organizations and journalists eventually revealed that the system had been deployed without public knowledge — and that it systematically flagged Black residents at higher rates.

New Orleans cancelled the contract in 2018. But the cancellation itself reveals something: sometimes redesign means stopping. Not every biased system can be fixed by changing its training data or adjusting its thresholds. Some applications are simply too high-stakes to run through an imperfect model at all.

In cases where redesign is attempted, it typically involves three moves: First, redefine the goal — instead of "predict past success," ask "predict job-relevant skills." Second, curate the data — actively collect data that represents the range of people the system will affect, not just the people who succeeded under the old system. Third, add fairness constraints — explicit mathematical rules that force the model to perform comparably across demographic groups, even if that reduces peak accuracy slightly.

That last move is where things get genuinely hard. Because fairness constraints cost something. They typically reduce overall accuracy — slightly — to make outcomes more equitable. Deciding whether to accept that trade-off is not a technical decision. It's a values decision. And it should be made by the people affected, not just the engineers building the system.

You now understand that redesigning a biased AI is not like patching a bug. It requires going back to the problem definition, the data collection strategy, the feature selection, the threshold setting, and the feedback loop — and making conscious decisions at every step about whose interests matter and what trade-offs are acceptable. Most people who use AI products never think about any of this. You do now.

Lesson 2 Quiz

Redesign by Constraint — 5 questions

1. Amazon's internal hiring AI was shut down in 2017 primarily because:

Correct. Engineers retrained the model and added explicit rules, but the system kept discovering new proxies for gender. They eventually concluded the problem was unfixable and shut it down entirely.

The reason was technical and ethical: the bias kept reappearing through new proxy variables even after multiple attempts to remove it.

2. A lending AI doesn't use race as a variable. But it uses zip code — and in this city, zip code strongly predicts race because of historical redlining. This is an example of:

Exactly. Proxy discrimination is when a neutral-seeming variable substitutes for a protected characteristic. Zip code is classic: it looks neutral but carries the legacy of racially discriminatory housing policies.

This is proxy discrimination — a variable that seems unrelated to race but effectively acts as a stand-in for it because of historical patterns of segregation.

3. Which of these is an example of bias entering through the "goal" of an AI system?

Right. The goal itself — predict similarity to past success — locks in historical patterns before the model is even trained. Changing the data won't fix this; you have to change what you're asking the AI to optimize for.

That describes goal-level bias: the problem definition instructs the AI to reproduce the past. The other options describe data bias, threshold bias, and feedback loops — all real problems, but different ones.

4. Adding "fairness constraints" to an AI model typically means:

Correct. Fairness constraints are explicit mathematical requirements that force comparable performance across demographic groups. The trade-off is that optimizing for equity often slightly reduces peak accuracy — which is why implementing them is a values decision, not just a technical one.

Fairness constraints are mathematical rules built into the model's training objective that force it to meet equity thresholds — typically at a small cost to overall accuracy.

5. New Orleans cancelled its contract with Palantir's predictive policing system in 2018. What does this case suggest about fixing biased AI?

Exactly right. Cancellation is itself a form of redesign. The lesson: the right response to a biased AI is not always to fix it. Sometimes it's to decide that the application shouldn't exist in its current form.

The deeper lesson is that some AI applications are too consequential to run through systems we can't yet make fair — and stopping is a legitimate choice, not a failure.

Lab 2: The Redesign Memo

A city wants to save its biased hiring AI. You have to decide: fix it, or scrap it?

Your Assignment

The city's biased AI hiring system is still running. City leadership doesn't want to shut it down — they spent $2 million building it. They've asked you and your AI partner MARCO to write a memo recommending either: (a) how to redesign it to be fair, or (b) why it should be cancelled. You have to take a clear position and justify it using what you know about the five places bias enters.

MARCO has already been briefed. He has opinions. He won't just agree with you.

Start by telling MARCO your position: redesign or cancel? Then tell him which of the five bias entry points you think is the hardest to fix in this case — and why.

MARCO — Policy Analysis AI Co-Author

I've been thinking about this since we got the brief. Two million dollars is real money — and "just cancel it" is a lot easier to say than to defend to the city council. On the other hand, I keep coming back to what Amazon found: sometimes you can't engineer your way out of a fundamentally biased setup. What's your read? Redesign or cancel — and where do you think the real problem lives?

Module 4 · Lesson 3

Who Gets a Seat at the Table?

The most technically correct fix for a biased AI can still fail — if the people most affected by the system had no say in how it was rebuilt.

Should the people an AI system affects be involved in designing it — even if they're not engineers?

In 2013, the Chicago Police Department deployed a "Strategic Subject List" — an algorithm that assigned every person with a criminal record a risk score from 0 to 500 predicting their likelihood of involvement in a future shooting. The list was generated automatically, updated regularly, and used by police to decide who to visit, warn, or monitor.

By 2016, the list had flagged over 400,000 Chicago residents. Civil liberties researchers who obtained the data found that being on the list didn't actually predict violence — people with high scores were no more likely to be involved in shootings than those with low scores. What the algorithm was actually measuring was past exposure to the criminal justice system, which correlated strongly with race and neighborhood.

Here's what makes this case particularly striking: nobody on the list knew they were on it. The residents of the neighborhoods most affected by the algorithm — almost exclusively Black and Latino communities on Chicago's South and West sides — had no idea the system existed. They were never consulted. They were never warned. They found out through investigative journalism.

The Participation Problem

Here is a pattern that shows up again and again in AI fairness failures: the people who design the system are not the same people who bear its consequences. Police department leadership and algorithm developers in Chicago were not living in the neighborhoods where the Strategic Subject List had its greatest effects. They did not risk being added to a government watch list. They did not face the consequences of a false positive — of being flagged as high-risk when they'd done nothing wrong.

This matters technically, not just ethically. When the people affected by a system aren't involved in designing it, the designers often don't know what questions to ask. They may not know that false positives are more damaging than false negatives in this context. They may not understand that a "risk score" that lands in someone's hands implies a certainty that the math doesn't actually support. They may not anticipate how officers will use — or misuse — the information.

This is sometimes called the participation gap: the distance between who designs an AI system and who lives with its consequences.

Participation gap The distance between the people who design an AI system — usually technical experts from privileged backgrounds — and the people who experience its consequences, who are often from communities with less institutional power.

What Meaningful Participation Looks Like

In 2019, a group of researchers at the AI Now Institute published a report called Discriminating Systems, documenting how the lack of diversity within AI research teams contributed to AI systems that failed minority communities. Their argument: it's not just about adding diverse voices at the end of the process, as a check. It's about involving affected communities throughout — in deciding whether the system should exist, what problem it should solve, and what trade-offs are acceptable.

That's a harder thing to actually implement than it sounds. Companies and governments are not set up for participatory design. It is slower. It produces conflict. It requires translating technical concepts for non-technical participants, and incorporating feedback that may conflict with engineering constraints.

But there are real examples of it working. In 2018, researchers at MIT developed a framework called Participatory ML — an approach where community members are brought into the training data labeling process, so that what counts as "correct" in the model reflects community values, not just the assumptions of the research team. In healthcare AI, patient advocacy groups have started demanding seats on the review boards that evaluate diagnostic algorithms before they're deployed in hospitals.

None of this is perfect. Community members can disagree among themselves. Participation can be superficial — a single meeting, a token consultation, a survey that no one reads. The difference between genuine participation and performative participation is a real and unresolved challenge.

No Clean Answer

If a community is divided — if some members want the AI system deployed and others want it cancelled — whose voice prevails? Majority? Most vulnerable? Most directly affected? There is no technical answer to this question. It's a political question, and it's one that AI developers rarely have to answer publicly.

Transparency as a Design Choice

One of the most basic forms of community participation is simply telling people an AI system is being used on them. This sounds obvious. It often doesn't happen.

In 2017, journalist Julia Angwin and the team at ProPublica revealed that a risk-assessment algorithm called COMPAS was being used in courtrooms across the country to help judges decide on sentencing and bail. Defendants were not told their score. Defense attorneys often didn't know the score existed. The algorithm's code was proprietary, and the company that built it — Northpointe — refused to disclose its methodology.

ProPublica's analysis found that COMPAS was nearly twice as likely to incorrectly flag Black defendants as future criminals compared to white defendants — while being more likely to incorrectly label white defendants as low risk when they went on to commit new offenses. The company disputed the methodology. The disagreement became one of the defining debates in AI fairness research. It's still not fully resolved.

But the transparency failure is harder to dispute: people whose futures were affected by this score had no way to challenge it, because they didn't know it existed. One design change — notifying defendants that a score exists and explaining how it's calculated — would not fix the bias, but it would at least give people the ability to contest a decision that affects their liberty.

At an institutional level — the level of courts, hospitals, police departments, and city governments — the question of who participates in AI design is a governance question, not a technical one. You now understand why "we hired great engineers" is not the same as "we designed this system responsibly." Knowing this changes how you read every announcement about a new government AI system.

Lesson 3 Quiz

Who Gets a Seat at the Table? — 5 questions

1. Chicago's Strategic Subject List was eventually found to be problematic primarily because:

Correct. The algorithm wasn't measuring future risk — it was measuring history of contact with police, which in a city with racially unequal policing is itself a biased signal. Residents were flagged not for what they might do, but for where they'd already been touched by the system.

The core problem was that the algorithm was measuring the wrong thing: prior police contact, not genuine future risk. And because policing was already unequal, the score was systematically biased.

2. The "participation gap" in AI design refers to:

Right. The participation gap is fundamentally about power and proximity: the people building systems often have no direct experience of the harms those systems can produce, because they're not the ones subject to them.

The participation gap is about who has a voice in design. It's the gap between the designers — who are insulated from the system's harms — and the communities who live with its consequences.

3. ProPublica's 2017 investigation into COMPAS found that the algorithm:

Correct. ProPublica found the false positive rate for Black defendants was significantly higher — they were more often incorrectly labeled high-risk. The company disputed the methodology, and the resulting debate revealed that "fairness" is not a single, agreed-upon mathematical definition.

ProPublica's key finding was the differential false positive rate: Black defendants were flagged as future criminals at nearly twice the rate of white defendants when that prediction turned out to be wrong.

4. A hospital deploys a new AI diagnostic tool. Patient advocacy groups were consulted once at the start and once at the end, but had no role in the middle stages of development. This is an example of:

Exactly. Two consultations — at the very beginning and very end — means communities had no influence during the critical design decisions in the middle. That's performative participation: it creates the appearance of inclusion without the substance.

This is performative participation. The timing of input matters: if communities only get to speak before and after the design process, they have no influence on the decisions that actually shape the system.

5. Based on the COMPAS case, which of the following would be the most basic transparency improvement — one that wouldn't fix the bias but would at least allow people to contest the decision?

Right. Notification and explanation is the floor of transparency — the minimum that allows people to understand and contest a decision affecting them. It doesn't fix the bias, but it restores the possibility of challenge, which matters enormously in a legal context where the decision affects someone's liberty.

The most basic step is notification: telling people the score exists and how it works. Without that, there's nothing to contest. More intensive fixes (publishing code, regulatory review) are important but come after this fundamental transparency step.

Lab 3: The Community Review

A neighborhood is about to get a predictive policing AI. You're at the meeting. Make your case.

Your Assignment

The city is holding a community review meeting before deploying a predictive policing system in your neighborhood. An AI named SONYA is playing the role of the city's AI project lead — she supports the system and will defend it. You are a community member who has read about Chicago, COMPAS, and the participation gap.

You can take any position: support the system with conditions, oppose it entirely, or propose an alternative. SONYA will engage seriously with your arguments. She won't fold easily — but she's not programmed to "win." She's there to make you sharpen your thinking.

Start by telling SONYA your position on the proposed system — and what evidence from what you've learned leads you to that position. Be specific about what happened in Chicago or with COMPAS if it's relevant to your argument.

SONYA — City AI Project Lead Debate Partner

Thank you for coming. I know there are concerns — and I want to take them seriously. Our system is different from Chicago's. We're using it to allocate patrol resources more efficiently, not to label individual people. We've had it independently reviewed. I believe in this project, but I'm here to listen. What's your position, and what's your reasoning?

Module 4 · Lesson 4

Rules, Rights, and What Comes Next

Individual fixes to individual AI systems are not enough. At some point, the question becomes: what rules should govern all of them?

Who should write the rules for AI — and what should those rules actually say?

On March 13, 2024, the European Parliament voted 523 to 46 to approve the EU Artificial Intelligence Act — the most comprehensive legal framework for AI governance in the world. It took four years to negotiate. It runs to hundreds of pages. And it represents one answer to a question that every course on AI fairness eventually has to face: can you fix systemic bias through law?

The Act classifies AI systems by risk level. Systems used in hiring, credit scoring, law enforcement, and education are classified as "high-risk" — meaning they require mandatory bias testing, human oversight, documentation, and in some cases prior registration with EU authorities before deployment. Systems that pose an "unacceptable risk" — like social scoring systems that rate citizens based on their behavior, or AI tools that manipulate people psychologically — are banned outright.

Critics immediately pointed out: the Act applies in Europe. Most of the world's largest AI companies are American. The US has no equivalent federal law. And enforcement — making sure companies actually comply — is still being figured out as of 2025.

What Regulation Can and Can't Do

The EU AI Act is the most serious attempt yet to answer a fundamental question: if individual audits and individual redesigns aren't enough to prevent AI harm at scale, can rules accomplish what goodwill doesn't?

Regulation has genuine advantages. It creates a floor — a minimum standard that every company must meet, regardless of whether they care about fairness or not. It shifts the burden: instead of advocates having to prove bias after the fact, companies must prove fairness before deployment. It creates records — documentation that can be subpoenaed, audited, or leaked. And it signals to the market that fairness is not optional.

But regulation also has real limits. Laws are written by legislators who often don't understand the technology. They get out of date fast — a law written in 2024 may be irrelevant to AI systems built in 2027. They apply within borders, but AI systems operate globally. And they can be written with enough loopholes that compliance looks good on paper while nothing actually changes.

The US has taken a different approach: a patchwork of sector-specific rules, executive orders (President Biden signed an executive order on AI safety in October 2023), and voluntary commitments from companies. Advocates argue this is inadequate. Companies often argue that heavy regulation would slow beneficial AI development. That tension is not resolved — it is ongoing, right now, in policy discussions around the world.

EU AI Act A 2024 European Union law that classifies AI systems by risk level and imposes mandatory testing, transparency, and oversight requirements on high-risk applications — including hiring, credit, law enforcement, and education tools.

Rights-Based vs. Risk-Based Frameworks

There are two broad philosophies underlying AI regulation, and they lead to different rules.

A risk-based framework — which is what the EU AI Act uses — categorizes systems by the severity of potential harm and applies proportional requirements. Low-risk AI (like a spam filter) needs almost no regulation. High-risk AI (like a bail-prediction system) needs heavy oversight. The logic is practical: not every AI system is equally dangerous, so rules should scale with stakes.

A rights-based framework starts from a different premise: that every person has fundamental rights — to not be discriminated against, to know when an automated decision affects them, to have that decision explained and challenged — regardless of how risky the system is categorized. In this view, the question isn't "how bad could this go?" but "what do people deserve?"

In practice, most real-world regulatory proposals mix both. But the underlying philosophy determines which way the rules lean when there's conflict. A risk-based approach might allow a mildly biased hiring AI to operate if the harm seems limited. A rights-based approach would ask: does any person have the right not to face discrimination? If yes, a mildly biased system still violates that right — and the harm size doesn't change the principle.

Genuine Tension

Here is a question that major democracies are actively debating: should AI systems that affect consequential decisions — about jobs, loans, healthcare, bail — require explicit consent from the people they affect? Or would requiring consent make these systems too cumbersome to use? And if you believe consent is required, does that mean people can opt out of algorithmic decisions in favor of human ones — even if human decisions are also biased?

What You Can Actually Do

This course has traced AI bias from its origins — in skewed training data and misaligned goals — through its detection (audits and outcome testing) and its causes (the five entry points) and its governance (who gets a voice, what rules apply). Module 4 is about fixing it. So what can actually be done?

At a technical level: redesign begins with the problem statement. Change what you're optimizing for. Curate your data. Add fairness constraints and accept the trade-offs they require. Monitor continuously — audits are not one-time events. Build in transparency so that people affected by the system can understand and contest its decisions.

At an organizational level: close the participation gap. Bring affected communities into the design process — not at the beginning and end, but throughout. Build diverse teams — not because diversity is a PR goal, but because teams without lived experience of bias systematically miss failure modes that teams with that experience catch. Create oversight boards with real authority, not advisory panels that can be ignored.

At a policy level: push for laws that require transparency, mandate audits, and create enforceable rights. Know what framework — risk-based or rights-based — underlies the rules being proposed, and know what that choice means for real people.

And at a personal level: know that this is not a finished problem. Every AI system currently deployed in a consequential domain — hiring, healthcare, criminal justice, credit — is operating under uncertainty about whether it's fair. The researchers, advocates, and policymakers working on these questions need people who understand what they're talking about. That's not a small thing. Most people don't.

You've now completed the arc from identifying bias to fixing it. You understand auditing methods, redesign constraints, participation gaps, and regulatory frameworks at a level that most adults — including many people who work in technology — don't have. When a company says their AI is "ethically reviewed," or a politician says we need "AI regulation," or a researcher says a system is "unfair," you know what questions to ask and what's actually at stake. That knowledge is not trivial. Use it.

The Final Question

Here is the question this course has been building toward, and the one that doesn't have an answer yet: given everything we know about how AI bias works, how hard it is to detect, how hard it is to fix, and how limited our regulatory tools are — should consequential AI decisions (about who gets hired, who gets bail, who gets a loan) continue to be made at all, until we can demonstrate they meet a minimum standard of fairness? Who decides what "minimum standard" means? And who enforces it?

Lesson 4 Quiz

Rules, Rights, and What Comes Next — 5 questions

1. The EU AI Act, passed in March 2024, classifies AI systems like hiring tools and bail-prediction algorithms as:

Correct. Hiring, credit, education, and law enforcement AI are classified as "high-risk" under the Act — meaning they face the most stringent requirements before deployment. Only a small category of the most extreme systems (like social scoring) is banned outright.

Hiring and law enforcement AI is classified as "high-risk" — not banned, but subject to mandatory testing, oversight, and documentation requirements. Outright bans are reserved for a narrow category of especially dangerous applications.

2. A risk-based regulatory framework and a rights-based framework differ primarily in:

Exactly right. The philosophical difference is the starting point. Risk-based frameworks scale requirements to potential harm. Rights-based frameworks say some protections apply regardless of how limited the harm appears — because the right itself matters, not just the severity of its violation.

The key difference is philosophical: risk-based asks about harm severity; rights-based asks about what protections people deserve regardless of harm scale. That difference determines what happens when a system is only mildly biased.

3. President Biden signed an executive order on AI safety in October 2023. Compared to the EU AI Act, this approach is best described as:

Correct. The US has no equivalent of the EU AI Act. Instead, it has a mix of executive orders, sector-specific rules (from agencies like the FTC and CFPB), and voluntary commitments from companies — a patchwork that critics argue lacks the binding force of comprehensive legislation.

The US approach is fragmented — executive orders, voluntary commitments, and sector-by-sector rules — rather than the comprehensive, legally binding framework the EU adopted. Whether that's better or worse is one of the live policy debates happening right now.

4. Which of the following is NOT listed as part of meaningful AI redesign at an organizational level?

Correct. Keeping methodology proprietary is actually the opposite of good organizational practice — it prevents external auditing and community understanding. The other three options are all genuine components of responsible AI organizational design.

Proprietary secrecy is a barrier to accountability, not a component of responsible design. The organizational-level fixes are about participation, diversity, and genuine oversight — all of which require openness, not secrecy.

5. A city argues: "Our new AI sentencing tool is only slightly biased — the disparity is small, so the harm is limited." A rights-based thinker would most likely respond:

Exactly. Rights-based thinking doesn't weigh the size of the harm against the benefit — it starts from the premise that discrimination violates a right, and rights violations don't get smaller because the disparity is modest. This is why the choice of framework (risk vs. rights) has real consequences for which systems get approved.

A rights-based thinker would say that "slight bias" still means some defendants were discriminated against — and discrimination violates a right regardless of scale. The risk-based response (weighing harm against benefit) is the other option in this debate.

Lab 4: Write the Rule

If you could write one rule that all AI systems affecting consequential decisions must follow, what would it say?

Your Assignment

You've been asked to draft a single rule — one sentence or one short paragraph — that any AI system used in hiring, bail, lending, or healthcare must follow. It could be about transparency, auditing, participation, fairness constraints, or something else entirely.

Your partner is LEX — a policy analysis AI who has read every major AI regulation proposal from the US, EU, UK, and Canada. LEX will push on your rule: what does it actually mean? Who enforces it? What happens when it conflicts with other values? LEX won't tell you your rule is wrong — but it will make you defend every word of it.

Write your rule. Then explain why you chose this particular requirement over all the others you've studied. What does it protect that nothing else does?

LEX — Policy Analysis AI Legislative Advisor

I've reviewed the EU AI Act, the US executive order, New York City's Local Law 144, and the UK's voluntary AI safety commitments. Every one of them makes a different bet about what the most important requirement is. Some bet on transparency. Some on auditing. Some on human oversight. Some on participation. I'm ready to hear your proposal. What's the rule — and why this one above all the others?

Module 4 Test

Fix It: Redesign the Game — 15 questions · Pass at 80%

1. New York City's Local Law 144 (2021) was significant primarily because it was:

Correct.

Local Law 144 required bias audits — not bans, not source disclosure, and not federal in scope.

2. When auditors of an AI hiring tool send identical résumés with different names to measure outcome differences, they are using:

Correct. Outcome testing measures results without requiring access to internal logic.

This is outcome testing — the same approach Bertrand and Mullainathan used in 2004 and that researchers now apply to AI systems.

3. Joy Buolamwini's Gender Shades project revealed that some facial recognition systems had gender error rates for dark-skinned women as high as 34%. What happened within a year of her publishing the findings?

Correct. Published research triggered significant industry responses — illustrating the power of rigorous, public documentation of bias.

Buolamwini's published research prompted major changes at IBM, Amazon, and Microsoft — demonstrating that independent audit research can drive real-world consequences.

4. Amazon's AI hiring tool penalized résumés containing the word "women's" because:

Correct. No one programmed this rule. The model learned it from biased historical patterns — illustrating how training data transmits past discrimination into future systems.

The bias was learned automatically from historical hiring records that reflected past discrimination — not deliberately programmed and not caused by sabotage.

5. "Proxy discrimination" occurs when:

Correct. Proxy discrimination is when a variable that looks neutral carries discriminatory signal — typically because of historical correlations rooted in past injustice.

Proxy discrimination is when a variable appears neutral but effectively substitutes for a protected characteristic like race — typically because of historical patterns like residential segregation.

6. Which of the five bias entry points does this describe: "The AI is retrained monthly; each training cycle uses decisions from the previous month — so early errors compound over time"?

Correct. When an AI's decisions become the next round's training data, early biases self-amplify — the system learns that its own biased outputs were "correct."

This is a feedback loop: the system's outputs become inputs to its own future training, compounding early errors with each cycle.

7. Adding fairness constraints to an AI model typically involves accepting:

Correct. Fairness constraints improve equity across groups but usually come at a small cost to peak overall accuracy. This trade-off is a values decision, not a technical one.

Fairness constraints force comparable performance across groups — but achieving that equity typically means slightly lower peak accuracy overall. Deciding whether to accept that trade-off is a values question.

8. Chicago's Strategic Subject List flagged over 400,000 residents as high-risk. Researchers found that high scores were correlated with:

Correct. The algorithm measured criminal justice history — not future risk. Because policing was already racially unequal, the score effectively measured race, producing a self-fulfilling and unjust pattern.

The scores didn't actually predict violence. They measured prior police contact — which in Chicago correlated strongly with race due to existing inequities in policing.

9. The "participation gap" in AI design most directly refers to:

Correct. The participation gap is about power and proximity: designers are insulated from the harms they create; affected communities have no voice in the decisions that affect them.

The participation gap is the distance between designers (who don't face the system's consequences) and affected communities (who do) — creating systematic blind spots in what problems get anticipated and addressed.

10. ProPublica's 2017 investigation found that the COMPAS sentencing algorithm had a higher false positive rate for Black defendants. The company (Northpointe) responded by:

Correct. Northpointe disputed the methodology, pointing out that ProPublica's definition of fairness wasn't the only valid one. The resulting debate revealed that "fairness" has multiple mathematical definitions that can conflict with each other.

Northpointe disputed the methodology — which led to one of the most important debates in AI fairness research about what "fairness" actually means mathematically and which definition should prevail.

11. A meaningful transparency improvement in the COMPAS case — one that wouldn't fix the bias but would restore some ability to contest decisions — would be:

Correct. Notification and explanation is the floor of transparency — the minimum that enables challenge. Without knowing the score exists, a defendant has nothing to contest.

Basic transparency means telling people a score was used and how it works. Without that, the ability to contest — which is fundamental in a legal context affecting liberty — is impossible.

12. Under the EU AI Act, which of the following applications is categorized as presenting "unacceptable risk" and is banned outright?

Correct. Social scoring — rating citizens on their overall behavior, as is done in some other countries — is categorized as unacceptable risk under the EU AI Act and is banned in the EU. Hiring and credit AI are "high-risk" but not banned.

The EU AI Act bans social scoring systems. Hiring tools and credit algorithms are classified as high-risk (requiring oversight) but not prohibited. The distinction matters enormously for which systems continue to operate.

13. The key philosophical difference between a risk-based and a rights-based AI regulatory framework is:

Correct. This philosophical difference determines what happens to a system that is only slightly biased: a risk-based approach might allow it; a rights-based approach says even slight discrimination violates a right that scale doesn't diminish.

The core difference is philosophical: risk-based asks "how bad could this get?" while rights-based asks "what do people deserve, regardless of harm magnitude?" This shapes which systems get approved.

14. An auditor tests a hospital's diagnostic AI in January and it passes all fairness checks. By July, the model has been retrained on new patient data. The January audit:

Correct. An audit is a snapshot of one moment in time. Retrained systems can behave very differently — meaning ongoing monitoring is necessary, not just a one-time audit.

Audits capture behavior at one moment. After retraining, the model may have learned new patterns — including new biases. A passed audit does not guarantee ongoing fairness.

15. A city argues: "Our AI bail recommendation system is only slightly biased against one demographic — the error rate difference is small, so the harm is limited." Which framework would most directly challenge this reasoning?

Exactly right. A rights-based thinker says: the right not to be discriminated against doesn't shrink because the disparity is modest. This is precisely the case where the two frameworks diverge — and it has real consequences for which systems continue operating.

Rights-based frameworks say discrimination is a violation regardless of scale. Risk-based frameworks might allow a slightly biased system if the overall harm appears limited. This is one of the most important practical differences between the two approaches.