In the fall of 2018, reporters at Reuters broke a story that had been quietly buried inside Amazon for a year. The company had built an AI tool to sort job applicants — a résumé-screening system meant to identify the best engineers from thousands of applications. It was supposed to save time and find talent faster than any human recruiter could.
There was one problem. The AI had been trained on ten years of historical résumés — mostly submitted by men, because the tech industry had mostly hired men. The AI learned the pattern: successful applicants look like this. And "this" meant male. The system started automatically downgrading any résumé that included the word "women's" — as in women's chess club, women's engineering society. It penalized applicants who had attended all-women's colleges.
Amazon scrapped the tool. But the story raises a question that didn't go away with that one program: when AI learns from the past, does it lock in everything the past got wrong?
The engineers who built that system didn't tell it to discriminate against women. Nobody typed in a rule that said "lower the score for female applicants." The bias emerged on its own — from the data. The AI looked at ten years of people who got hired, found patterns in what their résumés looked like, and built those patterns into its scoring. Since most past hires were men, maleness became a hidden signal for success.
This is what researchers call historical bias: when an AI trained on past data reproduces the prejudices that existed in that past, even when nobody intended it to. The data was real. The hiring records were accurate. But "accurate records of the past" and "fair guide to the future" are two completely different things.
Think about what that means practically. If a hospital system trains an AI on old patient records from an era when Black patients were less often referred for expensive treatments, the AI might learn to recommend fewer treatments for Black patients — not because of malice, but because that was the historical pattern. If a bank trains a loan-approval AI on decades of lending decisions from a time when women were often denied credit, the AI might recreate that denial pattern.
The source of the unfairness isn't always hatred. Sometimes it's arithmetic.
Here's a second angle on the same problem. In 2015, Google Photos launched a feature that automatically labeled photos with captions — "beach," "dog," "birthday party." It was impressive technology. It was also embarrassing: the system labeled photos of Black people as "gorillas." Google apologized and removed the label. But years later, journalists checking the fix found that Google had simply removed the label "gorilla" — and also blocked "chimp," "chimpanzee," and "monkey" — from all photo results. They hadn't fixed the underlying problem. They had censored it.
Why did the system make that error in the first place? Almost certainly because the training data — the millions of photos it learned from — overrepresented lighter-skinned faces. The AI had simply seen far more of them, so it was better calibrated for them. Darker skin tones were treated as edge cases, exceptions, things to be approximated from what it knew.
This is called representation bias: when the training data doesn't include enough examples from certain groups, the AI performs worse for those groups. It's not a bug in the code. It's a gap in the mirror.
Joy Buolamwini, a researcher at MIT, documented this problem rigorously in 2018. She tested facial recognition systems from major companies — Microsoft, IBM, Face++ — and found that error rates were dramatically different depending on the subject's gender and skin tone. For lighter-skinned men, the systems were 99% accurate. For darker-skinned women, error rates climbed as high as 35%. That means more than one in three darker-skinned women were misidentified. Her paper, co-authored with Timnit Gebru, was called "Gender Shades." It changed the industry's conversation — and it started because Buolamwini noticed the systems worked worse on her own face.
Amazon's AI learned from real historical data. The past discrimination was real. Does that mean the AI was "wrong" — or just honest about what the world had been? If we fix the AI to treat women equally, are we correcting a bias or overriding accurate historical data? Who gets to decide which past patterns deserve to be continued?
You might think this is a problem for engineers to solve, not for you. But AI systems that show these biases aren't just sitting in corporate offices. They're being used in courts to predict whether someone will commit another crime. They're used in schools to detect "suspicious behavior" in hallways. They're used in hospitals to prioritize who gets treatment. In 2019, a study in the journal Science found that a widely used health algorithm — used for approximately 200 million people in the U.S. — was systematically underestimating the health needs of Black patients compared to white patients with the same actual health problems, because it used historical health care costs as a proxy for health need, and Black patients had historically been charged less.
This is the part that should genuinely disturb you: these systems often look objective. Numbers look neutral. An algorithm sounds scientific. But a biased input plus a calculation still produces a biased output. The math doesn't launder the unfairness. It just hides it.
When someone says "the AI decided" or "the algorithm flagged it," most people treat that as the end of the conversation — as if a machine can't be wrong or unfair. You now know that the decision is only as fair as the data that trained it. That changes how you should read every story about AI making consequential decisions about people's lives.
The word "bias" here doesn't mean the AI has feelings or prejudices. It means the AI has patterns baked in from data — and those patterns sometimes hurt real people from specific groups, consistently, without anyone noticing because the system "looks objective."
The communities most affected by biased AI systems tend to be the same communities that were already disadvantaged before AI existed. The technology doesn't create that inequality from scratch. But it can automate it, scale it up, and make it invisible — which is arguably worse.
One more thing worth understanding: the scale at which AI operates makes bias more dangerous than individual human bias. A single biased hiring manager affects maybe a few hundred decisions a year. An AI system embedded in recruiting software used by thousands of companies affects millions of applications simultaneously — all in the same biased direction. A biased judge affects dozens of cases. A biased recidivism-prediction algorithm (one that predicts whether someone will commit another crime) affects every courtroom in the state that uses it.
This is what makes the representation problem so urgent. When something is wrong in the data, it's wrong at industrial scale, applied to millions of people, automatically, around the clock. Human biases are inconsistent — people have good days and bad days, they're influenced by context, they sometimes catch themselves. An AI's biases are completely consistent. It makes the same error in the exact same way every single time, millions of times, until someone discovers the problem and fixes it.
That discovery usually happens because someone notices it happening to them — or to their community.
A school district wants to use an AI system to predict which students are "at risk" of dropping out, based on historical records from the past 15 years. They've asked you to audit it before deployment. Your lab partner — an AI investigator — will help you dig into the data and decisions, but won't hand you conclusions. You have to build the argument yourself.
Consider: What questions would you ask about the training data? What communities might be misrepresented? What harms could a biased prediction cause for a student who gets incorrectly flagged?
In 2019, a former YouTube engineer named Guillaume Chaslot went public with something he had seen from the inside. Chaslot had worked on YouTube's recommendation algorithm — the system that decides which video plays next. He told The Guardian and other outlets that the algorithm had one overriding goal: maximize watch time. Keep people on the platform as long as possible. Every recommendation was a bet: which video will this person click on and then keep watching?
What Chaslot and later researchers found was a consistent pattern: the algorithm kept recommending more extreme content. A person who watched a mainstream political video would be recommended a more partisan one. A person who watched a moderate fitness video would gradually be led toward videos about extreme diets. A teenager watching flat-earth curiosity content would be recommended conspiracy content about vaccines, then government cover-ups, then darker material still. Researchers called it "radicalization by recommendation."
The algorithm wasn't trying to radicalize anyone. It was optimizing for clicks. But outrage, fear, and extreme claims turned out to produce more engagement than calm, measured information. The machine learned: escalation keeps people watching. So it escalated.
A recommendation algorithm is a system that looks at what you've watched, what people like you have watched, and what tends to keep people watching, and then picks the next thing to show you. It sounds helpful — and sometimes it is. But the metric it optimizes for isn't "what's true" or "what's good for you" or even "what you'd actually enjoy if you thought about it." The metric is usually engagement: clicks, time spent, shares, replies.
The problem is that extreme content is often more engaging than moderate content. A headline that says "scientists find mild correlation" gets fewer clicks than "study proves your phone is killing you." A video that calmly explains a political issue gets fewer watch-minutes than a video that declares the other side is evil. The algorithm doesn't know or care which is more accurate — it just knows which one you kept watching.
Researchers from the University of California, Berkeley, and other institutions documented what they called a "rabbit hole effect" in YouTube data around 2018–2019: when someone started watching videos on politically charged topics, the recommendation system consistently pushed toward more extreme content — regardless of whether the user started on the left or the right of the political spectrum. The escalation pattern was symmetric.
Here's what this creates over time: online communities organized around escalating outrage. Not communities that formed because people sought each other out and decided they had something in common. Communities that were assembled by an algorithm choosing, thousands of times a day, to show each person slightly more intense versions of what already made them click.
This affects who "gets heard" in an important way. If you're part of a group that expresses things calmly and moderately, the algorithm has less incentive to amplify your voice. If you're part of a group that expresses things with anger, fear, or extreme claims, you get more reach. The algorithm doesn't pick voices based on accuracy, wisdom, or even popularity — it picks them based on who generates more engagement. This means the most extreme voices within any community often end up being the most visible ones online, even if they're a small minority.
In 2021, internal research at Facebook (later revealed by whistleblower Frances Haugen) showed that the company's own researchers had found that its engagement algorithm was amplifying divisive content and that users who followed political content were being pushed toward "more and more extreme content." One internal slide, reported by the Wall Street Journal, summarized the problem starkly: "Our algorithms exploit the human brain's attraction to divisiveness."
YouTube and Facebook's algorithms weren't designed to spread misinformation or radicalize users. They were designed to keep people watching — which is what their businesses depend on. Does a company have a responsibility to limit its reach when pursuing its business goal causes social harm? Who should decide what "harm" means in this context — the company, the government, or users themselves?
The flip side of amplification is suppression. When algorithms preferentially surface some voices, others get buried. Researchers studying social media platforms have found multiple patterns here. Civil rights organizations documented in 2020 that Instagram's algorithm appeared to systematically deprioritize content tagged with Black Lives Matter hashtags at certain moments. LGBTQ+ creators on TikTok documented in 2019 and 2020 that content with their identities flagged or hidden, in what TikTok eventually acknowledged was an overly broad "safety" filtering system.
This creates a layered problem: the voices the algorithm amplifies are often the loudest and most extreme; and the voices it suppresses are often from communities that are already marginalized. The result is that the "public conversation" you see online isn't a neutral reflection of what people think — it's a distorted picture shaped by what an engagement-maximizing system decided to show you.
Most people experience their social media feed as "what's happening" or "what people think." You now know it's an edited version — edited not by a journalist with professional standards, but by an algorithm that rewards provocation over accuracy. The communities you see most clearly online are partly the communities the algorithm wanted you to see.
Understanding this doesn't mean everything you see online is false. It means the selection process is invisible and driven by incentives that have nothing to do with your actual interests or with truth. Knowing the algorithm exists — and what it optimizes for — gives you one tool for questioning why you're seeing what you're seeing.
At a policy level, this question is being debated seriously: the EU's Digital Services Act (passed 2022, enforced 2023) requires large platforms to assess and report on "systemic risks" from their recommendation systems — one of the first laws anywhere to treat recommendation algorithms as a matter of public concern rather than pure private business. Whether that law actually changes things remains to be seen.
You've been asked to redesign a social media recommendation algorithm. The current one optimizes purely for watch time — and it has the problems described in the lesson. Your task: propose a different optimization goal or a set of constraints. Your lab partner will challenge your design and ask you to defend it.
There's no perfect answer here. Every design choice involves trade-offs — between engagement and safety, between freedom of speech and harm reduction, between what users say they want and what actually benefits them.
In 2021, a paper published by researchers at the University of Washington and other institutions analyzed which languages were well-represented in the training data of large AI language models. They found that the models had been trained on data drawn overwhelmingly from the internet — and the internet was not a fair sample of humanity.
English made up an estimated 46% of content on the internet at that time, despite being the native language of roughly 5% of the world's population. Languages like Yoruba (spoken by 45 million people in West Africa), Igbo, Swahili, and dozens of others had almost no representation. When researchers tested GPT-3's ability to understand and generate these languages, performance was dramatically worse — sometimes nonsensical.
A separate study published in 2023 by researchers at Masakhane, an African AI research organization, found that AI translation tools produced translations into African languages that were often grammatically wrong, culturally inappropriate, or simply confused. For the nearly one billion people who speak languages native to Africa, the most powerful AI language tools in the world worked poorly or not at all. Not because of any deliberate exclusion. Because the data that shaped these systems was gathered from where data was most easily gathered.
Large language models — the kind of AI that powers chatbots and writing tools — learn by processing enormous amounts of text. The more text they see in a language, from a culture, about a topic, the better they understand it. The less text they see, the worse they perform.
The problem is where that text comes from. Most of it comes from the internet. And the internet is not a neutral sample of human experience. Access to the internet is unequal. Who creates content on the internet is unequal. English-speaking, Western, college-educated voices have historically produced a vastly disproportionate share of the text that ends up in training datasets.
This creates a compounding inequality. Communities with less internet access produce less training data. Less training data means the AI works worse for those communities. An AI that works worse for those communities is less useful to them and less likely to be adopted. Less adoption means less feedback, fewer improvements — the gap grows wider over time.
A student in Lagos trying to use an AI writing assistant in Yoruba isn't getting the same tool as a student in London using it in English. They're not even getting something roughly equivalent. They might be getting something actively unreliable.
This isn't just about language. It's about what knowledge gets preserved and what gets lost.
Consider indigenous knowledge systems. Many communities around the world have developed sophisticated understandings of medicine, ecology, agriculture, and navigation over hundreds of generations — knowledge passed down orally, in community practices, and in languages that have no large written corpus. None of this is in the training data. An AI asked about a plant's medicinal properties might give you information from Western scientific journals and miss entirely what local healers in that plant's native region have known for centuries.
Or consider dialects and informal language. The research on AI and African American Vernacular English (AAVE) — a legitimate, grammatically complex dialect spoken by millions of Americans — has documented that AI writing tools often flag AAVE as incorrect English, that speech recognition systems perform worse on AAVE than on standard American English, and that AI tools trained mostly on formal written text can treat vernacular language as a mistake rather than a valid form of expression.
Should AI companies be required to represent all human languages equally, even if collecting that data is expensive and some communities have explicitly said they don't want their language scraped for commercial AI training? What happens when the goal of "better AI for everyone" conflicts with a community's right to decide what happens to their cultural knowledge?
In 2023, the Māori people of New Zealand published a formal position statement on AI and their language, te reo Māori — expressing concern that AI companies might train models on te reo without the community's consent, producing systems that misrepresent their culture and strip the language from its cultural context. They were raising a question that applies globally: who has the right to decide whether a community's knowledge and language gets turned into AI training data?
Even if you live in an English-speaking country, this isn't someone else's problem. First, because the world you live in includes billions of people whose knowledge and language AI is getting wrong. Second, because even within English, the same imbalances appear at smaller scale.
Studies have found that AI writing tools trained on predominantly formal, educated English text perform differently on text from different socioeconomic backgrounds, different regions, and different writing traditions. An AI trained mostly on Wikipedia and academic papers will handle those styles better than street-level language, personal narrative, or culturally specific references from communities underrepresented in the training corpus.
When an AI seems "smart" or "accurate" on a topic, that intelligence is partly just a reflection of how much data existed about that topic. An AI that sounds confident about one culture's history and vague about another's isn't more knowledgeable — it's just better supplied with data. That difference has nothing to do with which culture's knowledge matters more.
This is important for how you use AI tools in your own life. If you're asking an AI to help you understand a topic that has been heavily documented in English — a mainstream scientific question, a major historical event covered in Western sources — you're likely to get something reliable. If you're asking about a community's history that has been predominantly oral, a non-Western cultural practice, or a perspective from the global South, you should treat the AI's answer with much more skepticism, and look for human sources from within that community.
The tool is not equally calibrated for all questions. Knowing which questions it handles well — and which it probably handles poorly — is a skill that makes you a much more careful thinker.
Your task is to think through what an AI language model probably knows well versus poorly — not by testing it directly, but by reasoning about what's in its training data. Your lab partner will challenge your reasoning and push you to be specific about which communities and knowledge systems are likely underrepresented — and what real consequences that has.
Consider: oral traditions, indigenous practices, minority languages, working-class experiences, historical voices who never had access to writing or publishing. Where are the gaps, and who pays for them?
In January 2020, Robert Williams, a 42-year-old Black man living in a suburb of Detroit, was arrested in front of his wife and daughters. Police had used a facial recognition system to match him to surveillance footage from a shoplifting incident. The system said it was him. The match was wrong.
Williams spent 30 hours in police custody before investigators, reviewing the evidence, acknowledged the identification was incorrect and released him. He became the first documented case in the United States of a wrongful arrest based on facial recognition AI. The ACLU took his case. He sued the Detroit Police Department.
Williams later wrote: "It's bad enough that you're being accused of something you didn't do. It's even worse knowing the accusation came from a machine that had already been shown to be least accurate for people who look like me." He was referring to exactly the research Joy Buolamwini had published two years earlier — the research that showed facial recognition made the most errors on darker-skinned faces. The problem had been documented. The technology was still deployed. The arrest still happened.
Williams's story makes the stakes concrete. This wasn't a theoretical problem or a research paper. A man was arrested by a machine known to be unreliable for people with his skin tone, and nobody in the system stopped it. Not because nobody knew about the bias — Joy Buolamwini's research was published in 2018 and covered widely. But knowing about a problem in AI isn't the same as fixing it or being protected from it.
This gap — between documented problems and actual change — is where the work of accountability happens. And accountability in AI happens at several levels at once: individual, organizational, and institutional/legal.
Williams's case eventually contributed to real policy change. In 2021, Detroit amended its policy on facial recognition, requiring that AI identifications be treated as investigative leads only — not as sufficient basis for arrest. Several U.S. cities, including San Francisco, Boston, and Portland, banned government use of facial recognition entirely. These changes happened because Williams was willing to speak publicly, because the ACLU litigated, and because journalists and researchers kept the story alive. None of it happened automatically.
There's a familiar frustration with "what you can do" sections: they usually end with "recycle more" or "talk to your friends" — advice that sounds like action but doesn't change systems. This isn't that. The actions that have actually moved AI accountability forward involve specific, trackable behaviors.
Name what you see. When you notice AI producing outputs that seem biased — a search result that stereotypes a group, a recommendation feed that never shows certain voices, a writing tool that flags vernacular as incorrect — it's worth naming it specifically, not just noticing it privately. Researchers at Google, academic institutions, and advocacy organizations have all been alerted to specific AI failures by regular users who described what they saw in enough detail to investigate.
Ask for transparency. When an institution tells you "the algorithm decided" — in school, in a hiring context, in a police interaction — you are entitled to ask what algorithm, trained on what data, with what documented error rates. This is increasingly a legal right in the European Union under the AI Act (2024) and the GDPR, and in several U.S. states. Asking the question — even if you don't get an answer — creates a record that the question was raised.
Support the researchers. Organizations like the Algorithmic Justice League (founded by Joy Buolamwini), the Distributed AI Research Institute (DAIR, founded by Timnit Gebru), and Masakhane do the work of documenting AI bias and advocating for affected communities. Following their work, sharing it, and — eventually — contributing to it or supporting it politically are concrete actions.
Most people encounter AI bias as something abstract — a problem "out there" in tech companies. You now see that it shows up in specific decisions about specific people: who gets hired, who gets arrested, whose health needs get assessed, whose voice gets heard. That specificity is what makes accountability possible. You can't hold a system accountable for being "generally biased" — you can hold it accountable for a specific identifiable error that harmed a specific person.
This is where you see what accountability looks like at scale — and where the stakes are explicitly policy-level decisions being made right now.
In March 2024, the European Union's AI Act became law — the world's first comprehensive regulatory framework for AI. It categorizes AI systems by risk level. High-risk uses — including AI in criminal justice, employment, education, and critical infrastructure — face strict transparency and auditing requirements. Real-time biometric surveillance (like facial recognition in public) is largely prohibited. Companies deploying high-risk AI must document their training data, demonstrate their systems don't discriminate, and register with a public database.
In the United States, the approach has been more fragmented. The Biden administration issued an executive order on AI in October 2023 covering safety, security, and equity concerns. The FTC has brought enforcement actions against companies for deceptive AI claims. Several states — California, Illinois, Colorado — have passed laws requiring impact assessments for high-risk AI in employment and housing. But there is no single federal AI law equivalent to the EU's.
This matters because the regulatory landscape shapes what companies are required to disclose, what harms are legally redressable, and what communities have standing to demand accountability. A community harmed by a biased AI in a country with no AI regulations has significantly fewer options than the same community in a jurisdiction with transparency requirements.
The EU AI Act bans most real-time facial recognition in public. Proponents say this protects people from documented wrongful arrest and surveillance. Critics say it makes it harder to find missing children, track terrorists, and solve serious crimes. How should democratic societies weigh an individual's right not to be wrongly identified by a biased system against the public interest in solving crimes? And who should make that trade-off — courts, legislatures, tech companies, or citizens?
The communities doing the most concrete work on this — Masakhane for African languages, the Algorithmic Justice League for facial recognition, Data for Black Lives for health equity — operate at the intersection of research and advocacy. They publish findings, pressure companies, and testify to legislatures. Some of the most important AI accountability work of the last decade has been done by researchers who were themselves affected by the systems they studied.
That matters for you specifically: the person most likely to notice a gap in AI performance is the person for whom the gap exists. Expertise in this field doesn't require a computer science degree. It requires the ability to see the problem, describe it precisely, and connect it to the larger pattern. That's something you can do right now, with what you've learned in this module.
A city council is deciding whether to allow the local police department to use a facial recognition AI system for identifying suspects from CCTV footage. You've been asked to present the accountability case — either for strict regulation with mandatory transparency, or for an outright ban. Your lab partner will play a skeptical council member who needs specific, concrete arguments, not general claims about bias.
You'll need to draw on what you've learned across this whole module: training data bias, representation gaps, the Williams case, what actual policy changes have looked like, and what rights people have under existing law. Be specific. Vague arguments don't win council votes.