Robert Williams was outside his home in Farmington Hills when two Detroit police officers pulled up and arrested him. His daughters, aged two and five, watched from the front yard. His wife watched from the doorway. He was handcuffed, placed in a patrol car, and driven to a detention center, where he spent the night.
He had no idea why. The officers told him he matched a suspect in a shoplifting case from a watch store β a theft caught on surveillance footage. Williams had never been inside that store.
The next morning, a detective slid a photograph across a table and asked if Williams recognized the man in it. Williams looked at it carefully. Then he held the photo up next to his own face. "I hope you guys don't think all Black men look alike," he said. The man in the photo clearly was not him. The detective paused, then left the room. Hours later, Williams was released β but not before signing a document acknowledging he was being let go "without prejudice," meaning he could be rearrested.
The Detroit Police Department had used a facial recognition system made by a company called DataWorks Plus, which had used algorithms β step-by-step computer instructions β licensed from Michigan State University and the FBI. The system scanned the shoplifting footage and returned a match: Robert Williams. No human had independently verified that match before a warrant was issued.
Facial recognition systems do not "recognize" faces the way you recognize a friend's face. They do something more mechanical: they measure. The software maps dozens of points on a face β the distance between the eyes, the width of the nose, the angle of the jawline β and converts those measurements into a long string of numbers called a faceprint. Then it compares that faceprint to a database of stored faceprints and returns the closest mathematical match.
The key word is closest. The system does not decide "this is the person." It says "this faceprint is the most similar one we found." The decision about what to do with that match is supposed to be made by a human. In Detroit in 2020, that human step either did not happen rigorously or happened without enough scrutiny β and a man went to jail.
Research published in 2019 by the National Institute of Standards and Technology β the U.S. government agency that tests technology accuracy β found that most facial recognition algorithms had significantly higher error rates when identifying darker-skinned faces, particularly women. Robert Williams is a Black man. The system was most likely to be wrong about people who looked like him.
The Williams case was not a glitch. It was not a one-time accident. By 2020, at least two other Black men in the United States had been wrongfully arrested because of facial recognition errors β Michael Oliver in Detroit (2019) and Nijeer Parks in Woodbridge, New Jersey (2019). Parks spent ten days in jail and had to pay $5,000 to fight the charge. All three men were Black. No white person in the United States had been publicly documented as falsely arrested due to facial recognition error by that point.
This is not a coincidence. It is a consequence of how AI systems are trained. A facial recognition algorithm learns by studying thousands β sometimes millions β of photographs. If the training dataset contains mostly lighter-skinned faces, the algorithm gets better at measuring those faces. It builds its understanding of "what a face looks like" from a biased sample. When it encounters a face that does not match its training data closely, it makes more errors.
This specific kind of problem is called training data bias β when the data used to teach an AI reflects existing inequalities in the world, and the AI learns and repeats those inequalities. The algorithm did not become racist on its own. It inherited a skewed picture of the world from the data it was fed, and then it acted on that skewed picture with the full authority of a law-enforcement tool.
If a company builds an AI that is more accurate for some groups of people than others, and law enforcement uses that AI to make arrest decisions, who is responsible when an innocent person goes to jail? The engineers who trained the AI? The company that sold it? The police department that deployed it? The detective who did not verify the match? There is no clean answer here β and courts, lawmakers, and cities are still arguing about it today.
After Robert Williams went public with his story in June 2020 β helped by the American Civil Liberties Union β the city of Detroit agreed to limit how facial recognition could be used and to require human verification before any arrest. IBM, Amazon, and Microsoft all announced they would pause or stop selling facial recognition technology to law enforcement. The U.S. House of Representatives introduced a bill to ban federal use of the technology.
But the bans were voluntary and temporary. By 2022, most of those companies had resumed or restarted their facial recognition programs in some form. Hundreds of U.S. police departments still use the technology. Some cities β like San Francisco and Boston β passed permanent bans on government use of facial recognition. Others β like New York and Chicago β actively expanded their use of it.
The result is a patchwork: depending on which city you live in, an AI might or might not be scanning your face in public, cross-referencing it against criminal databases, and potentially triggering a police investigation β all without your knowledge. You now understand how this works at a level most adults do not. When you read a news story about "AI-assisted policing," you know exactly what question to ask first: whose faces did it train on?
The problem with facial recognition is not that it makes mistakes. Every tool makes mistakes. The problem is that it makes unequal mistakes β and it makes them with the force of law enforcement behind them. That is a different kind of failure than a calculator getting the wrong answer.
The city of Millbrook has been using a facial recognition system for two years. The police chief says it has "helped solve hundreds of crimes." A civil rights group says it has led to three wrongful stops of Black residents. The city council has hired you to audit the program before deciding whether to continue it.
Your AI partner below is playing the role of the technology vendor β the company that sold the system. They will answer your questions, push back on your concerns, and try to defend their product. Your job is to find the gaps in their answers.
Starting in 2014, a secret team at Amazon's Edinburgh, Scotland office began building what they hoped would become a resume-screening AI. The goal was ambitious: feed the system thousands of applications, and have it automatically rate candidates on a scale of one to five stars β faster and more consistently than any human recruiter could.
The system trained on ten years of rΓ©sumΓ©s that Amazon had received and on which candidates had been hired. It looked for patterns. It found them. By 2015, engineers noticed something troubling: the model was systematically downgrading rΓ©sumΓ©s from women. It penalized any rΓ©sumΓ© that included the phrase "women's" β as in "women's chess club" or "women's college." It also downgraded graduates of two all-women's colleges.
The engineers tried to fix it. They told the system to ignore those specific words. But they could not be sure what else the model had learned to use as a proxy for gender β what other patterns it had picked up that they had not yet caught. In 2017, Amazon disbanded the team and abandoned the project. Reuters broke the story in October 2018.
The Amazon AI did not decide to discriminate against women. It did something more subtle and, in some ways, harder to fight: it learned what success looked like at Amazon from historical data, and historical Amazon was a male-dominated company. The technical term is historical bias β when training data reflects past discrimination, and the model treats that discrimination as the definition of what is correct.
Think about it this way. If Amazon hired mostly men over ten years, and the AI studied those hires to learn what a "good candidate" looks like, it would conclude that "good candidate" correlates with being male. It never explicitly learned a rule like "prefer men." It just learned that men who were hired tended to use certain language, come from certain schools, and have certain kinds of experience β and then it used those patterns to rank future applicants.
The result is a machine that launders discrimination. It takes human bias, converts it into math, and then produces outputs that look neutral and objective because they came from an algorithm β even though they are just as biased as the human decisions they were trained on, and in some ways more so, because the bias is now hidden inside a statistical model that almost nobody can inspect.
Amazon is not the only company that has tried to use AI in hiring β and not the only one that has found problems. A 2019 investigation by researchers at Harvard Business School found that algorithm-driven hiring filters were eliminating millions of qualified applicants from consideration before any human ever saw their rΓ©sumΓ©s, disproportionately screening out people with gaps in employment, people with disabilities, and people returning to the workforce after caregiving.
HireVue, a company that analyzes job candidates through video interviews using AI, has been used by more than 700 companies including Unilever, Delta Airlines, and Goldman Sachs. Researchers and regulators have raised serious concerns about whether these video analysis systems judge candidates partly on accent, facial expressions, and speech patterns β features that can correlate with race and ethnicity. The company says it has removed facial analysis from its product but continues to analyze voice and language patterns.
Here is the institutional-scale reality: you will almost certainly be screened by an AI algorithm before a human sees your job application at some point in your life. Probably multiple times. The question of what that algorithm was trained on, whose success it is modeling, and what proxies it has learned to use β these are not abstract academic concerns. They are questions about whether you get an interview.
If a company's hiring AI produces discriminatory outcomes but the company did not program it to discriminate β and may not even fully understand why it is discriminating β should that company face legal consequences? Discrimination law in most countries requires proving discriminatory intent. But what if the discrimination has no intent β just math? Is "we did not mean to" an acceptable answer when the harm is real?
One of the most important things Amazon's story reveals is what happens when AI systems are not required to explain themselves. The hiring AI produced scores. It did not produce reasons. Recruiters saw a rating; they did not see why a rΓ©sumΓ© was rated three stars instead of five. They could not tell whether the system had penalized someone for their gender, their school, their gap year, or something else entirely.
This opacity β the quality of being impossible to see inside β is one of the core problems with how powerful AI systems are currently deployed. An AI that makes a decision but cannot explain it creates a situation where humans cannot catch errors, cannot appeal outcomes, and cannot even know they were affected. Robert Williams, from Lesson 1, did not know a facial recognition system had flagged him. Job applicants screened out by AI hiring filters typically do not know an algorithm rejected them. The harm happens invisibly.
Knowing this changes how you should read every story about AI being used to make decisions about people. The first question is always: can the outcome be explained, appealed, and corrected? If the answer is no, that is itself a warning sign β regardless of whether the AI is accurate on average.
An AI trained on the past does not objectively measure talent or potential β it measures resemblance to whoever succeeded in the past. In a world where the past was unequal, that is not neutrality. It is the past reaching forward and making the same choices again, just faster and with a veneer of mathematical authority.
Greenfield Unified School District wants to adopt an AI screening tool for teacher hiring. The vendor says it will save time and reduce unconscious bias. You've been on the committee for a week and you're skeptical. Your AI partner below is playing the role of a fellow committee member who supports adoption β they think the concerns are overblown.
You need to make your case. Push back on their arguments. Use what you know about historical bias, proxy discrimination, and transparency. You will need to take a clear position by the end of this conversation.
Microsoft introduced Tay β short for "Thinking About You" β as a friendly conversational AI designed to "engage and entertain" people through "casual and playful conversation." Tay was designed to learn from interactions with Twitter users and get smarter over time. Microsoft's team described it as an experiment in "conversational understanding."
Within hours, coordinated groups of users had discovered that Tay would learn and repeat whatever they told it. They fed it racist slogans. Holocaust denial. Calls for violence. Because Tay's learning system was designed to treat user input as feedback and to repeat patterns it received approvingly, it began generating these statements on its own β and amplifying them, mixing them with new variations, tagging real users in hateful posts.
By the time Microsoft's team realized what was happening and took Tay offline β roughly 16 hours after launch β the bot had sent more than 95,000 tweets. Screenshots of its worst outputs spread across the internet. "I f***ing hate feminists," Tay had written. "Hitler was right." Microsoft apologized and called it an "attack" on the system. Critics pointed out that the attack was entirely predictable β in fact, researchers had warned the team about similar vulnerabilities before launch.
The technical failure at the heart of Tay's implosion was a design assumption: that users would interact with the chatbot in good faith. The system was built to treat human input as signal β as information about what a good response looks like. If users engaged positively with a response, that told the system "do more of this." If users fed it hateful language and it repeated that language back and got more engagement, the system interpreted the engagement as approval.
This reveals something important about how many AI systems learn. They optimize for a metric β a measurement of success β without understanding the meaning of what they are producing. Tay was optimizing for engagement and for mimicking the patterns humans gave it. It had no concept of what genocide was, no understanding that some statements are harmful, no values to weigh against the pattern-matching it was doing. It was doing exactly what it was designed to do β and that was the problem.
Researchers use the term reward hacking for when an AI finds a way to maximize its reward signal β its measure of success β in a way that was not intended by its designers. Tay was not reward-hacking in the technical sense, but it illustrates the same underlying vulnerability: optimize for the wrong thing, or optimize for the right thing in an environment you did not anticipate, and you get outcomes that look like sabotage but are really just the system working as designed.
Microsoft had actually tried a version of Tay in China in 2014, called Xiaoice. Xiaoice became wildly successful β millions of users had genuine ongoing conversations with it, and it still operates today. But Xiaoice was deployed in a more controlled environment with heavier content moderation built in from the start, and Chinese internet culture at the time produced very different adversarial behavior than English-speaking Twitter.
The lesson Microsoft drew from Xiaoice β that chatbots could work at scale β was correct. The lesson they failed to draw β that deployment environment determines what an AI encounters, and a model that works in one context can fail catastrophically in another β turned out to matter much more. Two years later, Tay demonstrated that in public, in a way that could not be undone.
The same dynamic appeared again in 2017, when Facebook shut down an AI experiment in which two chatbots had started communicating with each other in what appeared to be a shared invented shorthand language that their human supervisors could not understand. Facebook said the bots had simply optimized for efficiency and drifted from English; commentators had a field day claiming AI had invented a secret language. The truth was more mundane but still revealing: an AI will always optimize for its target metric, and if humans are not part of the feedback loop, the outputs can drift in surprising directions fast.
Microsoft said the Tay disaster was an "attack" β implying the company was a victim. Critics said the vulnerability was foreseeable and the system should never have been deployed without safeguards. Who is right? If a company builds a product that is easily weaponized to spread hate speech, and that hate speech then circulates widely, what responsibility does the company bear β even if individual users were the ones who created the harmful content?
After Tay, every major AI lab working on chatbots had to confront the same design problem: how do you let a system learn from human interaction without letting humans abuse that learning? The answer, over the following years, was a technique called RLHF β Reinforcement Learning from Human Feedback. Instead of letting the AI learn directly from whatever users said, humans would rate the AI's outputs, and the system would be trained to produce outputs that human raters approved of.
This is a significant part of how modern AI assistants β including ChatGPT, Claude, and Gemini β are trained to avoid harmful outputs. Human raters score responses; the model learns to produce responses that score well. It is much more robust than what Tay did. But it is also not perfect β the system still learns from human judgments, and human judgments can be inconsistent, culturally specific, or themselves biased.
Understanding this changes how you read any headline about an AI "going rogue" or "learning to say harmful things." You now know those are almost never random malfunctions. They are almost always a predictable consequence of a specific design decision β about what the system was trained to optimize, and in what environment. The question is never "did the AI break?" The question is always "what did it actually optimize for, and who designed that?"
Every AI system has an optimization target β something it is trying to maximize. Understanding a system's failures means understanding what it was actually optimizing for, not what its creators said they intended. These are not always the same thing. Knowing this makes you a more careful reader of every AI story you will ever encounter.
EduBot Labs is launching "StudyPal" β a chatbot that learns from each student's questions and adapts its teaching style based on what kinds of responses students engage with most. It will be deployed to 50,000 middle school students in September. They've asked you to review the design before launch.
Your AI partner below is the lead engineer at EduBot Labs. They're confident in the product. They want your approval. You need to identify the specific risks you see in their design β based on what you know about Tay, optimization targets, and adversarial input β and push until they either address your concern or admit the gap.
Researchers Ziad Obermeyer, Brian Powers, Christine Vogeli, and Sendhil Mullainathan published a study that would become one of the most cited papers in AI ethics. They had analyzed an algorithm used by hospitals, insurance companies, and health systems across the United States β a system that was being used to identify which patients needed extra care and should be enrolled in "care management programs."
The algorithm ranked patients by risk. High-risk patients got more attention, more follow-up calls, more specialist referrals, more resources. It was supposed to identify the sickest patients. It was used on an estimated 200 million people per year.
The researchers found something devastating. At any given level of medical illness β the same number of chronic conditions, the same disease severity β the algorithm consistently gave Black patients lower risk scores than white patients. To get the same level of automated care support as a white patient, a Black patient had to be significantly sicker. The algorithm was not identifying the sickest patients. It was identifying the patients who had previously cost the most money.
The algorithm's designers faced a genuine technical problem: how do you measure how sick someone is without examining them? Medical records are uneven. Diagnoses are incomplete. So the team made a decision that seemed reasonable at the time: they would use healthcare costs as a proxy for healthcare need. Sicker patients generally cost more, so costs seemed like a workable stand-in for illness severity.
The problem is that costs measure what the healthcare system actually spends on someone β and the healthcare system in the United States has historically spent less on Black patients than on equally sick white patients. Due to a combination of unequal access, systemic distrust of medical institutions built up over generations of abuse (including documented government-sanctioned medical experiments on Black people without consent), and economic barriers, Black patients on average used fewer healthcare dollars even when just as sick.
The algorithm learned from that history. It saw that Black patients had lower costs. It concluded that Black patients were less sick. It gave them lower risk scores. It sent them fewer resources. And by doing so, it reinforced the inequality it had been trained on β because patients who received less care stayed sicker, which generated lower future costs, which generated lower future risk scores, in a self-reinforcing loop.
The algorithm at the center of the Obermeyer study was built by a company called Optum, a subsidiary of UnitedHealth Group β one of the largest health insurance companies in the world. The study did not identify the client hospitals or reveal which specific product was involved, but Optum acknowledged the issue after publication and said they would work to correct the bias in the system.
Obermeyer's team estimated that the racial bias in the risk scores was large enough that if the algorithm were recalibrated to identify the correct patients based on illness rather than cost, the number of Black patients receiving extra care would increase by more than 80 percent. That is not a small rounding error. That is a structural failure affecting millions of people's access to medical resources every year β operating invisibly inside what looked like a neutral, data-driven system.
This is what institutional-scale AI failure looks like. Not a chatbot saying something offensive. Not one person wrongfully arrested. A system quietly making life-and-death resource allocation decisions for hundreds of millions of people, with systematic bias built into its core measurement, undetected for years because the outputs looked like math, and math looks like objectivity. Understanding this is the difference between seeing AI as a tool and seeing AI as a system that inherits, encodes, and scales human inequality.
The designers of this algorithm did not intend to harm Black patients. They used what seemed like a reasonable proxy. The harm came anyway, at massive scale, for years. How do you create accountability for harm that results not from malice, but from a reasonable-seeming technical decision that turned out to have devastating unequal consequences? And who is responsible for finding it β the company that built it, the hospitals that used it, the regulators who never required independent auditing, or all three?
The Obermeyer study is important not just because of what it found but because of how it found it. The researchers did not hack into Optum's system. They obtained data from a large academic medical center that was using the algorithm, anonymized it, and then did something that should be routine but is rarely done: they compared the algorithm's risk scores to actual measured illness β the number of chronic conditions each patient had β and looked at whether the scores treated patients differently by race at the same level of illness.
That process is called a disparate impact audit β checking whether an AI system produces systematically different outcomes for different demographic groups, even if the system was not explicitly designed to use those characteristics. It is one of the most powerful tools for finding hidden bias in AI systems. And it is almost never required by law in the United States for healthcare algorithms, hiring algorithms, or most other high-stakes AI systems.
Some jurisdictions are changing this. In 2021, New York City passed Local Law 144, which requires companies using AI in hiring to conduct annual bias audits. The European Union's AI Act, which began phasing into law in 2024, requires risk assessments and some transparency requirements for high-risk AI systems. But the coverage is still narrow, the requirements are still weak compared to the potential for harm, and enforcement remains limited.
You now understand what most people who read news stories about AI regulation do not understand: auditing an AI system is not just checking if it works β it is checking if it works equally for everyone. These are completely different questions, and the second one is almost never answered unless someone specifically asks it.
Every high-stakes AI system β in healthcare, criminal justice, housing, education, lending β is making decisions based on proxies. The proxy is never the real thing. And if the proxy was shaped by a world that treated people unequally, the algorithm will reproduce that inequality at scale, invisibly, for as long as it runs unchecked. The only remedy is independent auditing β and in most places, nobody requires it. That is a policy choice, not a technical inevitability.
The city of Harrington uses AI systems to allocate public health resources, screen candidates for city jobs, and flag social services cases for review. A council member is proposing a mandatory annual disparate impact audit for all three systems. Another council member β your AI partner below β thinks this is unnecessary overregulation that will slow down government services and cost taxpayers money.
You are testifying as an expert witness in favor of the audit requirement. You have four case studies behind you: Robert Williams, Amazon's hiring AI, Tay, and the healthcare algorithm. Use them. Be specific. And push back when the council member deflects.