Module 5 · Lesson 1

The Accountability Gap

When AI causes harm, the chain of responsibility stretches across engineers, companies, regulators — and sometimes snaps entirely.

When a machine makes a consequential decision, who answers for the outcome?

In the Netherlands, a fraud-detection algorithm called SyRI quietly scored hundreds of thousands of low-income residents on their likelihood of welfare fraud. The system was built by the Ministry of Social Affairs, fed data from seventeen government databases, and operated with almost no public disclosure. When Amnesty International finally brought a legal challenge in 2020, the court ruled that SyRI violated the right to private life. What the ruling could not do was name a single individual who would be held personally accountable for the years of harm inflicted on wrongly flagged families. The engineers said they followed specifications. The ministry said the algorithm was a tool, not a decision. The vendors said they delivered what was procured. The accountability gap yawned open — and nobody fell in.

This pattern is not an accident. It is a structural feature of how AI systems are designed, procured, and deployed. Understanding it is the first step toward closing it.

Defining Accountability in AI

Accountability means that a specific actor can be identified, questioned, and sanctioned when something goes wrong. It has three requirements working together: answerability (the actor must explain their decisions), enforceability (a sanction must be possible), and traceability (the decision path must be visible enough to assign blame or credit). In traditional governance — a city council passes a bad ordinance, voters can recall the councillors — these three are reasonably satisfied. AI disrupts all three simultaneously.

Answerability fails when the system is a black box that even its creators cannot fully explain. Enforceability fails when liability is diffused across a procurement chain. Traceability fails when decisions are produced by statistical models that have no explicit rule-based logic a human authored. The SyRI case collapsed all three: no individual authored the decision to flag any particular family; no contract clause assigned liability for erroneous flags; and the algorithm's internal logic was never published.

Accountability GapThe structural absence of a clearly identified, legally enforceable, personally answerable party when an AI system causes harm — despite multiple human actors having contributed to its development and deployment.

Diffused ResponsibilityThe spreading of culpability across so many actors (developers, vendors, procurers, deployers, regulators) that no single actor perceives themselves as bearing primary responsibility.

The Principal-Agent Problem in AI

Economists describe a principal-agent problem when someone (the principal) delegates action to another (the agent) whose interests and information differ from theirs. Classical examples include shareholders and executives, or patients and doctors. AI introduces a layered version: the deployer is the principal of the AI vendor, who is the principal of sub-contractors, who are principals of open-source maintainers. Each layer inserts a gap between instruction and outcome.

In 2016, Microsoft released Tay, a Twitter chatbot. Within sixteen hours, coordinated users had trained it to produce racist and misogynistic content. Microsoft pulled Tay and issued a statement expressing regret — but the statement was careful to frame the harm as user behavior, not design failure. In reality, the design decision to allow real-time learning from uncurated public input was made by human engineers at Microsoft. The principal-agent chain obscured who made that choice and why it was not challenged before launch.

The principal-agent framing clarifies the accountability problem: responsibility does not automatically flow back to principals just because they own the system. It must be deliberately engineered in.

Real Case: Uber's Fatal Crash, 2018

In Tempe, Arizona, an Uber self-driving test vehicle struck and killed pedestrian Elaine Herzberg. The backup safety driver, Rafaela Vasquez, was watching a television programme on her phone. Uber had disabled the car's emergency braking. The NTSB found fault with Uber's safety culture, the operator's inattention, and regulatory gaps in Arizona. In 2020, Uber's self-driving unit was sold to Aurora. No Uber executive was criminally charged. Vasquez was charged in 2020 — the human at the bottom of the chain who was least empowered to change the system.

Layers of Accountability

Scholars and regulators increasingly distinguish at least four layers where accountability must be assigned simultaneously rather than sequentially.

Design accountability sits with engineers and product managers who choose what the system optimises for, what data it trains on, and what failure modes are acceptable. Deployment accountability sits with the organisation that integrates the AI into a consequential decision process. Oversight accountability sits with regulators and auditors who set and enforce the rules of the game. Operational accountability sits with the human who is present when the system acts — a safety driver, a loan officer who rubber-stamps a model's output, a moderator who relies entirely on automated flags.

The Herzberg case illustrates the danger of assigning only the last layer. Prosecuting Vasquez while Uber's leadership faced no consequence sends a perverse signal: if you are the operator closest to the harm, you bear all the risk, regardless of how systematically the layers above you failed.

Core Principle

Accountability in AI is not a natural consequence of building systems. It must be designed in, legally structured, and culturally enforced across every layer of the development-to-deployment chain — or it will not exist at all.

Lesson 1 Quiz

The Accountability Gap

Five questions — select the best answer for each.

1. The Dutch SyRI system was struck down in 2020 primarily because it violated what right?

Correct. The court found SyRI violated the right to private life — but it could not name any individual personally accountable for the years of harm, illustrating the accountability gap in full.

Not quite. The Dutch court's ruling focused on privacy — specifically the right to private life under European human rights law — not electoral or employment rights.

2. Which three components must all be present for accountability to function in an AI context?

Correct. Answerability (explain decisions), enforceability (sanctions are possible), and traceability (the decision path can be followed) are the three structural requirements identified in the lesson.

Not quite. The lesson identifies answerability, enforceability, and traceability as the three structural requirements that accountability depends on — all three must hold simultaneously.

3. In the 2018 Uber fatality, who was ultimately charged with a criminal offence?

Correct. Vasquez — the operator least empowered to change the system's design — was charged, while no Uber executive faced criminal charges, illustrating the perverse incentive of assigning accountability only to the last layer.

Not quite. Rafaela Vasquez, the safety driver, was charged. No Uber executives faced criminal charges — a result the lesson uses to illustrate how blame flows to those closest to the harm, not those highest in the design chain.

4. Microsoft's Tay chatbot produced harmful content within sixteen hours. Which design decision most directly enabled this?

Correct. The decision to allow Tay to update its model from raw, uncurated Twitter interactions — made by human engineers — was the proximate design failure, even though Microsoft's statement framed the harm as user behavior.

Not quite. The core design decision the lesson highlights was enabling real-time learning from uncurated public input, which meant adversarial users could systematically train harmful outputs into the model.

5. "Diffused responsibility" in AI refers to which phenomenon?

Correct. Diffused responsibility describes the structural outcome when developers, vendors, procurers, deployers, and regulators each point to another layer, leaving no one perceiving primary culpability.

Not quite. Diffused responsibility is a governance concept: when culpability spreads across a long chain of actors, no single actor perceives themselves as primarily accountable for an outcome — even when harm is clear.

Lesson 1 Lab

Mapping the Accountability Chain

Use the AI tutor to analyse real accountability gaps and identify who should bear responsibility.

Your Task

You are an analyst advising a government committee investigating AI accountability failures. Use the chat below to explore specific cases, map accountability chains, and argue for accountability reforms. The AI tutor will challenge your reasoning and push you to be precise.

Start here: Pick one of the cases from the lesson (SyRI, Tay, or Uber/Herzberg) and walk through who you would hold accountable at each layer — design, deployment, oversight, and operational — and why. Be specific about the evidence.

Accountability Analyst Lab

Welcome to the Accountability Analyst Lab. I'm your AI tutor for this session. We're examining real cases where AI systems caused harm and accountability was diffused or absent. Choose a case — SyRI, Tay, or the 2018 Uber fatality — and let's build a detailed accountability map together. Which case interests you most, and what's your initial instinct about who bears the heaviest responsibility?

Module 5 · Lesson 2

Corporate Accountability

Tech companies that build and profit from AI systems face a fundamental question: does corporate responsibility require more than a press release?

When a product causes mass harm, can corporate culture itself be held accountable — or only individual employees?

The Facebook Oversight Board was announced in 2019, funded by a $130 million grant from Facebook itself, and designed by a company that had just been fined $5 billion by the FTC — the largest penalty in commission history — for privacy violations. The board would review individual content decisions but had no power over algorithmic design, no ability to modify the recommendation engine, and no mandate to address the structural features that the company's own internal research — later leaked as the Facebook Files in 2021 — showed were amplifying outrage and dividing communities worldwide. Frances Haugen, a former product manager, testified before Congress that Facebook's leadership chose profit over safety repeatedly, when the company's own data showed the harm. The question was not whether harm occurred. The question was: what does corporate accountability actually require — and who enforces it?

What Corporate Accountability Means

Corporate accountability in AI differs from accountability for traditional products in two important ways. First, software is updateable — a company can change a harmful system at any moment, which means ongoing harm after a problem is identified represents a deliberate ongoing choice, not a manufacturing defect frozen in a batch of physical goods. Second, AI systems generate behavioural data at scale, meaning companies often know, statistically, exactly what harm their systems are producing before regulators or the public do. This asymmetry of information is central to the accountability problem.

The FTC's $5 billion fine against Facebook in 2019, while historically large, amounted to less than one month of the company's revenue. Facebook's stock price rose after the fine was announced, because investors perceived the settlement as closing legal uncertainty cheaply. This is a classic accountability failure at the corporate level: the sanction did not change the cost-benefit calculation that drove the harmful behaviour in the first place.

Corporate AccountabilityThe obligation of a company — as a legal and social entity — to answer for the consequences of its products and practices, distinct from the personal accountability of individual employees.

Information AsymmetryThe structural advantage companies hold when they possess internal data about a system's harms before regulators, courts, or affected communities can access the same evidence.

The Amazon Rekognition Case

In 2018, the ACLU tested Amazon's facial recognition product Rekognition by running official photos of all 535 members of Congress through the system. It returned 28 false matches to criminal mugshots — disproportionately matching members of colour. Amazon's official response dismissed the test as methodologically flawed (the ACLU had used a confidence threshold Amazon said was too low). Amazon continued selling Rekognition to law enforcement agencies for three more years. In June 2020, following the murder of George Floyd and the subsequent nationwide protests, Amazon announced a one-year moratorium on police use of Rekognition. In 2021, the moratorium was extended indefinitely.

The sequence matters for accountability analysis. The company had evidence of racially disparate error rates in 2018. It sold the product to law enforcement for two more years, then voluntarily stopped — not because a regulator compelled it, but because reputational risk had become commercially significant. Corporate accountability in this case was triggered by market pressure, not legal obligation. That is a fragile foundation for a system affecting people's liberty.

Real Case: Boeing 737 MAX and the Software Analogy

Though not an AI case, Boeing's concealment of flaws in the MCAS flight control software that caused two crashes in 2018–2019, killing 346 people, offers a structural parallel. Boeing engineers internally flagged MCAS concerns before certification. Leadership prioritised schedule. The FAA delegated certification to Boeing itself under a self-certification regime. The 2021 congressional investigation found Boeing had "culture of concealment." Three Boeing employees faced criminal charges; the company paid $2.5 billion in a deferred prosecution agreement — avoiding criminal conviction while acknowledging wrongdoing. The CEO resigned but was not prosecuted.

Board-Level and Structural Mechanisms

Corporate governance scholars argue that genuine accountability requires structural embedding, not voluntary programmes. Specifically, this means board-level AI risk committees with independent members and real authority over product decisions; mandatory disclosure of internal safety research to regulators before product launch; and personal liability for executives when internal evidence of harm was suppressed or ignored.

The EU AI Act (adopted 2024) moves in this direction by requiring high-risk AI systems to maintain technical documentation, undergo conformity assessments, and register in a public database — and assigning compliance obligations to providers (those who put the system on the market) rather than only to deployers. This attempts to lodge accountability at the design stage. Critics note that enforcement relies on national market surveillance authorities whose capacity varies enormously across member states — replicating, at the regulatory level, the same diffusion problem the Act tries to solve.

Key Insight

Corporate accountability for AI is not merely a matter of fines after the fact. It requires information transparency before harm scales, structural governance mechanisms that have real authority, and sanctions calibrated to actually change cost-benefit calculations rather than be absorbed as a line item.

Lesson 2 Quiz

Corporate Accountability

Five questions — select the best answer for each.

1. Why did Facebook's stock price rise after the FTC announced a $5 billion fine in 2019?

Correct. Investors read the settlement as resolving legal uncertainty at a cost far below what many had feared, illustrating that when sanctions don't change cost-benefit calculations, they fail as accountability mechanisms.

Not quite. The rise reflected investor relief that the fine was smaller than feared and resolved uncertainty cheaply — a key illustration of why fines that don't change corporate behaviour represent accountability failures.

2. What did the 2018 ACLU test of Amazon Rekognition most clearly reveal?

Correct. The test produced 28 false matches, disproportionately matching members of colour — evidence of racially disparate error rates that Amazon had access to while continuing to sell the product to law enforcement.

Not quite. The test returned 28 false matches with a racial disparity in who was misidentified. Amazon continued selling to law enforcement for two more years — a key example of accountability deferred until reputational risk became commercially significant.

3. What ultimately triggered Amazon's moratorium on police use of Rekognition in 2020?

Correct. Amazon acted when reputational risk became commercially significant — not because a regulator compelled it. The lesson uses this to illustrate that market-pressure accountability is fragile and insufficient for systems affecting personal liberty.

Not quite. The moratorium followed nationwide protests after George Floyd's murder, when reputational risk to Amazon became commercially significant. No regulator ordered the pause — which is precisely the accountability problem the lesson highlights.

4. The EU AI Act assigns primary compliance obligations to which party?

Correct. The EU AI Act deliberately targets providers (those who bring systems to market) in an attempt to lodge accountability at the design stage rather than only after harm reaches deployers or users.

Not quite. The EU AI Act focuses compliance obligations on providers — those who put systems on the market — specifically to push accountability upstream to the design stage. Deployers have secondary obligations.

5. Why does the lesson argue that software-based AI harm is structurally different from a manufacturing defect in a physical product?

Correct. The updateability of software is morally significant: once a company knows about a harm and can fix it, continued deployment represents an active choice to accept that harm, not a frozen manufacturing defect.

Not quite. The key difference is updateability. A software company can change a harmful system at any moment, so continued harm after identification is a deliberate ongoing choice — not an unavoidable defect like in a physical recall scenario.

Lesson 2 Lab

Corporate Accountability Pressure Test

Stress-test corporate accountability arguments using real company cases.

Your Task

You are advising a Senate committee on what structural corporate accountability requirements should be mandated for companies deploying high-risk AI systems. Use the chat to build and defend your recommendations, responding to counterarguments the tutor raises.

Start here: Propose one concrete structural mechanism — board committee, mandatory disclosure, executive liability, or another — and explain why it would close an accountability gap that current law does not. Use at least one of the cases from the lesson (Facebook, Amazon Rekognition, or Boeing) to support your argument.

Corporate Governance Lab

Welcome to the Corporate Accountability lab. You're advising a Senate committee on mandatory AI governance requirements for large tech companies. I'll play devil's advocate — questioning whether each mechanism you propose actually works, whether it's enforceable, and whether it might have unintended consequences. Start with your first proposal and the case you're using to justify it.

Module 5 · Lesson 3

Government Accountability

When states deploy AI against their own citizens, the accountability problem inverts — the body with the greatest power to hold others accountable must answer for itself.

Who holds governments accountable when AI-driven state power goes wrong?

On January 9, 2020, Robert Williams, a Black man living in suburban Detroit, was arrested in front of his wife and daughters. The Detroit Police Department had used facial recognition technology — supplied by DataWorks Plus and processed by Michigan State Police — to match a blurry surveillance image of a shoplifter to Williams's driver's license photo. The match was wrong. Williams spent eighteen hours in jail before a detective, reviewing the evidence, conceded the identification was incorrect. He was the first documented American to be wrongfully arrested based solely on a facial recognition match.

Williams later told reporters: "I wasn't hiding. I wasn't a fugitive. They came to my home." The City of Detroit and Michigan State Police faced no criminal sanction. The ACLU filed a civil complaint on Williams's behalf. A settlement reached in 2021 required Detroit police to change their policies — but did not establish personal liability for any detective or official who authorised the arrest. The accountability mechanism was a policy reform, not a consequence.

State AI Use and the Accountability Inversion

Private companies face external accountability from regulators, courts, and markets. Governments face a different structure: they are both the user of AI and the primary enforcement authority. When a state deploys a harmful AI system, the body with greatest power to sanction wrongdoing is the same body that committed it. This creates what scholars call the accountability inversion — the principal that should enforce accountability is the party that should face accountability.

This is not a new problem — it predates AI in contexts like police brutality or prosecutorial misconduct — but AI amplifies it in two ways. First, algorithmic tools allow anonymous delegation: no individual detective decides to surveil a neighbourhood; the system flags it, and humans comply. The human is depersonalised in the decision chain. Second, algorithmic tools provide legitimising cover: "the computer said so" becomes a shield against scrutiny that "my informant said so" would not provide, because most observers assume computers are neutral.

Accountability InversionThe structural situation in which the body best positioned to enforce accountability (the state) is also the party whose conduct needs to be held accountable — requiring external checks specifically designed for this paradox.

Legitimising CoverThe phenomenon where algorithmic outputs are assumed to be objective or neutral, shielding human decision-makers from scrutiny they would face if the same decision had an explicit human author.

The UK Horizon Scandal and Automated Prosecution

Between 1999 and 2015, the UK Post Office prosecuted over 700 subpostmasters for fraud, theft, and false accounting. The prosecutions were based almost entirely on data from Horizon, an IT system supplied by Fujitsu, which showed cash shortfalls that prosecutors argued the subpostmasters had stolen. In reality, Horizon contained bugs that generated false shortfall figures. The Post Office knew about the bugs; Fujitsu engineers documented them; internal communications showed senior executives were aware of unreliability. Prosecutions continued for sixteen years.

By 2024, over 900 convictions had been overturned or were under review, making this the largest miscarriage of justice in British legal history. A public inquiry, chaired by Justice Fraser, concluded in 2024 that the Post Office had been "an institution that had lost its moral compass." Four Post Office executives were referred to prosecutors. The government announced legislation to automatically overturn convictions en masse — an extraordinary parliamentary intervention in the judicial process, required because the normal appeals mechanism could not handle the scale.

The Horizon case illustrates that government accountability failures at scale require not just individual prosecutions but systemic remedies — structural changes to how automated evidence is treated in prosecutions, who bears the burden of proof when a system's reliability is challenged, and what oversight applies to algorithmic prosecution tools.

Real Case: Xinjiang Mass Surveillance

China's Integrated Joint Operations Platform (IJOP) in Xinjiang uses AI-driven surveillance — facial recognition, gait analysis, phone data, social graphs — to flag Uyghur Muslims for detention. By 2019, over one million people had been detained in what Chinese authorities called "vocational education centres." The accountability question here is global: who holds a sovereign government accountable when the AI it deploys against its own citizens is built with components sourced from international markets — including US firms that supplied chips and software? Congressional investigations found that American companies including Intel and Nvidia had supplied technology used in Xinjiang surveillance infrastructure, raising cross-border accountability questions that no single jurisdiction can resolve.

Mechanisms for Holding States Accountable

Scholars identify four mechanisms that can check state AI use. Judicial review: courts can strike down unlawful AI deployments, as happened with SyRI, and can require disclosure of algorithmic tools used in criminal cases. Legislative oversight: parliaments can mandate registers of government AI systems, require impact assessments, and restrict categories of use. Civil society litigation: NGOs can bring strategic cases that establish rights-based limits, as the ACLU did with Williams. International human rights law: treaty bodies can issue non-binding findings that create reputational and diplomatic pressure, even absent enforcement.

Each mechanism has limits. Courts are slow and reactive. Legislators often lack technical expertise. Litigation requires individual victims willing to come forward. International pressure is ineffective against states that do not value the reputational cost. No single mechanism is sufficient — robust government accountability requires all four, and they rarely function simultaneously at the scale a systemic problem demands.

Core Principle

Government accountability for AI is structurally more difficult than corporate accountability precisely because states possess both the power to harm and the institutional authority to investigate that harm. Meaningful accountability requires external mechanisms — courts, legislatures, civil society, and international law — operating independently and in parallel.

Lesson 3 Quiz

Government Accountability

Five questions — select the best answer for each.

1. Robert Williams was wrongfully arrested in Detroit in 2020 because of which specific failure?

Correct. Williams was the first documented American wrongfully arrested solely on a facial recognition match — a blurry surveillance image incorrectly matched to his driver's license photo by DataWorks Plus software processed through Michigan State Police.

Not quite. Williams was arrested based on a facial recognition match that was wrong — a blurry store surveillance image was incorrectly matched to his driver's license by DataWorks Plus technology, with no corroborating evidence.

2. The "accountability inversion" in government AI use refers to what structural problem?

Correct. The accountability inversion is the paradox that the state — which is both the deployer of harmful AI and the primary enforcement authority — must be checked by external mechanisms precisely because it cannot effectively police itself.

Not quite. The accountability inversion describes the paradox where the party that should enforce accountability (the state) is the same party whose conduct requires scrutiny — making self-regulation structurally inadequate.

3. The UK Horizon scandal resulted in what historic outcome by 2024?

Correct. Over 900 convictions were overturned — an outcome so extraordinary in scale that parliament passed legislation to quash them en masse, bypassing the normal appeals system, which could not handle the volume.

Not quite. Over 900 wrongful convictions were overturned or under review, making Horizon the largest miscarriage of justice in British legal history. Parliament passed special legislation to handle the scale because normal appeals couldn't cope.

4. What makes algorithmic evidence provide "legitimising cover" for human decision-makers in criminal justice contexts?

Correct. The false perception of computer neutrality gives algorithmic decisions a credibility that human judgment calls don't receive — shielding the human who acted on the algorithm from accountability they would face if the decision were explicitly theirs.

Not quite. The concept of legitimising cover describes how algorithmic outputs are assumed objective, giving decision-makers shelter from accountability because "the computer said so" carries an unwarranted aura of neutrality and certainty.

5. Which of the following best describes why no single external accountability mechanism is sufficient to check state AI misuse?

Correct. Each mechanism — judicial, legislative, civil society, international — compensates for the others' specific weaknesses. No single mechanism is sufficient because each has structural limits in speed, expertise, reach, or enforcement power.

Not quite. The lesson argues that courts, legislation, civil society litigation, and international law each have specific structural limitations — slowness, expertise gaps, needing individual plaintiffs, and lacking hard enforcement. All four must operate in parallel.

Lesson 3 Lab

State AI Accountability Simulation

Analyse government AI deployment cases and argue for accountability reforms.

Your Task

You are a human rights lawyer preparing to argue before a parliamentary select committee why the government's use of facial recognition in public spaces requires independent oversight. Use the chat to develop your arguments, respond to government counterarguments, and propose specific oversight mechanisms.

Start here: The government argues that facial recognition in public spaces is simply a modern version of a detective studying CCTV footage — no different in kind, only in scale and speed, and therefore no new oversight is required. Respond to this argument and explain what specific oversight mechanism you would propose.

State Accountability Lab

Welcome to the State AI Accountability lab. I'll play the role of the government's legal counsel defending current police use of live facial recognition in public spaces. You're the human rights lawyer arguing for independent oversight. The government's opening position: facial recognition is simply scaled-up human pattern recognition — it doesn't require new oversight frameworks any more than binoculars do. Your turn to respond. Be specific: which cases, which rights, which proposed mechanisms?

Module 5 · Lesson 4

Individual and Collective Accountability

Responsibility doesn't disappear because an organisation is involved. Every harmful AI system was built by specific people who made specific choices — and collective problems can still require individual answers.

When AI causes harm, what distinguishes personal moral responsibility from systemic failure — and can it ever be both?

When Frances Haugen joined Facebook in 2019, she had already worked at Google and Pinterest. She specifically requested assignment to civic integrity — the team working on election misinformation. In 2021, before leaving, she copied tens of thousands of internal documents and handed them to the Wall Street Journal and then to Congress. The Facebook Files she leaked showed the company's own research teams had repeatedly concluded that Instagram harmed teenage girls' body image, that the algorithm was amplifying outrage and extremism, and that executives including Mark Zuckerberg had been briefed on findings and had not acted to change the products.

Haugen's testimony before the Senate Commerce Committee in October 2021 crystallised a question that ethics scholars had debated for years: at what point does an employee's knowledge of harm create a personal moral obligation to act — even if acting means violating employment contracts, legal agreements, and professional norms?

Individual Responsibility Within Systems

Large organisations diffuse responsibility through hierarchy, specialisation, and incremental contribution. An engineer who writes a recommendation algorithm component does not decide what the product optimises for. A data labeller who tags emotional content does not decide how that data is used in training. A product manager who launches a feature does not control whether it is eventually misused. This fragmentation is not morally neutral — it is a design feature that has the effect, if not always the intention, of making individual responsibility feel impossible to assign.

Moral philosophers distinguish between causal responsibility (your actions caused the outcome), capacity responsibility (you had the power to prevent it), and role responsibility (your position created an obligation). An executive who is briefed on internal harm data and chooses not to act may bear all three simultaneously. A junior engineer who flags a concern through proper channels and is overruled bears causal responsibility but arguably not full moral culpability, depending on whether they had further options available and what the cost of taking them would have been.

Causal ResponsibilityResponsibility grounded in the fact that your specific actions or omissions were part of the causal chain that produced a harmful outcome.

Capacity ResponsibilityResponsibility grounded in having had the power, resources, or information to prevent a harm — regardless of whether you were directly in the causal chain.

Role ResponsibilityResponsibility grounded in the obligations that attach to a specific professional position — a safety officer, a board member, a chief medical officer — regardless of personal culpability for individual decisions.

The Google Engineers and Project Maven

In 2018, Google had a contract with the US Department of Defense called Project Maven, which involved using Google's AI tools to analyse drone footage for the military. When the contract became known internally, over 4,000 Google employees signed an open letter to CEO Sundar Pichai calling for withdrawal from the contract. Twelve employees resigned in protest. Within months, Google announced it would not renew Project Maven and subsequently published AI Principles that excluded weapons applications.

The episode illustrated three things about individual accountability within organisations. First, collective action — individuals acting in concert — can produce accountability outcomes that isolated individual action cannot. Second, resignation as protest is a real but high-cost mechanism that is not equally available to employees in different economic circumstances. Third, the outcome depended critically on organisational culture and market context: Google was a talent-competitive employer in an era of high employee leverage, so internal dissent had unusual power. In other contexts — a government contractor with few alternative employers, a company without a public brand sensitive to employee values — the same individual actions would have produced no accountability outcome.

Real Case: OpenAI's Board Firing — November 2023

In November 2023, OpenAI's board abruptly fired CEO Sam Altman, citing concerns about candour with the board. Within 96 hours, nearly all of OpenAI's 770 employees signed a letter threatening to quit unless the board resigned and Altman was reinstated. He was reinstated within five days; the board members who voted to fire him resigned. The episode illustrated both the power and the limits of collective employee accountability action: employees successfully defended their preferred leadership, but the accountability mechanisms that should review a CEO's fitness — the board — were overridden by mass employee pressure motivated partly by equity value. Who was accountable to whom became deeply unclear.

Professional Ethics and Whistleblower Protections

Professional codes of ethics in engineering, medicine, and law create explicit individual obligations that exist independent of employer instruction. The ACM Code of Ethics, which covers computer science professionals, states that members must "avoid harm" and "be honest and trustworthy," and that when conflicts exist between following employer instructions and avoiding harm, the professional interest in avoiding harm takes precedence. This is a form of role responsibility that cuts against the "just following orders" defence.

In practice, individual accountability is often only available to those protected by whistleblower statutes. In the US, the Dodd-Frank Act's whistleblower provisions apply to securities violations; the False Claims Act covers fraud against the government. Neither specifically covers AI harms to the public. Frances Haugen was able to provide documents to Congress with legal protection because Facebook was a public company and the documents were relevant to investor disclosure obligations — a narrow and contingent pathway. A whistleblower at a private AI company with no securities obligations would have far weaker protections.

Closing the individual accountability gap therefore also requires expanding whistleblower protections specifically for AI safety disclosures — a reform advocated by multiple congressional proposals as of 2024, but not yet enacted.

Synthesis

Individual and collective accountability are not alternatives to systemic reform — they are complements. Systemic reforms create the structures; individual accountability ensures that specific people bear consequences when those structures are violated. Both are necessary. Neither alone is sufficient.

Lesson 4 Quiz

Individual and Collective Accountability

Five questions — select the best answer for each.

1. Frances Haugen's testimony was legally protected because Facebook was a public company. What does this tell us about whistleblower protections for AI harms more broadly?

Correct. Haugen's pathway depended on Facebook being a public company with securities disclosure obligations. A whistleblower at a private AI company would have far weaker — possibly no applicable — statutory protection for equivalent disclosures.

Not quite. The lesson highlights this as a gap: Haugen's protection was tied to securities law applicable to public companies. Employees at private AI firms lack comparable protection, which is why specific AI safety whistleblower statutes have been proposed.

2. In the Project Maven episode at Google, what factor most explains why employee collective action succeeded in changing company policy?

Correct. The lesson emphasises this was context-dependent: Google's brand and talent competition gave employees unusual leverage. The same collective action at a government contractor with few alternative employers might have produced no outcome.

Not quite. The lesson stresses the importance of organisational context: Google depended heavily on top engineering talent and had a public brand sensitive to employee values. This made internal dissent commercially powerful in a way it would not be in other contexts.

3. An executive who is briefed on internal harm data and chooses not to act bears which type(s) of responsibility, according to the distinctions in this lesson?

Correct. The lesson explicitly states an executive briefed on harm data who does not act "may bear all three simultaneously" — their inaction is causally relevant, they had the capacity to prevent harm, and their role creates explicit obligations.

Not quite. The lesson says an executive briefed on harm data who does not act may bear all three: causal (inaction was part of the causal chain), capacity (they had the power to prevent harm), and role (their position created explicit obligations).

4. The ACM Code of Ethics states that when conflicts arise between employer instructions and avoiding harm, which takes precedence for computing professionals?

Correct. The ACM Code explicitly prioritises avoiding harm over employer instruction — a form of role responsibility that cuts against "just following orders" as a defence for harmful AI development decisions.

Not quite. The ACM Code of Ethics is explicit: when employer instructions conflict with the obligation to avoid harm, avoiding harm takes precedence. This creates a professional accountability obligation independent of what employers instruct.

5. The November 2023 OpenAI board crisis illustrated which specific accountability tension?

Correct. The episode showed collective employee action overriding the governance mechanism (the board) designed to hold leadership accountable — while employee motivations mixed genuine concern with equity value, muddying whether accountability had been served or subverted.

Not quite. The lesson highlights the confusion: employees successfully defended their preferred CEO, but this overrode the board — the body specifically designed to review CEO fitness. With motivations mixing principle and equity value, it became unclear whether accountability had been exercised or undermined.

Lesson 4 Lab

Personal Accountability Scenarios

Work through real-world dilemmas about individual responsibility in AI development.

Your Task

You are a mid-level machine learning engineer at a company that deploys an AI hiring tool. Internal testing you have conducted shows the tool systematically disadvantages applicants from certain demographic groups. Your manager has seen the results and told you to focus on other priorities. Use the chat to work through your ethical obligations, legal options, and practical decisions.

Start here: Using the three types of responsibility from the lesson — causal, capacity, and role — analyse your own accountability in this situation. Then identify what you believe you are morally obligated to do next, and what the obstacles to doing it are.

Personal Accountability Lab

Welcome to the Personal Accountability Lab. You're a mid-level ML engineer who has documented that your company's AI hiring tool discriminates against applicants from certain demographic groups. Your manager has seen your findings and deprioritised them. I'll work through this with you, asking challenging questions about your moral obligations, the practical limits of individual action, and the difference between what's ethical and what's legally required. Start with your accountability analysis using causal, capacity, and role responsibility — then tell me what you think you're obligated to do.

Module 5

Module Test: Accountability

15 questions — you need 80% or above to pass. Select the best answer for each.

1. What is the "accountability gap" in AI systems?

Correct. The accountability gap is the structural situation where multiple human actors contributed to a harmful AI outcome but no single party is clearly identified, answerable, and legally sanctionable.

Not quite. The accountability gap refers specifically to the structural absence of a clearly identified, legally enforceable, personally answerable party — even when many actors contributed to a harmful system.

2. The three components required for accountability — answerability, enforceability, and traceability — were all absent in which case?

Correct. SyRI was a black box (no answerability), liability was diffused across procurement chains (no enforceability), and the algorithm's internal logic was never published (no traceability).

Not quite. The SyRI case is the lesson's primary example of all three components failing: no one could explain the decisions (answerability), no contract assigned liability (enforceability), and the algorithm's logic was never published (traceability).

3. In the 2018 Uber fatality, the NTSB identified multiple causes of the crash. Which design decision made by Uber is specifically highlighted as a key accountability failure?

Correct. Uber had disabled the car's emergency braking — a deliberate design decision made by humans, not the safety driver — making the prosecution of the driver particularly illustrative of how accountability flows to the lowest layer.

Not quite. The lesson specifically identifies Uber's decision to disable emergency braking as a key design accountability failure — made by engineers and product leaders, not by the safety driver who was ultimately charged.

4. Information asymmetry in corporate AI accountability refers to what specific advantage?

Correct. AI systems generate behavioural data at scale, meaning companies often know statistically what harm their systems produce long before regulators or the public have access to the same evidence — a structural accountability advantage.

Not quite. Information asymmetry here refers specifically to companies' ability to observe harms through their own data before those harms become visible to anyone outside — giving them a window to act (or not act) without external scrutiny.

5. The Facebook Oversight Board, established in 2019, was criticised primarily because of which limitation?

Correct. The board's mandate excluded the system-level decisions — recommendation algorithms, optimisation objectives — that Facebook's own internal research identified as the drivers of harm. It could adjudicate individual posts, not architectural choices.

Not quite. The board's core limitation was scope: it reviewed individual content decisions while having no mandate to examine the algorithmic design choices that Facebook's own leaked research showed were causing systemic harm.

6. Robert Williams's wrongful arrest in Detroit in 2020 ultimately resulted in what accountability outcome?

Correct. The ACLU civil complaint resulted in a policy reform settlement — no individual faced personal legal consequence, illustrating that the accountability mechanism was institutional rather than personal.

Not quite. The outcome was a policy reform settlement through ACLU civil litigation — no individual detective or official faced personal legal consequence, which the lesson uses to show the limits of policy-level accountability without personal sanctions.

7. The UK Horizon Post Office scandal is relevant to AI accountability because it demonstrates which principle?

Correct. Over 900 wrongful convictions resulted from systematic reliance on buggy automated evidence — on a scale that required extraordinary parliamentary intervention because normal appeals processes could not handle the volume.

Not quite. The Horizon case demonstrates the scale problem: systemic miscarriages of justice from automated evidence errors can accumulate over decades and overwhelm normal accountability mechanisms, requiring extraordinary systemic remedies.

8. Which of the following best describes "legitimising cover" in the context of government AI use?

Correct. "The computer said so" functions as a shield in ways "my informant said so" does not — because most observers assume algorithmic outputs are neutral, when in fact they encode the decisions and biases of human designers.

Not quite. Legitimising cover describes how the assumed objectivity of algorithmic systems gives human decision-makers shelter from accountability that explicitly human decisions would not receive. It is a cultural and epistemic phenomenon, not a legal one.

9. The four external mechanisms for checking state AI accountability — judicial review, legislative oversight, civil society litigation, and international human rights law — are described as insufficient individually because:

Correct. Courts are slow, legislators lack expertise, litigation needs individual plaintiffs, and international pressure lacks hard enforcement — each limitation is specific, and addressing it requires the other mechanisms operating simultaneously.

Not quite. The lesson analyses each mechanism's specific limits: courts are slow; legislators often lack technical expertise; litigation requires willing victims; international pressure lacks enforcement against non-cooperative states. No single mechanism covers all these gaps.

10. An engineer who flags a concern through proper internal channels and is overruled bears which type of responsibility, according to the lesson's framework?

Correct. The lesson states the engineer bears causal responsibility (part of the causal chain) but that moral culpability depends on whether further options existed and what cost taking them would impose — a nuanced position that acknowledges structural constraints on individual action.

Not quite. The lesson identifies causal responsibility (the engineer's work is in the causal chain) while acknowledging that moral culpability is diminished when proper channels were used and overruled — depending on whether further options were available and at what personal cost.

11. What distinguishes "capacity responsibility" from "causal responsibility" in the lesson's framework?

Correct. The distinction matters because an executive who did not author any harmful line of code may still bear capacity responsibility — they had power, resources, and information to prevent harm and chose not to use them.

Not quite. Causal responsibility is about being in the causal chain (your actions produced the harm); capacity responsibility is about having had the power to prevent it regardless of whether you were in the chain. They are distinct and can both apply to the same actor.

12. The EU AI Act's attempt to solve the accountability problem by assigning compliance obligations to "providers" faces which specific criticism?

Correct. Uneven enforcement capacity across EU member states means a company might face rigorous oversight in one country and minimal oversight in another — replicating, at the regulatory level, the same diffusion problem the Act is designed to solve.

Not quite. The lesson's specific criticism of the EU AI Act is that enforcement depends on national authorities whose capacity varies enormously — meaning the accountability the Act creates on paper may diffuse in practice across member states.

13. The principal-agent problem in AI accountability is best illustrated by which of the following?

Correct. The lesson uses the principal-agent framework to explain how layered procurement — deployer → vendor → subcontractor → open-source maintainer — means responsibility does not automatically flow back to principals, and must be deliberately engineered in.

Not quite. The principal-agent problem the lesson discusses is about layered organisational structures: each layer of vendor/subcontractor relationships inserts a gap between instruction and outcome, and responsibility does not automatically flow back without deliberate design.

14. Google's decision not to renew the Project Maven contract demonstrates which lesson about collective accountability?

Correct. The same collective action — signing a letter, resigning — would likely not change outcomes at a government contractor with few alternative employers or a company whose brand is not sensitive to employee values. Context determines effectiveness.

Not quite. The lesson explicitly warns against over-generalising the Project Maven outcome: it worked at Google because of specific market conditions (talent competition, brand sensitivity) that do not apply universally. Context is determinative.

15. Which statement best captures the module's overall argument about individual versus systemic accountability in AI?

Correct. The module's synthesis position is explicitly that both are necessary and neither sufficient alone — systemic structures define obligations, and individual consequences ensure those structures are not merely nominal.

Not quite. The module's conclusion is deliberately complementarist: systemic reforms create the rules and structures; individual accountability ensures specific people bear consequences when those structures fail. Privileging one over the other leaves the other unenforceable.