At 9:58 p.m. on a darkened stretch of North Mill Avenue, a Volvo XC90 operated by Uber's Advanced Technologies Group struck and killed Elaine Herzberg as she walked her bicycle across the road. The vehicle's LIDAR detected her 5.6 seconds before impact. Its software classified her, in sequence, as an unknown object, then a vehicle, then a bicycle — never stabilizing on "pedestrian." An automatic emergency braking system had been disabled by Uber engineers to reduce what they called "erratic behavior." A human safety operator was inside the vehicle, eyes on a phone. The system did not alert her in time. Nobody pressed the brakes.
The National Transportation Safety Board's 2019 report found that the system had no contingency for objects outside its anticipated categories. The "autonomous" vehicle could act — but it could not reason about what it did not expect. That gap, between the ability to act and the capacity to reason responsibly, is where this module begins.
Autonomy in machines exists on a continuum. At the lowest end sits simple automation: a thermostat that turns heat on when temperature drops below a threshold. It executes a fixed rule with no perception of context. At the highest end sits a hypothetical moral agent: a system that perceives its environment, evaluates options against internalized ethical principles, acts, and reflects on the consequences of its action.
Most AI systems deployed today occupy the middle ranges. The Society of Automotive Engineers (SAE) formalized this for vehicles with its J3016 standard, defining six levels of driving automation from Level 0 (no automation) to Level 6 — a hypothetical full automation level. The Uber ATG vehicle was nominally operating at SAE Level 3, where the system handles driving tasks but the human must be ready to intervene. The tragedy in Tempe revealed that Level 3 creates a dangerous ambiguity: the machine acts with authority, the human retains responsibility, but the handoff between them is unreliable.
Philosophers distinguish between behavioral autonomy (the system acts without moment-to-moment human direction) and moral autonomy (the system bears genuine responsibility for its choices). Contemporary AI systems achieve increasing degrees of behavioral autonomy while possessing zero moral autonomy in any philosophically defensible sense. They cannot be punished, feel remorse, or update their values through lived experience.
A system can be behaviorally autonomous — acting without human oversight in the moment — while still being morally heteronomous, meaning its values and constraints were entirely set by designers. The system's "choices" are always traceable back to human decisions made earlier in its design process.
Intentional Stance (Dennett, 1987): Daniel Dennett argued that we can usefully treat any system — thermostat, chess engine, or person — as if it has beliefs, desires, and intentions, as a predictive shortcut. This does not mean the system actually has those mental states. When we say a chess engine "wants to protect its queen," we are using the intentional stance to predict behavior, not asserting inner experience.
Degrees of Moral Patiency vs. Agency: Luciano Floridi and J.W. Sanders proposed in their 2004 paper "On the Morality of Artificial Agents" that moral agency need not be binary. They suggested AI systems can be evaluated on three criteria: interactivity (does the system respond to its environment?), autonomy (does it act without direct human control?), and adaptability (does it modify its behavior based on experience?). By these criteria, many current AI systems qualify as low-degree moral agents — not because they have inner lives, but because their behaviors have moral consequences without continuous human direction.
Responsibility Gap (Matthias, 2004): Andreas Matthias identified what he called the "responsibility gap" — the fact that as machine learning systems become more autonomous and less predictable, there may be harms for which no human individual is clearly responsible. The programmer could not have foreseen the specific decision. The operator did not make it. The manufacturer disclaimed it contractually. The gap is not a technical failure. It is a structural feature of how autonomous systems distribute — and sometimes dissolve — accountability.
Knight Capital Group's trading algorithm executed 4 million trades in 45 minutes after engineers accidentally deployed old test code. The system lost $440 million — more than Knight's net income for the prior year. No single human made any of those 4 million decisions. The responsibility gap was financial, immediate, and total. Knight Capital ceased to exist as an independent firm within days.
Autonomous systems are making or substantially influencing decisions in at least six domains where those decisions carry serious moral weight: lethal military targeting, criminal sentencing, medical diagnosis, child welfare risk scoring, autonomous vehicle navigation, and financial credit allocation. In each domain, the speed and scale at which these systems operate makes meaningful human review of individual decisions practically impossible. We are building the infrastructure of distributed moral action faster than we are building the philosophical and legal frameworks to govern it.
The lessons in this module trace four dimensions of that challenge: what autonomy means (this lesson), how autonomous systems fail and who bears responsibility (Lesson 2), how we design meaningful human oversight (Lesson 3), and how emerging international frameworks attempt to regulate systems that no single jurisdiction fully controls (Lesson 4).
In this lab you will engage with an AI tutor to analyze real autonomous systems and classify their position on the autonomy spectrum. Consider both behavioral autonomy (how much it acts without human direction) and moral autonomy (whether it can bear responsibility).
Explore at least three different systems — such as a spam filter, a credit scoring algorithm, a self-driving car, or a lethal autonomous weapon system — and discuss where each sits on the spectrum and why the distinction matters ethically.
In 2016, ProPublica published an analysis of the COMPAS recidivism risk-scoring algorithm used in Broward County, Florida to inform bail, sentencing, and parole decisions. Examining 7,000 individuals arrested in 2013 and 2014, ProPublica found that Black defendants were nearly twice as likely as white defendants to be falsely flagged as high-risk for future violent crime — and white defendants were more likely to be incorrectly flagged as low-risk when they did go on to reoffend.
Northpointe, the company that built COMPAS, responded that the algorithm was equally accurate across racial groups as measured by a different metric: calibration. Both claims were mathematically correct. But satisfying one fairness criterion while violating another is not a technical accident — it is a reflection of a structural impossibility. When base rates of arrest differ across groups, you cannot simultaneously achieve equal false positive rates, equal false negative rates, and equal calibration. The algorithm did not create this tension. American criminal justice history did. The algorithm encoded it and applied it at scale.
Autonomous system failures cluster into four broad categories, each with distinct accountability implications:
1. Specification Failures: The system optimizes for the wrong objective. Amazon's internal AI recruiting tool, scrapped in 2018, was trained on a decade of résumés from a male-dominated industry. It learned to penalize résumés containing the word "women's" (as in "women's chess club") and to downgrade graduates of all-women's colleges. No engineer specified "discriminate by gender." The objective function simply reflected the historical data, and the historical data reflected historical hiring bias. Garbage in, injustice out.
2. Distribution Shift Failures: The system encounters conditions outside its training distribution. The Uber ATG vehicle had not been trained on pedestrians who do not cross at crosswalks or whose classification oscillates between categories. Healthcare AI trained on data from large academic medical centers may fail when deployed in rural community hospitals with different patient demographics and documentation practices.
3. Adversarial Failures: Deliberate manipulation by external actors. A 2019 study by researchers at McAfee demonstrated that adding a small strip of tape to a speed limit sign could cause a Tesla Model S to misread a 35 mph sign as 85 mph. The physical world can be modified to fool AI perception systems in ways invisible to human observers.
4. Emergent Failures from System Interaction: Individual systems behave as designed, but their interaction produces unanticipated and harmful outcomes. The 2010 "Flash Crash," in which the Dow Jones Industrial Average fell nearly 1,000 points in minutes before partially recovering, was caused by automated trading systems reacting to each other's outputs in a feedback loop that no individual system's designers had modeled.
Alexandra Chouldechova formally proved that when prevalence of an outcome differs across groups, a classifier cannot simultaneously achieve: (1) equal false positive rates, (2) equal false negative rates, and (3) equal calibration. Any choice among these fairness criteria is a moral and political choice, not a technical one. Embedding COMPAS in criminal justice without acknowledging this is not technical neutrality — it is moral abdication disguised as objectivity.
When an autonomous system causes harm, accountability is distributed across a chain of actors, each of whom made decisions that contributed to the outcome:
Data collectors determined which data to gather, how to label it, and whose experience would be represented. Algorithm designers chose the objective function, the model architecture, and the training methodology. Deployers decided the context of use, the population affected, and the degree of human oversight retained. Regulators established — or failed to establish — requirements for testing, transparency, and accountability. Users placed trust in outputs without necessarily understanding their limitations.
The legal scholar Frank Pasquale has noted that in many of these chains, no single actor made a decision they would recognize as "I am choosing to harm this person." Each made a locally reasonable choice. The harm is an emergent property of the chain, not the intent of any individual link.
The Maneuvering Characteristics Augmentation System (MCAS) on the Boeing 737 MAX caused two fatal crashes — Lion Air Flight 610 (October 2018, 189 dead) and Ethiopian Airlines Flight 302 (March 2019, 157 dead) — by repeatedly pushing the nose down based on faulty angle-of-attack sensor data. The system could override pilot inputs. Pilots had not been informed it existed. A 2020 House Transportation Committee report found that Boeing and the FAA had traded independence for efficiency, with the FAA delegating safety certification to Boeing itself. The responsibility chain ran from sensor design through software architecture through certification delegation through pilot training decisions. Every actor had authorized their specific decision. No one had owned the whole.
Several frameworks have emerged to address distributed accountability. The EU AI Act (2024) introduces risk-tiered obligations: high-risk AI systems — including those used in criminal justice, employment, and critical infrastructure — must maintain human oversight, be explainable to affected parties, and undergo conformity assessments before deployment. Accountability rests with the deployer in the EU framework, regardless of whether the deployer built the system.
The IEEE's Ethically Aligned Design framework recommends that autonomous systems be designed with traceability: the ability to reconstruct which decision pathways led to a specific output, and which humans authorized those pathways at design time. Without traceability, post-hoc accountability is impossible — you can identify that harm occurred without being able to identify where in the design chain it originated.
Work with the AI tutor to dissect at least two real autonomous system failures. For each case, identify: (1) what type of failure it represents, (2) which actors in the design chain made consequential decisions, and (3) which fairness criteria were violated and which were preserved.
Consider cases like COMPAS, the Boeing 737 MAX MCAS, Amazon's recruiting tool, or the 2010 Flash Crash. You may also introduce a case you are familiar with.
The USS Vincennes was equipped with the Aegis Combat System, one of the most sophisticated autonomous tracking and weapons systems of its era. On July 3, 1988, Aegis tracked Iran Air Flight 655, a civilian Airbus A300, as it ascended on a scheduled commercial route. The system correctly identified the aircraft's Mode III transponder — the civilian code — but operators in the combat information center, operating under stress during a simultaneous naval engagement, misread or misreported its altitude data. They believed the aircraft was descending toward the ship in an attack profile.
Captain Will Rogers III ordered the aircraft shot down. All 290 passengers and crew died. The human was technically "in the loop" — he pressed no automated button, he gave the verbal order. But the information environment that surrounded him had been shaped by Aegis's classification outputs, by the stress of simultaneous combat, and by an organizational culture that prioritized rapid action over deliberation. He was in the loop. He was not in control.
Human-in-the-loop (HITL) requirements are the most common policy response to concerns about autonomous system accountability. They appear in the EU AI Act's high-risk provisions, in U.S. Department of Defense Directive 3000.09 on autonomous weapons, and in dozens of corporate AI governance frameworks. The logic is intuitive: if a human must approve consequential decisions, humans retain moral and legal responsibility.
The problem is that "human in the loop" can describe radically different realities. Researchers Daniele Amerini and others have documented what is sometimes called automation bias: the tendency of human operators to accept algorithmic recommendations without independent analysis, particularly when operating under time pressure, cognitive load, or when disagreeing with the algorithm requires visible, justifiable dissent. When a parole board member can see that COMPAS scored a defendant 8/10 for recidivism risk, studies suggest they are substantially more likely to deny parole than if they had not seen the score — regardless of other case information. The human is in the loop. The algorithm is running the loop.
A related phenomenon is the speed-accuracy tradeoff: many autonomous systems operate faster than human deliberation is possible. A high-frequency trading algorithm executes in microseconds. A drone swarm target identification system may present and close a targeting window in seconds. A cybersecurity intrusion detection system flags thousands of anomalies per hour. In each case, a human can technically approve or override — but the operational tempo makes genuine deliberation practically impossible.
A 2014 study by Parasuraman and Manzey in Human Factors found that human operators commit significantly more errors when an automated aid is present and wrong than when no automated aid is present at all — because operators stop independently verifying decisions the machine has already made. The presence of automation can degrade human oversight quality even while technically satisfying oversight requirements.
The philosopher Heather Roff and roboticist Richard Moyes have proposed four conditions that must be met for human control over an autonomous system to be genuinely meaningful rather than nominal:
1. Understanding: The operator must comprehend what the system is doing and why — at a level of detail sufficient to evaluate whether the action is appropriate. This excludes black-box systems where outputs are unexplainable to any human in the decision chain.
2. Ability to Intervene: The operator must have a genuine, operable mechanism to stop or modify the system's action before it causes harm. A review process that requires more time than the action window allows is not meaningful intervention capacity.
3. Authority: The operator must have organizational and legal authority to override the system's recommendation without personal career or legal risk for doing so. If overriding an algorithm requires justification that is institutionally costly, the override mechanism is formally available but practically inaccessible.
4. Accountability: The operator must genuinely bear responsibility for the outcome — not liability that can be transferred to the algorithm's developer or to the ambiguity of the decision chain. Accountability that is diffuse across a chain is accountability that belongs to no one.
Durham Police in England deployed the Harm Assessment Risk Tool (HART) to predict whether suspects should be placed in custody or released. The algorithm generated predictions officers could see on their screens. A 2018 independent review by Marion Oswald found that officers rarely overrode HART predictions, even when they possessed additional contextual information the algorithm could not access. Officers reported feeling that overriding the algorithm required formal justification they were reluctant to provide. The review concluded that HART satisfied nominal human-in-the-loop requirements while functionally operating as if decisions were fully automated.
Several design interventions have been proposed to make human oversight substantively rather than nominally real. Forcing functions require operators to input independent assessments before seeing algorithmic outputs, preventing anchoring to the algorithm's recommendation. Red-teaming and adversarial testing probe whether operators can detect and override errors — and trains them to maintain healthy skepticism of system outputs. Explanation requirements mandate that systems provide not just a prediction but the key factors driving it, at a level human operators can evaluate.
The European Commission's High-Level Expert Group on AI recommended in 2019 that oversight mechanisms be designed with attention to operationally realistic conditions — meaning oversight requirements should be tested under the actual time pressures, cognitive loads, and organizational cultures in which they will be applied, not in idealized laboratory conditions.
Work with the AI tutor to evaluate proposed human oversight mechanisms for real autonomous systems. For each mechanism, apply the four conditions of meaningful human control: understanding, ability to intervene, authority, and accountability.
Consider situations like: a radiologist reviewing an AI cancer diagnosis tool, a judge who has seen a COMPAS risk score, an air traffic controller monitoring an autonomous collision avoidance system, or a loan officer with an algorithmic credit decision on screen.
Clearview AI scraped billions of facial images from public websites — Instagram, Facebook, news archives, LinkedIn — without consent and built a facial recognition database marketed primarily to law enforcement. By 2020 it had contracts with over 600 U.S. law enforcement agencies. The system was an autonomous classifier operating across jurisdictions: images of people from Canada, Australia, the UK, and the EU were in its training data; its deployment decisions were made in the United States.
What followed was a fragmented global enforcement action. The UK's Information Commissioner's Office fined Clearview £7.5 million in 2022 and ordered deletion of UK residents' data. Italy's Garante fined it €20 million. Australia's Privacy Commissioner found it in breach of the Privacy Act 1988. The Canadian privacy authorities concluded it violated PIPEDA and demanded it cease operations in Canada. Clearview refused to pay most fines, arguing it had no legal presence in those jurisdictions. As of 2024, the company continued to operate in the United States, where no equivalent federal privacy law existed. The same technology, the same data, the same harms — but nine different legal outcomes across nine jurisdictions. This is the regulatory gap writ large.
International responses to autonomous AI systems have operated at three levels, each with distinct scope and enforceability:
National Legislation: The EU AI Act (formally adopted June 2024) is the most comprehensive binding framework. It takes a risk-tiered approach: prohibited practices (social scoring by public authorities, real-time biometric surveillance in public spaces with narrow exceptions), high-risk applications (education, employment, criminal justice, critical infrastructure — subject to conformity assessments and human oversight requirements), and lower-risk applications (chatbots) requiring transparency disclosures. The Act has extraterritorial reach by design: systems deployed in the EU market must comply regardless of where their developers are headquartered — a deliberate mirror of GDPR's enforcement model. China enacted its AI Generation Regulations in 2023, requiring content labeling, algorithmic transparency registrations, and security assessments for generative AI systems — with different emphasis than the EU's rights-based framework but similar extraterritorial ambitions for systems affecting Chinese users.
International Guidelines (Non-Binding): The OECD Principles on AI (2019, endorsed by G20) establish five principles: inclusive growth and sustainable development, human-centered values and fairness, transparency and explainability, robustness, security and safety, and accountability. These are influential but unenforceable. The UNESCO Recommendation on the Ethics of AI (2021), adopted by 193 member states, addresses AI's impact on human rights, the environment, and epistemology — but carries no compliance mechanism. Recommendations travel faster than legislation; enforcement follows neither.
Sectoral and Platform-Level Rules: In practice, much autonomous system governance happens at the platform and sector level — not through legislation but through terms of service, API policies, and self-regulatory frameworks. OpenAI's usage policies, Google's prohibited use cases for Gemini, and Apple's App Store policies each constrain AI deployment for hundreds of millions of users in ways that no international treaty currently does. The limitation is that these frameworks are set by the platforms themselves, can be changed unilaterally, and reflect commercial interests alongside ethical ones.
Legal scholar Anu Bradford coined the term "Brussels Effect" to describe how EU regulations effectively become global standards — because multinationals find it more efficient to apply the world's strictest standard everywhere than to maintain separate compliance tracks. GDPR drove global privacy policy updates far beyond EU borders. The EU AI Act is likely to produce a similar effect on AI governance, particularly for high-risk AI systems sold to enterprise clients who face their own EU market exposure.
The sharpest test of international autonomous AI governance is lethal autonomous weapons systems (LAWS) — weapons that can select and engage targets without human authorization for individual strikes. The UN's Group of Governmental Experts on LAWS has convened annually since 2014 under the Convention on Certain Conventional Weapons (CCW) framework. As of 2024, no binding international agreement exists. The United States, Russia, China, Israel, and South Korea — all states with substantial autonomous weapons development programs — have collectively blocked binding restrictions while endorsing non-binding "responsible use" guidelines.
The International Committee of the Red Cross (ICRC) recommended in 2023 that states adopt new legally binding rules specifically prohibiting autonomous weapons that (a) cannot be used in compliance with International Humanitarian Law, (b) apply force against persons without human judgment over individual targeting decisions, or (c) have unpredictable effects. The ICRC argument is that IHL requirements — distinction between combatants and civilians, proportionality, precaution — require value judgments that cannot be encoded in current AI systems.
Several documented deployments have already occurred without this governance in place. The Kargu-2 Turkish-made autonomous drone was reportedly used in the Libya conflict in 2020. An autonomous Israeli Harop loitering munition has been sold to multiple states. The U.S. Navy's autonomous patrol vessels operate in contested waters. Each deployment tests the claim that existing IHL is sufficient to govern systems the law was never written to address.
The U.S. Department of Defense's governing policy on autonomous weapons requires "appropriate levels of human judgment over the use of force." This deliberately vague language — "appropriate levels" — was intentional: it allows flexibility across future weapon systems while satisfying nominal accountability requirements. Critics including Human Rights Watch and the Campaign to Stop Killer Robots have argued that "appropriate human judgment" without a requirement for individual targeting authorization is insufficient to satisfy IHL obligations. The DoD maintains that the policy is adequate. No international body has authority to adjudicate the disagreement.
Scholars across political science, law, and ethics broadly agree that effective governance of autonomous systems across jurisdictions requires at least four elements that current frameworks lack. Common definitional standards: what counts as "autonomous," what constitutes "meaningful human control," and what risks qualify as "high-risk" must be defined consistently enough that regulatory arbitrage — moving development to lower-standard jurisdictions — does not hollow out stronger frameworks. Mutual recognition or harmonization: the EU AI Act, China's regulations, and the U.S. Executive Order on AI (October 2023) each create distinct compliance obligations; multinationals face compliance stacks rather than a coherent framework. Enforcement mechanisms with extraterritorial reach: the Clearview AI case illustrates that fines are ineffective against entities with no physical presence in the fining jurisdiction. Inclusive representation: the states most affected by autonomous weapons, predictive policing systems, and credit scoring algorithms are often least represented in the bodies that set governance standards for them.
Work with the AI tutor to examine real cases where autonomous systems have crossed jurisdictional boundaries and existing governance frameworks have failed to address the harms. Analyze what governance mechanisms would need to exist to address each gap.
Consider: the Clearview AI multi-jurisdiction enforcement failure, the LAWS governance vacuum at the UN, or the challenge of applying EU AI Act standards to U.S.-developed models. What would a workable international framework need to include?