← Back to Academy
Module 1 · AI and Ethics — Advanced | AESOP AI Academy Module 4
Color
Advanced
Module Test
Lesson 1

What Are Ethics?

Metaethics, normative ethics, and applied ethics in AI contexts.

Philosopher Peter Singer's 1972 paper "Famine, Affluence and Morality" argued that if we can prevent something bad from happening without sacrificing anything of comparable moral importance, we are morally obligated to do so. Singer's consequentialism has been enormously influential — and enormously contested. Deontologists argue that duties to refrain from harm are stronger than duties to actively help; virtue ethicists argue Singer ignores the importance of character and community. For AI ethics, the question isn't just which framework is right — it's which frameworks are most useful for reasoning about specific AI situations, and whether different situations call for different frameworks.

Three Levels of Ethical Analysis
  • Metaethics: Are moral facts objective? What is the nature of moral claims? Can AI have moral status? Are values culturally relative or universal?
  • Normative ethics: What is the right thing to do? The frameworks — consequentialism, deontology, virtue ethics, contractualism — are normative theories.
  • Applied ethics: How do normative frameworks apply to specific situations? AI ethics is primarily applied ethics.
Contractualism and AI

T.M. Scanlon's contractualism: an action is wrong if principles allowing it could not be justified to everyone it affects. This framework has particular relevance for AI ethics — it asks designers to consider whether their design choices could be justified to every affected party:

  • Could you justify this algorithmic decision to the person it's being made about?
  • Could you justify this data collection to every person whose data is collected?
  • Could you justify this optimization target to every stakeholder affected by the optimization?
The Justification Test

A useful design heuristic: for each consequential design choice, ask "could I justify this to the person most harmed by it?" If not, that's a signal worth taking seriously.

Quiz 1

What Are Ethics?

5 questions — free, untracked, retake anytime.

distinguishes metaethics from normative ethics?

✓ Correct — ✅ ✓ Metaethics: what are moral facts? Normative ethics: what should we do? Applied ethics: how do normative frameworks apply to specific situations? AI ethics primarily operates at normative and applied levels.
❌ ❌ Metaethics: the nature of moral claims (are moral facts objective?). Normative ethics: what is right to do. Applied ethics: how frameworks apply to specific situations. AI ethics is primarily applied.

makes Scanlon's contractualism particularly relevant to AI ethics?

✓ Correct — ✅ ✓ Contractualism's justification test: could this choice be justified to everyone affected? This directly challenges AI designers to consider whether their choices could be justified to every person their system affects.
❌ ❌ Contractualism asks whether design choices could be justified to all affected parties — a powerful test for AI systems that make decisions about people who had no say in the design.

is the 'justification test' as a design heuristic for AI?

✓ Correct — ✅ ✓ The justification test: for each consequential design choice, could you justify it to the person most harmed? This applies contractualist reasoning as a practical design check.
❌ ❌ Justification test: for each consequential design choice, could you justify it to the person most harmed by it? If not, that's a signal worth taking seriously.

distinguishes Singer's consequentialism from deontological responses to it?

✓ Correct — ✅ ✓ Singer: if we can prevent bad outcomes without comparable sacrifice, we must. Deontologists: duties to refrain from harm are stronger than positive duties to help — and Singer's view creates unlimited obligations.
❌ ❌ Singer: obligation to prevent bad outcomes if costs aren't comparable. Deontologists: negative duties (don't harm) are stronger than positive duties (actively help) — and unlimited positive obligations are problematic.

is 'which framework is most useful' a better question than 'which framework is right' for AI ethics practice?

✓ Correct — ✅ ✓ Different frameworks illuminate different ethical dimensions. Some situations are primarily about outcomes (consequentialism), others about rights (deontology), others about character (virtue ethics). The useful question is which framework best illuminates what matters in this specific situation.
❌ ❌ Different ethical frameworks illuminate different aspects of situations. The practically useful question isn't which is universally right, but which framework best captures what matters in this specific AI ethics case.
Lab 1

Ethical Framework Synthesis

Develop a multi-framework ethical analysis process for AI decisions.

Lab 1 — Ethical Framework Synthesis

Develop a multi-framework approach to AI ethics.

  1. The AI opens: if different ethical frameworks illuminate different aspects of AI situations, how would you design an ethical analysis process for AI development decisions that draws on multiple frameworks?
  2. Apply your multi-framework approach to a specific AI design decision (of your choice).
  3. Address: is there a meta-framework for deciding which framework to prioritize in different situations?
Consider: contractualism's justification test, consequentialist impact assessment, deontological rights constraints, and how to resolve conflicts between frameworks.
🎯 AI GuideLab 1
Lesson 2

Harm and Benefit — Who Decides?

Harm ontology, stakeholder theory, and the political philosophy of impact assessment.

When Clearview AI scraped billions of photographs from social media and built a facial recognition database sold to law enforcement, the company argued it was simply indexing publicly available information — a first amendment protection. Privacy advocates argued it had created an unprecedented surveillance capability by aggregating individually innocuous public photos into a comprehensive identification system. The EU banned it under GDPR. The US legal status remained contested. The harm wasn't in any individual photo — it was in the aggregation. Harm from AI systems often emerges from combination and scale rather than from any individual data point or decision.

Emergent Harm and Aggregation

A distinctive feature of AI-related harm is its emergent character — harm arising from combination and scale that wasn't present in any individual component:

  • Aggregation harm: Combining individually harmless data points produces harmful surveillance capability
  • Amplification harm: AI scales human decision-making in ways that amplify errors and biases to affect millions simultaneously
  • Systemic harm: AI-driven changes to social systems (labor markets, information environments) that harm without any individual AI decision being identifiable as the cause
Stakeholder Theory and Impact Assessment

Who counts as a stakeholder in AI impact assessment?

  • Direct users of the system
  • People whose data trains the system (often not the users)
  • People who are the subjects of AI decisions (often not the users)
  • Third parties affected by AI outputs
  • Future generations affected by AI-driven systemic change
The Scope Problem

Standard impact assessments focus on foreseeable direct harms. AI's emergent, aggregative, and systemic harms require expanding scope — temporally, geographically, and in terms of affected populations — beyond what standard assessment practices capture.

Quiz 2

Harm and Benefit — Who Decides?

5 questions — free, untracked, retake anytime.

is 'aggregation harm' in the context of AI?

✓ Correct — ✅ ✓ Aggregation harm: combining individually innocuous public photos into a comprehensive identification database. The harm wasn't in any individual photo — it emerged from combination and scale.
❌ ❌ Aggregation harm: combining individually harmless data points produces harmful capabilities. Clearview AI: each photo was publicly available; the harm emerged from aggregating billions into a surveillance system.

makes systemic AI harm difficult to attribute to specific decisions?

✓ Correct — ✅ ✓ Systemic harm: AI-driven changes to social systems cause harm without any individual decision being the identifiable cause. No single AI decision caused the fragmentation of the information environment — it emerged from aggregate system behavior.
❌ ❌ Systemic harm emerges from large-scale social change without any individual AI decision being the identifiable cause. This makes attribution — and therefore accountability — very difficult.

counts as a stakeholder in AI impact assessment beyond direct users?

✓ Correct — ✅ ✓ Full stakeholder scope: data subjects who trained the system, people subject to AI decisions, third parties affected by outputs, and future generations affected by AI-driven systemic change.
❌ ❌ Full stakeholder scope: data subjects (training data contributors), people subject to AI decisions, third parties affected by outputs, and future generations affected by systemic AI-driven changes.

is the 'scope problem' in AI impact assessment?

✓ Correct — ✅ ✓ Scope problem: standard impact assessment focuses on foreseeable direct harms. AI's emergent and systemic harms require temporal, geographic, and stakeholder expansion that current practices don't capture.
❌ ❌ Scope problem: standard impact assessment captures foreseeable direct harms. AI's emergent, aggregative, and systemic harms require expanded temporal scope, geographic scope, and stakeholder populations.

was the core legal and ethical dispute about Clearview AI?

✓ Correct — ✅ ✓ Clearview AI dispute: each photo was public and presumably legal. The harm emerged from aggregation — creating unprecedented surveillance capability. Was aggregation of public data protected, or did it create new harm?
❌ ❌ The Clearview AI dispute: each photo was individually public. The harm emerged from aggregating billions into a comprehensive surveillance system. The dispute was whether aggregation itself caused harm beyond what any individual photo posed.
Lab 2

Impact Assessment Framework

Design a comprehensive AI impact assessment framework.

Lab 2 — Impact Assessment Framework

Design a comprehensive AI impact assessment framework.

  1. The AI opens with the Clearview AI case: harm emerged from aggregation, not from any individual photo. How do you design an impact assessment process that captures emergent, aggregative, and systemic harms that standard assessment practices miss?
  2. Develop a stakeholder identification methodology that goes beyond direct users.
  3. Address: what institutional mechanisms would ensure impact assessments are genuinely independent rather than self-serving?
Consider: temporal scope (long-horizon systemic harms), stakeholder expansion, aggregation risk, independence mechanisms, and what 'foreseeable harm' should mean for AI.
🎯 AI GuideLab 2
Lesson 3

Fairness and AI

The philosophy of fairness, anti-discrimination law, and structural equality in AI contexts.

Political philosopher John Rawls proposed the "veil of ignorance" thought experiment: imagine designing the rules of society without knowing your place in it — your class, race, gender, abilities. Behind this veil, Rawls argued, rational self-interest would produce fair principles, because you might end up in the worst-off position. Applied to AI: if you didn't know whether you'd be a beneficiary or subject of an AI system, what design constraints would you demand? This thought experiment reveals something important: fairness in AI design often requires actively designing against the interests of those with most power to shape the design.

Rawlsian Fairness and AI

Rawls's difference principle: inequalities are acceptable only if they benefit the least-advantaged members of society. Applied to AI: AI systems that produce unequal outcomes are justifiable only if those outcomes benefit the worst-off group. This is a much stronger fairness requirement than most current AI systems meet:

  • It requires designers to actively consider worst-case outcomes for disadvantaged groups
  • It bars systems that benefit the majority while harming minorities
  • It requires ongoing monitoring of distributional outcomes, not just initial design
Anti-Discrimination Law and AI

Legal anti-discrimination frameworks add to philosophical requirements:

  • Disparate treatment: Intentional discrimination based on protected characteristics — generally illegal
  • Disparate impact: Facially neutral policies with disproportionate adverse effects on protected groups — illegal unless justified by business necessity
  • The AI challenge: Disparate impact doctrine applies to AI systems, but proving causation and identifying the specific decision point causing disparity is technically complex
The Structural Problem

Anti-discrimination law addresses individual decisions. AI bias operates systemically and distributionally — across thousands of decisions simultaneously. Legal frameworks designed for individual acts of discrimination don't fit this pattern well.

Quiz 3

Fairness and AI

5 questions — free, untracked, retake anytime.

does Rawls's veil of ignorance thought experiment apply to AI design?

✓ Correct — ✅ ✓ Veil of ignorance applied to AI: design the system without knowing which side of it you'll be on. This generates much stronger fairness constraints — you'd demand protection from worst-case outcomes.
❌ ❌ Veil of ignorance: design without knowing whether you'll be beneficiary or subject. This generates demands for protection from worst-case outcomes — constraints that current AI systems rarely meet.

is Rawls's difference principle, and how does it apply to AI?

✓ Correct — ✅ ✓ Difference principle: inequalities are justified only if they benefit the least-advantaged. For AI: unequal outcomes are acceptable only if worst-off groups benefit. This bars most current AI equity justifications.
❌ ❌ Rawls's difference principle: inequalities are acceptable only if they benefit the least-advantaged. For AI: unequal outcomes are justifiable only if they benefit the worst-off group — a very demanding standard.

is 'disparate impact' as a legal anti-discrimination doctrine, and why is it relevant to AI?

✓ Correct — ✅ ✓ Disparate impact: facially neutral policies with disproportionate adverse effects on protected groups. AI systems can produce this without intentional discrimination — and disparate impact doctrine applies.
❌ ❌ Disparate impact: neutral-seeming policies with disproportionate adverse effects on protected groups. AI can produce disparate impact through bias in training data or design choices — not requiring intentional discrimination.

don't anti-discrimination legal frameworks fit AI bias well?

✓ Correct — ✅ ✓ Legal frameworks target individual acts of discrimination. AI bias is systemic and distributional — operating across thousands of decisions simultaneously. The individual-act framework doesn't fit the distributional pattern.
❌ ❌ Anti-discrimination law targets individual decision acts. AI bias is systemic — distributed across thousands of decisions. The individual-act legal framework doesn't fit the distributional nature of AI discrimination.

does the veil of ignorance reveal that fairness in AI design often requires 'designing against designer interests'?

✓ Correct — ✅ ✓ Designers typically have more power and resources than those most harmed by AI systems. Without the veil, they naturally design for their own situations. Fairness requires actively overriding that natural tendency.
❌ ❌ Designers typically have more power and capital than those most harmed. Their natural interests don't align with protecting worst-case outcomes for disadvantaged groups. Fairness requires actively designing against those natural interests.
Lab 3

Rawlsian AI Design

Apply Rawlsian fairness principles to AI system design.

Lab 3 — Rawlsian AI Design

Apply Rawlsian fairness principles to AI system design.

  1. The AI opens: behind the veil of ignorance, not knowing whether you'd be the beneficiary or subject of an AI hiring system, what design constraints would you demand? How do those constraints compare to typical current practice?
  2. Apply the difference principle: what AI systems could you justify to the least-advantaged group they affect?
  3. Address: is Rawlsian fairness too demanding for commercial AI, or is the commercial framing itself part of the problem?
Consider: what the veil generates, how the difference principle changes design requirements, and whether existing anti-discrimination law is sufficient or inadequate.
🎯 AI GuideLab 3
Lesson 4

Transparency and Honesty

Explainability, interpretability, and the ethics of AI communication.

In 2018, the EU's GDPR created a "right to explanation" for automated decisions. Within months, legal scholars were debating what this right actually required. Did it require post-hoc explanations (explaining a decision after it was made)? Ante-hoc interpretability (using models that are inherently interpretable)? Both? Neither, if explanations could mislead? Researchers found that some post-hoc explanation tools (like LIME and SHAP) could be made to produce any desired explanation regardless of the actual model behavior — raising questions about whether 'explanation' requirements created a false sense of transparency rather than genuine accountability.

Explainability vs. Interpretability
  • Interpretability: The model itself is inherently understandable — its decision process can be inspected. Linear models and decision trees are interpretable.
  • Explainability: A separate tool explains what a complex (potentially non-interpretable) model did. Post-hoc explanations approximate model behavior but are not the model itself.
  • The fidelity problem: Post-hoc explanations may not accurately represent the model's actual reasoning — they provide an approximation that may satisfy explanation requirements without providing genuine transparency.
The Ethics of AI Communication

Honesty for AI systems involves more than just the truth of individual outputs:

  • Sincere vs. performative assertion: Is the AI asserting something as true, or performing a communication act (writing a persuasive essay, playing a role)?
  • Calibration: Does the AI accurately represent its own uncertainty?
  • Misleading implicature: Can an AI deceive while saying only technically true things?
The Explanation Problem

If explanation tools can be made to produce any desired explanation regardless of model behavior, explanation requirements create compliance theater rather than genuine transparency. What matters is accountability — whether the explanation actually enables challenge and correction — not just whether an explanation was provided.

Quiz 4

Transparency and Honesty

5 questions — free, untracked, retake anytime.

is the 'fidelity problem' with post-hoc explanation tools?

✓ Correct — ✅ ✓ Fidelity problem: post-hoc explanation tools (LIME, SHAP) approximate model behavior but don't represent the actual model reasoning — and can be made to produce any desired explanation regardless of model behavior.
❌ ❌ Fidelity problem: post-hoc explanation tools approximate model behavior but don't represent actual reasoning — and can be made to produce any desired explanation. This creates compliance theater, not genuine transparency.

distinguishes interpretability from explainability in AI systems?

✓ Correct — ✅ ✓ Interpretability: the model is inherently understandable. Explainability: a separate tool explains a complex model after the fact. Interpretable models provide genuine transparency; post-hoc explanations provide approximations of potentially varying fidelity.
❌ ❌ Interpretability: model itself is understandable (linear models, decision trees). Explainability: a separate tool explains a complex model post-hoc. Interpretability is genuinely transparent; explainability is an approximation.

might GDPR's right to explanation create compliance theater rather than genuine accountability?

✓ Correct — ✅ ✓ Compliance theater: if explanation tools produce any desired explanation regardless of model behavior, satisfying explanation requirements doesn't require genuine transparency — it just requires producing a plausible-sounding explanation.
❌ ❌ Compliance theater: if post-hoc explanation tools can produce any desired explanation regardless of actual model behavior, explanation requirements are satisfied without genuine transparency — just compliance documentation.

is the difference between 'sincere assertion' and 'performative assertion' for AI honesty?

✓ Correct — ✅ ✓ Sincere: asserting as true. Performative: communication act where both parties know it doesn't express the speaker's actual views (writing an essay arguing a position, role-playing). Different honesty norms apply to each.
❌ ❌ Sincere assertion: asserting as genuinely true. Performative: a communication act (essay-writing, role-play) where both parties understand it doesn't express the speaker's actual beliefs. Honesty norms differ for each.

makes genuine AI transparency about accountability rather than just explanation?

✓ Correct — ✅ ✓ Accountability-enabling transparency: the explanation must actually enable challenge and correction. An explanation that satisfies compliance but doesn't help affected parties understand or contest decisions is compliance theater.
❌ ❌ Genuine transparency enables accountability — the ability to challenge and correct AI decisions. An explanation that satisfies compliance requirements without enabling meaningful challenge is compliance theater.
Lab 4

Transparency and Accountability

Design genuine transparency requirements for high-stakes AI.

Lab 4 — Transparency and Accountability

Analyze the fidelity problem and design genuine transparency requirements.

  1. The AI opens: if post-hoc explanation tools can be made to produce any desired explanation regardless of model behavior, what does 'the right to explanation' actually require? Is interpretability mandatory for high-stakes decisions?
  2. Develop requirements for genuine transparency that enable accountability rather than just compliance.
  3. Address: should high-stakes AI systems be required to use interpretable models, even at performance cost?
Consider: the fidelity problem, what explanation would enable a citizen to challenge a benefits denial, and the performance-interpretability tradeoff in different risk contexts.
🎯 AI GuideLab 4
Lesson 5

Autonomy and Consent

Philosophical theories of autonomy, manipulation, and AI's threats to self-determination.

Philosopher Harry Frankfurt distinguished first-order desires (what you want) from second-order desires (what you want to want). Genuine autonomy, for Frankfurt, requires acting in accordance with your second-order desires — being the author of your motivational structure. Persuasive AI systems exploit this gap: they satisfy first-order desires (you want to check, you want validation) while working against second-order preferences (you want to want to spend time differently, you want to make independent judgments). This isn't just a practical problem of screen time — it's a philosophical attack on the preconditions of autonomous agency.

Theories of Autonomy and AI
  • Hierarchical autonomy (Frankfurt): Genuine autonomy requires alignment between first-order desires and higher-order preferences about those desires. AI systems that exploit first-order impulses against second-order preferences undermine autonomy.
  • Substantive autonomy: Autonomy requires not just freedom from coercion but positive conditions — information, cognitive capacity, alternatives. AI systems that limit alternatives, degrade cognitive capacity, or provide distorted information undermine substantive autonomy.
  • Relational autonomy: Autonomy is produced through relationships and social structures. AI systems that reshape social relationships and information environments affect the social conditions for autonomy.
The Epistemic Autonomy Problem

AI systems that shape what information users encounter, how they think about issues, and what views feel natural have significant epistemic autonomy implications. If billions of people interact with systems that subtly shape their beliefs in commercially or politically motivated directions, individual epistemic autonomy — the capacity to form beliefs through one's own reasoning — is threatened at scale.

Scale Matters

An individual trying to influence your beliefs is a normal social interaction. A system influencing the beliefs of billions simultaneously, in commercially motivated directions, is a threat to the epistemic conditions for democratic self-governance.

Quiz 5

Autonomy and Consent

5 questions — free, untracked, retake anytime.

does Frankfurt's hierarchical autonomy theory explain why persuasive AI is ethically problematic?

✓ Correct — ✅ ✓ Frankfurt: genuine autonomy requires alignment between first-order desires and higher-order preferences. Persuasive AI satisfies first-order impulses while working against second-order preferences — undermining authentic self-authorship.
❌ ❌ Frankfurt's hierarchical autonomy: genuine autonomy requires acting in line with what you want to want, not just what you want. Persuasive AI exploits first-order impulses against second-order preferences — undermining authentic autonomy.

is 'substantive autonomy' and how do AI systems threaten it?

✓ Correct — ✅ ✓ Substantive autonomy: more than freedom from coercion — positive conditions (information, cognitive capacity, genuine alternatives). AI systems limiting alternatives, distorting information, or degrading cognitive capacity undermine these conditions.
❌ ❌ Substantive autonomy: positive conditions for genuine choice — accurate information, cognitive capacity, real alternatives. AI systems that limit alternatives, distort information, or degrade capacity undermine the conditions for substantive autonomy.

is the 'epistemic autonomy' problem specific to AI systems at scale?

✓ Correct — ✅ ✓ Epistemic autonomy at scale: systems shaping billions' beliefs simultaneously in motivated directions is qualitatively different from normal social influence. It threatens the epistemic conditions for democratic self-governance.
❌ ❌ Epistemic autonomy threat: individual social influence is normal. Systems shaping billions' beliefs simultaneously in commercially/politically motivated directions threatens the epistemic foundations of individual reasoning and democratic self-governance.

does 'relational autonomy' theory extend the critique of AI systems?

✓ Correct — ✅ ✓ Relational autonomy: autonomy is socially produced — through relationships and information environments. AI systems that reshape social relationships and information environments affect the social conditions in which autonomy is possible.
❌ ❌ Relational autonomy: autonomy is produced through social relationships and structures. AI systems that reshape those relationships and information environments affect the social preconditions that make autonomy possible.

distinguishes manipulation from legitimate influence in terms of autonomy?

✓ Correct — ✅ ✓ The autonomy distinction: legitimate influence works through rational agency (reasons, evidence). Manipulation bypasses rational agency (exploiting psychological mechanisms). The first respects autonomy; the second undermines it.
❌ ❌ Legitimate influence: works through rational agency — reasons, evidence, information people can evaluate. Manipulation: bypasses rational agency — exploits psychological mechanisms to produce behavior the person wouldn't endorse on reflection.
Lab 5

Autonomy and AI Design Ethics

Apply autonomy theories to derive AI design ethics constraints.

Lab 5 — Autonomy and AI Design Ethics

Apply autonomy theories to AI design ethics.

  1. The AI opens: Frankfurt's hierarchical autonomy theory suggests that persuasive AI systems undermine genuine self-authorship by exploiting first-order impulses against second-order preferences. What design constraints would this theory impose on AI products?
  2. Develop design requirements derived from substantive and relational autonomy theories.
  3. Address: at what scale does commercially motivated influence over beliefs become a threat to democratic epistemic preconditions — and how should this be governed?
Consider: design constraints from each autonomy theory, what genuine respect for user autonomy would require, and the governance implications of epistemic autonomy threats at scale.
🎯 AI GuideLab 5
Lesson 6

Responsibility and Accountability

Moral responsibility theory, the problem of many hands, and new liability frameworks.

Philosopher Dennis Thompson identified what he called "the problem of many hands" in organizational ethics: in complex organizations, when something goes wrong, no individual is responsible — each contributed a small, locally rational piece of a harmful aggregate outcome. Modern AI systems are a paradigm case: thousands of engineers make individually reasonable design choices; the resulting system causes harm that no individual engineer intended or could have prevented alone. This doesn't mean no one is responsible — it means responsibility must be conceived differently than the individual attribution that traditional ethical and legal frameworks assume.

The Problem of Many Hands in AI

Dennis Thompson's 'problem of many hands' applies acutely to AI:

  • Training data collection, labeling, model architecture, training, fine-tuning, safety evaluation, deployment — each involving different people with different responsibilities
  • Each individual's contribution may be locally defensible; the aggregate may be harmful
  • Traditional moral responsibility requires: causal contribution, knowledge, and ability to have done otherwise. AI harm often involves attenuated causation, limited knowledge, and constrained alternatives
Structural vs. Individual Responsibility

Thompson and others have argued for structural responsibility — responsibility for the institutions, incentive structures, and organizational designs that produce harmful outcomes — alongside (not instead of) individual responsibility:

  • Organizations can be morally responsible even when no individual is fully responsible
  • Structural design choices that predictably produce harm create prospective responsibility for designers
  • Regulatory and liability frameworks should target structures, not just individuals
Prospective Responsibility

Rather than asking only 'who caused this?' (retrospective), prospective responsibility asks 'who is responsible for preventing this?' Organizations that design incentive structures that predictably produce harm are prospectively responsible — regardless of whether any individual is fully causally responsible.

Quiz 6

Responsibility and Accountability

5 questions — free, untracked, retake anytime.

is 'the problem of many hands' in organizational ethics?

✓ Correct — ✅ ✓ Problem of many hands: each engineer made locally defensible choices; the aggregate system caused harm no individual intended. Traditional individual-attribution responsibility doesn't fit this pattern.
❌ ❌ Problem of many hands: harmful outcomes emerge from many individually defensible contributions in complex organizations. No individual is fully responsible, but the aggregate is harmful. Individual-attribution frameworks don't fit.

three elements does traditional moral responsibility require, and why are they difficult to satisfy in AI harms?

✓ Correct — ✅ ✓ Traditional moral responsibility: causal contribution (attenuated in AI), knowledge (each contributor has limited view), ability to have done otherwise (constrained by system and organizational structure). All three are difficult in distributed AI harm.
❌ ❌ Traditional moral responsibility requires: causal contribution (attenuated in distributed development), knowledge of harm potential (limited for individual contributors), and ability to have done otherwise (often constrained). All three are difficult to satisfy in AI harm.

is 'structural responsibility' as Thompson conceives it?

✓ Correct — ✅ ✓ Structural responsibility: organizations can be morally responsible for designing incentive structures and institutions that predictably produce harm — even when no individual is fully causally responsible.
❌ ❌ Structural responsibility: responsibility for institutions, incentive structures, and organizational designs that predictably produce harmful outcomes. Organizations can be responsible even when no individual is fully so.

distinguishes prospective from retrospective responsibility in AI contexts?

✓ Correct — ✅ ✓ Retrospective: who caused this? Prospective: who is responsible for preventing this? Prospective responsibility targets organizations that design incentive structures predictably producing harm — regardless of individual causation.
❌ ❌ Retrospective: who caused this harm? Prospective: who is responsible for preventing this? Organizations designing conditions that predictably produce harm bear prospective responsibility — regardless of whether any individual is fully causally responsible.

does the problem of many hands imply for AI governance design?

✓ Correct — ✅ ✓ Many-hands implication for governance: target structures (incentive systems, organizational design, institutional accountability) alongside individuals — because AI harm typically emerges from institutional aggregates, not individual bad actors.
❌ ❌ Many-hands problem implies governance should target institutional structures and incentive systems — not just individual wrongdoers — because AI harm characteristically emerges from aggregate institutional design.
Lab 6

Structural Responsibility Design

Develop a structural responsibility framework for AI harm.

Lab 6 — Structural Responsibility Design

Develop a structural responsibility framework for AI harm.

  1. The AI opens: Thompson's problem of many hands describes AI development precisely — thousands of individually defensible contributions produce harmful aggregates. What governance framework would assign structural responsibility appropriately?
  2. Develop a prospective responsibility framework — who is responsible for preventing AI harms, not just attributing them after the fact?
  3. Address: what institutional design would make structural responsibility real rather than nominal?
Consider: mandatory safety structures, organizational liability, regulatory requirements, and the difference between assigning blame after harm and preventing harm through structural accountability.
🎯 AI GuideLab 6
Lesson 7

Values Alignment — Whose Values?

The political philosophy of AI values, universal vs. cultural ethics, and the legitimacy of alignment choices.

When Anthropic describes its Constitutional AI, it uses a written constitution of ethical principles. When Chinese AI company ByteDance deploys TikTok, its algorithm reflects different content norms. When the EU regulates AI under the AI Act, it encodes European political values. When US AI systems are trained, they reflect demographic and cultural skews in their training data and RLHF feedback. Every AI system reflects specific values — the question is whether those values are universal, culturally contingent, explicitly chosen, or accidentally encoded. AI systems deployed globally operate across cultures with genuinely different values — and neither ignoring those differences nor relativizing all values is ethically satisfactory.

The Values Pluralism Problem

The challenge of values alignment in a pluralist world:

  • Universal values claim: Some values (prohibitions on torture, protection of children) are universal and should constrain all AI systems regardless of cultural context
  • Cultural relativism: Values are culturally constituted; imposing one culture's values through global AI systems is a form of cultural imperialism
  • Democratic legitimacy: The values encoded in AI systems should be chosen through legitimate democratic processes by those affected — not decided unilaterally by AI developers
  • Power-asymmetry concern: 'Universal values' claims are often made by powerful groups in ways that coincidentally align with their interests
Who Has Legitimate Authority to Specify AI Values?
  • AI developers (most actual influence; least democratic legitimacy)
  • National governments (democratic legitimacy within jurisdiction; limited global reach)
  • International bodies (limited authority; slow; contested legitimacy)
  • Affected communities (most legitimate in principle; least institutional power)
The Legitimacy Gap

The entities with most actual power to specify AI values (large AI developers) have the least democratic legitimacy to do so. The entities with most democratic legitimacy (affected communities, democratic governments) have the least power over AI systems deployed globally by foreign corporations.

Quiz 7

Values Alignment — Whose Values?

5 questions — free, untracked, retake anytime.

is the 'values pluralism problem' for globally deployed AI systems?

✓ Correct — ✅ ✓ Values pluralism: AI systems encode specific values but deploy globally across cultures with genuinely different values. Neither cultural imposition nor relativism is satisfactory — but the middle ground is contested.
❌ ❌ Values pluralism: AI systems encode specific values; they're deployed across cultures with genuinely different values. Neither imposing one culture's values globally nor relativizing all values resolves the tension.

is the power-asymmetry concern about 'universal values' claims in AI?

✓ Correct — ✅ ✓ Power-asymmetry: 'universal values' claims are often made by powerful actors whose interests align with the claimed universal values. This doesn't mean universal values don't exist — but requires scrutiny of who's claiming universality and why.
❌ ❌ Power-asymmetry concern: universal values claims are often made by powerful groups whose interests coincidentally align with those values. Scrutinizing who claims universality and whose interests it serves is warranted.

is the 'legitimacy gap' in AI values specification?

✓ Correct — ✅ ✓ Legitimacy gap: actual power (AI developers) ≠ democratic legitimacy (affected communities, governments). AI developers specify values globally; affected communities have little say; governments' reach is limited to their jurisdictions.
❌ ❌ Legitimacy gap: AI developers have most actual power to specify values — and least democratic legitimacy. Affected communities have most legitimate claim — and least institutional power over globally deployed AI systems.

distinguishes cultural relativism from the values pluralism position in AI ethics?

✓ Correct — ✅ ✓ Cultural relativism: all values are culturally contingent. Values pluralism: genuine differences deserve respect, but this doesn't preclude universal constraints on the worst harms. The middle position is contested.
❌ ❌ Cultural relativism: all values culturally contingent, no universal standards. Values pluralism: genuine value differences deserve respect, but some constraints (against torture, child harm) may still be universal.

is democratic legitimacy for AI values important beyond procedural fairness?

✓ Correct — ✅ ✓ Democratic legitimacy matters substantively: widely deployed AI systems shape what billions of people can say, see, and do. Power of this scope over people's lives requires democratic accountability — not just procedural compliance.
❌ ❌ Democratic legitimacy is substantively important: widely deployed AI systems shape what billions of people can say, see, and do. This scale of influence over human life requires democratic accountability commensurate with that power.
Lab 7

Values Legitimacy Analysis

Analyze who has legitimate authority to specify AI values.

Lab 7 — Values Legitimacy Analysis

Analyze the legitimacy problem in AI values specification.

  1. The AI opens with the legitimacy gap: AI developers have most power to specify values; affected communities have least. How would you design an institutional framework that gives affected communities genuine input into the values encoded in widely deployed AI systems?
  2. Address the values pluralism problem: are there genuinely universal values that should constrain all AI systems globally, and if so, what makes them universal rather than culturally imposed?
  3. Develop your position on: what democratic legitimacy actually requires for AI values specification at global scale.
Consider: the power-asymmetry in universality claims, what institutional mechanisms would give affected communities real input, and how to distinguish universal constraints from cultural imposition.
🎯 AI GuideLab 7
Lesson 8

The Ethics of Building AI

Professional ethics for AI builders, collective responsibility, and the moral obligations of those with technical power.

In 2018, Google employees organized and petitioned against Project Maven — the company's AI contract with the US military for drone targeting. Thousands signed. The company chose not to renew the contract. In 2023, employees at multiple AI companies signed open letters about safety concerns. In 2024, Anthropic published its model spec publicly — attempting to make explicit the values and priorities embedded in Claude. These episodes illustrate a developing field: the professional ethics of AI developers. What obligations do people who build AI systems have — to their employers, to users, to society, and to the future?

Professional Ethics for AI Builders

Professional ethics frameworks for AI are emerging:

  • Obligations to users: Not to deceive, manipulate, or cause harm to people who use the systems you build
  • Obligations to third parties: Systems affect people who aren't users — workers displaced, people subject to AI decisions, communities affected by AI-driven systemic change
  • Obligations to society: Technical power carries civic responsibility — understanding the potential effects of what you build and speaking up when those effects are harmful
  • Whistleblowing obligations: When an organization is building something harmful, individuals face choices about internal escalation, external disclosure, and resignation
The Complicity Problem

AI builders face complicity questions that most professional fields don't:

  • Can you build a system you believe is harmful if the organization requires it?
  • Can your work be "neutral" when it enables harmful applications?
  • What obligations follow from being among the few people with technical capability to build these systems?
The Power-Responsibility Principle

Technical capability to build AI systems is rare and comes with corresponding moral weight. People who can build these systems are not just workers following instructions — they are agents making choices with significant societal consequences. With that capability comes responsibility that can't be fully outsourced to employers or regulators.

Quiz 8

The Ethics of Building AI

5 questions — free, untracked, retake anytime.

did the Google Project Maven episode illustrate about AI professional ethics?

✓ Correct — ✅ ✓ Maven illustrated that AI builders have collective ethical agency: the ability to refuse or organize against applications they believe are harmful — and that these are ethical choices, not just technical implementation.
❌ ❌ Maven illustrated that AI builders have collective ethical agency — capacity to organize against applications they find harmful, and that questions of what to build and for whom are ethical, not just technical.

can't technical work in AI be 'ethically neutral'?

✓ Correct — ✅ ✓ Technical work in AI is not ethically neutral: choices about what to optimize, what data to use, what capabilities to build encode values and determine what applications are possible. The technical and the ethical are inseparable.
❌ ❌ Technical work isn't neutral: choices about what to optimize, what data to use, what to build encode values and enable specific applications. In consequential AI systems, technical choices are ethical choices.

is the 'power-responsibility principle' for AI builders?

✓ Correct — ✅ ✓ Power-responsibility: rare technical capability to build consequential AI systems carries moral weight. AI builders aren't just workers — they're agents making significant societal choices. This responsibility can't be fully outsourced.
❌ ❌ Power-responsibility principle: rare technical capability to build consequential AI carries moral weight that can't be fully outsourced to employers or regulators. Builders are agents making significant societal choices.

are AI builders' obligations to third parties beyond their direct users?

✓ Correct — ✅ ✓ Third-party obligations: AI systems affect workers displaced, people subject to AI decisions, communities experiencing systemic change. Builders have ethical obligations to consider these effects, not just effects on users and employers.
❌ ❌ AI builders have obligations to third parties: workers displaced by their systems, people subject to AI decisions, communities experiencing systemic AI-driven change — not just to direct users and employers.

does the complicity problem ask AI builders to consider?

✓ Correct — ✅ ✓ Complicity problem: does building harmful systems — even when required by employer, even when you didn't make the deployment decision — implicate you morally? Does being technically capable obligate you to refuse certain work?
❌ ❌ Complicity problem: does building harmful systems when required by an organization make builders morally responsible — even if they didn't decide to deploy? Does technical capability obligate refusal of certain work?
Lab 8

Synthesis: The Ethics of Building

Develop your personal AI builder ethics framework.

Lab 8 — Synthesis: The Ethics of Building

Synthesize the module and develop your AI builder ethics framework.

  1. The AI opens: AI builders have rare technical capability that carries moral weight. What personal ethical framework would you apply to decisions about what to build, what to refuse, and when to speak out?
  2. Drawing on the full module — ethical frameworks, harm definition, fairness, transparency, autonomy, responsibility, and values legitimacy — develop your synthesis position on the most important ethical requirements for AI development.
  3. Address: what do you owe to people who will be affected by AI systems you might build — and what would you refuse to build?
This is a synthesis lab. Draw on any and all of the module to build a position that is genuinely your own.
🎯 AI GuideLab 8

Module 4 Test

8 questions covering all lessons. Free, untracked, retake anytime.

contractualism asks AI designers to:

✓ Correct — ✅ ✓ Contractualism: could this design choice be justified to everyone affected? Applying this to AI design generates much stronger ethical constraints than standard impact assessment.
❌ ❌ Contractualism: for each design choice, ask whether it could be justified to every person it affects — especially the person most harmed. This is a powerful ethical design constraint.

harm in AI means:

✓ Correct — ✅ ✓ Aggregation harm: combining individually innocuous data produces harmful capabilities not present in any individual piece. Clearview AI is the paradigm case.
❌ ❌ Aggregation harm: combining individually harmless data points produces harmful capabilities. The harm emerges from combination and scale — not from any individual data point.

veil of ignorance applied to AI design would generate:

✓ Correct — ✅ ✓ Veil of ignorance: design without knowing whether you're the beneficiary or worst-off subject. This generates demands for protection of worst-case outcomes — much stronger fairness constraints than typical design practice.
❌ ❌ Veil of ignorance: design without knowing your position. This generates demands for protecting worst-case outcomes — constraints that typical AI design practice rarely meets.

fidelity problem with post-hoc explanation tools means:

✓ Correct — ✅ ✓ Fidelity problem: post-hoc tools can produce any desired explanation regardless of actual model behavior. This creates compliance theater — satisfying explanation requirements without genuine transparency.
❌ ❌ Fidelity problem: post-hoc explanation tools can produce any desired explanation regardless of model behavior — creating compliance theater rather than genuine accountability-enabling transparency.

hierarchical autonomy theory implies that persuasive AI is ethically problematic because:

✓ Correct — ✅ ✓ Frankfurt: genuine autonomy requires acting in line with second-order preferences (what you want to want). Persuasive AI exploits first-order impulses against those higher-order preferences — undermining authentic self-authorship.
❌ ❌ Frankfurt: genuine autonomy requires alignment between first-order desires and second-order preferences. Persuasive AI exploits first-order impulses against second-order preferences — undermining genuine self-authorship.

'problem of many hands' implies that AI governance should:

✓ Correct — ✅ ✓ Many-hands problem implies governance should target structures and incentive systems — because AI harm emerges from institutional aggregates, not individual bad actors.
❌ ❌ Problem of many hands: AI harm emerges from institutional aggregates, not individual bad actors. Governance should target structures and incentive systems that predictably produce harm.

'legitimacy gap' in AI values means:

✓ Correct — ✅ ✓ Legitimacy gap: AI developers have most power, least democratic legitimacy. Affected communities have most legitimate claim, least institutional power. This mismatch is the central values governance problem.
❌ ❌ Legitimacy gap: actual power (AI developers) ≠ democratic legitimacy (affected communities). This mismatch — those with most power have least legitimacy — is the central problem in AI values governance.

power-responsibility principle for AI builders states:

✓ Correct — ✅ ✓ Power-responsibility: rare capability carries commensurate moral weight. AI builders aren't just executing instructions — they're agents making significant societal choices. This responsibility can't be fully outsourced.
❌ ❌ Power-responsibility: rare technical capability to build consequential AI carries moral weight that can't be fully outsourced. Builders are agents making significant societal choices — not just workers following instructions.