When U.S. and Soviet negotiators concluded START I in July 1991, verification rested on a deceptively simple premise: missiles are physical objects. Inspectors could count SS-18 silos from satellites, walk into Votkinsk to watch rail-mobile launchers, weigh warheads on certified scales. The treaty's 700-page annex of definitions worked because thermonuclear warheads cannot be e-mailed. Thirty years later, the code that controls an autonomous targeting system can cross a border in milliseconds.
Every successful arms control treaty in history has depended on what scholars call national technical means — primarily reconnaissance satellites — supplemented by on-site inspections. Both methods assume the object being counted has a stable physical signature: a missile's heat bloom, a warhead's distinctive re-entry vehicle shape, a submarine's acoustic profile.
AI systems shatter that assumption. A trained neural network is, at its core, a large array of floating-point numbers. It occupies no unique geography. One model can be copied infinitely at near-zero marginal cost. A state that agrees to limit "autonomous lethal targeting systems" could comply on paper while distributing identical weights across civilian cloud infrastructure the moment inspectors depart.
The 2019 U.S. National Intelligence Strategy explicitly named adversarial AI as an emerging threat but offered no verification mechanism. The OECD AI Principles (May 2019), endorsed by 42 countries, established norms around transparency and human oversight — but contained zero enforcement provisions. The gap between aspiration and verification is the central challenge of AI arms control.
The same computer vision model that guides an autonomous drone can sort medical images for cancer detection. The same reinforcement-learning algorithm that trains a cyber-intrusion agent trains a robotic surgery system. Unlike highly enriched uranium, there is no detectable physical difference between a "weapons" AI and a "civilian" AI.
Three historical regimes offer partial lessons. The Nuclear Non-Proliferation Treaty (1968) succeeded partly because fissile material production requires massive, visible infrastructure — enrichment cascades, reactors — that satellites can detect. The Chemical Weapons Convention (1993) relies on declared facilities and short-notice inspections; it has largely worked for state parties because industrial-scale chemical weapons production leaves physical evidence. The Biological Weapons Convention (1972) is the cautionary tale: it banned an entire class of weapons but has no verification protocol at all, and multiple states — including the Soviet Union's Biopreparat program, disclosed after 1991 — violated it massively.
AI is closer to bioweapons than to nuclear weapons in its verification difficulty. Both involve dual-use knowledge; both can be developed in small, concealed facilities; both lack a single detectable physical signature. The BWC failure is directly instructive.
Nuclear arms control counts objects. Chemical weapons control monitors processes. Biological weapons control attempts — and largely fails — to monitor knowledge. AI arms control must confront the same knowledge-control problem, but at digital speed and global scale.
A second fundamental challenge is temporal. Cold War arms control assumed that decision-making occurred on human timescales — minutes at minimum, typically hours or days. The 1963 Hotline Agreement between Washington and Moscow assumed humans would be in the loop during crises.
AI systems can execute kill chains in milliseconds. In November 2019, the U.S. Defense Advanced Research Projects Agency's AlphaDogfight Trials demonstrated an AI pilot defeating experienced F-16 pilots 5-0 in simulated combat. The winning agent reacted approximately eight times faster than a human. A conflict involving autonomous systems operating at machine speed could escalate from first contact to existential exchange before any human has an opportunity to intervene — let alone invoke crisis communication protocols established for human decision-makers.
This compresses the window arms control must protect. Treaties designed around human reaction times may be functionally irrelevant in a conflict conducted by autonomous systems.
You are a policy analyst advising a fictional international working group on AI arms control. Your task is to propose at least one novel verification mechanism for a hypothetical treaty limiting autonomous lethal AI systems — and to pressure-test it against the dual-use and speed-of-action challenges covered in Lesson 1.
Delegates to the Convention on Certain Conventional Weapons Group of Governmental Experts on Lethal Autonomous Weapons Systems filed into the Palais des Nations for what campaigners had hoped would be a breakthrough session. The Campaign to Stop Killer Robots, a coalition of 270 NGOs, had been lobbying for a legally binding instrument since 2012. After eleven years of meetings, the GGE produced another non-binding set of guiding principles. Russia and the United States blocked a mandate to negotiate a treaty. The session ended with a press release.
The Convention on Certain Conventional Weapons (CCW), adopted in 1980, restricts or prohibits weapons deemed excessively injurious or indiscriminate. Its Protocol IV banned blinding laser weapons in 1995 — the last successful addition to the CCW. Since 2014, a Group of Governmental Experts (GGE) has met periodically to discuss lethal autonomous weapons systems (LAWS), defined roughly as systems that select and engage targets without human intervention.
The GGE has produced a set of eleven guiding principles, agreed in 2019, that affirm international humanitarian law applies to LAWS and that human responsibility must be maintained. What the GGE has not produced: a definition of LAWS that all states accept, a prohibition on any category of autonomous weapon, or any verification mechanism.
The fundamental impasse is structural. The CCW operates by consensus — any state party can block progress. Russia has argued that autonomous systems are simply automated weapons governed by existing IHL. The United States has resisted a ban it fears would constrain advantageous U.S. systems while being easily evaded by adversaries. China has publicly called for a ban on autonomous systems that "kill" independently but resisted any definition that would apply to its own programs.
No multilateral forum has agreed on what "autonomous" means in "lethal autonomous weapons system." Does a Phalanx close-in weapon system — which automatically engages incoming missiles — qualify? What about a loitering munition with a 30-minute time-on-station limit? The inability to agree on scope has paralyzed every normative effort.
In November 2023, the United Kingdom convened the AI Safety Summit at Bletchley Park — the first major multilateral gathering focused specifically on frontier AI risks. Twenty-eight states, including the United States, China, and the EU, signed the Bletchley Declaration, acknowledging that frontier AI poses potential catastrophic risk and committing to share information on safety.
China's signature was diplomatically significant — Beijing had been absent or obstructive at previous technology governance forums. However, the Declaration is explicitly non-binding. It contains no commitments to limit development, share model weights, or submit to inspection. A follow-on summit was held in Seoul in May 2024, producing the Seoul Statement, which added language on government-industry cooperation but again stopped short of binding obligations.
The Bletchley process represents the first attempt to bring major AI-developing states together around shared safety concerns — but critics note it focuses on "frontier AI" risks (primarily large language models and general-purpose AI) rather than specifically military applications or autonomous weapons.
Separately from multilateral forums, the United States in February 2023 issued a Political Declaration on Responsible Military Use of Artificial Intelligence and Autonomy. By late 2023, 52 states had endorsed it. The Declaration commits signatories to develop AI consistent with international humanitarian law, maintain human judgment in nuclear command and control, and take steps to minimize unintended engagements.
The Declaration is not legally binding. It has no verification mechanism. Notable non-signatories include Russia and China. Its practical impact remains unclear, but it represents the most explicit multilateral statement of principle on military AI to date, and it explicitly addresses the nuclear nexus — a critical area given concerns about AI integration into early-warning systems.
| Framework | Year | Binding? | Verification? | Key States Absent |
|---|---|---|---|---|
| CCW GGE Guiding Principles | 2019 | No | No | — |
| OECD AI Principles | 2019 | No | No | Russia, China |
| U.S. Political Declaration | 2023 | No | No | Russia, China |
| Bletchley Declaration | 2023 | No | No | — |
| Seoul Statement | 2024 | No | No | — |
Every existing multilateral AI or LAWS framework is non-binding and contains no verification mechanism. This is not an oversight — it reflects the genuine difficulty of verifying compliance and the unwillingness of major military powers to accept binding constraints on capabilities they are actively developing.
You are a researcher preparing a policy brief on why AI arms control efforts have stalled. The AI advisor will play devil's advocate — defending existing frameworks as "better than nothing." Your task is to diagnose the specific structural failures that have prevented binding agreements, using the CCW GGE and Bletchley processes as primary case studies.
When the Biden administration issued Executive Order 14110 on AI safety in October 2023, buried in its 111 pages was a requirement that cloud providers report when foreign customers rent computing clusters above a threshold level. The logic was precise: training frontier AI models requires massive compute. You cannot train GPT-4-scale systems on a laptop. If you control the compute, you control access to the capability — without needing to inspect the model itself.
The most technically sophisticated arms control proposal currently circulating in policy circles is compute governance — the idea that since training advanced AI requires specialized semiconductor hardware (primarily Nvidia H100-class GPUs and TPUs), controlling the manufacture, sale, and operation of that hardware is more tractable than controlling software.
The logic has three steps. First, frontier model training is compute-constrained: current leading models require clusters of thousands of specialized chips running for months. Second, these chips are manufactured by a small number of firms (primarily TSMC in Taiwan) using equipment from an even smaller number of suppliers (ASML in the Netherlands dominates extreme ultraviolet lithography). Third, the U.S. export control regime, expanded in October 2022 and October 2023, already restricts export of advanced AI chips to China — demonstrating that compute governance has partial precedent.
Researchers at the Centre for the Governance of AI and Oxford's Future of Humanity Institute have proposed embedding on-chip monitoring hardware in advanced AI accelerators — essentially a trusted reporting module that logs compute usage and reports to an international registry without revealing the content of computations. This would allow verification of whether a state is training models above a threshold size, analogous to IAEA safeguards on nuclear material.
Compute governance can potentially monitor training — the most resource-intensive phase. But deploying an already-trained model (inference) requires far less compute. A state could train a system covertly before controls are implemented, then run it indefinitely on modest hardware. The October 2022 U.S. chip export restrictions came years after China had already acquired significant compute stockpiles.
A second approach, advocated by legal scholars and humanitarian organizations, focuses not on capabilities but on specific prohibited behaviors. Rather than banning "autonomous AI" (difficult to define), a treaty could prohibit specific applications: autonomous targeting of humans without human confirmation, AI systems in nuclear launch chains, or autonomous cyber attacks on critical civilian infrastructure.
This approach draws on the Chemical Weapons Convention model — rather than banning all chemistry, the CWC bans specific chemicals and their weaponized use. Behavioral red lines are easier to define than capability thresholds and easier to attribute when violated.
The International Committee of the Red Cross has advocated specifically for a rule requiring human control over the decision to use force against persons — effectively banning fully autonomous lethal targeting. This framing aligns with international humanitarian law's requirement of distinction (combatants from civilians) and proportionality, which critics argue autonomous systems cannot reliably perform.
The concept of meaningful human control (MHC) has emerged as a possible normative anchor. Proposed by the Campaign to Stop Killer Robots and elaborated by scholars including Heather Roff and Richard Moyes, MHC requires that a human operator understands the context of an attack, can intervene to prevent it, and retains moral and legal responsibility for the outcome.
MHC is intentionally vague enough to command broad support while precise enough to exclude "fire-and-forget" fully autonomous systems. States including the Netherlands and Austria have endorsed MHC language in CCW negotiations. The United States has resisted it, arguing that "appropriate human judgment" — its preferred formulation — is more operationally realistic and avoids prohibiting beneficial automation such as missile defense.
The practical gap between MHC and U.S. policy is smaller than it appears: U.S. Department of Defense Directive 3000.09, revised in January 2023, requires senior official approval for any autonomous weapons system that falls outside defined parameters — but explicitly permits autonomous functions within those parameters, which critics argue is functionally indistinguishable from machine-speed autonomous engagement.
A third category, more modest in ambition, focuses on confidence-building measures (CBMs) — steps short of binding limits that reduce the risk of accidental conflict. These include: notification when deploying autonomous systems in proximity to adversary forces; shared incident reporting channels for AI-related military accidents; agreements to maintain human control over nuclear command and control regardless of AI integration elsewhere; and agreed technical standards for autonomous system behavior in contested environments.
CBMs have a strong historical precedent. The 1972 U.S.-Soviet Incidents at Sea Agreement reduced dangerous naval encounters without limiting either fleet. The 1987 Accident Measures Agreement established procedures for notifying nuclear incidents. Analogous AI-focused CBMs could reduce escalation risk even without constraining development.
AI arms control proposals range from maximally ambitious (binding ban on all autonomous lethal systems, verified by compute monitoring) to modestly practical (notification agreements and shared incident channels). The more ambitious the proposal, the more verification it requires — and the less likely major powers are to accept it. The pragmatic question is where on this spectrum meaningful risk reduction can actually be achieved.
You are advising a state proposing a multilateral compute governance regime at an international AI safety forum. Your task is to design the key elements of such a regime — thresholds, monitoring mechanisms, and enforcement — while addressing the inference problem and the objection that compute controls are technologically nationalist rather than genuinely safety-oriented.
On the sidelines of the APEC summit, Presidents Biden and Xi met for four hours at the Filoli Estate. Among the outcomes: an agreement to resume military-to-military communications suspended after Nancy Pelosi's Taiwan visit — and a commitment to convene government-to-government talks on AI risk. It was the first explicit bilateral commitment by the world's two leading AI powers to discuss the technology's risks. The talks were not arms control. They were not binding. But they were a beginning.
Following the Biden-Xi commitment, the first formal U.S.-China AI government-to-government talks were held in Geneva in May 2024. The State Department described them as "substantive and candid." No joint statement was issued. No agreements were announced. But the meeting represented the first structured bilateral exchange on AI safety between the two governments — analogous to the earliest Soviet-American strategic stability talks before any treaties existed.
The structural context is deeply challenging. U.S.-China relations are characterized by strategic competition, mutual distrust over technology transfer, and active military competition in the Taiwan Strait and South China Sea. The October 2022 U.S. chip export controls were explicitly designed to degrade China's AI capabilities — making it difficult to simultaneously ask China to participate in cooperative AI governance arrangements.
China's domestic AI governance framework — including its 2023 Generative AI Regulations and the Cyberspace Administration of China's algorithm governance rules — is sophisticated but focuses primarily on domestic content control and social stability rather than international security. Beijing's stated position favors "multilateral" rather than "U.S.-led" AI governance, making it resistant to frameworks where Washington sets the terms.
The history of U.S.-Soviet arms control offers an instructive parallel. The first strategic stability talks began in 1969; the first treaty (SALT I) was signed in 1972; the first treaty with real reductions (INF) came in 1987. The process took eighteen years from initial dialogue to binding limits — and proceeded through multiple crises, betrayed agreements, and periods of complete breakdown.
Several specific nuclear precedents are relevant. The 1963 Limited Test Ban Treaty was achievable before comprehensive verification was possible because it banned only atmospheric testing — a behavior detectable by seismic monitoring and radiation sensors. Analogously, some AI behaviors might be verifiable (training above a compute threshold, deploying AI in nuclear C2) before general AI capability limits are tractable.
The 1972 Anti-Ballistic Missile Treaty succeeded partly because both sides calculated that unconstrained ABM deployment would be mutually destabilizing — a shared strategic interest in stability that transcended political tensions. A similar logic may apply to AI: both the U.S. and China have reason to fear accidental escalation from autonomous systems misidentifying threats. That shared interest in avoiding inadvertent war is the foundation any AI arms control must build on.
U.S.-Soviet arms control succeeded partly because both sides developed what scholars call a "verification culture" — shared technical understanding, agreed data exchanges, and mutual confidence that violations would be detected. U.S.-China AI talks are starting from a baseline of near-zero mutual transparency on military AI programs. Building verification culture takes time and usually requires small initial agreements that build confidence before ambitious ones become possible.
Most scholars working in this space converge on a realistic near-term agenda that prioritizes achievable risk reduction over transformative arms control. The key elements:
1. Nuclear AI firewall. A bilateral or multilateral agreement that AI systems will not be integrated into nuclear launch authorization chains — and that early-warning systems will maintain human confirmation requirements. This is arguably the most urgent risk because AI false positives in early-warning could trigger inadvertent nuclear launch. The U.S. Political Declaration already endorsed this principle; getting Russia and China to agree bilaterally would be the next step.
2. Incident reporting channel. A dedicated channel for rapid communication when AI-related military incidents occur — comparable to the 1963 Hotline but specifically scoped to autonomous system incidents. This addresses the speed-of-action problem by ensuring human decision-makers can communicate even when systems are operating at machine speed.
3. Shared definitions working group. An ongoing technical forum to develop shared definitions of autonomous systems, agreed taxonomies of autonomy levels, and common vocabulary for negotiations — addressing the definitional paralysis that has stalled CCW talks for a decade.
4. Compute transparency pilot. A voluntary transparency measure in which major AI-developing states report compute usage above a threshold to a neutral registry — not mandatory limits, but data collection that could eventually support verification.
Effective AI arms control — if it comes — will likely follow the nuclear model: decades of iterative dialogue, small initial agreements, and treaty architecture built on verified confidence. The window for early agreements may be narrowing as autonomous systems are deployed and strategic interests become more entrenched. The urgency is real; so is the complexity. The gap between them defines the central challenge of AI governance for the next generation of policymakers.
You are a U.S. State Department official in a backchannel dialogue with a Chinese counterpart. Your task is to draft language for a bilateral joint statement committing both states to maintain human confirmation requirements in nuclear early-warning systems — without allowing either side to verify the other's internal systems or concede any strategic position.