In the autumn of 2011, virologist Ron Fouchier at Erasmus Medical Center in Rotterdam submitted a paper to Science describing how his team had engineered H5N1 avian influenza to transmit between ferrets via respiratory droplets. A parallel team led by Yoshihiro Kawaoka at the University of Wisconsin submitted similar findings to Nature. Both papers were immediately flagged by the U.S. National Science Advisory Board for Biosecurity β not for fraud, but for being too dangerous to publish in full. The research was legitimate science aimed at understanding pandemic risk. The methods were also a potential blueprint for a catastrophic biological weapon.
The debate that followed β over whether the journals should redact the methods sections β was the most visible public confrontation with dual-use research of concern in the modern era. Both papers were eventually published, largely intact.
The concept predates modern science. The same metallurgy that forged plows forged swords. The Haber-Bosch process, developed by Fritz Haber and Carl Bosch in the early twentieth century, enabled nitrogen fixation that feeds roughly half of humanity today β and also enabled industrial-scale production of explosives and chemical weapons. Haber personally supervised Germany's first large-scale chlorine gas attacks at Ypres in April 1915.
In contemporary policy, dual-use research of concern (DURC) refers specifically to research that, while conducted for legitimate purposes, could be misused to threaten public health, safety, security, or other significant values. The U.S. government formalized this definition in its 2012 DURC policy, requiring institutional review for certain life-sciences research categories.
The challenge is not that bad actors do bad research. It is that good actors doing good research produce knowledge that bad actors can exploit. This asymmetry is the core of the dual-use problem.
Australian researchers Ron Jackson and Ian Ramshaw, working on a mouse contraceptive, inadvertently created a hyper-lethal mousepox virus by inserting the IL-4 gene. The modified virus killed mice that had been vaccinated against the normal strain. The implications for smallpox β a pathogen with a known genome and extinct in the wild β were immediately alarming. The researchers published their findings in Journal of Virology, later stating they had not considered the dual-use implications before submission.
Prior to AI, dual-use knowledge posed a diffusion problem: dangerous methods existed in academic papers and specialist communities, but accessing, understanding, and operationalizing them required significant technical expertise and often expensive laboratory infrastructure. The barrier was not the knowledge itself but the human capital required to act on it.
Large language models and AI-assisted design tools compress this barrier. In 2023, a study commissioned by the Johns Hopkins Center for Health Security found that an AI chatbot could provide "meaningful uplift" to individuals seeking to synthesize dangerous pathogens β not by providing information unavailable in literature, but by providing synthesis, guidance, and troubleshooting in a conversational format that dramatically reduced the expertise threshold. The study was partially redacted before publication.
Similar dynamics apply to cyber intrusion tooling, disinformation generation, materials science for improvised weapons, and autonomous system design. In each case, AI does not necessarily introduce new knowledge β it democratizes access to existing knowledge in ways that legacy export controls and classification regimes were not designed to handle.
Not all dual-use risk is equivalent. Policymakers and researchers have developed rough taxonomies. At one end: basic science with theoretical misuse potential (e.g., published protein folding data). At the other: specific synthesis routes for select agents with no plausible civilian application in the form described.
The 2017 Nunn-Lugar Cooperative Threat Reduction program extension explicitly recognized AI-enabled biology as a new threat vector, funding detection research at DOE national laboratories. DARPA's Safe Genes program, launched the same year, developed safeguards for gene-editing technologies that are inherently dual-use.
For AI specifically, the spectrum runs from: general-purpose language models that can answer chemistry questions, to specialized models trained on restricted literature, to purpose-built AI systems for molecular design such as those used in legitimate pharmaceutical discovery β and potentially in bioweapons design if the objective function is reversed.
DeepMind's AlphaFold 2, released publicly in 2021, solved the protein structure prediction problem that had stumped biology for fifty years. The database now contains over 200 million predicted structures. The biosecurity community immediately noted that the same tool enabling vaccine development could assist in designing novel protein-based toxins or engineering pathogen proteins for enhanced virulence or immune evasion. No restrictions were placed on public access.
Pre-AI dual-use governance relied on several interlocking mechanisms: export controls (the Wassenaar Arrangement, Export Administration Regulations) restricting transfer of controlled technologies; select agent regulations limiting who may work with dangerous pathogens; pre-publication review by institutional biosafety committees; and classification of the most sensitive government-funded research.
Each mechanism assumed a relatively slow diffusion of technical knowledge through a credentialed professional community. AI disrupts that assumption at multiple points simultaneously β by making existing published literature more actionable, by enabling non-credentialed actors to navigate technical domains, and by potentially generating novel dangerous knowledge rather than merely retrieving existing knowledge.
Explore the concept of dual-use research of concern with a focus on how AI changes the traditional uplift calculus. Consider the Fouchier case, AlphaFold, and the expertise-threshold problem.
In March 2022, researchers at Collaborations Pharmaceuticals published a paper in Nature Machine Intelligence describing an experiment they had conducted as a thought exercise for a biosecurity conference. Using their AI drug-discovery model, MegaSyn, they inverted the objective function β rather than optimizing for low toxicity, they tasked the model with identifying molecules with high toxicity potential. Within six hours, the model had generated approximately 40,000 candidate molecules. Many scored higher on predicted toxicity than known chemical warfare agents, including VX nerve agent. The team declined to publish the full output. The paper's conclusion was stark: the same AI infrastructure used to discover life-saving drugs could be trivially repurposed for offensive chemistry.
Modern AI drug discovery platforms β including SchrΓΆdinger's physics-based modeling suite, Recursion Pharmaceuticals' phenomics platform, and academic tools like RoseTTAFold β operate on a core paradigm: given a target protein and a desired interaction profile, generate candidate molecules likely to bind and produce a desired effect. The AI learns from vast databases of known molecule-effect relationships.
The dual-use problem emerges from a simple observation: the model does not know or care whether the desired effect is therapeutic or lethal. "Inhibit this receptor" and "maximally disrupt this receptor" are both valid query framings. The molecular design process is the same. The objective function is what differs.
Beyond small molecules, AI is transforming protein engineering. Tools like ProteinMPNN and RFdiffusion (both from the Baker Lab at University of Washington) can design novel proteins with specified structural and functional properties. In 2023, the Baker Lab demonstrated de novo protein binders for influenza hemagglutinin β a major advance for antivirals. The identical approach could be used to design proteins that enhance pathogen binding to human receptors.
Gain-of-function (GOF) research deliberately enhances pathogen characteristics β transmissibility, virulence, immune evasion β to study pandemic risk and develop countermeasures. The NIH imposed a funding moratorium on certain GOF research in 2014, lifted in 2017 with new oversight requirements under the P3CO (Potential Pandemic Pathogen Care and Oversight) framework. AI accelerates GOF by enabling computational prediction of which mutations would achieve desired functional changes before any laboratory work is done β expanding what can be explored at near-zero cost before expensive wet-lab confirmation.
Even a perfectly designed biological agent requires physical synthesis. For pathogens, this means synthesizing the nucleic acid sequences that encode the agent. The commercial DNA synthesis industry has grown dramatically, with companies like Twist Bioscience, IDT, and Genscript capable of producing long DNA sequences on demand.
In 2022, the Nuclear Threat Initiative (NTI) published an assessment finding that biosecurity screening practices varied widely across synthesis providers and that the International Gene Synthesis Consortium's voluntary screening protocol covered only a fraction of global capacity. AI compounds this risk by generating optimized sequences that might evade signature-based screening β sequences with the same functional effect as a dangerous pathogen gene but different enough in sequence to avoid detection flags.
The Biden administration's Executive Order 14110 (October 2023) addressed this directly, requiring federal agencies to develop minimum standards for nucleic acid synthesis screening β the first federal policy intervention specifically targeting AI-enabled bioweapons risk in the synthesis supply chain.
The biosecurity community has not been passive. The Johns Hopkins Center for Health Security, NTI Bio, and SecureBio (a nonprofit dedicated to AI biosecurity) have each published frameworks for evaluating AI biosecurity risk. SecureBio's Biological Hazard Assessment for AI (BHA-AI) framework, developed with former intelligence community officials, evaluates AI systems across five dimensions: knowledge provision, task completion, access facilitation, quality uplift, and speed uplift.
DARPA's Biological Technologies Office has funded work on metagenomic monitoring β environmental surveillance systems capable of detecting engineered pathogens in real time. The logic: if AI makes creation easier, detection infrastructure must become faster and broader.
Leading AI laboratories have begun implementing domain-specific safeguards. Anthropic, OpenAI, and Google DeepMind have each published biosecurity policies restricting detailed synthesis guidance for dangerous pathogens. Whether these policies are effective under adversarial prompting remains an active area of red-team research.
Traditional screening compares submitted sequences against known dangerous sequences using BLAST (Basic Local Alignment Search Tool). AI-designed sequences can be functionally equivalent to dangerous agents while being sufficiently different in sequence to evade BLAST matches. In 2023, MIT researchers demonstrated this in a controlled study β designing functional analogs to toxin-encoding genes that passed standard screening. Their paper argued for AI-based screening to counter AI-based evasion.
Engage with the specific dual-use risks from AI tools in biology β drug discovery platforms, protein design, and the synthesis supply chain. Apply the concept of objective-function inversion and screening evasion.
In July 2023, cybersecurity researchers at SlashNext documented WormGPT, a large language model fine-tuned on malware data and made available on criminal forums for a monthly subscription. Unlike frontier models with safety training, WormGPT had no content restrictions and would assist in writing malicious code, crafting business email compromise attacks, and advising on exploitation techniques. Shortly after, a second tool, FraudGPT, appeared on Telegram offering similar capabilities. Both were advertised as enabling "everything you feared ChatGPT could do." These were not sophisticated nation-state tools β they were commoditized, subscription-based, and available to low-skill actors.
The cybersecurity dual-use problem is the oldest in the digital domain. Penetration testing tools are identical in function to attack tools. Vulnerability research produces knowledge that can be used for patching or for exploitation. AI extends this tension across several dimensions:
Code generation: In 2023, researchers at CyberArk demonstrated that GPT-4 could assist in developing polymorphic malware β code that continuously rewrites itself to evade signature detection. The researchers noted that producing such code previously required advanced reverse-engineering expertise. The model compressed that requirement significantly.
Social engineering: AI dramatically reduces the cost and improves the quality of phishing content. A 2023 study by IBM Security found AI-generated phishing emails achieved click rates comparable to human-written ones while costing 96% less to produce. Spear-phishing, historically expensive because it required per-target research and writing, becomes near-free.
Vulnerability discovery: AI-assisted fuzzing and code analysis tools can discover software vulnerabilities faster than human analysts. DARPA's Cyber Grand Challenge (2016) demonstrated autonomous systems finding and patching vulnerabilities in real time. The same capability can find vulnerabilities to exploit rather than patch.
Stuxnet (discovered 2010, attributed to the U.S. and Israel) targeted Iranian nuclear centrifuges using four zero-day exploits β unprecedented in a single weapon. It demonstrated that cyber capabilities could produce kinetic physical effects on industrial infrastructure. AI-assisted vulnerability discovery and exploit development makes the technical sophistication required to build Stuxnet-equivalent weapons increasingly accessible, though the intelligence and targeting requirements remain high.
Autonomous weapons systems β military platforms capable of selecting and engaging targets without human intervention β represent a fundamentally different dual-use structure than biology or cyber. Here, the civil-military dual-use is not about the same tool being used for different purposes but about commercial AI capabilities being integrated into weapons platforms.
The commercial drone industry illustrates this concretely. DJI drones, designed for photography and inspection, have been extensively modified by non-state actors in conflicts including Ukraine and the Islamic State's operations in Iraq and Syria. In Ukraine (2022βpresent), both sides have used commercial FPV (first-person view) racing drones fitted with explosives as precision munitions. The AI-assisted stabilization, obstacle avoidance, and target-tracking capabilities developed for consumer use translate directly into weapons capabilities.
Computer vision systems trained for autonomous vehicle navigation can be repurposed for target identification. Reinforcement learning systems trained in simulation can control autonomous platforms. The DOD's Project Maven, launched in 2017, explicitly used commercial computer vision AI (initially Google TensorFlow) to analyze drone surveillance footage β and prompted a Google employee petition and eventual Google withdrawal from the contract, illustrating the governance tension when commercial AI firms encounter weapons applications.
AI applications in materials science represent a less-discussed but significant dual-use concern. GNoME (Graph Networks for Materials Exploration), released by Google DeepMind in November 2023, predicted the structures of 2.2 million new stable materials β 380,000 of which DeepMind assessed as immediately synthesizable. The research community celebrated this as a breakthrough for battery technology, superconductors, and other clean-energy applications.
The nuclear nonproliferation community noted a different implication: GNoME and similar tools could assist in identifying novel materials useful for weapons applications β including nuclear weapon components, radiation shielding, and advanced conventional explosives. The Nuclear Threat Initiative published an analysis in early 2024 noting that AI materials discovery tools were not subject to the same DURC review requirements as life-sciences research, representing a governance gap.
Export controls under the Export Administration Regulations (EAR) and the International Traffic in Arms Regulations (ITAR) address specific controlled materials but were not designed to regulate AI models that discover new materials. The question of whether an AI model that could identify weapons-relevant materials constitutes a controlled technology is legally unresolved.
The Wassenaar Arrangement on Export Controls for Conventional Arms and Dual-Use Goods and Technologies covers specific software and hardware categories. In 2019, participating states discussed but failed to reach consensus on including certain AI capabilities β particularly intrusion software and autonomous control technologies β in the control lists. The arrangement's consensus-based amendment process has not kept pace with AI development cycles, leaving significant gaps.
Each domain has domain-specific governance architecture of varying effectiveness. Cybersecurity: The Vulnerabilities Equities Process (VEP) governs U.S. government decisions about whether to disclose discovered vulnerabilities to vendors or retain them for offensive use. AI-discovered vulnerabilities in principle fall within the VEP but the process was not designed for the discovery volumes AI enables. Autonomous weapons: DOD Directive 3000.09 (updated 2023) requires human judgment for lethal force decisions, but "human judgment" is interpreted broadly and the directive has no treaty-level international counterpart. The Campaign to Stop Killer Robots has sought a binding international instrument without success at the UN Group of Governmental Experts level. Materials: The 2023 CHIPS and Science Act included provisions for AI research security but did not specifically address AI-enabled materials discovery.
Explore AI dual-use risks in cybersecurity, autonomous weapons, and materials science. Consider governance gaps β the Wassenaar Arrangement lag, DOD Directive 3000.09 limitations, and the absence of materials DURC review.
When the United Kingdom announced the AI Safety Institute (AISI) at the November 2023 Bletchley Park AI Safety Summit, its mandate explicitly included evaluating dual-use capabilities in frontier AI models β particularly biosecurity and cybersecurity risks. The UK became the first government to establish a permanent, funded body specifically tasked with pre-deployment evaluation of AI systems for dangerous dual-use outputs. The U.S. AI Safety Institute, announced days later under NIST, obtained similar mandate language. Both institutions immediately faced a structural problem: they had no legal authority to require companies to submit models for evaluation before deployment.
The life-sciences community's most mature dual-use governance instrument is the pre-publication review β institutional biosafety committee evaluation before sensitive research goes public. The model has documented limitations: it depends on researcher compliance, many countries lack comparable institutions, and the review process was not designed for the speed of AI-generated outputs.
The Responsible Disclosure norm in cybersecurity offers a parallel framework: researchers who discover vulnerabilities notify vendors privately before public disclosure, giving time for patches. AI laboratories have adapted this concept through staged deployment (releasing to trusted testers before general release) and capability thresholds (withholding or restricting specific capabilities identified as high-risk). GPT-4's system card, published in March 2023, is the most detailed public example β OpenAI described testing for CBRN (chemical, biological, radiological, nuclear) uplift and implementing mitigations before release.
In July 2023, seven leading AI companies β Anthropic, Google, Meta, Microsoft, Amazon, OpenAI, and Inflection β made voluntary commitments to the White House including: sharing safety information with governments and each other; investing in cybersecurity and insider threat safeguards; and conducting research on CBRN risk from AI. These commitments were voluntary, unverified, and unenforceable. They represented a political signal and a baseline, not a binding governance framework. Critics noted the absence of timelines, metrics, or enforcement mechanisms.
The Bureau of Industry and Security (BIS) at the Commerce Department has been the most active U.S. regulatory actor on AI dual-use governance. Its October 2022 chip export controls β restricting sale of advanced AI chips (NVIDIA A100, H100) and chip-manufacturing equipment to China β were explicitly framed as dual-use controls, targeting AI capabilities that could enable advanced weapons development and mass surveillance.
In October 2023, BIS tightened these controls significantly with a new rule that closed workarounds China had exploited, added performance thresholds to capture future chip generations, and extended controls to additional countries. The controls represent perhaps the most consequential dual-use AI governance action to date β not because they prevent China from developing AI, but because they significantly slow access to leading-edge training infrastructure.
The model for computing controls differs from traditional dual-use governance in an important way: it targets the means of production rather than the knowledge itself. You cannot embargo the understanding of transformer architectures, but you can restrict access to the chips needed to train competitive models at scale.
One of the most contested governance questions is whether AI safety research itself constitutes an information hazard. When researchers demonstrate that an AI model can be prompted to provide dangerous bioweapon synthesis guidance, publishing a detailed description of the jailbreak enables adversaries to replicate the approach. When they demonstrate synthesis-screening evasion, they publish a technique for evading screening.
The community has not reached consensus on this. A 2023 Center for Security and Emerging Technology (CSET) analysis identified three camps: those who favor full disclosure on the grounds that adversaries already know these techniques; those who favor full suppression; and those who advocate for coordinated disclosure β sharing findings with AI developers, government, and affected industries before or instead of public publication.
The Biological Weapons Convention (BWC) review conferences have twice (2011, 2016) discussed but failed to adopt measures specifically addressing dual-use life-sciences research. The BWC has no verification mechanism and no standing scientific advisory body. AI's intersection with the BWC's scope has been raised in Expert Meetings but has not produced binding guidance.
Compute governance β controlling access to the hardware needed to train and run frontier AI systems β has emerged as a potentially tractable governance lever. The argument: unlike knowledge (which spreads freely), compute is physical, trackable, and manufactured by a small number of firms in a small number of locations. The current semiconductor supply chain runs through TSMC in Taiwan, with ASML in the Netherlands holding monopoly supply of EUV lithography machines essential for advanced chip production.
Proposals under active policy discussion include: Know Your Customer (KYC) requirements for large cloud computing providers, requiring verification of end-user identity and stated purpose for large training runs; on-chip governance mechanisms that could enable remote attestation of how chips are being used; and international monitoring of large compute clusters analogous to nuclear facility inspection regimes. None of these are currently implemented at scale.
The Institute for AI Policy and Strategy (IAPS) and Georgetown CSET have both published technical analyses of compute governance feasibility, generally concluding that the window for implementing effective compute-based controls is narrowing as chip designs proliferate and domestic manufacturing capacity grows in China.
Meta's release of the LLaMA model weights (February 2023, July 2023 with LLaMA 2) crystallized an unresolved governance debate. Once model weights are public, no subsequent safety measure, content filter, or access restriction applies β anyone with sufficient compute can fine-tune the model to remove safety training. Defenders of open release argue that closed models concentrate power dangerously and that open models enable safety research. Critics argue that releasing weights of capable models is an irreversible action whose dual-use risk cannot be contained. Both arguments have merit, and no international governance norm yet addresses this question.
Engage with the governance frameworks and gaps covered in Lesson 4. Evaluate mechanisms like the UK/US AISIs, BIS chip controls, voluntary commitments, and compute governance proposals against the real dual-use risks identified in earlier lessons.