Module 3 · Lesson 1

Voluntary Commitments and Their Limits

From the White House pledges of 2023 to Frontier Model Forum — what industry promises actually deliver, and where they fall short.

When tech companies govern themselves, who watches the watchers?

Seven major AI companies — Amazon, Anthropic, Google, Inflection, Meta, Microsoft, and OpenAI — filed into the Roosevelt Room at the White House and emerged with a set of voluntary commitments. They pledged to share safety information, invest in cybersecurity, and watermark AI-generated content. The Biden administration called it a historic first step. Critics called it a press release. Both were partly right.

Within weeks, the same companies jointly announced the Frontier Model Forum, a membership body that would advance AI safety research and define best practices. The question hanging in the air was familiar from every other regulated industry: voluntary commitments are easy to make in a room full of cameras. What actually gets measured, and who enforces it?

The Structure of Voluntary Commitments

Industry self-regulation in AI follows patterns seen in previous technology waves. Companies offer voluntary commitments — formal public pledges about responsible behavior — to signal trustworthiness, pre-empt legislation, and establish norms that favor incumbents with the resources to implement them. The July 2023 White House pledges covered eight areas: internal red-teaming before deployment, sharing safety information across companies, investing in cybersecurity, implementing watermarking of AI-generated content, reporting vulnerabilities, supporting research on AI risk, prioritizing research on societal risks, and developing technical mechanisms to identify AI-generated content.

The commitments were notable for what they did not include: no independent audits, no enforcement body, no withdrawal consequences, and no specific metrics. A company could claim compliance by taking any marginal action in each area. OpenAI, for instance, had already been running red-teaming internally; the pledge required no new behavior on its part.

Key Pattern

Voluntary AI commitments consistently lack three elements that make regulatory frameworks effective: independent verification, quantitative benchmarks, and consequences for non-compliance. Without these, commitments function primarily as public relations instruments.

The Frontier Model Forum

Announced on July 26, 2023, the Frontier Model Forum (FMF) was formed by Anthropic, Google, Microsoft, and OpenAI. Its stated goals included advancing AI safety research, identifying best practices for responsible deployment, sharing knowledge with policymakers, and supporting efforts to address AI risks. A $10 million AI Safety Fund was established.

The FMF structure reflects the tension at the heart of industry self-regulation: the companies funding the forum are also the companies whose conduct it is meant to oversee. Board membership is controlled by paying members. Decisions require consensus among major players who are simultaneously fierce commercial competitors. By early 2024, the FMF had published working papers on red-teaming methodologies and model evaluations, but had not produced binding safety standards or sanctioned any member for unsafe practices.

Critics including AI researcher Timnit Gebru and organizations like the Algorithmic Justice League noted that the FMF focused almost exclusively on catastrophic and existential AI risks — the concerns most relevant to large foundation model developers — while sidelining near-term harms like algorithmic discrimination, labor displacement, and surveillance, which affect marginalized communities most acutely.

Voluntary Commitment A public pledge by a company or industry group to adopt certain practices, not backed by legal obligation or third-party enforcement. Effectiveness depends entirely on reputational incentives and cultural pressure.

Frontier Model Forum An industry body founded in July 2023 by Anthropic, Google, Microsoft, and OpenAI to promote AI safety research and develop best practices for frontier AI systems.

Self-Regulatory Organization (SRO) A non-governmental body that exercises regulatory authority over an industry or profession, often with delegated powers or informal authority. Examples include FINRA in finance and the Internet Watch Foundation online.

Precursors: When Tech Self-Regulation Has and Hasn't Worked

Tech industry self-regulation has a mixed record. The Children's Online Privacy Protection Act (COPPA) of 1998 was partly a response to industry failure to voluntarily protect children's data, demonstrating that voluntary measures eventually yield to legislation when harms mount. By contrast, payment card security (PCI-DSS) shows a case where industry self-regulation produced technically specific, consistently enforced standards — though only after data breaches created powerful liability incentives.

The Global Network Initiative (GNI), founded in 2008 after Google and Yahoo faced criticism for cooperating with censorship in China, provides perhaps the most instructive model for AI. GNI requires member companies to undergo independent assessments of their human rights practices every two years. Membership nonetheless remains small, audits are limited in scope, and GNI has no power to expel or sanction members beyond reputational consequences.

The pattern across sectors is consistent: voluntary commitments proliferate when regulation threatens, establish norms that favor incumbents, and tend to address symptoms rather than structural causes of harm. The more specific and independently verified the commitment, the more costly it becomes to maintain and the less likely companies are to adopt it.

Historical Note

In 1930, the Hollywood film industry created the Hays Code as a voluntary self-censorship regime to avoid federal regulation. It persisted for 38 years. The MPAA ratings system that replaced it in 1968 also began as voluntary self-regulation and remains so today. Both illustrate how voluntary regimes can achieve longevity and social legitimacy — but also how they can entrench industry values at the expense of public ones.

Lesson 1 Quiz

Voluntary Commitments and Their Limits · 5 questions

1. How many AI companies signed voluntary commitments at the White House in July 2023?

Correct. Amazon, Anthropic, Google, Inflection, Meta, Microsoft, and OpenAI — seven companies — signed the White House voluntary commitments on July 21, 2023.

Not quite. Seven companies signed: Amazon, Anthropic, Google, Inflection, Meta, Microsoft, and OpenAI on July 21, 2023.

2. Which of the following was NOT included in the July 2023 White House AI voluntary commitments?

Correct. Independent audits with binding findings were notably absent. The commitments lacked enforcement mechanisms, quantitative benchmarks, and consequences for non-compliance.

That was actually included. The commitments lacked independent audits with binding findings — a key gap critics identified immediately.

3. Which four companies founded the Frontier Model Forum in July 2023?

Correct. Anthropic, Google, Microsoft, and OpenAI founded the Frontier Model Forum on July 26, 2023, just five days after the White House commitments.

The Frontier Model Forum was founded by Anthropic, Google, Microsoft, and OpenAI on July 26, 2023.

4. Critics like Timnit Gebru argued the Frontier Model Forum's focus was too narrow because it primarily addressed:

Correct. Gebru and the Algorithmic Justice League argued the FMF's focus on catastrophic/existential risks neglected near-term harms like discrimination, labor displacement, and surveillance disproportionately affecting marginalized communities.

The critique centered on the FMF's focus on catastrophic/existential AI risks while neglecting near-term harms like algorithmic discrimination that affect marginalized communities more immediately.

5. The Global Network Initiative (GNI), founded in 2008, is most relevant to AI self-regulation because it:

Correct. GNI's biennial independent assessments represent the stronger end of voluntary commitment verification — yet membership remains small and assessments limited in scope, illustrating both the potential and the persistent limits of self-regulation.

GNI is instructive because it mandates independent assessments every two years — stronger than most voluntary frameworks — yet still has small membership and limited audit scope, showing both potential and limits of self-regulation.

Lab 1: Evaluating Voluntary Commitments

Analyze what makes AI voluntary commitments credible — or not

Your Task

You'll analyze real voluntary AI commitments using a framework that distinguishes credible pledges from performative ones. Consider the 2023 White House commitments, the Frontier Model Forum, and similar industry initiatives.

Discuss with the AI assistant what distinguishes a credible voluntary commitment from a public-relations exercise, and how you would design a stronger self-regulatory framework for frontier AI companies.

Start by asking: "What are the minimum requirements for a voluntary AI commitment to be considered credible?" — or jump straight to critiquing the White House 2023 pledges.

AI Governance Lab Assistant

Lab 1

Welcome to Lab 1. We're examining voluntary commitments in AI governance — specifically what separates credible pledges from performative ones. The July 2023 White House commitments are a perfect case study. What aspect would you like to explore first: the structure of the pledges, their enforcement gaps, or how you'd design something more robust?

Module 3 · Lesson 2

Internal Ethics Boards and Red-Teaming

Google's ATEN dissolution, OpenAI's safety departures, and Meta's Responsible AI team disbanding — what happens when internal governance structures conflict with business imperatives.

Can a company's ethics board meaningfully constrain the company that funds it?

Google's Advanced Technology External Advisory Council (ATEN) lasted exactly eight days. Announced on March 26, 2019 as an external ethics board for Google AI, it collapsed after employees circulated petitions objecting to the inclusion of Heritage Foundation president Kay Coles James, whose organization had opposed LGBTQ rights. A second member resigned over drone warfare concerns. Google quietly announced on April 4 that the council was dissolved, with a statement acknowledging it had become "untenable."

The episode crystallized a structural problem that plagues internal AI governance everywhere: the company controls membership, agenda, funding, and the ability to disband the board. An ethics body that cannot survive its first controversy over membership has no credible authority over anything else.

The Anatomy of Internal AI Ethics Boards

Internal AI ethics boards proliferated between 2017 and 2022. Microsoft, IBM, Salesforce, SAP, and dozens of smaller companies created dedicated ethics teams or advisory councils. Their mandates typically included reviewing products for bias and fairness, advising on responsible deployment, and producing public principles documents. By 2022, the wave had crested — and in many cases reversed.

In November 2022, Meta disbanded its Responsible AI team, reassigning most members to generative AI product work. The team had been responsible for Meta's fairness toolkits and its Fundamental AI Research ethics work. The dissolution coincided with Meta's pivot toward aggressive AI product development under Yann LeCun's leadership. Meta did not publicly explain the decision.

At OpenAI, the departure of safety-oriented researchers became a recurring story. Ilya Sutskever, co-founder and chief scientist, departed in May 2024 after playing a role in the November 2023 board drama that briefly ousted CEO Sam Altman. Jan Leike, who co-led OpenAI's Superalignment team — tasked with solving alignment for superintelligent systems — resigned in May 2024 with an extraordinary public statement: "Safety culture and processes have taken a back seat to shiny products." The Superalignment team was effectively dissolved months after its founding.

Structural Problem

Internal ethics boards face a fundamental principal-agent problem: they are funded by and accountable to the organization whose conduct they evaluate. This creates structural pressure to align findings with business needs, avoid blocking high-revenue products, and self-censor to preserve access and influence.

Red-Teaming: From Military Concept to AI Practice

Red-teaming — deliberately adversarial testing by an internal or external team — has become the primary technical safety process companies invoke. The term originated in Cold War military strategy, where "red teams" simulated Soviet attacks on US defenses. In AI, red-teaming means systematic adversarial prompting to elicit harmful outputs, test safety mitigations, and identify failure modes before deployment.

OpenAI's GPT-4 technical report (March 2023) described an extensive red-teaming process involving over 50 experts in domains including biosecurity, cybersecurity, and disinformation. The report documented specific risk categories tested and mitigations applied. This level of transparency was notable — and prompted immediate questions about what was not disclosed.

The AI Safety Institute (AISI) in the UK, established under the Bletchley Declaration in November 2023, has worked to formalize red-teaming as a pre-deployment evaluation standard. AISI conducted evaluations of several frontier models before their public release, finding in its first published evaluation of Claude 3 Opus (April 2024) that the model showed no uplift capability for creating chemical or biological weapons — but acknowledged the methodology was still evolving.

Critics of company-conducted red-teaming note that the teams report to company leadership, test only what leadership decides to test, and have their findings filtered before public release. Independent red-teaming — as practiced by AISI, the US AI Safety Institute (USAISI), and academic researchers — addresses this but faces access challenges: companies control what model versions researchers can test.

Red-Teaming Systematic adversarial testing designed to probe an AI system for harmful outputs, safety failures, and policy violations. Can be internal (company-run) or external (independent researchers or government agencies).

AI Safety Institute (AISI) A UK government body established in November 2023 at the Bletchley AI Safety Summit to evaluate frontier AI models before and after deployment. The first of its kind; the US established a parallel USAISI in 2024.

Superalignment OpenAI's initiative, announced July 2023, to solve the technical problem of aligning superintelligent AI systems. The team was allocated 20% of OpenAI's compute and aimed at a 4-year research program. Its co-head Jan Leike resigned publicly in May 2024 citing safety culture failures.

What Effective Internal Governance Requires

Research on corporate governance suggests internal ethics mechanisms work best when they have: independent reporting lines (to boards rather than executives), veto power or meaningful delay authority over product launches, protected employment for ethics personnel, external validation of findings, and public accountability through disclosure of recommendations and outcomes.

Anthropic's Constitutional AI approach and its published Acceptable Use Policy represent an attempt to embed safety in the technical training process rather than relying solely on post-hoc review — but Anthropic is a private company with no obligation to disclose whether its internal safety recommendations have ever delayed or modified a product launch. Google DeepMind's published safety policies and regular model cards represent stronger disclosure practices than most, yet the merger of Google Brain and DeepMind in 2023 raised concerns about whether safety-focused research culture would survive commercial pressures.

Case Study: OpenAI's November 2023 Board Crisis

On November 17, 2023, OpenAI's board — which under the company's unusual structure had a nonprofit governance mandate to ensure AI benefited humanity — fired CEO Sam Altman, citing concerns about his candor. Within 96 hours, Microsoft offered Altman a new role, nearly the entire OpenAI staff threatened to resign and follow him, and the board reversed course and reinstated Altman. The board members who had voted to fire him resigned or were removed. The episode demonstrated that even a structurally unusual governance mechanism designed to prioritize safety over commercial interests could be rapidly overwhelmed by financial and employment pressure.

Lesson 2 Quiz

Internal Ethics Boards and Red-Teaming · 5 questions

1. How long did Google's Advanced Technology External Advisory Council (ATEN) survive before being dissolved?

Correct. ATEN was announced March 26, 2019 and dissolved April 4, 2019 — just eight days — after employee petitions objected to a member's views and another member resigned.

ATEN lasted only eight days, from its announcement on March 26 to dissolution on April 4, 2019.

2. Jan Leike's public resignation from OpenAI in May 2024 stated that at OpenAI:

Correct. Leike's resignation statement — "Safety culture and processes have taken a back seat to shiny products" — was an unusually direct public indictment of OpenAI's organizational priorities from a departing senior safety researcher.

Leike's exact words were: "Safety culture and processes have taken a back seat to shiny products." It was an unusually public and specific critique from a departing safety team co-lead.

3. The UK AI Safety Institute (AISI) was established as a result of which event in November 2023?

Correct. AISI was established under the Bletchley Declaration, signed at the AI Safety Summit hosted by the UK government at Bletchley Park in November 2023.

AISI was established under the Bletchley Declaration from the UK-hosted AI Safety Summit at Bletchley Park in November 2023.

4. Which company disbanded its Responsible AI team in November 2022, reassigning most members to generative AI product work?

Correct. Meta disbanded its Responsible AI team in November 2022, coinciding with the company's aggressive pivot toward generative AI product development.

Meta disbanded its Responsible AI team in November 2022, reassigning members to generative AI product development at a time of aggressive expansion.

5. The structural problem with company-conducted red-teaming is primarily that:

Correct. Company-controlled red-teaming has limited independence: the team's scope, methodology, and findings all flow through organizational hierarchy, creating incentives to avoid findings that would delay or kill revenue-generating products.

The core issue is independence: company red-teams report to leadership, test what leadership approves, and findings are filtered — creating structural pressure away from findings that impede commercial products.

Lab 2: Designing Credible Internal Governance

What would a genuinely independent internal AI ethics board look like?

Your Task

Given the failures of Google's ATEN, the dissolution of Meta's Responsible AI team, and the OpenAI board crisis, you're tasked with designing a more credible internal AI governance structure for a hypothetical large AI company.

Discuss with the AI assistant what structural features would make an internal AI ethics board or safety team genuinely effective rather than performative — and what trade-offs companies face in implementing them.

Consider: independent reporting lines, veto authority, employment protections, public disclosure requirements, and lessons from financial sector governance (like audit committees). What's the minimum viable structure for credibility?

AI Governance Lab Assistant

Lab 2

Welcome to Lab 2. We've seen internal AI ethics boards fail repeatedly — Google's ATEN lasted 8 days, Meta's Responsible AI team was disbanded when it conflicted with product priorities, and OpenAI's board was effectively neutered by financial pressure. Let's design something better. What do you think is the most critical structural failure in these cases — and how would you fix it?

Module 3 · Lesson 3

Industry Codes and Standards Bodies

IEEE, ISO/IEC, NIST AI RMF, and the PAI — how technical standards emerge, who writes them, and whose values they encode.

When engineers write the rules for AI, do democratic values survive the translation?

In the marble corridors of ISO's Geneva offices, representatives from 167 national standards bodies spent three years negotiating ISO/IEC 42001 — the world's first international standard for AI management systems. Published in December 2023, it specifies how organizations should plan, implement, and improve their AI governance. It does not tell them what to do about any specific AI capability.

The standard emerged from a working group heavily populated by representatives from large technology companies. IBM, Microsoft, and Google each had multiple delegates. Civil society organizations, affected communities, and academic researchers were nearly absent from the drafting process. This is not unusual — it is how standards are made. The question for AI governance is whether technical standards written by incumbent technology companies can adequately protect interests those companies have no financial incentive to prioritize.

The NIST AI Risk Management Framework

Released in January 2023, the NIST AI Risk Management Framework (AI RMF 1.0) is the United States' primary voluntary technical guidance for AI governance. Developed through extensive public consultation, it organizes AI risk management around four functions: Govern, Map, Measure, and Manage — with detailed practices for each.

The AI RMF is notable for its comprehensiveness and its explicit acknowledgment of sociotechnical risks including bias, privacy, and human rights. It draws explicitly from prior NIST frameworks in cybersecurity and privacy. The framework is explicitly voluntary and non-prescriptive — it tells organizations to think carefully about AI risks but does not specify what risk levels are acceptable or what mitigations are required.

By 2024, NIST had developed supplementary profiles including the Generative AI Profile (NIST AI 600-1), which addressed risks specific to large language models including confabulation, data privacy, and information integrity. Federal agencies were directed by executive order to use the AI RMF for managing AI risks in government applications.

The limitation of the NIST approach is characteristic of voluntary frameworks: adoption is uneven. Companies already committed to responsible AI adopt the framework and find it useful. Companies racing to ship products treat it as a documentation exercise. The framework cannot distinguish between these uses, and NIST has no enforcement authority.

NIST AI RMF Core Functions

Govern: Organizational policies, culture, and accountability structures for AI risk. Map: Categorizing AI contexts, intended uses, and potential harms. Measure: Analyzing and assessing AI risks quantitatively and qualitatively. Manage: Prioritizing, responding to, and monitoring AI risks throughout the system lifecycle.

IEEE Ethics Guidelines and the P7000 Series

The IEEE's Ethically Aligned Design initiative, launched in 2016, produced its first full document in 2019 — a 290-page framework for embedding ethical principles into autonomous and intelligent systems. The document drew on contributions from hundreds of experts globally and addressed topics ranging from algorithmic bias to autonomous weapons.

More concretely, the IEEE Standards Association launched the P7000 series — a collection of specific standards including P7001 (Transparency of Autonomous Systems), P7002 (Data Privacy Process), P7003 (Algorithmic Bias Considerations), and P7004 (Standard for Child and Student Data Governance). These process standards define how to think through specific risks, not what outcomes to achieve.

The adoption of IEEE P7000 standards by industry has been limited. Unlike ISO 9001 (quality management) or ISO 27001 (information security), which became near-universal requirements for enterprise contracting, the P7000 series has not achieved equivalent market pressure for adoption. No major corporate purchasing contract or government procurement requirement mandated compliance as of 2024.

Partnership on AI (PAI)

Founded in 2016 by Amazon, Apple, DeepMind, Facebook, Google, IBM, and Microsoft — later joined by academic and civil society members — the Partnership on AI (PAI) was established to study and formulate best practices for AI systems. Unlike the Frontier Model Forum, PAI includes civil society and academic members, making it structurally more representative.

PAI's published outputs include the Responsible Practices for Synthetic Media framework (2023), which provides guidance on deepfake labeling and content authentication. The framework was cited by several signatories as the basis for their voluntary watermarking commitments in the White House 2023 pledges. PAI also produced research on worker wellbeing in AI-impacted industries and fairness in algorithmic decision-making.

The structural tension in PAI is between its corporate funders — who control most of the operating budget — and its civil society members, who bring perspectives on affected communities. Corporate members have generally resisted positions that would require binding commitments or regulatory action. Several civil society organizations have publicly expressed frustration with the pace and depth of PAI's outputs, noting that consensus requirements among members with conflicting interests consistently produce the weakest possible recommendations.

ISO/IEC 42001 Published December 2023, the first international standard for AI management systems. Specifies organizational requirements for responsible AI governance as a certifiable management system standard.

NIST AI RMF The US National Institute of Standards and Technology's AI Risk Management Framework (January 2023). A voluntary, non-prescriptive framework organizing AI risk management into four functions: Govern, Map, Measure, Manage.

Partnership on AI (PAI) A multi-stakeholder organization founded 2016 by major tech companies plus civil society and academic partners, producing research and voluntary best practices for responsible AI. Notable for its inclusion of non-corporate voices.

Critical Perspective

Standards bodies consistently face the same structural problem: those with the most expertise in a technology (its developers) have the most influence in writing its standards, while those most affected by the technology (communities bearing its risks) have the least. The more technically complex the standard, the greater this expertise gap becomes. AI governance standards are among the most technically complex in history.

Lesson 3 Quiz

Industry Codes and Standards Bodies · 5 questions

1. ISO/IEC 42001, the first international AI management systems standard, was published in:

Correct. ISO/IEC 42001 was published in December 2023 after roughly three years of development in ISO working groups heavily populated by major technology company representatives.

ISO/IEC 42001 was published in December 2023, after approximately three years of negotiation in ISO's working groups.

2. The NIST AI Risk Management Framework organizes AI risk management around four core functions. Which of the following is NOT one of them?

Correct. The four NIST AI RMF functions are Govern, Map, Measure, and Manage. "Enforce" is not among them — reflecting the framework's explicitly voluntary, non-prescriptive nature.

The four functions are Govern, Map, Measure, and Manage. "Enforce" is absent — which is actually a design feature of the voluntary framework, not an oversight.

3. The Partnership on AI (PAI) is structurally distinct from the Frontier Model Forum primarily because PAI:

Correct. PAI's multi-stakeholder structure — incorporating civil society and academic voices alongside corporate founders — distinguishes it from industry-only bodies like the FMF, even if civil society members report frustration with consensus dynamics.

PAI's distinguishing feature is its multi-stakeholder structure including civil society and academic members, even though those members have less funding power than corporate ones.

4. The critical limitation of the NIST AI RMF identified in this lesson is that:

Correct. The NIST AI RMF's voluntary nature means adoption is self-selected: companies already committed to responsible AI find it useful, while others treat it as a documentation exercise. NIST has no authority to distinguish between genuine and performative adoption.

The core limitation is that voluntary adoption is uneven — responsible companies find it useful while others treat it as paperwork — and NIST has no enforcement authority to distinguish genuine from performative compliance.

5. Which IEEE standard specifically addresses algorithmic bias considerations?

Correct. IEEE P7003 covers Algorithmic Bias Considerations. P7001 is Transparency, P7002 is Data Privacy Process, and P7004 is Child and Student Data Governance.

IEEE P7003 specifically addresses Algorithmic Bias Considerations. P7001 is Transparency, P7002 is Data Privacy, and P7004 is Child/Student Data Governance.

Lab 3: Standards Body Analysis

Evaluate who writes AI standards and whose values they reflect

Your Task

Standards bodies like ISO, IEEE, and NIST shape AI governance through technical norms that can be as influential as legislation. But the composition of drafting groups, consensus requirements, and adoption dynamics all affect whose interests these standards serve.

Explore with the AI assistant how to evaluate whether an AI standard genuinely protects public interests vs. encodes incumbent industry preferences — and what reforms to standards processes would produce better outcomes.

Start by examining ISO/IEC 42001 or the NIST AI RMF. Ask: "Who had disproportionate influence in writing this standard, and how can you tell?" Then consider what a more inclusive standards process would require.

AI Governance Lab Assistant

Lab 3

Welcome to Lab 3. AI technical standards like ISO/IEC 42001 and the IEEE P7000 series are shaping how companies think about governance — but they're written largely by the same companies they govern. Let's dig into this. Which standard would you like to analyze first, and what's your initial read on whether its drafting process was representative?

Module 3 · Lesson 4

The Co-Regulation Frontier

From the EU AI Act's "regulatory sandbox" to the UK's pro-innovation approach — hybrid models that blend industry flexibility with public accountability.

Is co-regulation a principled compromise — or a lobbied-for halfway house that gives companies flexibility while providing regulators with political cover?

As EU negotiators worked through the final trilogues on the AI Act, a late-stage lobbying push sought to carve out foundation models from the most stringent requirements. The companies involved — primarily European voices for US foundation model developers — argued that regulating general-purpose AI at the model level would stifle innovation and impose compliance costs that favored large incumbents.

The final text created a tiered system: general-purpose AI models with high systemic impact face stricter obligations including model evaluations, transparency, and adversarial testing — but the threshold was set at 10^25 FLOPS of training compute, a level only a handful of the largest models crossed. Smaller but still powerful models faced lighter obligations. Critics noted that this threshold would be technically obsolete within years as compute efficiency improved.

What Co-Regulation Means in Practice

Co-regulation refers to regulatory frameworks where governments set overarching goals and accountability requirements, while delegating detailed rule-setting and some enforcement to industry bodies or technical standards organizations. It is distinct from pure self-regulation (industry governs itself entirely) and command-and-control regulation (government specifies all requirements).

Co-regulation has precedents in financial services (banks set internal risk models subject to regulatory approval), telecommunications (spectrum allocation with industry technical standards), and internet content moderation (platforms set policies within legal frameworks like NetzDG in Germany or the DSA in the EU). Each sector shows both the potential and the risks: co-regulation can leverage industry expertise and adapt quickly to technology change, but regulatory capture — where the regulated industry shapes regulation in its own interest — is a persistent risk.

For AI, co-regulation models take several forms. The EU AI Act's regulatory sandboxes allow companies to test high-risk AI systems under relaxed rules in exchange for data sharing with regulators. The UK's pro-innovation regulatory framework, outlined in its March 2023 AI regulation white paper, explicitly assigns responsibility to existing sector regulators (FCA for financial AI, CMA for competition, ICO for data) rather than creating a new AI-specific body, while industry-funded "frontier AI taskforces" provide technical guidance.

EU AI Act: Key Co-Regulation Features

The EU AI Act (formally adopted June 2024) mandates that general-purpose AI model providers with high systemic impact conduct adversarial testing, share results with the European AI Office, and maintain model cards. For implementation details, however, the Act defers to codes of practice developed by industry groups working with the European AI Office — a classic co-regulatory structure.

The EU AI Office and Codes of Practice

The European AI Office, established within the European Commission in February 2024, is the primary supervisory body for general-purpose AI models under the AI Act. It oversees compliance, coordinates enforcement across member states, and — crucially — manages the development of codes of practice that will fill in the technical details of the Act's requirements.

The first AI Code of Practice drafting process, launched in late 2024, involved over 1,000 stakeholders including AI developers, civil society organizations, and member state representatives. The process was significantly more inclusive than ISO standards drafting — but also significantly more complex and slower. The codes of practice must be finalized before GPAI rules fully apply, creating a window during which enforcement depends on voluntary compliance.

A specific tension emerged around copyright and training data: the AI Act requires GPAI model providers to publish "sufficiently detailed summaries" of training data used. Several major providers argued this would require disclosing commercially sensitive information. The codes of practice process became a venue to negotiate how much transparency was actually required — illustrating how co-regulation often involves ongoing negotiation about the actual content of rules, not just their implementation.

The UK's Distributed Regulatory Approach

The UK explicitly chose not to pass comprehensive AI legislation in 2023–2024, instead issuing a white paper directing existing sector regulators to apply their existing frameworks to AI. The Competition and Markets Authority (CMA) launched a foundation models review in 2023, examining whether frontier AI created anticompetitive market structures. The Information Commissioner's Office (ICO) issued guidance on generative AI and data protection. The Medicines and Healthcare products Regulatory Agency (MHRA) addressed AI in medical devices.

This distributed approach leverages sector-specific expertise and avoids creating a large new regulatory bureaucracy. Its weakness is coordination: an AI system used in healthcare, credit scoring, and employment decisions may be regulated by three different bodies with different standards, creating inconsistency and compliance complexity. The government's proposed AI Safety Institute and later AI Security Institute focused on frontier model evaluation rather than horizontal governance — leaving the coordination gap largely unfilled as of 2024.

The UK's approach also created uncertainty for industry: companies operating across EU and UK markets faced different regulatory requirements, and the UK's "pro-innovation" framing raised questions about whether safety would be adequately weighted when it conflicted with competitiveness goals.

Co-Regulation A regulatory model where government sets overarching goals and accountability requirements while delegating detailed rule-making to industry bodies or technical standards organizations. Aims to combine government accountability with industry expertise.

Regulatory Sandbox A supervised space where companies can test innovative products or services under relaxed regulatory requirements in exchange for data sharing and regulatory oversight. Used in the EU AI Act for high-risk AI development.

Regulatory Capture The process by which regulatory agencies or co-regulatory bodies come to advance the interests of the industries they regulate rather than the public interest. A systemic risk in any regulatory model where industry has structural advantages in resources and information.

Comparative Assessment

Across the three models examined in this lesson — EU comprehensive legislation with co-regulatory implementation details, UK distributed sector regulation with voluntary guidelines, and US voluntary frameworks with some federal sector requirements — the EU model provides the strongest legal accountability while the US and UK models offer more industry flexibility. No model has yet produced independent verification of meaningful safety improvements for frontier AI systems. The empirical record remains limited because the models themselves are so new.

Lesson 4 Quiz

The Co-Regulation Frontier · 5 questions

1. Under the EU AI Act, the threshold for a general-purpose AI model to face the strictest systemic impact obligations was set at:

Correct. The EU AI Act set the systemic impact threshold at 10^25 FLOPS of training compute — a level only the largest frontier models exceeded as of 2024, though critics noted this threshold would become less meaningful as compute efficiency improved.

The threshold was 10^25 FLOPS of training compute. Critics noted this compute-based threshold would become technically obsolete as models achieved more capability per FLOP through improved architectures.

2. The European AI Office was established in:

Correct. The European AI Office was established within the European Commission in February 2024, in advance of the AI Act's formal adoption in June 2024, to begin preparatory work on codes of practice and enforcement frameworks.

The European AI Office was established in February 2024, within the European Commission, to begin work on codes of practice before the AI Act's formal June 2024 adoption.

3. Co-regulation differs from pure self-regulation in that co-regulation:

Correct. Co-regulation's defining feature is the hybrid structure: government establishes the framework and accountability requirements, while industry bodies or technical standards organizations fill in the detailed rules — combining government legitimacy with industry expertise.

Co-regulation is defined by its hybrid structure: government sets overarching goals and accountability requirements, while industry bodies provide detailed rule-setting and some enforcement within that government-established framework.

4. The UK's 2023 approach to AI regulation was primarily characterized by:

Correct. The UK's March 2023 AI regulation white paper directed existing sector regulators — FCA, CMA, ICO, MHRA — to apply their existing frameworks to AI, explicitly avoiding new AI-specific legislation in favor of a "pro-innovation" distributed approach.

The UK's approach directed existing sector regulators (FCA, CMA, ICO, MHRA) to apply their frameworks to AI, deliberately avoiding new legislation — a "pro-innovation" distributed model contrasting with the EU's comprehensive approach.

5. Regulatory capture in the context of AI co-regulation refers to:

Correct. Regulatory capture — where the regulated industry shapes regulation in its own interest — is a systemic risk in co-regulatory models where industry holds structural advantages in technical expertise, resources, and access to regulators. It is a well-documented phenomenon across financial services, telecommunications, and other complex technical industries.

Regulatory capture is the process by which regulatory agencies advance industry interests rather than the public's — a systemic risk whenever the regulated industry has structural advantages in technical knowledge, resources, and regulatory access, as AI companies currently do.

Lab 4: Designing a Co-Regulatory Framework

Build a hybrid governance model for a specific AI application domain

Your Task

You've seen how the EU AI Act uses co-regulatory codes of practice, how the UK distributes AI oversight across sector regulators, and how voluntary frameworks fill the space where law hasn't yet reached. Now design your own co-regulatory model.

Choose a specific AI application domain — hiring algorithms, medical diagnosis AI, autonomous vehicles, content moderation systems, or credit scoring — and work with the AI assistant to build a co-regulatory framework that balances innovation with meaningful public accountability.

A strong co-regulatory design will specify: who sets the rules (and with what legitimacy), who verifies compliance (and how they're funded), what happens when a company fails to comply, and how the framework adapts as the technology evolves. Pick your domain and begin.

AI Governance Lab Assistant

Lab 4

Welcome to Lab 4. We're designing co-regulatory frameworks — hybrid models where government sets accountability requirements and industry fills in technical details. The challenge is preventing the "detailed rule-setting" from becoming a vehicle for regulatory capture. Which AI application domain would you like to design a framework for? I'd recommend picking one where you can be specific about the harms, the stakeholders, and the technical verification challenges.

Module 3 Test

Industry Self-Regulation · 15 questions · Pass at 80%

1. The July 2023 White House voluntary AI commitments were signed by how many companies, and which was NOT among them?

Correct. Seven companies signed: Amazon, Anthropic, Google, Inflection, Meta, Microsoft, and OpenAI. Apple was notably absent.

Seven companies signed — Amazon, Anthropic, Google, Inflection, Meta, Microsoft, and OpenAI. Apple did not participate.

2. The Frontier Model Forum's $10 million AI Safety Fund was established to:

Correct. The $10 million AI Safety Fund was established to support AI safety research as part of the FMF's stated mission of advancing safety for frontier models.

The AI Safety Fund was established to support AI safety research — part of the FMF's stated mission alongside developing best practices and sharing knowledge with policymakers.

3. Google's ATEN was dissolved primarily because:

Correct. ATEN collapsed after employee petitions objected to Heritage Foundation president Kay Coles James's inclusion, a second member resigned over drone warfare concerns, and Google declared it "untenable" — all within 8 days of announcement.

ATEN was dissolved after employee petitions over a member's views and another member's resignation made it untenable — within 8 days of its announcement.

4. OpenAI's Superalignment team was co-led by Jan Leike. When Leike resigned publicly in May 2024, his specific critique was:

Correct. Leike's exact resignation statement included: "Safety culture and processes have taken a back seat to shiny products" — an unusually direct public critique of organizational priorities from a departing safety team lead.

Leike's public statement was: "Safety culture and processes have taken a back seat to shiny products." It was remarkable for its specificity and the seniority of the person making it.

5. The principal-agent problem in internal AI ethics boards refers to:

Correct. The principal-agent problem in internal ethics boards is structural: the organization whose conduct is being evaluated (the principal) controls the ethics board (the agent) through funding and governance, creating incentives to avoid findings that threaten revenue.

The principal-agent problem is that ethics boards are funded and governed by the organization they evaluate — creating structural incentives to align findings with business needs and avoid blocking revenue-generating products.

6. The Global Network Initiative (GNI), founded in 2008, was established primarily in response to:

Correct. GNI was founded in 2008 after Google and Yahoo faced significant criticism for cooperating with Chinese government censorship demands, creating reputational and political pressure that motivated the voluntary human rights framework.

GNI was founded in 2008 after Google and Yahoo cooperated with Chinese censorship — creating the reputational pressure that motivated the voluntary human rights assessment framework.

7. ISO/IEC 42001 is best described as:

Correct. ISO/IEC 42001 is a management system standard — like ISO 9001 for quality or ISO 27001 for security — that organizations can be certified against. It specifies governance processes, not prohibited applications or technical benchmarks.

ISO/IEC 42001 is a certifiable management system standard, comparable to ISO 9001 (quality) or ISO 27001 (security), specifying organizational processes for AI governance rather than prohibitions or benchmarks.

8. The NIST AI RMF's Generative AI Profile (NIST AI 600-1) specifically addressed risks including:

Correct. NIST AI 600-1, the Generative AI Profile, addressed risks specific to large language models including confabulation (hallucination), data privacy violations, and information integrity — including disinformation risks.

NIST AI 600-1 addressed LLM-specific risks: confabulation, data privacy, and information integrity. These reflect the distinctive failure modes of generative AI systems compared to earlier narrow AI.

9. In the November 2023 OpenAI board crisis, why was Sam Altman reinstated within 96 hours of being fired?

Correct. Microsoft's job offer to Altman and the near-total employee threat to resign demonstrated that financial and employment leverage overwhelmed the board's governance authority — even a structurally unusual safety-oriented governance mechanism.

Microsoft's offer to Altman and nearly the entire staff threatening to follow him created overwhelming financial and employment pressure that forced the board to reverse course and reinstate Altman.

10. Partnership on AI (PAI) was founded in 2016 by which group of companies?

Correct. PAI's founding corporate members were Amazon, Apple, DeepMind, Facebook, Google, IBM, and Microsoft — a notably broad group that predated the current generation of generative AI companies.

PAI was founded by Amazon, Apple, DeepMind, Facebook, Google, IBM, and Microsoft — a 2016 coalition that predated the generative AI era and the current focus on frontier model governance.

11. The EU AI Act's regulatory sandbox provision allows companies to:

Correct. EU AI Act sandboxes allow supervised testing of high-risk AI systems under relaxed rules in exchange for data sharing with regulators — a co-regulatory mechanism designed to enable innovation while maintaining oversight.

Sandboxes allow supervised testing under relaxed rules in exchange for data sharing with regulators — giving companies flexibility to innovate while giving regulators real-world evidence for refining requirements.

12. The UK's distributed AI regulatory approach is primarily criticized for:

Correct. The UK's distributed approach means an AI system used in healthcare, credit scoring, and employment could be governed by MHRA, ICO, and FCA respectively — with potentially inconsistent standards and no horizontal coordinator.

The coordination gap is the primary weakness: different sector regulators with different standards for the same underlying AI system, with no horizontal coordinator to ensure consistency.

13. Which of the following best describes the difference between the NIST AI RMF and the EU AI Act?

Correct. The NIST AI RMF is voluntary guidance with no enforcement authority, while the EU AI Act is binding legislation with a designated supervisory authority (the European AI Office) and financial penalties for non-compliance.

The fundamental difference is legal force: NIST RMF is voluntary guidance while the EU AI Act is binding law with enforcement authority and financial penalties — representing fundamentally different governance philosophies.

14. The meta-critique of all voluntary AI commitments identified across this module is that they consistently lack:

Correct. The consistent critique across the White House pledges, FMF commitments, and industry ethics initiatives is the absence of the three elements that make regulatory frameworks effective: independent verification, quantitative benchmarks, and consequences for non-compliance.

The three consistently absent elements across voluntary AI commitments are: independent verification (who checks), quantitative benchmarks (what counts as compliant), and consequences for non-compliance (what happens if you fail).

15. Regulatory capture in AI co-regulation is best prevented by:

Correct. Preventing regulatory capture requires addressing the structural advantages that make it possible: independent funding (removes direct industry financial control), diverse stakeholder representation (breaks monopoly on technical expertise), transparent decision-making (enables public scrutiny), and public accountability (creates reputational consequences).

Preventing capture requires structural interventions: independent funding, diverse representation including affected communities, transparent decision-making, and public accountability for outcomes — each addressing a specific mechanism through which capture occurs.