On March 20, 2023, OpenAI disclosed that a bug in the Redis open-source library it used had caused approximately 1.2% of ChatGPT Plus subscribers to briefly see another active user's chat titles, first messages, and in some cases partial payment information. The vulnerability was not in OpenAI's own code — it was in a third-party caching layer OpenAI had integrated. Hundreds of thousands of businesses that had embedded ChatGPT into customer workflows suddenly faced questions about data exposure they had no direct visibility into and no contractual warning of.
The incident illustrated a principle that governs every modern AI deployment: your risk surface is not bounded by your own walls. It extends through every vendor, library, and foundation model provider in your stack.
Most enterprise AI products are not purpose-built from scratch. They are assembled stacks: a foundation model (GPT-4, Claude, Gemini, Llama) sits at the base; a middleware layer handles orchestration and retrieval; vector databases store embeddings; monitoring tools observe outputs. Each layer is typically provided by a different vendor, often under different contractual terms, different security certifications, and different data-handling regimes.
When your organisation licenses an AI application from a SaaS vendor, you are in practice entering relationships with every subprocessor that vendor uses — yet your procurement process likely evaluated only the top-level vendor. The subprocessor chain is frequently three to five companies deep before you reach the actual model inference infrastructure.
This creates what researchers at the Brookings Institution described in 2023 as "n-tier opacity": the business deploying AI cannot observe the risk practices of the vendors that actually perform inference on its data. Standard SaaS due diligence — SOC 2 reports, penetration test summaries, data processing agreements — typically covers only the immediate vendor, not the chain.
As of 2024, the majority of commercial AI applications globally run on inference infrastructure controlled by three providers: OpenAI, Google (Vertex AI / Gemini), and Anthropic, with Microsoft Azure OpenAI Service acting as a fourth major distribution channel. Amazon Bedrock aggregates several of these behind AWS infrastructure. This means a service disruption, policy change, or security incident at any one of these providers propagates simultaneously across thousands of downstream enterprise applications.
This is not hypothetical. In November 2023, OpenAI experienced a governance crisis — the board's abrupt firing and reinstatement of CEO Sam Altman over five days — that created acute uncertainty for enterprises whose AI roadmaps depended entirely on GPT-4 access. Companies that had not built provider-agnostic architectures had no credible fallback. The episode prompted Goldman Sachs, JPMorgan, and several European banks to formally document "AI provider concentration risk" in internal risk registers for the first time.
The practical implication: dependency on a single foundation model provider is a business continuity risk, not merely a technology preference. Organisations should maintain documented fallback procedures and, where critical workloads are involved, test failover to an alternative provider at least annually.
In June 2023, OpenAI deprecated the original GPT-3.5 and Codex model endpoints used by thousands of production applications, giving 90 days' notice. Applications that had hardcoded model names — common in rapid prototypes that became production systems — broke at deprecation. Organisations should treat AI model versions as infrastructure with end-of-life dates and conduct periodic "model continuity" audits of all AI integrations.
Every AI vendor relationship is actually a portfolio of vendor relationships you cannot see. Your vendor due diligence process must explicitly demand full subprocessor disclosure, review each subprocessor's data handling terms, and map your concentration exposure across foundation model providers before approving production deployment.
In this lab you will practice the skills of AI vendor dependency analysis: identifying subprocessor chains, mapping concentration risk across foundation model providers, and formulating due diligence questions for vendor contracts.
Describe a real or hypothetical AI deployment at your organisation and the assistant will help you map its dependency structure, identify concentration risks, and draft the right questions for your vendor procurement process.
In March 2023, Samsung's semiconductor division discovered that engineers had on at least three separate occasions pasted proprietary chip design schematics, internal meeting notes, and source code into ChatGPT to assist with debugging and documentation. The incidents were only identified after Samsung issued a survey to staff following OpenAI's disclosure of the Redis vulnerability — at which point engineers self-reported. Samsung immediately banned ChatGPT and other external generative AI tools company-wide. The data submitted had potentially been used in OpenAI's model training pipeline, and Samsung had no contractual basis to demand its deletion or verify its handling.
Samsung was not breached in the traditional sense. Its own employees made rational, productivity-maximising decisions — and in doing so inadvertently transferred some of the company's most sensitive intellectual property to a third-party AI training corpus, with no retrieval mechanism and no notice to legal or compliance.
When an employee or system sends data to an external AI service, that data typically follows one of four pathways — and which pathway applies depends entirely on contract terms that most users never read:
1. Inference-only, no retention: The data is processed to generate a response and immediately discarded. This is the regime major enterprise API contracts (OpenAI API, Azure OpenAI, Anthropic API, Google Vertex AI) now offer by default — but it must be explicitly verified in contract terms and is typically not the default for free consumer tiers.
2. Retained for safety review: The data is stored for a defined period (often 30 days) for abuse detection and safety monitoring by the vendor. This is standard even in enterprise API agreements. The data is not used for training but is accessible to vendor trust-and-safety staff.
3. Retained and used for model improvement: The data is used to fine-tune or retrain future model versions. This was OpenAI's default for ChatGPT free and Plus users until April 2023, when opt-out was introduced. Many smaller AI vendors still default to this regime. Samsung's incident fell into this category.
4. Shared with subprocessors for operational purposes: The data is passed to the vendor's own subprocessors — cloud infrastructure providers, monitoring services, human labelling contractors — for operational purposes. These subprocessors may have their own retention and data rights terms.
In March 2023, Italy's data protection authority (Garante) temporarily banned ChatGPT in Italy on GDPR grounds, citing the lack of a legal basis for processing personal data of EU residents and the absence of age verification. OpenAI was given 20 days to respond. The ban was lifted in late April 2023 after OpenAI added privacy controls, opt-out mechanisms, and a transparency notice — but the episode demonstrated that cross-border AI data flows create regulatory exposure for enterprises even when the vendor, not the enterprise, controls the processing.
Foundation model inference infrastructure is concentrated in the United States (primarily AWS us-east-1, us-west-2, and Azure eastus regions). This means that even when an enterprise is domiciled in the EU, UK, or Australia, data submitted to US-based AI vendors is typically processed in the US, triggering cross-border transfer obligations under GDPR, the UK GDPR, or Australia's Privacy Act.
The standard legal mechanism is a Data Processing Agreement (DPA) incorporating Standard Contractual Clauses (SCCs for EU transfers). OpenAI, Google, Microsoft, and Anthropic all offer enterprise DPAs. However, in practice, the majority of enterprise deployments of AI tools are made by individual teams or functions that bypass central procurement — meaning no DPA exists, no transfer mechanism is in place, and the organisation may be in ongoing breach of data protection law for transfers it does not even know are occurring.
A 2023 survey by the law firm Fieldfisher found that 67% of European enterprises that had deployed AI tools at the departmental level had not completed the required Transfer Impact Assessments for US-based model providers. This is a systemic compliance gap, not an outlier problem.
Beyond data protection law, AI data flows create a distinct intellectual property risk. When proprietary code, trade secrets, client data, or confidential strategy documents are submitted to an AI vendor whose terms permit use for training, that information may surface — transformed, but potentially recognisable — in responses to future users. This is not merely theoretical: GitHub Copilot has been documented reproducing verbatim code segments from private repositories where the repository owner's licence terms permitted training use.
The IP contamination risk runs in both directions. Code or content generated by AI tools may itself incorporate training data from copyrighted sources, potentially creating infringement exposure for the organisation that deploys the output. The ongoing litigation between The New York Times and OpenAI (filed December 2023) turns precisely on this mechanism.
Before any AI tool touches business data, your organisation must determine: (1) which of the four data fate categories applies under the vendor's current contract terms, (2) whether a signed DPA with appropriate transfer mechanisms is in place, (3) whether the data type is permissible under those terms, and (4) whether IP generated by or submitted to the tool creates downstream ownership or liability risk. Answering these questions after deployment is dramatically more expensive than before.
In this lab you will work through the practical challenge of understanding what data your organisation sends to AI vendors, how it is handled, and whether appropriate legal mechanisms are in place. You will practice categorising AI data flows and identifying compliance gaps before they become regulatory incidents.
Describe a specific AI tool or workflow at your organisation — what data it processes and which vendor it uses — and the assistant will help you identify which of the four data fate categories applies, what transfer obligations are triggered, and what due diligence steps are outstanding.
In 2021, researchers from the University of Maryland and UC Santa Barbara published findings on what they termed "Hidden Killer" — a training-data poisoning attack capable of embedding targeted backdoors in NLP models that activated only on specific trigger phrases, while leaving the model's performance on standard benchmarks completely normal. The attack required the adversary to poison only 0.2% of the training dataset — roughly one in five hundred examples — to achieve near-perfect backdoor trigger reliability.
The paper's most commercially significant finding was that the poisoning was undetectable by standard model evaluation. An organisation that downloaded a poisoned open-source model from Hugging Face, evaluated it on standard test sets, and deployed it in production would have no indication the backdoor existed until an adversary sent the trigger phrase. At that point, the model would output whatever the attacker had specified — incorrect medical advice, fraudulent financial guidance, or manipulated code.
The rapid proliferation of open-source foundation models — Meta's Llama series, Mistral, Falcon, Bloom, and hundreds of fine-tuned derivatives — has created a new and largely unmanaged risk surface for enterprise AI. As of 2024, Hugging Face's model hub hosts over 700,000 publicly available models, the vast majority of which have undergone no independent security evaluation.
Organisations that choose open-source models to reduce vendor dependency or cost often pull models from Hugging Face or similar repositories without verifying the provenance of the training data, the identity of the model uploader, or whether the model has been subjected to any form of red-teaming or security audit. This is the AI equivalent of downloading executable software from an anonymous forum and running it in your production environment.
The risk is not merely theoretical. In 2023, security researchers at Protect AI discovered that models uploaded to Hugging Face could execute arbitrary code during the deserialization of model files — a vulnerability in the pickle file format used by PyTorch. Malicious model files had been uploaded to the platform. Protect AI's scanning tool identified over 2,800 potentially malicious models on the platform before coordinated disclosure prompted Hugging Face to add security scanning. Organisations that had downloaded and deployed those models had remote code execution vulnerabilities in their AI infrastructure.
In April 2024, a sophisticated multi-year supply chain attack on the XZ Utils open-source compression library was discovered by a Microsoft engineer, Andres Freund, only by accident — he noticed slightly elevated SSH login times. The attacker had spent two years building credibility as an open-source contributor before inserting a backdoor. AI and ML frameworks depend heavily on the same open-source ecosystem of libraries. The XZ Utils case confirmed that nation-state actors are actively investing in long-horizon supply chain compromise of open-source software — the AI model ecosystem is not immune.
As enterprises deploy AI agents that can browse the web, read emails, execute code, and interact with external APIs, a new class of supply chain attack has emerged: indirect prompt injection. In this attack, malicious instructions are embedded in content that the AI system reads — a webpage, a document, an email — and those instructions hijack the AI's subsequent actions.
In 2023, security researchers demonstrated that an AI assistant with access to a user's email could be instructed, via text hidden in a received email, to forward all future emails to an attacker's address. The user sees nothing. The AI simply executes the instruction it found in the third-party content. This attack vector is entirely in the supply chain of content the AI system processes — not in the AI system itself.
The practical implication for organisations deploying AI agents with tool-use capabilities (Microsoft Copilot for M365, Google Workspace AI, Salesforce Einstein Agents) is that every external data source the agent reads becomes a potential attack surface. Trust boundaries must be architected into agent workflows before deployment, not patched in afterward.
Responsible AI procurement, whether of open-source models or fine-tuned vendor models, should include formal model integrity assessment. Current best-practice elements include:
Provenance verification: Can the vendor document the origin of training data, the training compute environment, and the chain of custody of model weights? NIST's AI Risk Management Framework (AI RMF 1.0, January 2023) lists provenance as a core supply chain risk management requirement.
Backdoor red-teaming: Has the model been subjected to adversarial testing specifically designed to elicit hidden behaviours under trigger conditions? This is distinct from standard performance benchmarking and requires specialist capability.
Dependency scanning: Are the software libraries used to serve the model (inference frameworks, serialization formats) free of known vulnerabilities? The Protect AI findings confirm this is not a theoretical concern for production deployments.
The integrity of an AI model cannot be assumed from benchmark performance alone. Organisations deploying open-source or vendor-fine-tuned models must treat model provenance and integrity verification as procurement requirements equivalent to penetration testing in traditional software procurement — and must architect explicit trust boundaries into any AI agent that processes external content.
In this lab you will practise identifying model integrity risks in open-source AI procurement and designing appropriate trust boundaries for AI agent deployments. You will work through real evaluation criteria and think through how indirect prompt injection attacks could affect specific agentic AI workflows at your organisation.
Describe an open-source model you are considering, or an AI agent workflow your organisation uses or is planning, and the assistant will guide you through the integrity assessment questions and trust boundary design considerations.
In 2023, UK insurer Aviva disclosed in its annual report that it had established a formal AI Vendor Risk Committee following an internal review that found 23 AI tools deployed across business units, only four of which had gone through standard vendor risk assessment. The remaining 19 had been procured by individual teams, often under "innovation" budgets that bypassed central IT and legal review. Two of the 19 had no signed data processing agreements. One was processing claims data through a US-based AI vendor with no GDPR transfer mechanism in place.
Aviva's public disclosure was notable because it was voluntary — the company chose to acknowledge the gap rather than wait for a regulatory finding. The remediation programme cost significantly more than prevention would have: renegotiating contracts retrospectively, conducting Transfer Impact Assessments under time pressure, and managing the reputational dimension of disclosing to regulators that claims data had been processed without adequate legal basis.
Standard SaaS vendor contracts were not designed for AI. They typically address data processing, security standards, and service levels — but do not address model-specific risks. An AI vendor contract should now include provisions covering:
Model version stability: Commitment to advance notice (minimum 90 days is emerging as standard) before model version deprecation or significant capability changes. Changes to model behaviour can invalidate downstream applications even without a security incident.
Training data usage rights: Explicit prohibition on using submitted data for model training or improvement, with audit rights to verify compliance. The prohibition should extend to all subprocessors.
Subprocessor disclosure and approval: Full list of all AI-specific subprocessors (not just generic IT subprocessors) and contractual right to object to additions. The list should include the specific foundation model provider(s) used.
Model incident notification: Obligation to notify the customer within a defined window (72 hours is the GDPR standard for personal data breaches) of any model behaviour incident, training data incident, or infrastructure compromise that could affect model outputs or data handling.
Right to audit AI governance: Right to receive third-party AI audit reports (analogous to SOC 2 for security) on at least an annual basis. The EU AI Act creates a formal audit requirement for high-risk AI systems; contracts with vendors deploying in the EU should reference this framework.
The EU AI Act (entered into force August 2024) creates explicit supply chain obligations for AI. Providers of high-risk AI systems must maintain technical documentation including training data characteristics, system architecture, and validation results. Deployers of third-party high-risk AI systems must conduct conformity assessments of those systems before deployment. For the first time, the law treats AI supply chain governance as a legal requirement rather than a best practice — and penalties for non-compliance can reach 3% of global annual turnover.
Traditional incident response plans address security breaches and system outages. AI supply chain incidents create a distinct category that most IR plans do not address: model behaviour incidents, where the AI system continues to function technically but is producing systematically incorrect, biased, or manipulated outputs due to a supply chain compromise or model change.
A model behaviour incident may be invisible to standard monitoring — the system is "up," API calls are completing, error rates are normal. Detection requires output monitoring: continuous statistical comparison of model outputs against baseline distributions, flagging of unexpected shifts in output characteristics. This is a capability most organisations deploying AI tools have not yet built.
In 2023, researchers documented that GPT-4's performance on several coding and mathematics benchmarks statistically declined between March and June 2023, as measured by Stanford and UC Berkeley researchers in a paper titled "How Is ChatGPT's Behavior Changing over Time?" The cause was not disclosed by OpenAI. The practical implication is that AI systems in production can silently degrade — and organisations without output monitoring will not detect the degradation until business impact surfaces through other channels.
The Aviva case illustrates a near-universal problem: shadow AI — AI tools deployed without central visibility — is the rule, not the exception. A 2023 survey by Salesforce found that 55% of employees were using AI tools that had not been approved by their employer's IT function. This shadow deployment creates a risk register that the organisation cannot see, manage, or respond to.
Organisations that have successfully addressed this have typically combined three measures: (1) a mandatory AI tool registry requiring any team deploying an AI tool to submit a brief standardised declaration covering the tool, vendor, data types processed, and contract status; (2) network monitoring to detect traffic to known AI service endpoints, creating visibility into unapproved deployments; and (3) a fast-track approval process that takes no more than five business days for low-risk AI tools, removing the incentive to bypass procurement by making compliant procurement genuinely fast.
Governing AI vendor risk requires extending your vendor risk management framework in three directions simultaneously: contracts that address model-specific risks, incident response plans that cover model behaviour incidents (not just security breaches), and registry processes that create visibility into shadow AI deployments before they become compliance events. The cost of remediation consistently exceeds the cost of prevention by a factor of three to ten — as Aviva's experience demonstrates.
In this lab you will practise the governance design skills required to manage AI vendor risk at an organisational level: drafting specific contract provisions, designing a practical shadow AI registry process, and building AI-specific incident response procedures that address model behaviour incidents — not just security breaches.
Describe your organisation's current AI vendor governance situation — what you have in place and what gaps you are aware of — and the assistant will help you prioritise and draft specific governance artefacts appropriate to your context.