L1
Β·
Quiz
Β·
Lab
L2
Β·
Quiz
Β·
Lab
L3
Β·
Quiz
Β·
Lab
L4
Β·
Quiz
Β·
Lab
Module Test
Module 5 Β· Lesson 1

AI-Hallucinated Packages and Dependency Confusion

When AI code generators invent package names that don't exist β€” and attackers rush to claim them.
How does a package name that never existed become a critical attack surface?

In 2023, security researchers at Vulcan Cyber published findings demonstrating that large language models including ChatGPT and Copilot routinely suggested non-existent npm, PyPI, and RubyGems packages with convincing-sounding names. They coined the attack pattern "AI package hallucination." In one documented test, GPT-3.5 recommended the package huggingface-cli for a task where no such package existed under that exact name on PyPI β€” a name an attacker could trivially register.

A separate 2023 study by researchers at the University of Texas at San Antonio found that in tests across multiple AI models, up to 30% of AI-suggested packages did not exist in the target registry. The research was published under the title "Hallucinating AI Hackathon" and presented at DEF CON 31.

Why AI Models Hallucinate Package Names

Large language models learn to write code by training on massive corpora of open-source repositories, documentation, Stack Overflow answers, and GitHub issues. Within this corpus, references to packages appear with enormous variation: version-pinned requirements files, informal blog posts, deprecated names, forks, and packages that existed briefly before removal. The model learns statistical patterns β€” it learns that certain task descriptions tend to appear alongside certain package names β€” but it has no live connection to any package registry.

When asked to complete a task for which it has seen many partial, contradictory, or sparse examples, the model generates a plausible-sounding package name by interpolating across its training data. The result is a package name that feels right β€” it follows naming conventions, it matches the ecosystem's style β€” but does not correspond to any published package.

This is not a rare edge case. Because AI-generated code is now being adopted at scale without verification, the gap between "AI suggested it" and "it actually exists and is safe" has become one of the most quietly dangerous failure modes in modern software development.

Documented Case β€” The "huggingface-cli" Pattern

When Vulcan Cyber researchers asked ChatGPT to write code using Hugging Face's API, the model suggested installing packages like huggingface-cli and several other variations that did not exist on PyPI. These names were plausible enough that a developer running pip install without verification would simply see an install failure β€” or, if an attacker had pre-registered the name, would silently install malicious code.

Dependency Confusion: The Compound Threat

Dependency confusion is a distinct but related attack first documented by security researcher Alex Birsan in February 2021. Birsan demonstrated that by registering public package names that matched internal private package names used by companies including Apple, Microsoft, and Netflix, he could cause their build systems to automatically pull his (benign, proof-of-concept) packages instead of the intended private ones. He collected bug bounty payouts exceeding $130,000 from over 35 companies before publishing his findings.

The dependency confusion attack exploits a resolver priority flaw: most package managers, by default, prefer public registries over private ones when the same name exists in both. An AI model generating code that references private internal package names β€” learned from leaked configuration files, open-source forks, or developer blog posts in its training data β€” creates precisely the conditions Birsan exploited, at scale.

When combined, AI hallucination and dependency confusion create a compound threat: the AI suggests a non-existent or internal package name, the developer installs it without checking, and an attacker who has registered that name on the public registry delivers a malicious payload.

Package HallucinationAn AI model generating a reference to a package name that does not exist in the target package registry, derived from interpolation across training data rather than live registry lookup.
Dependency ConfusionAn attack where a malicious public package with the same name as an internal private package is automatically pulled by a package manager that prioritizes public registries.
TyposquattingRegistering a package name that closely resembles a popular legitimate package (e.g., requesst vs requests) to capture developers who mistype or blindly follow AI suggestions.
Auditing AI-Generated Dependency Recommendations

The audit workflow for AI-generated package references must include explicit existence verification before any installation. This sounds obvious but is routinely skipped β€” developers working at AI-accelerated pace often treat the AI's output as pre-validated. It is not.

Step 1: Extract all package references. Parse requirements.txt, package.json, go.mod, Cargo.toml, or equivalent files generated by the AI for every named dependency. Do not rely on the AI to have listed only real packages.

Step 2: Verify existence in the canonical registry. For every name, perform a direct registry query: pip index versions [package], npm view [package], or equivalent. A 404 or "package not found" response is a critical finding requiring immediate escalation.

Step 3: Verify the publisher and history. A package that exists but was registered within the last 30 days with zero prior versions and a single anonymous maintainer is a red flag, especially if its name closely matches an AI-suggested one.

Step 4: Check for scope/namespace confusion. AI models frequently confuse scoped npm packages (e.g., @company/utils) with unscoped variants. Verify the exact registry path matches what was intended.

Key Audit Principle

Treat every package name in AI-generated code as unverified until explicitly confirmed against a live registry query. The AI has no knowledge of which packages currently exist, have been renamed, removed for malware, or were never published.

Lesson 1 Quiz

AI-Hallucinated Packages and Dependency Confusion
1. According to documented research, approximately what percentage of AI-suggested packages were found to not exist in target registries in University of Texas at San Antonio testing?
Correct. The UTSA research presented at DEF CON 31 found up to 30% of AI-suggested packages did not exist in the target registry.
Incorrect. The documented finding from the UTSA study presented at DEF CON 31 was up to 30% of AI-suggested packages were non-existent.
2. What was the name of the researcher who first publicly documented and exploited dependency confusion in 2021?
Correct. Alex Birsan published the dependency confusion research in February 2021 and received over $130,000 in bug bounties from more than 35 companies.
Incorrect. Alex Birsan is the researcher who documented and demonstrated the dependency confusion attack in February 2021, earning over $130,000 in bug bounties.
3. Why does a package manager typically pull a malicious public package instead of the intended private one in a dependency confusion attack?
Correct. The core vulnerability Birsan exploited was that package managers preferentially resolve from public registries when a name collision exists.
Incorrect. The dependency confusion attack works because package managers typically prioritize public registries over private ones when the same package name exists in both locations.
4. When auditing AI-generated dependency files, a package registered within the last 30 days with a single anonymous maintainer should be treated as:
Correct. Newly registered packages with minimal history and anonymous maintainers, especially those matching AI-suggested names, are classic indicators of potential supply chain attacks.
Incorrect. A newly registered package matching an AI-suggested name, with a single anonymous maintainer, is a significant red flag and should not be approved without thorough investigation.

Lab 1: Auditing AI-Suggested Package References

Practice identifying hallucinated and suspicious dependency names in AI-generated code.

Lab Scenario

You have received a Python requirements file generated by an AI assistant for a new internal data pipeline project. Your task is to audit the dependency list for hallucinated, suspicious, or dependency-confusion-risk package names before any installation proceeds.

Work with the AI security tutor below. Describe your audit methodology, ask about specific package name patterns, and explore what signals indicate a high-risk dependency suggestion.

Suggested opening: "I have a requirements.txt with packages including 'pandas-streamutils', 'aws-secretsmanager-helper', and 'internal-datatools'. Walk me through how I should audit each of these for hallucination or supply chain risk."
Supply Chain Audit Tutor
Lab 1
Ready to help you audit AI-generated dependency files. Share the package names you want to evaluate, and I'll walk you through the risk signals for each one β€” existence verification, publisher history, naming pattern analysis, and dependency confusion indicators. What does your requirements list look like?
Module 5 Β· Lesson 2

Software Bill of Materials and Transitive Dependencies

AI-generated code creates dependency trees of unknown depth β€” what you can't see can still compromise you.
How do you audit what the AI didn't tell you was there?

The December 2021 Log4Shell vulnerability (CVE-2021-44228) illustrated with catastrophic clarity why transitive dependencies matter. Log4j was not a package most developers knowingly chose β€” it was pulled in as a dependency of a dependency of a dependency. Thousands of organizations discovered they were running vulnerable Log4j versions only after exploitation attempts began. The U.S. CISA described the vulnerability as "one of the most serious" ever discovered, with estimated remediation costs in the billions.

AI code generators do not track or disclose the transitive dependency trees of packages they recommend. When an AI suggests using Spring Boot or a cloud SDK, the hundreds of transitive dependencies pulled along with that choice are invisible to both the AI and the developer unless explicitly inventoried.

What Is a Software Bill of Materials (SBOM)?

An SBOM is a formal, machine-readable inventory of every software component in an application β€” including direct dependencies, transitive dependencies, and their versions and licensing information. The concept gained regulatory force in the United States with Executive Order 14028 (May 2021), which mandated SBOM generation for software sold to federal agencies. NIST's guidelines under the EO referenced SPDX and CycloneDX as the two primary SBOM standards.

For AI-generated code specifically, the SBOM serves a function beyond compliance: it makes visible the full dependency graph that the AI implicitly created by recommending certain packages. A developer who accepts AI-suggested code may end up with 200 transitive packages they never consciously chose and cannot name without tooling.

SPDX (Software Package Data Exchange) is maintained by the Linux Foundation and is the ISO/IEC 5962:2021 standard. CycloneDX is maintained by OWASP and is optimized for security use cases, including vulnerability correlation. Both are accepted under EO 14028.

EO 14028 and SBOM Mandates

Executive Order 14028 "Improving the Nation's Cybersecurity" (May 12, 2021) required that software providers selling to the federal government provide SBOMs. NIST published guidance in NIST SP 800-218 (Secure Software Development Framework) and NIST SP 800-161r1 (C-SCRM) addressing supply chain risk management practices that directly apply to AI-generated code workflows.

Transitive Dependency Risks in AI-Generated Projects

When AI tools generate project scaffolding or suggest frameworks, they select packages optimized for the task described β€” not for minimal dependency footprint or supply chain hygiene. The result is typically a project that inherits a large transitive dependency tree from day one, often including packages with known CVEs that have not been patched in the version range the AI specified.

A documented example: researchers at Endor Labs published a 2023 study titled "State of Dependency Management" finding that 95% of vulnerable open-source package versions arise from transitive dependencies, not direct ones. Of the top 10 critical vulnerabilities affecting open-source projects, 9 were in transitive dependencies. AI code generators do not audit for this; they recommend based on training data that may reflect version requirements predating known vulnerabilities.

The practical implication: when auditing AI-generated code, the first-layer requirements file is only the starting point. The complete dependency graph, including all transitive dependencies, must be resolved and scanned before the code is considered safe to use.

Transitive DependencyA package that is not directly imported by your code, but is required by one of your direct dependencies β€” or by a dependency of a dependency, recursively.
SBOMSoftware Bill of Materials β€” a complete, machine-readable inventory of all software components in an application, including versions and licensing, covering both direct and transitive dependencies.
Dependency PinningSpecifying exact version numbers rather than version ranges in dependency files, to prevent automatic upgrades to versions that may introduce new vulnerabilities or breaking changes.
Generating and Using SBOMs for AI-Generated Code

Tooling for SBOM generation: Syft (Anchore) generates SBOMs in both SPDX and CycloneDX format for containers and filesystems. cdxgen (OWASP) generates CycloneDX SBOMs from package manifests. pip-audit scans Python environments against the OSV vulnerability database. npm audit and Dependabot cover Node.js transitive trees.

The audit workflow: (1) accept AI-generated dependency manifest; (2) resolve the full transitive dependency tree without installing to production β€” using pip-compile, npm ci --dry-run, or equivalent; (3) generate an SBOM from the resolved tree; (4) scan the SBOM against NVD, OSV, and GitHub Advisory Database; (5) review any flagged CVEs against severity threshold before proceeding.

A critical audit point specific to AI-generated code: AI models frequently suggest unpinned version ranges (e.g., requests>=2.0 rather than requests==2.28.2). Unpinned ranges allow package managers to silently upgrade to newer versions that may introduce new vulnerabilities or, in cases of compromised maintainer accounts, injected malicious code. All AI-generated version specifications should be pinned before deployment.

Audit Checklist

1. Generate a full SBOM before any production deployment of AI-generated code. 2. Scan all transitive dependencies, not just direct ones. 3. Pin all version numbers β€” reject unpinned ranges. 4. Track SBOM against new CVE feeds on an ongoing basis, not just at initial generation.

Lesson 2 Quiz

Software Bill of Materials and Transitive Dependencies
1. According to the 2023 Endor Labs "State of Dependency Management" study, what percentage of vulnerable open-source package versions arise from transitive (not direct) dependencies?
Correct. The Endor Labs 2023 research found 95% of vulnerable versions come from transitive, not direct, dependencies.
Incorrect. The Endor Labs 2023 "State of Dependency Management" study found that 95% of vulnerable open-source package versions arose from transitive dependencies.
2. Which U.S. executive order mandated SBOM generation for software sold to federal agencies?
Correct. Executive Order 14028, "Improving the Nation's Cybersecurity" (May 2021), established SBOM requirements for federal software suppliers.
Incorrect. Executive Order 14028, signed in May 2021, is the one that mandated SBOM generation for software sold to U.S. federal agencies.
3. Why is dependency pinning (specifying exact versions) especially important for AI-generated dependency files?
Correct. AI models tend to suggest flexible version ranges rather than exact pinned versions, which can allow silent upgrades to versions containing new vulnerabilities or injected malicious code.
Incorrect. The primary concern is that AI-suggested unpinned ranges allow silent upgrades to newer versions that may introduce vulnerabilities or, in compromised-maintainer scenarios, malicious code.
4. The Log4Shell vulnerability (CVE-2021-44228) is frequently cited in supply chain security discussions because:
Correct. Log4Shell highlighted how transitive dependencies create invisible risk β€” thousands of organizations were running vulnerable Log4j without knowing it was present in their stack.
Incorrect. Log4Shell's key supply chain lesson was that most affected organizations didn't know they had Log4j because it entered their systems as a transitive dependency of other packages they consciously chose.

Lab 2: SBOM Generation and Transitive Dependency Analysis

Practice the SBOM workflow and transitive dependency audit process for AI-generated projects.

Lab Scenario

An AI assistant has scaffolded a new Python web API project using FastAPI, SQLAlchemy, and boto3 for AWS S3 access. Your security review must include a complete SBOM analysis covering all transitive dependencies before the project can proceed to staging.

Use the AI tutor below to work through the SBOM generation process, understand which tools to use, and discuss how to interpret vulnerability scan results from the resolved dependency tree.

Suggested opening: "Walk me through generating an SBOM for a Python project using FastAPI and boto3. What tooling should I use and what should I be looking for in the transitive dependency tree?"
SBOM & Dependency Audit Tutor
Lab 2
Ready to walk you through SBOM generation and transitive dependency analysis for AI-scaffolded Python projects. Tell me about your project's direct dependencies, or ask about tooling like Syft, cdxgen, pip-audit, or pip-compile β€” and we'll work through what a thorough supply chain audit looks like from manifest to scan results.
Module 5 Β· Lesson 3

Compromised Maintainer Accounts and Malicious Package Updates

The package was legitimate when the AI trained on it. It may not be legitimate now.
How do you audit against threats that didn't exist when the AI learned what it knows?

In October 2021, the npm package ua-parser-js β€” downloaded over 7 million times per week and used by Facebook, Microsoft, and Amazon β€” was compromised when its maintainer's npm account was taken over. The attacker published versions 0.7.29, 0.8.0, and 1.0.0 containing a cryptocurrency miner and a password stealer. The compromise was live for several hours before detection. Any CI/CD pipeline using unpinned version ranges pulled the malicious code automatically.

In January 2022, the npm package node-ipc (used by the popular Vue CLI) was deliberately sabotaged by its maintainer, Brandon Nozaki Miller, who pushed versions that overwrote files with a heart emoji on computers with Russian or Belarusian IP addresses β€” a protest against the Ukraine invasion. This event, catalogued as CVE-2022-23812, raised fundamental questions about maintainer trustworthiness independent of account compromise.

In March 2024, the XZ Utils backdoor (CVE-2024-3094) was discovered β€” a multi-year social engineering attack in which a threat actor known as "Jia Tan" systematically built trust with the xz maintainer over two years before inserting a sophisticated backdoor into versions 5.6.0 and 5.6.1. Microsoft engineer Andres Freund discovered the backdoor accidentally while investigating unusual SSH performance.

Why AI-Generated Code Is Especially Vulnerable

AI models are trained on a fixed corpus with a training cutoff date. When a model recommends a package, it is recommending the package as it existed β€” and was reviewed by the community β€” up to that cutoff. It has no knowledge of compromises that occurred afterward. The model may confidently recommend ua-parser-js in its training-data-known-good state, while the package being installed is a post-compromise version containing malware.

This temporal gap is a structural vulnerability in AI-assisted development. The AI cannot say "this package was compromised six months ago." It can only say "this package was widely used and highly rated in the data I was trained on."

When AI-generated code enters a codebase with unpinned version ranges, every subsequent dependency resolution is a fresh trust decision that the AI cannot inform. The combination of AI-recommended packages, unpinned versions, and automated dependency updates creates a pipeline where malicious updates can enter production without any human reviewing what changed.

The XZ Utils Attack Pattern

The XZ Utils backdoor (CVE-2024-3094) demonstrated a sophisticated supply chain attack requiring years of social engineering: the attacker contributed legitimate, useful code for 2+ years before the malicious commit. AI models trained on pre-compromise repository data would recommend xz with no indication of the risk. This attack was only discovered by accident β€” no automated tool detected it prior to Andres Freund's investigation in March 2024.

Detecting and Responding to Compromised Dependencies

Monitoring for compromise: Subscribe to security advisory feeds for every package in your SBOM. The GitHub Advisory Database, OSV (Open Source Vulnerabilities database), and Sonatype's vulnerability data all track known-compromised packages. Tools like Dependabot, Snyk, and Socket.dev can automatically flag newly reported compromises against your dependency tree.

Socket.dev specifically targets the problem of newly suspicious packages β€” it analyzes behavioral signals in package updates rather than waiting for CVE publication, flagging packages that suddenly add network access, file system access, or obfuscated code that wasn't present in prior versions.

Version pinning as a mitigation: Pinning to exact versions prevents automatic adoption of compromised updates, but requires a deliberate upgrade process and ongoing monitoring of pinned versions against advisories. A pinned version is safer against surprise compromise but must be actively maintained.

Integrity verification: Most modern package registries publish cryptographic hashes for packages. npm uses lockfile integrity fields (SHA-512), PyPI provides package hashes, and tools like pip-compile --generate-hashes create requirements files with integrity checks. AI-generated code rarely includes these verification steps; auditors must add them.

Account Takeover (ATO)An attacker gains unauthorized access to a legitimate package maintainer's registry account, typically through credential stuffing or phishing, and publishes malicious package versions under the legitimate identity.
ProtestwareSoftware deliberately sabotaged by its own maintainer for political or ideological reasons, as documented in the node-ipc CVE-2022-23812 incident.
Subresource Integrity (SRI)A browser security mechanism that verifies fetched resources match a known cryptographic hash β€” a concept extended to package verification in the supply chain context.
Auditing AI Code for Compromise-Resilient Patterns

When reviewing AI-generated project configurations, check specifically for: (1) Whether dependency files use exact version pinning or ranges. (2) Whether lockfiles are generated and committed (package-lock.json, poetry.lock, Pipfile.lock, yarn.lock). (3) Whether any automated update tools (Dependabot, Renovate) are configured with appropriate review gates rather than auto-merge. (4) Whether package integrity hashes are included in lock files and verified during installation.

AI models frequently generate code that skips lockfile generation, uses broad version ranges, and does not include hash verification β€” because the training data it learned from frequently exhibited these patterns. These are not AI errors in the conventional sense; they are accurate reproductions of common but insecure practices from the training corpus.

Temporal Blind Spot

Every AI-generated package recommendation reflects the state of that package at training time. Your audit must account for the gap between the model's training cutoff and today β€” this gap may span months or years of supply chain events the model has no knowledge of.

Lesson 3 Quiz

Compromised Maintainer Accounts and Malicious Package Updates
1. The ua-parser-js npm package compromise in October 2021 inserted what type of malicious code?
Correct. The compromised ua-parser-js versions contained both a cryptocurrency miner and a password-stealing component.
Incorrect. The ua-parser-js compromise inserted a cryptocurrency miner and a password stealer into versions 0.7.29, 0.8.0, and 1.0.0.
2. The XZ Utils backdoor (CVE-2024-3094) is significant from a supply chain perspective primarily because:
Correct. The XZ backdoor required approximately two years of the attacker ("Jia Tan") building trust through legitimate contributions before the malicious commit, and was discovered accidentally, not by automated tools.
Incorrect. The XZ Utils backdoor was notable for its multi-year social engineering approach β€” the attacker contributed legitimate code for roughly two years to build trust before introducing the backdoor, and it was discovered accidentally, not by any scanning tool.
3. The term "protestware" refers to:
Correct. The node-ipc incident (CVE-2022-23812) is a documented example of protestware β€” a maintainer intentionally introducing destructive behavior for ideological reasons.
Incorrect. Protestware refers to software deliberately sabotaged by its own maintainer, as in the node-ipc case (CVE-2022-23812) where the maintainer introduced file-destroying code as a political protest.
4. Why does an AI model's training cutoff create a specific vulnerability when it recommends packages?
Correct. The training cutoff means the AI's "knowledge" of a package reflects its state at that point β€” post-cutoff compromises, CVEs, and maintainer account takeovers are invisible to the model.
Incorrect. The training cutoff vulnerability means the AI model recommends packages as they were known at training time β€” it has no awareness of supply chain compromises, account takeovers, or CVEs discovered after that date.

Lab 3: Evaluating Compromised Package Risk in AI-Generated Code

Practice assessing package integrity controls and compromise detection workflows.

Lab Scenario

Your team has received an AI-generated Node.js project that uses unpinned dependencies in package.json, has no lockfile committed, and has Dependabot configured with auto-merge enabled. You have been asked to assess the compromise exposure and recommend specific mitigations.

Work with the tutor to develop an assessment of the configuration's risk and a prioritized remediation plan addressing version pinning, lockfile management, integrity verification, and monitoring.

Suggested opening: "I have a Node.js project with unpinned dependencies and Dependabot set to auto-merge. Walk me through the supply chain compromise risks and what I should fix first."
Package Integrity & Compromise Tutor
Lab 3
Ready to help you assess compromise exposure in AI-generated dependency configurations. Share your setup β€” package.json structure, lockfile status, and update automation configuration β€” and we'll work through the risks and prioritized mitigations, including version pinning, lockfile integrity, hash verification, and monitoring with tools like Socket.dev and the GitHub Advisory Database.
Module 5 Β· Lesson 4

CI/CD Pipeline Integrity and Build-Time Injection Risks

AI-generated pipeline configurations introduce attack surface at every build step β€” and most developers never audit them.
What happens when the code that builds your code is also untrusted?

The SolarWinds Orion compromise, revealed in December 2020, remains the definitive documented case of build pipeline injection at scale. Attackers identifying as Cozy Bear / APT29 injected malicious code into SolarWinds' build system β€” not the source code repository β€” such that the malware was compiled into official, digitally-signed SolarWinds binaries. Approximately 18,000 organizations installed the compromised updates, including nine U.S. federal agencies. The attack was active for nine months before discovery.

On a smaller but more directly relevant scale: in 2023, researchers at Palo Alto Unit 42 documented GitHub Actions supply chain attacks in which compromised third-party Actions (the reusable workflow components that AI models routinely suggest in CI/CD configurations) were used to exfiltrate secrets from build environments. AI coding assistants frequently suggest GitHub Actions by name and version β€” recommendations that may reflect outdated, now-compromised Action versions.

How AI-Generated Pipeline Configs Create Exposure

When developers ask AI assistants to generate CI/CD configurations β€” GitHub Actions workflows, GitLab CI YAML, Jenkinsfiles, CircleCI configs β€” the AI produces configurations based on patterns from its training data. These configurations typically include references to specific third-party Actions, Docker base images, and build scripts. Each of these references is a potential injection point if the referenced resource has been compromised or replaced since the AI's training cutoff.

GitHub Actions are particularly high-risk because they are referenced by owner/repo@version tags. AI models frequently suggest Actions pinned to branch references (uses: actions/checkout@main) rather than immutable commit SHAs (uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683). Branch references are mutable β€” the owner of the Action can push new code to the branch at any time, changing what runs in your pipeline without any change to your workflow file.

The tj-actions/changed-files compromise in March 2023 (CVE-2023-26301) demonstrated this concretely: attackers compromised the tj-actions/changed-files GitHub Action and modified it to print repository secrets to workflow logs. Any workflow using this Action with a non-SHA pin was automatically running malicious code. AI tools suggesting this Action by name without SHA pinning would have recommended an attack vector.

tj-actions/changed-files Compromise (March 2023)

CVE-2023-26301 documented the compromise of the widely-used tj-actions/changed-files GitHub Action. Attackers modified the Action to exfiltrate CI/CD secrets by printing them to workflow logs. Organizations with workflows pinned to tag or branch references (the default AI recommendation pattern) were exposed. SHA-pinned workflows were unaffected. This incident affected thousands of repositories.

Secure CI/CD Patterns for AI-Assisted Development

SHA pinning for GitHub Actions: Every third-party Action reference must use an immutable commit SHA rather than a tag or branch. This is the single highest-impact control for GitHub Actions supply chain risk. Tools like Ratchet (by Google) and pin-github-action automate the conversion of tag-based references to SHA pins. The StepSecurity Harden-Runner Action can also monitor and restrict Actions' runtime behavior.

Docker base image pinning: AI-generated Dockerfiles frequently use FROM python:3.11 or similar mutable tags. These should be replaced with digest-pinned references: FROM python:3.11@sha256:abc123…. The Docker Hub tag python:3.11 can be updated to point to a new image at any time; the digest-pinned reference is immutable.

Secrets management: AI-generated pipeline configurations frequently place secrets directly in environment variables with patterns like AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET }} without additional controls. Auditors should verify that secrets are scoped to the minimum necessary steps, that OIDC token-based authentication is used where possible (eliminating long-lived credentials), and that secrets are not printed to logs.

Pipeline permissions: AI-generated GitHub Actions workflows frequently use permissions: write-all or omit permissions entirely (which defaults to write in older configurations). Each workflow should use the minimum necessary permissions. The permissions key should be explicitly declared at the workflow level with only required access granted.

SHA PinningReferencing a GitHub Action by its immutable commit SHA rather than a mutable tag or branch name, ensuring the exact code version used cannot be changed without updating your workflow file.
Build Pipeline InjectionAn attack where malicious code is introduced into the build or CI/CD pipeline rather than the source code itself, producing compromised artifacts even from clean source.
OIDC Token AuthenticationUsing short-lived, automatically-rotated OpenID Connect tokens to authenticate CI/CD pipelines to cloud providers, eliminating the need for long-lived static credentials in pipeline configurations.
Auditing AI-Generated Pipeline Configurations

A systematic audit of AI-generated CI/CD configuration should cover: (1) All third-party Action references β€” flag any that use branch, tag, or latest references instead of commit SHAs. (2) All Docker base image references β€” flag mutable tags without digest pinning. (3) Workflow permissions β€” flag write-all, missing permission declarations, or overly broad access. (4) Secret handling β€” flag hardcoded values, overly broad secret scopes, or patterns that could cause secrets to be logged. (5) Script injection vectors β€” flag workflow steps that interpolate user-controlled input (PR titles, branch names, commit messages) directly into shell commands.

Script injection is a particularly insidious AI-generated pattern. When AI generates workflow steps that use GitHub Actions expression syntax inside run: blocks β€” such as run: echo "${{ github.event.pull_request.title }}" β€” this creates a command injection vulnerability if a pull request title contains shell metacharacters. The github.event.pull_request.title is attacker-controlled input. AI models replicate this pattern widely because it appears frequently in documentation examples.

Critical Audit Rule

Any AI-generated CI/CD configuration should be treated as potentially containing mutable references, overly broad permissions, and script injection patterns until explicitly audited. The AI generates configurations that work, not configurations that are secure.

Lesson 4 Quiz

CI/CD Pipeline Integrity and Build-Time Injection Risks
1. In the SolarWinds Orion compromise, malicious code was injected at which stage of the software supply chain?
Correct. The SolarWinds attack targeted the build system itself β€” not the source code β€” so the resulting binaries were legitimately compiled and digitally signed, making detection extremely difficult.
Incorrect. The SolarWinds Orion compromise injected malicious code into the build system, not the source repository, resulting in officially compiled and signed binaries containing malware.
2. Why is referencing a GitHub Action as uses: actions/checkout@v3 considered a supply chain risk compared to using a full commit SHA?
Correct. Tag references are mutable β€” any update the Action owner pushes under that tag is automatically used in your workflow, making it a live attack surface. Commit SHA references are immutable.
Incorrect. The supply chain risk of tag references is mutability β€” the Action owner (or an attacker who compromises their account) can update what code the tag points to at any time, instantly affecting all workflows using that tag reference.
3. A GitHub Actions workflow step containing run: echo "${{ github.event.pull_request.title }}" is vulnerable to which type of attack?
Correct. PR titles are attacker-controlled β€” anyone opening a pull request controls the title. Interpolating them directly into shell commands creates a command injection vulnerability.
Incorrect. This is a script injection vulnerability. The pull request title is attacker-controlled input (anyone can open a PR), and interpolating it directly into a shell run step allows an attacker to inject arbitrary shell commands via the title string.
4. What is the primary security benefit of using OIDC token-based authentication for CI/CD pipelines connecting to cloud providers?
Correct. OIDC-based authentication means no long-lived AWS/GCP/Azure credentials need to be stored in CI/CD secrets β€” tokens are generated per-run and expire automatically.
Incorrect. The primary benefit of OIDC token authentication for CI/CD is eliminating long-lived static credentials. Instead of storing persistent AWS keys or similar credentials as secrets, each pipeline run receives a short-lived token that expires automatically.

Lab 4: Auditing AI-Generated CI/CD Pipeline Configurations

Practice identifying mutable references, insecure permissions, and injection risks in AI-generated workflow files.

Lab Scenario

An AI assistant has generated a GitHub Actions workflow for a Node.js application. The workflow uses several third-party Actions by tag, runs with default permissions, echoes PR metadata in run steps, and stores AWS credentials as static secrets. Your task is to conduct a security audit and produce a remediation plan.

Use the tutor below to work through each risk category: SHA pinning, permissions hardening, script injection prevention, and credential management. Develop specific remediated versions of problematic configurations.

Suggested opening: "I have a GitHub Actions workflow that uses 'actions/checkout@v4', 'tj-actions/changed-files@v35', echoes PR titles in run steps, and stores AWS_SECRET_ACCESS_KEY as a secret. Walk me through auditing each of these issues."
CI/CD Pipeline Security Tutor
Lab 4
Ready to audit your AI-generated GitHub Actions workflow. Share the configuration patterns you want to review β€” Action references, permissions settings, run-step patterns, or credential handling β€” and I'll walk you through the specific risks and remediated alternatives for each one. What does your workflow look like?

Module 5 Test

Dependency and Supply Chain Risks β€” 15 questions, 80% to pass
1. The term "AI package hallucination" refers to:
Correct. AI package hallucination specifically refers to the generation of non-existent package names, coined by Vulcan Cyber in 2023.
Incorrect. AI package hallucination refers to the AI generating references to package names that don't exist in any registry β€” not mischaracterization of existing packages.
2. Alex Birsan's 2021 dependency confusion research demonstrated that affected companies included:
Correct. Birsan's research affected over 35 companies including Apple, Microsoft, and Netflix, earning him $130,000+ in bug bounties.
Incorrect. Alex Birsan's dependency confusion research affected over 35 companies including Apple, Microsoft, and Netflix, earning him more than $130,000 in bug bounties.
3. Which package manager behavior is the root cause of dependency confusion attacks?
Correct. The resolver priority flaw β€” public over private registry preference β€” is the core mechanism that dependency confusion exploits.
Incorrect. Dependency confusion exploits the fact that package managers typically prefer public registries over private ones when a name collision exists between them.
4. An SBOM (Software Bill of Materials) should include:
Correct. A complete SBOM covers the entire dependency graph including transitive dependencies β€” which is where 95% of vulnerable versions are found, according to Endor Labs research.
Incorrect. An SBOM must include all components β€” direct and transitive β€” with versions and licensing. Excluding transitive dependencies would miss the majority of vulnerability exposure.
5. Executive Order 14028's SBOM mandate applies to:
Correct. EO 14028 mandated SBOM requirements specifically for software sold to federal agencies, though the standards established have broader industry influence.
Incorrect. EO 14028's direct mandate covers software sold to U.S. federal government agencies, though its influence on industry practices is broader.
6. Why are AI-generated version specifications like requests>=2.0 a security risk compared to requests==2.28.2?
Correct. Version ranges allow the package manager to resolve to newer versions automatically β€” including versions that post-date your security review and may have introduced new vulnerabilities or compromised code.
Incorrect. Version range specifications are a security risk because they allow automatic resolution to newer versions that may contain new CVEs, or in the case of compromised maintainer accounts, injected malicious code that bypasses your initial security review.
7. The ua-parser-js npm package was compromised through which mechanism?
Correct. The ua-parser-js compromise in October 2021 resulted from an attacker taking over the legitimate maintainer's npm account and publishing malicious versions.
Incorrect. The ua-parser-js compromise involved an attacker taking over the legitimate maintainer's npm account and publishing new malicious versions (0.7.29, 0.8.0, 1.0.0) containing a crypto miner and password stealer.
8. The XZ Utils backdoor (CVE-2024-3094) was ultimately discovered:
Correct. The XZ backdoor was discovered accidentally by Andres Freund, who noticed SSH login latency anomalies and traced them to the compromised xz-utils library.
Incorrect. The XZ Utils backdoor was discovered accidentally by Microsoft engineer Andres Freund, who was investigating unusual SSH performance degradation β€” not by any security tooling or formal audit process.
9. Socket.dev differs from traditional vulnerability scanners in that it:
Correct. Socket.dev focuses on behavioral change detection β€” flagging packages that suddenly add network access or obfuscated code β€” rather than relying solely on CVE database matches.
Incorrect. Socket.dev's differentiator is behavioral signal analysis β€” detecting when packages suddenly add new capabilities like network access or file system writes β€” rather than waiting for CVEs to be published after exploitation.
10. The SolarWinds Orion compromise was active for approximately how long before discovery?
Correct. The SolarWinds supply chain compromise was active for approximately nine months before it was discovered in December 2020.
Incorrect. The SolarWinds Orion compromise, revealed in December 2020, had been active for approximately nine months before discovery.
11. When an AI generates a GitHub Actions workflow with uses: some-action/tool@v2, the security concern is:
Correct. Tag references are mutable β€” the action owner (or an attacker who compromises their account) can update the tag to point to new code at any time, making it a live attack surface.
Incorrect. The security concern with tag references is mutability β€” the action owner can push new code under an existing tag at any time. Only commit SHA references are immutable and safe from this risk.
12. The tj-actions/changed-files compromise (CVE-2023-26301) resulted in:
Correct. The compromised tj-actions/changed-files Action was modified to print CI/CD secrets (environment variables) to workflow logs, potentially enabling their exfiltration.
Incorrect. The tj-actions/changed-files compromise resulted in secrets being printed to workflow logs β€” the Action was modified to dump environment variables including any secrets available to the workflow.
13. The two SBOM standards accepted under EO 14028 for federal software supply chain compliance are:
Correct. SPDX (maintained by the Linux Foundation, ISO/IEC 5962:2021) and CycloneDX (maintained by OWASP) are the two primary SBOM standards referenced in EO 14028 guidance.
Incorrect. The two SBOM standards referenced under EO 14028 are SPDX (Software Package Data Exchange, maintained by the Linux Foundation) and CycloneDX (maintained by OWASP).
14. A GitHub Actions workflow step containing run: git commit -m "${{ github.event.issue.title }}" is vulnerable because:
Correct. GitHub issue titles are attacker-controlled β€” anyone can open an issue. Interpolating them into shell commands creates a classic command injection vulnerability where an attacker can inject arbitrary shell commands via a crafted issue title.
Incorrect. This is a script injection vulnerability. GitHub issue titles are attacker-controlled input (anyone can create an issue with any title), and interpolating them directly into shell commands allows shell metacharacters to break the intended command and execute arbitrary code.
15. When generating a complete dependency audit for AI-generated code, which step must come BEFORE running vulnerability scanners?
Correct. The full transitive dependency tree must be resolved first β€” scanning only direct dependencies misses the 95% of vulnerable versions that reside in transitive dependencies, per Endor Labs research.
Incorrect. Before scanning, the complete transitive dependency tree must be resolved (using pip-compile, npm ci, or equivalent) β€” scanning only the primary manifest would miss the transitive dependencies where the majority of vulnerable versions reside.