Intro
L1
·
Quiz
·
Lab
L2
·
Quiz
·
Lab
L3
·
Quiz
·
Lab
L4
·
Quiz
·
Lab
Module Test
Security Auditing for AI-Generated Code · Introduction

The Code Is Already in Production

AI writes code at machine speed. Humans inherit the vulnerabilities at human speed.

When electric power first reached factories in the 1880s, mill owners simply bolted electric motors onto machinery designed for steam. The layouts, the workflows, the safety assumptions — all inherited from a prior era. Electrocution rates climbed. It took roughly four decades before engineers realized the technology demanded entirely new mental models, not faster adoption of old ones. By 1920, industrial accident rates finally began falling, not because electricity became safer, but because the people working with it learned to think differently about risk.

The same pattern is unfolding now with AI-assisted code generation. GitHub Copilot launched in June 2021. By 2023, Stanford researchers studying 1,689 developers found that those who used AI coding assistants were significantly more likely to introduce security vulnerabilities than those who wrote code by hand — and were also more confident their code was safe. The technology is moving faster than the safety culture that should accompany it. Organizations are shipping AI-generated code into payment systems, authentication flows, and medical devices with the same breezy optimism that mill owners had when they first threw the switch.

This course is a practical audit methodology for that reality. It will not teach you to distrust AI tools — they produce real value. It will teach you to read AI-generated code the way an experienced structural engineer reads a building plan: looking specifically for the failure modes that the designer's optimism tends to obscure. Four lessons, four labs, one module test. By the end you will have a working checklist, pattern recognition for the most common AI security mistakes, and the habit of asking the right questions before any AI-written function reaches a production branch.

Security Auditing for AI-Generated Code · Lesson 1

Why AI Gets Security Wrong

Training on the past, blind to the present — the structural reasons AI code generation fails at security.
If an AI model has read every public GitHub repository, why does it still write SQL injection vulnerabilities?

In August 2023, a security researcher named Joseph Thacker published an analysis of code produced by GitHub Copilot across 25 common programming tasks. He found that in roughly 40% of cases, Copilot introduced at least one security issue — hardcoded credentials, missing input validation, insecure cryptographic defaults. The code was syntactically correct, passed obvious tests, and looked professional to a developer unfamiliar with the specific attack surface. The AI did not produce garbage. It produced plausible, functional, quietly dangerous code.

This is the defining characteristic of AI-generated security failures: they do not look like failures. They look like the code that a reasonably competent junior developer would write before their first serious code review. Understanding why requires understanding what AI models actually learn, and what they structurally cannot learn.

1.1 — What the Model Actually Learned

Large language models trained on code — Copilot, CodeWhisperer, Cursor, Codeium — learned from public repositories. Public repositories contain a great deal of insecure code. A 2022 study by researchers at New York University, published in the proceedings of IEEE Symposium on Security and Privacy, examined 1,692 programs generated by Copilot and found that approximately 40% contained CWE-listed vulnerabilities. The training signal for security is weak relative to the training signal for functionality: there are vastly more repositories where code works than repositories where code is both functional and explicitly secure.

The model learns the statistical average of what code looks like, not the security principles behind correct code. When the average public repository uses MD5 for password hashing (because it was common until roughly 2012), the model learns MD5 as a reasonable password-hashing choice. When the average tutorial skips parameterized queries in favor of string concatenation for simplicity, the model learns string concatenation as a plausible approach.

This is not a bug in the implementation — it is a fundamental property of the training objective. The model is rewarded for predicting plausible next tokens, not for predicting secure next tokens.

Research Note

The NYU / IEEE S&P 2022 paper "Asleep at the Keyboard? Assessing the Security of GitHub Copilot's Code Contributions" (Pearce et al.) is the landmark study establishing this baseline. It examined scenarios specifically designed to elicit security-sensitive code — file I/O, network calls, cryptography, SQL — and found vulnerabilities in the majority of generated outputs across most categories.

1.2 — The Knowledge Cutoff Problem

Every AI model has a training cutoff date. Security does not. The CVE database grows by roughly 25,000 new entries per year. A model trained through mid-2023 has no knowledge of vulnerabilities disclosed in late 2023 or 2024 — which means it may confidently recommend a library, API, or pattern that was publicly broken after its cutoff.

The Log4Shell vulnerability (CVE-2021-44228) was disclosed in December 2021. It affected Log4j versions that had been in wide use for years. Models trained before December 2021 would have recommended exactly the vulnerable version as standard practice — because it was standard practice. Models trained slightly after would have absorbed the fix. But the model itself cannot tell you which side of that cutoff a given piece of advice falls on.

This creates a structural asymmetry: the AI is confident regardless of whether its knowledge is current. It does not hedge when recommending a dependency. It does not flag that a cryptographic primitive may have been deprecated since it was trained. The model's epistemic humility about security is not calibrated to its actual epistemic position.

1.3 — Context Collapse

A human security engineer reviewing a function considers the entire system: where does user input enter? What trust boundaries exist? What is the data classification of this field? AI code generation operates almost entirely on local context — the immediately visible code, the function signature, the docstring. It does not know that the string being passed to that function comes from an unauthenticated HTTP POST. It does not know that the database being queried contains PII subject to GDPR. It cannot see the threat model because no one has shown it the threat model.

This produces a specific failure mode: code that is secure in isolation but insecure in deployment. A generated password-reset function might handle the reset flow correctly but fail to rate-limit requests — because the rate-limiting logic lives two layers up in the middleware, invisible to the model. A generated file-upload handler might validate MIME types but miss that the upload directory is web-accessible — again, architectural context the model cannot infer from a function stub.

The Core Mental Model

AI code generation optimizes for local correctness: the function does what the docstring says. Security is fundamentally about system-level correctness: the function behaves safely within a threat environment the model cannot see. This gap is not fixable by prompting more carefully. It requires human review at the architectural level.

1.4 — The Confidence Problem

Perhaps the most dangerous property of AI-generated code is that it does not look uncertain. A junior developer who is unsure about the correct way to implement HMAC signature verification will often leave a comment, search Stack Overflow visibly, or ask a colleague. AI models produce fluent, confident code regardless of whether the underlying pattern is correct.

This was demonstrated in the Stanford study published in 2023 (Perry et al., "Do Users Write More Insecure Code with AI Assistants?"): participants using AI assistants were more likely to rate their own code as secure compared to control participants who wrote code without AI help — even though the AI-assisted code contained more vulnerabilities. The tool increased both the rate of errors and the developer's confidence that errors were absent. This combination — increased vulnerability rate, increased false confidence — is precisely the condition that allows vulnerabilities to pass through review undetected.

CWE
Common Weakness Enumeration — a community catalog of software security weaknesses maintained by MITRE. AI models frequently produce code exhibiting CWE-89 (SQL Injection), CWE-798 (Hardcoded Credentials), CWE-327 (Broken Cryptography), and CWE-22 (Path Traversal).
Training Cutoff
The date after which new information was not incorporated into a model's weights. The model's security knowledge is frozen at this date while the threat landscape continues to evolve.
Context Collapse
The failure mode where AI generates code that is locally plausible but architecturally unsafe because the model cannot see the broader system, trust boundaries, or threat model.

Lesson 1 Quiz

Why AI Gets Security Wrong — check your understanding before the lab.
1. The NYU / IEEE S&P 2022 study (Pearce et al.) found that approximately what percentage of Copilot-generated code samples contained CWE-listed vulnerabilities?
Correct. The Pearce et al. study found roughly 40% of generated programs contained at least one vulnerability — enough to establish that insecure output is not a rare edge case but a systematic pattern.
Not quite. The figure was approximately 40%, which is significant enough to treat AI security failures as a routine audit concern rather than an occasional outlier.
2. Which of the following best explains why AI models recommend deprecated cryptographic primitives like MD5 for password hashing?
Correct. The model learns statistical patterns from its training corpus. When large volumes of older code used MD5, the model learned it as a plausible choice — not because it is secure, but because it is common in the data.
Not correct. The root cause is the model's training objective: predict plausible next tokens based on statistical patterns in the training data. Older code used MD5 extensively, so the model learned it as a plausible recommendation.
3. What does "context collapse" mean in the context of AI security failures?
Correct. Context collapse describes the gap between local function-level correctness and system-level security — the model cannot see trust boundaries, data classification, or architectural threat surfaces.
Not quite. Context collapse in this lesson refers to the model's inability to reason about the broader architectural and threat context a function lives in, producing code that is locally plausible but systemically unsafe.
4. According to the Stanford 2023 study (Perry et al.), how did AI assistance affect developers' confidence in their code security?
Correct. This is the dangerous combination: AI assistance simultaneously increased vulnerability rate and increased false confidence. The result is that vulnerabilities are both more common and less likely to be caught.
Incorrect. The study found the opposite: AI-assisted developers were more confident their code was secure despite producing more vulnerabilities — a compounding problem for any audit process.
5. Why is the training cutoff date a security-specific concern, distinct from other AI limitations?
Correct. The CVE database grows by ~25,000 entries per year. A model cannot know what was disclosed after its training cutoff, but it also does not flag that uncertainty — it recommends with equal confidence regardless.
Not quite. The key issue is the combination of a frozen knowledge state and confident, unhedged recommendations — the model cannot know what new vulnerabilities have been disclosed since training, but it does not signal that limitation.

Lab 1 — Interrogating AI Confidence

Practice identifying the security reasoning gap between what AI says and what AI knows.

Lab Objective

In this lab you will interact with an AI security tutor to explore the three core failure modes from Lesson 1: training data bias, knowledge cutoff blindness, and context collapse. The AI will push back if your reasoning is imprecise — treat that as part of the learning, not a malfunction.

Complete at least three substantive exchanges to mark the lab complete. Quality of reasoning matters more than volume of messages.

Start here: Ask the tutor to explain — in concrete terms — why an AI model trained on public GitHub code in 2022 might still recommend bcrypt over argon2id for password hashing, even though argon2id won the Password Hashing Competition in 2015 and is now broadly preferred. Then challenge its answer.
Security Audit Tutor
Lab 1 · AI Security Failure Modes
Ready when you are. The lab prompt above gives you a starting point, but feel free to approach it your own way — the goal is to stress-test the reasoning behind AI security recommendations, not follow a script. What's on your mind?
Security Auditing for AI-Generated Code · Lesson 2

The Vulnerability Taxonomy: What AI Gets Wrong, Specifically

A field guide to the vulnerability classes that appear most frequently in AI-generated code — with documented evidence for each.
Which specific vulnerability categories should you look for first when auditing AI-generated code, and why do they cluster the way they do?

The Pearce et al. study did not find that AI models produce vulnerabilities randomly across all possible categories. It found that failures cluster — certain weakness types appear far more often than others. This is not coincidental. The clustering maps directly onto the structural explanations from Lesson 1: vulnerabilities appear where training data is historically inconsistent, where context is essential to safe implementation, and where the secure approach requires knowledge that postdates the training corpus. Knowing which clusters to look for first is the foundation of efficient auditing.

2.1 — SQL Injection and Injection Broadly (CWE-89, CWE-77)

SQL injection remains the most commonly reproduced vulnerability in AI-generated code. The reason is straightforward: tutorials, documentation examples, and early Stack Overflow answers overwhelmingly used string concatenation to build queries. Parameterized queries became the clear standard only gradually, and large swaths of the training corpus predate that consensus.

In the Pearce et al. scenarios specifically targeting SQL, Copilot produced injection-vulnerable code in a majority of cases. The model did not produce injection vulnerabilities because it was unaware of parameterized queries — it produces them because string concatenation is more statistically common in the context of simple tutorial-style examples, which dominate the training distribution for common database operations.

Command injection (CWE-77) follows the same pattern. Subprocess calls, shell execution, and system() invocations frequently appear in AI output without the input sanitization or argument array patterns that prevent shell interpretation of user-controlled data.

Audit Trigger

Any AI-generated function that touches a database or executes a system command should be treated as injection-suspect until parameterized queries or argument arrays are confirmed. String concatenation in these contexts is an automatic flag regardless of where the input appears to come from.

2.2 — Cryptographic Failures (CWE-327, CWE-328, CWE-330)

Cryptographic recommendations from AI models are systematically outdated. Common findings include: MD5 or SHA-1 used for password hashing (both broken for this purpose since the mid-2000s), ECB mode used for block cipher encryption (which produces deterministic ciphertext and leaks data patterns), insufficient random number generation using non-cryptographic PRNGs like Python's random module or JavaScript's Math.random(), and hardcoded or static initialization vectors for AES.

The Amazon CodeWhisperer security benchmark, published by AWS in 2023, found that cryptographic failures were among the top three most frequently generated vulnerability categories. This tracks with the training data explanation: cryptography tutorials have historically prioritized demonstrating that encryption "works" over demonstrating that it is secure. A tutorial showing AES encryption will typically use a simple, readable key — often a hardcoded string — because the point of the tutorial is the mechanics of the call, not operational security.

2.3 — Hardcoded Credentials (CWE-798)

AI models reproduce hardcoded credentials with surprising regularity — API keys, database passwords, JWT secrets embedded directly in source code. This happens because demonstration code on GitHub, in blog posts, and in documentation frequently uses placeholder credentials that look like real credentials. The model learns that code-near-credential is a valid syntactic pattern.

GitGuardian's 2023 State of Secrets Sprawl report found over 10 million hardcoded secrets committed to public GitHub repositories in a single year. That corpus is exactly what AI models trained on. When you ask a model to generate a function that connects to an external API, it has seen thousands of examples where the API key appears in the same file as the connection function — and it will reproduce that pattern.

2.4 — Path Traversal and File Handling (CWE-22)

Path traversal vulnerabilities allow attackers to read or write files outside an intended directory by manipulating path inputs with sequences like ../../etc/passwd. AI-generated file-handling code frequently omits the path normalization and directory restriction checks that prevent this — because the model is completing a pattern (open a file given a name) without the architectural context that tells it the name comes from untrusted input.

This is context collapse in its clearest form. A function stub that says def read_report(filename) gives the model no signal that filename could be attacker-controlled. The model generates a clean, readable file-open call — and omits the os.path.realpath() check and the containment assertion that would make it safe.

2.5 — Missing Authentication and Authorization Checks (CWE-306, CWE-862)

When asked to generate an API endpoint or a data-access function, AI models often produce the data access logic cleanly but omit authentication verification entirely. The reason is again context: authorization belongs to middleware layers that the model cannot see. The model generates what was asked for — a function that retrieves user data — and does not add the authentication check because the middleware is an external dependency invisible in the prompt.

This was identified as a specific concern in a 2023 analysis by researchers at the University of Quebec (Hajipour et al.), who found that AI-generated REST API endpoints frequently lacked access control validation in scenarios where the prompt did not explicitly request it.

Priority Audit Order

When reviewing AI-generated code, check in this order: (1) injection in database and system calls, (2) cryptographic algorithm and configuration choices, (3) credential and secret handling, (4) file path validation, (5) presence of authentication and authorization checks on data-access functions. This order reflects documented frequency, not theoretical importance.

CWE-89
SQL Injection — user-controlled data reaches a database query without parameterization. The most consistently reproduced AI vulnerability in documented studies.
CWE-327
Use of Broken or Risky Cryptographic Algorithm — includes MD5/SHA-1 for hashing, ECB mode, and other deprecated primitives commonly recommended by AI models trained on older code.
CWE-798
Use of Hardcoded Credentials — API keys, passwords, or secrets embedded directly in source code, a pattern learned from the vast quantity of demonstration code in the training corpus.

Lesson 2 Quiz

The Vulnerability Taxonomy — test your recall of the specific failure categories.
6. Why does SQL injection appear so frequently in AI-generated code, according to the training data explanation?
Correct. The model learned from data where string concatenation was the dominant pattern — not because parameterized queries don't exist in the training data, but because tutorial-style code overwhelmingly used the unsafe pattern.
Incorrect. The explanation is statistical: string concatenation was the dominant pattern in tutorial and early reference code that makes up much of the training corpus. The model reproduces the statistical average, which skews toward the older unsafe pattern.
7. GitGuardian's 2023 State of Secrets Sprawl report is relevant to understanding AI hardcoded credentials because:
Correct. The public repository corpus that AI models train on contains millions of real and placeholder credentials — the model learns that credential-near-code is syntactically and structurally normal.
Not quite. The relevance is that the public training corpus contains millions of examples of credentials in source code — real, placeholder, and demonstration — teaching the model that this is a normal structural pattern.
8. A generated function def read_report(filename) opens a file without path validation. This is best described as an example of:
Correct. Path traversal (CWE-22) is the vulnerability class, and the mechanism is context collapse: the function stub gives no signal about where the filename originates, so the model generates functional but unsafe file-open logic.
Incorrect. This is a path traversal scenario (CWE-22), and the underlying cause is context collapse — the model has no visibility into whether the filename parameter comes from a trusted or untrusted source.
9. When auditing AI-generated code, what is the recommended first check according to Lesson 2's priority order?
Correct. Injection in database and system calls is the highest-frequency finding in documented studies and should be the first check in any AI code audit.
Not quite. The recommended first check is injection in database and system calls — it is the highest-frequency finding across documented AI code security studies.
10. ECB mode is a specific cryptographic failure AI models produce. What makes ECB mode dangerous for encryption?
Correct. ECB's deterministic output means that identical plaintext produces identical ciphertext — the classic demonstration is encrypting a bitmap image with ECB, where the shape of the original image remains clearly visible in the encrypted output.
Incorrect. The danger of ECB mode is that it is deterministic: identical plaintext always produces identical ciphertext under the same key. This leaks pattern information regardless of key strength.

Lab 2 — Vulnerability Pattern Recognition

Practice spotting the specific weakness categories in realistic AI-generated code snippets.

Lab Objective

The tutor will present you with short code snippets typical of AI-generated output. Your task is to identify the vulnerability class (using CWE numbers where possible), explain the specific failure, and suggest the secure alternative. The tutor will probe your reasoning and correct imprecise identifications.

Aim for at least three code review exchanges. Push for specificity — "it's insecure" is not an acceptable answer in a real audit.

To begin: Ask the tutor for the first code snippet to review. Tell it which language you prefer (Python, JavaScript, or Java) and it will tailor the examples accordingly.
Security Audit Tutor
Lab 2 · Vulnerability Pattern Recognition
Tell me which language you prefer — Python, JavaScript, or Java — and I'll give you the first AI-generated code snippet to audit. Be ready to name the vulnerability class, explain the failure mechanism, and propose the secure fix.
Security Auditing for AI-Generated Code · Lesson 3

Audit Methodology: Building a Repeatable Review Process

Moving from ad-hoc inspection to systematic audit — the protocols that catch what intuition misses.
How do you construct an audit process that is rigorous enough to catch AI-specific failure modes without becoming so burdensome that developers route around it?

In 2022, the cryptocurrency exchange Coinbase published a retrospective on their internal code review practices following a series of near-miss security incidents. The pattern they identified was consistent: reviewers caught problems they were specifically looking for and missed problems they were not primed to consider. Code review based on general vigilance is not sufficient. Structured checklists tied to known failure modes consistently outperform expert intuition in controlled studies — a finding that applies doubly to AI-generated code, where the failure modes are specific and documented.

3.1 — The Triage Phase: Before You Read Code

The first phase of an AI code audit happens before examining a single line. It is about establishing context that the AI model did not have when generating the code. Four questions must be answered:

1. What is the trust level of the inputs? — Does any input to this code come from a user, an external API, a file, or a network connection? If yes, every function that touches that input is injection- and traversal-suspect.

2. What is the data classification? — Does this code handle credentials, PII, payment data, or health information? If yes, cryptographic handling and access control are mandatory checks, not optional ones.

3. What is the deployment environment? — Is this code running in a container with network access? In a serverless function? In a browser? The threat surface varies significantly by environment.

4. What AI tool generated this, and what was the prompt? — If available, the original prompt reveals what context the model had. A prompt that does not mention authentication will produce code without authentication — predictably.

Process Note

Preserve AI prompts alongside generated code in version control. The prompt is architectural context. A future auditor reviewing the code six months later cannot know what the AI was told — which is precisely the information needed to reason about what the AI could not have known.

3.2 — Static Analysis Integration

Static analysis tools — Semgrep, Bandit (Python), ESLint security plugins (JavaScript), SpotBugs (Java), CodeQL — should run on all AI-generated code before human review. Not instead of human review. Before it. The tools catch the mechanical failures: string concatenation in SQL queries, calls to deprecated hash functions, subprocess calls without shell=False. This frees human reviewers to focus on the architectural and contextual failures that tools cannot detect.

GitHub's 2023 Octoverse report noted that repositories using AI code generation had adoption rates of automated security scanning roughly equivalent to those without — suggesting that organizations are not consistently adding automated review when they add AI generation. This is the gap. The two capabilities should be treated as a required pair, not independent options.

Semgrep, specifically, allows teams to write custom rules targeting AI-specific failure patterns. A rule that flags any direct variable interpolation into a database query string can catch Copilot's most common SQL failure within seconds of commit — before a human reviewer sees the code.

3.3 — The Dependency Audit

AI models frequently recommend specific library versions, and those recommendations reflect the training corpus rather than current security advisories. A model asked to generate Python cryptographic code may recommend pycrypto — a library that was formally deprecated and abandoned in 2017 in favor of pycryptodome. A model generating Node.js JWT code may recommend jsonwebtoken versions that predate the header injection fixes applied in version 9.0.0 (released December 2022).

Every AI-generated dependency should be checked against current CVE databases, and the specific version pinned (if any) should be compared against the current stable release. Tools like npm audit, pip-audit, OWASP Dependency-Check, and GitHub's Dependabot automate much of this — but they must be invoked, and AI-assisted development workflows often skip this step under time pressure.

3.4 — Architectural Context Review

The failures that static analysis cannot catch are the ones requiring architectural reasoning: missing rate limiting, absent authentication middleware, incorrect trust boundary placement, missing logging of security events. These require a human reviewer who knows the system to walk through the generated code asking a specific question at each function boundary: what happens if the input to this function is adversarial?

This is not a general code review. It is a threat-model walkthrough applied to code that was generated without a threat model. It should be performed by someone who has read the system's threat model document — or, if no such document exists, the architectural review should begin by producing one.

The Missing Middleware Problem

When an AI generates an endpoint function and omits authentication, the correct fix is not to add authentication to the function — it is to confirm that authentication is handled at the correct layer (middleware, API gateway, or decorator) and that the function is correctly registered behind it. Patching the symptom in the generated code without understanding the architectural intent creates a false sense of resolution.

3.5 — Documentation and Signoff

An audit without documentation is an opinion. For AI-generated code in production systems, audit records should capture: which tool generated the code, the date of generation (to establish what model version and training cutoff applies), which static analysis tools ran and what they found, which human reviewer conducted the architectural review and what they checked, and the specific disposition of any flagged findings. This record is both a quality gate and a liability document — if a vulnerability is discovered post-deployment, the audit record demonstrates due diligence and establishes what was and was not visible at review time.

Semgrep
An open-source static analysis tool that supports custom rule authoring. Particularly well-suited to writing AI-specific security patterns because rules can be targeted at the precise failure modes AI models are documented to produce.
Threat Model
A structured analysis of a system's assets, threats, and mitigations. AI code generation produces code with no threat model awareness — the human audit process must supply what the AI structurally cannot.
Training Cutoff Dependency Risk
The specific risk that a library or library version recommended by an AI model was secure at training time but has since received critical CVEs — and the model cannot flag this because the disclosure postdates its knowledge.

Lesson 3 Quiz

Audit Methodology — confirm your understanding of the review process.
11. Why should AI prompts be preserved alongside generated code in version control?
Correct. The prompt is architectural context. It tells a future auditor what the model was and was not told, which directly informs what the model could not have reasoned about — including security constraints that were never mentioned.
Not quite. The value of prompt preservation is that it documents the AI's context at generation time — a future auditor can then reason about what constraints, trust levels, or security requirements were absent from the prompt and therefore absent from the output.
12. What is the correct relationship between static analysis tools and human code review for AI-generated code?
Correct. Static analysis and human review are complementary, not competitive. Tools catch the patterned, mechanical failures efficiently; humans catch the architectural failures that require system context.
Not correct. The two are complementary and sequenced: static analysis first (mechanical failure detection), then human review focused on the architectural and contextual failures that tools cannot reason about.
13. An AI generates a REST endpoint with correct data retrieval logic but no authentication check. What is the correct remediation approach?
Correct. Patching the symptom in the function itself may create a false resolution. The correct approach is to verify the architectural layer and ensure the endpoint is correctly protected — then document what was found and how it was addressed.
Incorrect. Adding authentication directly to the generated function without understanding the architectural intent may duplicate or conflict with existing middleware. The correct approach is to verify the correct architectural layer handles authentication and that the endpoint is registered behind it.

Lab 3 — Building Your Audit Checklist

Construct and stress-test a repeatable audit checklist for AI-generated code in a specific deployment context.

Lab Objective

You will work with the tutor to build a practical audit checklist for a specific scenario. The tutor will play devil's advocate — pushing back on checklist items that are too vague, challenging the ordering of checks, and probing whether each item is actually measurable during a code review.

Complete at least three exchanges building, refining, and defending your checklist. The goal is a list you could hand to a colleague and have them apply without additional explanation.

Choose a scenario: (A) a Python Django REST API handling user authentication and payment data, (B) a Node.js microservice processing webhook events from external vendors, or (C) a Java Spring Boot service reading and writing patient health records. Tell the tutor your choice and propose your first three checklist items.
Security Audit Tutor
Lab 3 · Audit Checklist Construction
Choose your scenario — Django REST API with payment data, Node.js webhook processor, or Spring Boot health records service — and give me your first three checklist items. I'll push back on anything that's too vague to apply in a real review.
Security Auditing for AI-Generated Code · Lesson 4

Prompt Engineering for Safer Code — and Its Limits

Better prompts reduce AI security failures. They do not eliminate them. Here is the honest accounting.
How much of the security problem can be addressed at the prompt level, and where does prompt engineering reach its structural ceiling?

In February 2024, researchers at Snyk published a comparative study examining how prompt specificity affected the security of GitHub Copilot outputs. They tested three prompt styles across 20 security-sensitive scenarios: bare functional prompts ("write a login function"), standard prompts with language and framework specification, and security-enriched prompts that specified trust levels, named relevant CWE categories, and asked explicitly for parameterized queries or bcrypt. The security-enriched prompts reduced vulnerability rate by roughly 30–40% compared to bare prompts. That is a real and meaningful improvement. It is also not sufficient on its own to replace review.

4.1 — Prompt Patterns That Reduce Security Failures

Research and practitioner experience converge on several prompt patterns that consistently produce safer AI-generated code:

Specify the trust level of inputs explicitly. "Write a function that accepts a filename from an HTTP request parameter" gives the model far more security-relevant context than "write a function that reads a file." The untrusted-input signal triggers pattern associations with validation that a bare functional prompt does not.

Name the security constraint directly. "Use parameterized queries, not string concatenation" produces parameterized queries more reliably than prompts that simply ask for database interaction. "Use bcrypt with a work factor of at least 12" produces bcrypt more reliably than "hash the password securely." The model responds to explicit naming of constraints better than implicit expectations of security.

Ask for threat reasoning alongside code. Prompts that request "explain what security assumptions this code makes" or "identify what inputs would be dangerous" produce more security-aware output — and the explanation itself becomes a review artifact that surfaces what the model did and did not consider.

Specify the negative space. "Do not use MD5 or SHA-1 for hashing," "do not use shell=True in subprocess calls," "do not hardcode credentials" — explicit prohibitions reduce the incidence of the named failure modes because the model's context now includes a token sequence that associates the task with the prohibition.

Practical Template

A security-enriched prompt template: "Write a [language] function that [task]. Input comes from [trust level]. Requirements: [explicit security constraints]. Do not use: [prohibited patterns]. After the code, list what security assumptions it makes and what inputs would break it."

4.2 — The Ceiling: What Prompting Cannot Fix

Prompt engineering has a structural ceiling determined by the same factors that cause AI security failures in the first place. Three failure modes persist regardless of prompt quality:

Architectural context that is absent from the prompt cannot be supplied by the prompt. If the broader system's trust boundaries, data classification, and threat model are not described in the prompt — and they rarely are, in full — the model cannot reason about them. No prompt phrasing compensates for missing architectural context.

Post-training vulnerabilities remain invisible. A prompt that says "use the current secure version of [library]" cannot cause the model to recommend a version released after its training cutoff. The model's knowledge is frozen. The prompt cannot thaw it.

Statistical pressure is not fully overridden by instruction. Research from the Snyk study and others consistently shows that security-enriched prompts reduce but do not eliminate failure rates. The model is completing a probability distribution over tokens. Instructions shift the distribution but do not redraw it. A prompt that says "do not use string concatenation for SQL" will produce parameterized queries most of the time — but not all of the time, particularly for complex or edge-case scenarios where the training distribution for string concatenation is very strong.

4.3 — The Sociotechnical Layer

The most durable security improvements from AI code generation come not from individual prompt optimization but from organizational processes: security-enriched prompt templates embedded in IDE configurations, pre-commit hooks that run static analysis on all AI-flagged files, mandatory architectural review for AI-generated code touching security-sensitive surfaces, and explicit training for developers on the vulnerability taxonomy from Lesson 2.

Microsoft's DevDiv team documented in their 2023 internal research (published at MSR) that teams with explicit AI code review policies had significantly lower escape rates for security vulnerabilities than teams that relied on developers' individual judgment about when AI review was needed. The finding was not that AI code was uniquely dangerous — it was that policy consistency was the determining variable, not individual skill.

4.4 — Integrating This Into Development Workflow

A practical integration of the full module's methodology into a development workflow looks like this: developers use security-enriched prompt templates as a baseline for all security-sensitive generation tasks. Pre-commit hooks run Semgrep with AI-specific rules on every commit that contains AI-flagged files. Pull requests from AI-generated code require architectural review sign-off from a designated reviewer. AI prompts are committed to a /ai-context/ directory alongside generated files. Dependency recommendations from AI are automatically piped through pip-audit or npm audit before any merge. The audit record — tool results, reviewer sign-off, finding disposition — is attached to the PR as a required artifact.

This is not onerous if the tools are configured once and the process is embedded in the team's normal workflow. It is exactly as onerous as the security failures it prevents are costly — which, for code handling payment data, authentication, or health records, is very costly indeed.

Module Summary

AI code generation produces real security vulnerabilities at a documented rate, for structural reasons: training data bias, knowledge cutoff blindness, and context collapse. The failure modes cluster around injection, cryptography, hardcoded credentials, path traversal, and missing authorization. Systematic audit — triage, static analysis, dependency review, architectural walkthrough, documentation — catches what intuition misses. Better prompting helps but has a ceiling. Organizational process consistency is the determining variable for escape rate. The audit is not optional — it is the compensating control for a technology that optimizes for functionality, not security.

Security-Enriched Prompt
A prompt that explicitly specifies input trust level, names security constraints, prohibits dangerous patterns, and requests security reasoning alongside generated code. Reduces but does not eliminate AI vulnerability rates.
Prompt Ceiling
The structural limit of prompt engineering as a security control — determined by absent architectural context, post-cutoff knowledge gaps, and residual statistical pressure toward insecure patterns in the training distribution.
AI-Context Directory
A version-controlled location (e.g., /ai-context/) where AI prompts are committed alongside generated files, providing future auditors with the model's generation context.

Lesson 4 Quiz

Prompt Engineering for Safer Code — and its limits.
14. According to the Snyk 2024 study, security-enriched prompts reduced vulnerability rates compared to bare functional prompts by approximately:
Correct. The 30–40% reduction is significant and worth pursuing — but it also means 60–70% of the original vulnerability rate persists, which is why prompt engineering complements but does not replace systematic audit.
Incorrect. The Snyk study found approximately 30–40% reduction — meaningful, but leaving the majority of the vulnerability rate intact. This is why prompt engineering is one layer of defense, not a complete solution.
15. Which of the following is a prompt pattern documented to reduce AI security failures?
Correct. Specifying input trust level, naming constraints explicitly, and requesting threat reasoning are all documented to shift the model's output toward safer patterns — they supply context the model would otherwise lack.
Incorrect. The effective prompt patterns supply security-relevant context: input trust level, explicit security constraints, prohibited patterns, and a request for threat reasoning. Efficiency, brevity, and test generation do not meaningfully improve security output.

Lab 4 — Prompt Engineering Under Pressure

Write security-enriched prompts for high-risk scenarios, then analyze their outputs for residual vulnerabilities.

Lab Objective

In this lab you will write security-enriched prompts for three security-sensitive scenarios, then the tutor will simulate what a model is likely to output and help you identify what residual vulnerabilities or assumptions remain — demonstrating the prompt ceiling in practice.

Complete at least three prompt-output-analysis cycles. The goal is to internalize where prompting helps and where it structurally cannot compensate for missing context or outdated knowledge.

Scenario 1: Write a security-enriched prompt for a Python function that accepts a username and password from a web form and authenticates the user against a PostgreSQL database. Use the template from Lesson 4 as a starting point. After writing it, the tutor will analyze it for gaps.
Security Audit Tutor
Lab 4 · Prompt Engineering Analysis
Write your security-enriched prompt for the authentication scenario above. I'll analyze it against the four prompt patterns from Lesson 4 and identify what residual risks your prompt cannot eliminate — showing you the ceiling in real terms.

Module 1 — Test

15 questions across all four lessons. Score 80% or higher to pass.
1. The Pearce et al. (NYU / IEEE S&P 2022) study found approximately what percentage of Copilot-generated programs contained CWE-listed vulnerabilities?
Correct — ~40% is the landmark finding from Pearce et al.
Incorrect — the figure was approximately 40%.
2. AI models produce insecure cryptographic recommendations primarily because:
Correct — training data bias is the root cause for cryptographic failures.
Incorrect — training data statistical bias explains cryptographic recommendation failures.
3. The Stanford 2023 Perry et al. study found that developers using AI assistants were:
Correct — increased confidence alongside increased vulnerability rate is the dangerous combination the study documented.
Incorrect — the study found increased confidence and increased vulnerability rate together.
4. Context collapse in AI code generation refers to:
Correct — local correctness without system-level safety awareness is the definition of context collapse.
Incorrect — context collapse means locally plausible but architecturally unsafe code due to absent system context.
5. According to Lesson 2, which vulnerability class should be checked FIRST when auditing AI-generated code?
Correct — injection is the highest-frequency documented finding and the recommended first check.
Incorrect — injection in database and system calls is the first priority per documented frequency.
6. GitGuardian's 2023 report of 10+ million hardcoded secrets on public GitHub is relevant to AI security because:
Correct — the public training corpus containing millions of exposed secrets teaches the model that this structural pattern is normal.
Incorrect — the relevance is that the model's training data contains millions of examples of credentials in source code, normalizing that pattern.
7. ECB mode is dangerous for encryption because:
Correct — ECB's deterministic mapping is its fundamental security failure.
Incorrect — ECB is dangerous because of its deterministic output: identical plaintext always produces identical ciphertext.
8. What is the recommended purpose of running static analysis BEFORE human review of AI-generated code?
Correct — the two types of review are complementary and sequenced for efficiency.
Incorrect — static analysis precedes human review to handle mechanical failures, freeing human reviewers for architectural reasoning.
9. Why should AI prompts be stored in version control alongside generated code?
Correct — the prompt is architectural context; without it, future auditors cannot reason about what was missing from the AI's inputs.
Incorrect — the value is that prompts document the AI's generation context, enabling future auditors to identify what security constraints were absent from the model's inputs.
10. An AI recommends using a specific version of a library that had a critical CVE published three months after the model's training cutoff. This is an example of:
Correct — training cutoff dependency risk is precisely this: the model cannot know what CVEs were published after its training concluded.
Incorrect — this is training cutoff dependency risk. The model's library knowledge is frozen at its cutoff date regardless of prompt specificity.
11. The Snyk 2024 study found that security-enriched prompts reduced AI vulnerability rates by approximately:
Correct — 30–40% is the documented improvement, meaningful but leaving the majority of the original rate intact.
Incorrect — the Snyk figure is approximately 30–40% reduction, significant but not sufficient alone.
12. Which of the following prompt patterns is documented to reduce AI security failures?
Correct — explicit prohibition, input trust specification, and threat reasoning request are all documented effective patterns.
Incorrect — the effective pattern combines explicit prohibition, input trust level, and a request for threat reasoning.
13. According to Microsoft DevDiv's 2023 MSR research, the determining variable in AI code security escape rates was:
Correct — policy consistency outweighed individual skill or tool choice as the variable controlling security escape rate.
Incorrect — the determining variable was policy consistency, not tool choice, seniority, or language.
14. The "prompt ceiling" describes:
Correct — three structural factors set the ceiling: missing context, frozen knowledge, and residual statistical pressure.
Incorrect — the prompt ceiling is structural: absent context, post-cutoff knowledge gaps, and statistical pressure toward training distribution patterns that prompting cannot fully override.
15. Which of the following is the most complete description of a correct response when an AI-generated REST endpoint lacks authentication?
Correct — architectural verification, correct-layer confirmation, and documented disposition are the three required elements of proper remediation.
Incorrect — the complete response addresses the correct architectural layer, verifies registration behind it, and documents the finding — not just patching the function or re-generating.