When electric power first reached factories in the 1880s, mill owners simply bolted electric motors onto machinery designed for steam. The layouts, the workflows, the safety assumptions — all inherited from a prior era. Electrocution rates climbed. It took roughly four decades before engineers realized the technology demanded entirely new mental models, not faster adoption of old ones. By 1920, industrial accident rates finally began falling, not because electricity became safer, but because the people working with it learned to think differently about risk.
The same pattern is unfolding now with AI-assisted code generation. GitHub Copilot launched in June 2021. By 2023, Stanford researchers studying 1,689 developers found that those who used AI coding assistants were significantly more likely to introduce security vulnerabilities than those who wrote code by hand — and were also more confident their code was safe. The technology is moving faster than the safety culture that should accompany it. Organizations are shipping AI-generated code into payment systems, authentication flows, and medical devices with the same breezy optimism that mill owners had when they first threw the switch.
This course is a practical audit methodology for that reality. It will not teach you to distrust AI tools — they produce real value. It will teach you to read AI-generated code the way an experienced structural engineer reads a building plan: looking specifically for the failure modes that the designer's optimism tends to obscure. Four lessons, four labs, one module test. By the end you will have a working checklist, pattern recognition for the most common AI security mistakes, and the habit of asking the right questions before any AI-written function reaches a production branch.
In August 2023, a security researcher named Joseph Thacker published an analysis of code produced by GitHub Copilot across 25 common programming tasks. He found that in roughly 40% of cases, Copilot introduced at least one security issue — hardcoded credentials, missing input validation, insecure cryptographic defaults. The code was syntactically correct, passed obvious tests, and looked professional to a developer unfamiliar with the specific attack surface. The AI did not produce garbage. It produced plausible, functional, quietly dangerous code.
This is the defining characteristic of AI-generated security failures: they do not look like failures. They look like the code that a reasonably competent junior developer would write before their first serious code review. Understanding why requires understanding what AI models actually learn, and what they structurally cannot learn.
Large language models trained on code — Copilot, CodeWhisperer, Cursor, Codeium — learned from public repositories. Public repositories contain a great deal of insecure code. A 2022 study by researchers at New York University, published in the proceedings of IEEE Symposium on Security and Privacy, examined 1,692 programs generated by Copilot and found that approximately 40% contained CWE-listed vulnerabilities. The training signal for security is weak relative to the training signal for functionality: there are vastly more repositories where code works than repositories where code is both functional and explicitly secure.
The model learns the statistical average of what code looks like, not the security principles behind correct code. When the average public repository uses MD5 for password hashing (because it was common until roughly 2012), the model learns MD5 as a reasonable password-hashing choice. When the average tutorial skips parameterized queries in favor of string concatenation for simplicity, the model learns string concatenation as a plausible approach.
This is not a bug in the implementation — it is a fundamental property of the training objective. The model is rewarded for predicting plausible next tokens, not for predicting secure next tokens.
The NYU / IEEE S&P 2022 paper "Asleep at the Keyboard? Assessing the Security of GitHub Copilot's Code Contributions" (Pearce et al.) is the landmark study establishing this baseline. It examined scenarios specifically designed to elicit security-sensitive code — file I/O, network calls, cryptography, SQL — and found vulnerabilities in the majority of generated outputs across most categories.
Every AI model has a training cutoff date. Security does not. The CVE database grows by roughly 25,000 new entries per year. A model trained through mid-2023 has no knowledge of vulnerabilities disclosed in late 2023 or 2024 — which means it may confidently recommend a library, API, or pattern that was publicly broken after its cutoff.
The Log4Shell vulnerability (CVE-2021-44228) was disclosed in December 2021. It affected Log4j versions that had been in wide use for years. Models trained before December 2021 would have recommended exactly the vulnerable version as standard practice — because it was standard practice. Models trained slightly after would have absorbed the fix. But the model itself cannot tell you which side of that cutoff a given piece of advice falls on.
This creates a structural asymmetry: the AI is confident regardless of whether its knowledge is current. It does not hedge when recommending a dependency. It does not flag that a cryptographic primitive may have been deprecated since it was trained. The model's epistemic humility about security is not calibrated to its actual epistemic position.
A human security engineer reviewing a function considers the entire system: where does user input enter? What trust boundaries exist? What is the data classification of this field? AI code generation operates almost entirely on local context — the immediately visible code, the function signature, the docstring. It does not know that the string being passed to that function comes from an unauthenticated HTTP POST. It does not know that the database being queried contains PII subject to GDPR. It cannot see the threat model because no one has shown it the threat model.
This produces a specific failure mode: code that is secure in isolation but insecure in deployment. A generated password-reset function might handle the reset flow correctly but fail to rate-limit requests — because the rate-limiting logic lives two layers up in the middleware, invisible to the model. A generated file-upload handler might validate MIME types but miss that the upload directory is web-accessible — again, architectural context the model cannot infer from a function stub.
AI code generation optimizes for local correctness: the function does what the docstring says. Security is fundamentally about system-level correctness: the function behaves safely within a threat environment the model cannot see. This gap is not fixable by prompting more carefully. It requires human review at the architectural level.
Perhaps the most dangerous property of AI-generated code is that it does not look uncertain. A junior developer who is unsure about the correct way to implement HMAC signature verification will often leave a comment, search Stack Overflow visibly, or ask a colleague. AI models produce fluent, confident code regardless of whether the underlying pattern is correct.
This was demonstrated in the Stanford study published in 2023 (Perry et al., "Do Users Write More Insecure Code with AI Assistants?"): participants using AI assistants were more likely to rate their own code as secure compared to control participants who wrote code without AI help — even though the AI-assisted code contained more vulnerabilities. The tool increased both the rate of errors and the developer's confidence that errors were absent. This combination — increased vulnerability rate, increased false confidence — is precisely the condition that allows vulnerabilities to pass through review undetected.
In this lab you will interact with an AI security tutor to explore the three core failure modes from Lesson 1: training data bias, knowledge cutoff blindness, and context collapse. The AI will push back if your reasoning is imprecise — treat that as part of the learning, not a malfunction.
Complete at least three substantive exchanges to mark the lab complete. Quality of reasoning matters more than volume of messages.
The Pearce et al. study did not find that AI models produce vulnerabilities randomly across all possible categories. It found that failures cluster — certain weakness types appear far more often than others. This is not coincidental. The clustering maps directly onto the structural explanations from Lesson 1: vulnerabilities appear where training data is historically inconsistent, where context is essential to safe implementation, and where the secure approach requires knowledge that postdates the training corpus. Knowing which clusters to look for first is the foundation of efficient auditing.
SQL injection remains the most commonly reproduced vulnerability in AI-generated code. The reason is straightforward: tutorials, documentation examples, and early Stack Overflow answers overwhelmingly used string concatenation to build queries. Parameterized queries became the clear standard only gradually, and large swaths of the training corpus predate that consensus.
In the Pearce et al. scenarios specifically targeting SQL, Copilot produced injection-vulnerable code in a majority of cases. The model did not produce injection vulnerabilities because it was unaware of parameterized queries — it produces them because string concatenation is more statistically common in the context of simple tutorial-style examples, which dominate the training distribution for common database operations.
Command injection (CWE-77) follows the same pattern. Subprocess calls, shell execution, and system() invocations frequently appear in AI output without the input sanitization or argument array patterns that prevent shell interpretation of user-controlled data.
Any AI-generated function that touches a database or executes a system command should be treated as injection-suspect until parameterized queries or argument arrays are confirmed. String concatenation in these contexts is an automatic flag regardless of where the input appears to come from.
Cryptographic recommendations from AI models are systematically outdated. Common findings include: MD5 or SHA-1 used for password hashing (both broken for this purpose since the mid-2000s), ECB mode used for block cipher encryption (which produces deterministic ciphertext and leaks data patterns), insufficient random number generation using non-cryptographic PRNGs like Python's random module or JavaScript's Math.random(), and hardcoded or static initialization vectors for AES.
The Amazon CodeWhisperer security benchmark, published by AWS in 2023, found that cryptographic failures were among the top three most frequently generated vulnerability categories. This tracks with the training data explanation: cryptography tutorials have historically prioritized demonstrating that encryption "works" over demonstrating that it is secure. A tutorial showing AES encryption will typically use a simple, readable key — often a hardcoded string — because the point of the tutorial is the mechanics of the call, not operational security.
AI models reproduce hardcoded credentials with surprising regularity — API keys, database passwords, JWT secrets embedded directly in source code. This happens because demonstration code on GitHub, in blog posts, and in documentation frequently uses placeholder credentials that look like real credentials. The model learns that code-near-credential is a valid syntactic pattern.
GitGuardian's 2023 State of Secrets Sprawl report found over 10 million hardcoded secrets committed to public GitHub repositories in a single year. That corpus is exactly what AI models trained on. When you ask a model to generate a function that connects to an external API, it has seen thousands of examples where the API key appears in the same file as the connection function — and it will reproduce that pattern.
Path traversal vulnerabilities allow attackers to read or write files outside an intended directory by manipulating path inputs with sequences like ../../etc/passwd. AI-generated file-handling code frequently omits the path normalization and directory restriction checks that prevent this — because the model is completing a pattern (open a file given a name) without the architectural context that tells it the name comes from untrusted input.
This is context collapse in its clearest form. A function stub that says def read_report(filename) gives the model no signal that filename could be attacker-controlled. The model generates a clean, readable file-open call — and omits the os.path.realpath() check and the containment assertion that would make it safe.
When asked to generate an API endpoint or a data-access function, AI models often produce the data access logic cleanly but omit authentication verification entirely. The reason is again context: authorization belongs to middleware layers that the model cannot see. The model generates what was asked for — a function that retrieves user data — and does not add the authentication check because the middleware is an external dependency invisible in the prompt.
This was identified as a specific concern in a 2023 analysis by researchers at the University of Quebec (Hajipour et al.), who found that AI-generated REST API endpoints frequently lacked access control validation in scenarios where the prompt did not explicitly request it.
When reviewing AI-generated code, check in this order: (1) injection in database and system calls, (2) cryptographic algorithm and configuration choices, (3) credential and secret handling, (4) file path validation, (5) presence of authentication and authorization checks on data-access functions. This order reflects documented frequency, not theoretical importance.
def read_report(filename) opens a file without path validation. This is best described as an example of:The tutor will present you with short code snippets typical of AI-generated output. Your task is to identify the vulnerability class (using CWE numbers where possible), explain the specific failure, and suggest the secure alternative. The tutor will probe your reasoning and correct imprecise identifications.
Aim for at least three code review exchanges. Push for specificity — "it's insecure" is not an acceptable answer in a real audit.
In 2022, the cryptocurrency exchange Coinbase published a retrospective on their internal code review practices following a series of near-miss security incidents. The pattern they identified was consistent: reviewers caught problems they were specifically looking for and missed problems they were not primed to consider. Code review based on general vigilance is not sufficient. Structured checklists tied to known failure modes consistently outperform expert intuition in controlled studies — a finding that applies doubly to AI-generated code, where the failure modes are specific and documented.
The first phase of an AI code audit happens before examining a single line. It is about establishing context that the AI model did not have when generating the code. Four questions must be answered:
1. What is the trust level of the inputs? — Does any input to this code come from a user, an external API, a file, or a network connection? If yes, every function that touches that input is injection- and traversal-suspect.
2. What is the data classification? — Does this code handle credentials, PII, payment data, or health information? If yes, cryptographic handling and access control are mandatory checks, not optional ones.
3. What is the deployment environment? — Is this code running in a container with network access? In a serverless function? In a browser? The threat surface varies significantly by environment.
4. What AI tool generated this, and what was the prompt? — If available, the original prompt reveals what context the model had. A prompt that does not mention authentication will produce code without authentication — predictably.
Preserve AI prompts alongside generated code in version control. The prompt is architectural context. A future auditor reviewing the code six months later cannot know what the AI was told — which is precisely the information needed to reason about what the AI could not have known.
Static analysis tools — Semgrep, Bandit (Python), ESLint security plugins (JavaScript), SpotBugs (Java), CodeQL — should run on all AI-generated code before human review. Not instead of human review. Before it. The tools catch the mechanical failures: string concatenation in SQL queries, calls to deprecated hash functions, subprocess calls without shell=False. This frees human reviewers to focus on the architectural and contextual failures that tools cannot detect.
GitHub's 2023 Octoverse report noted that repositories using AI code generation had adoption rates of automated security scanning roughly equivalent to those without — suggesting that organizations are not consistently adding automated review when they add AI generation. This is the gap. The two capabilities should be treated as a required pair, not independent options.
Semgrep, specifically, allows teams to write custom rules targeting AI-specific failure patterns. A rule that flags any direct variable interpolation into a database query string can catch Copilot's most common SQL failure within seconds of commit — before a human reviewer sees the code.
AI models frequently recommend specific library versions, and those recommendations reflect the training corpus rather than current security advisories. A model asked to generate Python cryptographic code may recommend pycrypto — a library that was formally deprecated and abandoned in 2017 in favor of pycryptodome. A model generating Node.js JWT code may recommend jsonwebtoken versions that predate the header injection fixes applied in version 9.0.0 (released December 2022).
Every AI-generated dependency should be checked against current CVE databases, and the specific version pinned (if any) should be compared against the current stable release. Tools like npm audit, pip-audit, OWASP Dependency-Check, and GitHub's Dependabot automate much of this — but they must be invoked, and AI-assisted development workflows often skip this step under time pressure.
The failures that static analysis cannot catch are the ones requiring architectural reasoning: missing rate limiting, absent authentication middleware, incorrect trust boundary placement, missing logging of security events. These require a human reviewer who knows the system to walk through the generated code asking a specific question at each function boundary: what happens if the input to this function is adversarial?
This is not a general code review. It is a threat-model walkthrough applied to code that was generated without a threat model. It should be performed by someone who has read the system's threat model document — or, if no such document exists, the architectural review should begin by producing one.
When an AI generates an endpoint function and omits authentication, the correct fix is not to add authentication to the function — it is to confirm that authentication is handled at the correct layer (middleware, API gateway, or decorator) and that the function is correctly registered behind it. Patching the symptom in the generated code without understanding the architectural intent creates a false sense of resolution.
An audit without documentation is an opinion. For AI-generated code in production systems, audit records should capture: which tool generated the code, the date of generation (to establish what model version and training cutoff applies), which static analysis tools ran and what they found, which human reviewer conducted the architectural review and what they checked, and the specific disposition of any flagged findings. This record is both a quality gate and a liability document — if a vulnerability is discovered post-deployment, the audit record demonstrates due diligence and establishes what was and was not visible at review time.
You will work with the tutor to build a practical audit checklist for a specific scenario. The tutor will play devil's advocate — pushing back on checklist items that are too vague, challenging the ordering of checks, and probing whether each item is actually measurable during a code review.
Complete at least three exchanges building, refining, and defending your checklist. The goal is a list you could hand to a colleague and have them apply without additional explanation.
In February 2024, researchers at Snyk published a comparative study examining how prompt specificity affected the security of GitHub Copilot outputs. They tested three prompt styles across 20 security-sensitive scenarios: bare functional prompts ("write a login function"), standard prompts with language and framework specification, and security-enriched prompts that specified trust levels, named relevant CWE categories, and asked explicitly for parameterized queries or bcrypt. The security-enriched prompts reduced vulnerability rate by roughly 30–40% compared to bare prompts. That is a real and meaningful improvement. It is also not sufficient on its own to replace review.
Research and practitioner experience converge on several prompt patterns that consistently produce safer AI-generated code:
Specify the trust level of inputs explicitly. "Write a function that accepts a filename from an HTTP request parameter" gives the model far more security-relevant context than "write a function that reads a file." The untrusted-input signal triggers pattern associations with validation that a bare functional prompt does not.
Name the security constraint directly. "Use parameterized queries, not string concatenation" produces parameterized queries more reliably than prompts that simply ask for database interaction. "Use bcrypt with a work factor of at least 12" produces bcrypt more reliably than "hash the password securely." The model responds to explicit naming of constraints better than implicit expectations of security.
Ask for threat reasoning alongside code. Prompts that request "explain what security assumptions this code makes" or "identify what inputs would be dangerous" produce more security-aware output — and the explanation itself becomes a review artifact that surfaces what the model did and did not consider.
Specify the negative space. "Do not use MD5 or SHA-1 for hashing," "do not use shell=True in subprocess calls," "do not hardcode credentials" — explicit prohibitions reduce the incidence of the named failure modes because the model's context now includes a token sequence that associates the task with the prohibition.
A security-enriched prompt template: "Write a [language] function that [task]. Input comes from [trust level]. Requirements: [explicit security constraints]. Do not use: [prohibited patterns]. After the code, list what security assumptions it makes and what inputs would break it."
Prompt engineering has a structural ceiling determined by the same factors that cause AI security failures in the first place. Three failure modes persist regardless of prompt quality:
Architectural context that is absent from the prompt cannot be supplied by the prompt. If the broader system's trust boundaries, data classification, and threat model are not described in the prompt — and they rarely are, in full — the model cannot reason about them. No prompt phrasing compensates for missing architectural context.
Post-training vulnerabilities remain invisible. A prompt that says "use the current secure version of [library]" cannot cause the model to recommend a version released after its training cutoff. The model's knowledge is frozen. The prompt cannot thaw it.
Statistical pressure is not fully overridden by instruction. Research from the Snyk study and others consistently shows that security-enriched prompts reduce but do not eliminate failure rates. The model is completing a probability distribution over tokens. Instructions shift the distribution but do not redraw it. A prompt that says "do not use string concatenation for SQL" will produce parameterized queries most of the time — but not all of the time, particularly for complex or edge-case scenarios where the training distribution for string concatenation is very strong.
The most durable security improvements from AI code generation come not from individual prompt optimization but from organizational processes: security-enriched prompt templates embedded in IDE configurations, pre-commit hooks that run static analysis on all AI-flagged files, mandatory architectural review for AI-generated code touching security-sensitive surfaces, and explicit training for developers on the vulnerability taxonomy from Lesson 2.
Microsoft's DevDiv team documented in their 2023 internal research (published at MSR) that teams with explicit AI code review policies had significantly lower escape rates for security vulnerabilities than teams that relied on developers' individual judgment about when AI review was needed. The finding was not that AI code was uniquely dangerous — it was that policy consistency was the determining variable, not individual skill.
A practical integration of the full module's methodology into a development workflow looks like this: developers use security-enriched prompt templates as a baseline for all security-sensitive generation tasks. Pre-commit hooks run Semgrep with AI-specific rules on every commit that contains AI-flagged files. Pull requests from AI-generated code require architectural review sign-off from a designated reviewer. AI prompts are committed to a /ai-context/ directory alongside generated files. Dependency recommendations from AI are automatically piped through pip-audit or npm audit before any merge. The audit record — tool results, reviewer sign-off, finding disposition — is attached to the PR as a required artifact.
This is not onerous if the tools are configured once and the process is embedded in the team's normal workflow. It is exactly as onerous as the security failures it prevents are costly — which, for code handling payment data, authentication, or health records, is very costly indeed.
AI code generation produces real security vulnerabilities at a documented rate, for structural reasons: training data bias, knowledge cutoff blindness, and context collapse. The failure modes cluster around injection, cryptography, hardcoded credentials, path traversal, and missing authorization. Systematic audit — triage, static analysis, dependency review, architectural walkthrough, documentation — catches what intuition misses. Better prompting helps but has a ceiling. Organizational process consistency is the determining variable for escape rate. The audit is not optional — it is the compensating control for a technology that optimizes for functionality, not security.
In this lab you will write security-enriched prompts for three security-sensitive scenarios, then the tutor will simulate what a model is likely to output and help you identify what residual vulnerabilities or assumptions remain — demonstrating the prompt ceiling in practice.
Complete at least three prompt-output-analysis cycles. The goal is to internalize where prompting helps and where it structurally cannot compensate for missing context or outdated knowledge.