Module 4 · Lesson 1

Commit Hygiene: The Atomic Unit of Trust

Every commit is a signed statement about what changed and why. Claude can generate them — but you are accountable for every word.

What separates a commit history that guides future engineers from one that gets them fired?

On August 1, 2012, Knight Capital Group deployed new trading software to production. A technician failed to update one of eight servers with the latest code. The old server still carried a dormant flag called Power Peg — an algorithm decommissioned years earlier. Within 45 minutes, Knight had executed 4 million trades it never intended, losing $440 million. The company was sold within weeks.

Post-mortems revealed the root cause extended beyond deployment: the commit history for that flag was incoherent. Developers had bundled unrelated changes together. The decommissioning of Power Peg was buried inside a commit titled "misc cleanup" alongside twelve other modifications. No one tracing the flag's lifecycle could quickly reconstruct what had changed, when, or why it had been retained on certain servers.

Why Atomic Commits Matter

An atomic commit encapsulates exactly one logical change. It passes tests in isolation. It can be reverted cleanly without pulling unrelated work along with it. The discipline sounds simple; the practice is rare. Engineers routinely bundle refactoring, bug fixes, and feature additions into a single commit because it is faster in the moment — and because no one is watching.

Claude Code can generate commits automatically when given the --auto-commit flag or when working in agentic mode. The convenience is real. The risk is that Claude, optimizing for task completion, will bundle changes just as human engineers do under deadline pressure. Your job as the human operator is to review the proposed commit scope before it lands.

The key discipline: one logical change, one commit. If a commit message requires the word "and" to describe what changed, it probably contains two commits.

Production Pattern

Run git diff --staged before every Claude-generated commit. Scan for files that surprise you — config changes alongside logic changes, test deletions alongside feature additions. These are signals that Claude bundled work that should be separated.

Writing Commit Messages That Survive Archaeology

The Conventional Commits specification, adopted by Angular, Vue, and hundreds of open-source projects, establishes a minimal grammar: type(scope): subject. Types include feat, fix, docs, style, refactor, test, chore. The scope names the subsystem. The subject uses the imperative mood — "add rate limiting" not "added rate limiting."

Claude follows Conventional Commits when prompted explicitly. Without direction, it tends toward verbose past-tense summaries that read like changelogs rather than commit messages. The distinction matters: a commit message should explain the intent of the change; a changelog documents the result.

A strong commit message has three parts. The subject line (50 characters or fewer) states what changed. A blank line separates it from the body. The body (wrapped at 72 characters) explains why the change was necessary, what alternatives were considered, and any non-obvious consequences. Claude can generate all three — but you must prompt for all three.

Atomic Commit A commit that encapsulates exactly one logical change, passes tests in isolation, and can be reverted without side effects on unrelated work.

Conventional Commits A specification for commit message structure: type(scope): subject. Enables automated changelog generation and semantic versioning.

Imperative Mood The command form in commit subjects: "fix null check" not "fixed null check." Reads as an instruction to the codebase, not a past-tense diary entry.

Signing and Attribution in AI-Assisted Workflows

When Claude writes code that you commit under your name, the authorship question is real but settled by current practice: you are the author. Git sign-off (git commit -s) appends a Signed-off-by trailer indicating the committer accepts the Developer Certificate of Origin. Many regulated projects require it. If Claude generated the diff, you are still the signatory — and still responsible for its correctness.

Some teams document AI assistance in the commit body using a trailer like Co-Authored-By: Claude (Anthropic). This is optional, increasingly common in open-source, and professionally prudent — it creates an honest audit trail without obscuring accountability. GitHub renders Co-Authored-By trailers in pull request views, which makes the assistance visible to reviewers.

The Discipline Principle

Commit hygiene is not bureaucracy. It is the foundation of every subsequent workflow in this module — PR reviews, rollback procedures, and production incident post-mortems all depend on a commit history you can read and trust. Knight Capital's disaster was partly a commit-history disaster. Your commits are signed statements of professional judgment.

Quiz · Lesson 1

Commit Hygiene and Atomic Change Discipline

1. What is the defining characteristic of an atomic commit?

Correct. Atomicity is about logical scope — one change, reversible in isolation — not line count or tooling compliance.

Not quite. Atomic commits are defined by logical scope, not size metrics or authorship rules.

2. In the Knight Capital 2012 incident, what commit-history failure contributed to the disaster?

Correct. Bundled, poorly labeled commits made the flag's lifecycle impossible to trace quickly — a real-world cost of commit-hygiene negligence.

The key failure was bundling an important change inside an unrelated, vaguely-titled commit — making historical reconstruction impossible under pressure.

3. According to Conventional Commits, which message format is correct?

Correct. Conventional Commits requires lowercase type, optional scope in parentheses, colon-space separator, and imperative mood subject.

Conventional Commits specifies: lowercase type, optional scope in parentheses, colon-space, then imperative mood (present tense) subject.

4. When Claude auto-generates commits during agentic work, what is the operator's primary review responsibility?

Correct. The human operator must verify that Claude hasn't bundled unrelated changes — running git diff --staged is the practical check.

The key responsibility is verifying the logical scope of the commit before it lands, not rewriting style or disabling automation entirely.

5. A commit message reads: "fix auth bug and refactor token validation module." What is wrong with it?

Correct. "And" in a commit message is a red flag for a non-atomic commit. The bug fix and refactor should be separate commits.

While other issues exist, the core problem is atomicity: "and" signals two separate changes that should be two separate commits.

Lab · Lesson 1

Practice: Commit Message Craft with Claude

Scenario: Triage a Messy Commit History

You're reviewing a Claude-assisted sprint. The AI produced six commits, but several bundle unrelated changes. Use this lab to practice diagnosing non-atomic commits, rewriting messages per Conventional Commits, and deciding when to split a commit into two.

Start by describing a commit message that mixes concerns — for example: "update user model and fix login bug and add tests." Ask the assistant to help you untangle it into atomic commits with proper Conventional Commits formatting.

Commit Hygiene Lab

AI Assistant

Welcome to the commit hygiene lab. I'm here to help you practice writing atomic commits and Conventional Commits messages. Describe a messy commit you want to untangle — paste a fake commit message, a list of file changes, or a scenario, and we'll work through splitting and rewriting it together.

Module 4 · Lesson 2

Pull Request Architecture: Reviews That Actually Work

A PR is not a formality. It is the last human checkpoint before code meets production. Claude can accelerate every stage — and corrupt every stage if misused.

How do you design a PR process that catches real problems rather than performing the ritual of review?

On April 7, 2014, Neel Mehta of Google Security disclosed CVE-2014-0160, the Heartbleed vulnerability in OpenSSL. The bug had lived in production for two years, introduced in a commit by Robin Seggelmann on December 31, 2011. The commit added the TLS heartbeat extension. A bounds check was missing. An attacker could read 64KB of server memory per request — exposing private keys, passwords, and session tokens.

The code was reviewed. The commit was accepted. The review missed the missing bounds check because the diff was large, the context was complex, and the reviewer — Dr. Stephen Henson — later acknowledged he had not checked carefully enough. The lesson applied to every code review system: a review that processes too much at once catches nothing reliably.

The Structure of a Reviewable PR

Heartbleed's commit touched 579 lines across multiple files. Research by SmartBear (published in their 2011 Code Review Best Practices report) found that reviewers who examine more than 400 lines of diff in a single session show dramatically declining defect detection — their brains simply saturate. The implication for Claude-assisted work is sharp: Claude can produce 2,000-line diffs in the time it takes you to write a prompt. Unless you constrain scope at the task level, your PRs will routinely exceed reviewable size.

A reviewable PR has a single stated purpose, a diff under 400 lines of meaningful change (excluding auto-generated files), a description that states what changed, why it was necessary, and what to pay attention to, and a test plan that a reviewer can execute independently. The description is not optional. Claude can draft it — and should be asked to — but you must verify it accurately describes the actual diff.

The Size Rule

If your PR exceeds 400 meaningful lines of diff, split it. If splitting it breaks functionality, that is evidence the feature was not decomposed into atomic pieces before implementation began. Fix the decomposition, not the review process.

Claude as a Pre-Review Pass

One of the highest-leverage uses of Claude in a PR workflow is as a pre-review — a structured pass that happens before the PR reaches human reviewers. You feed Claude the diff and a description, and ask it to identify: security boundary violations, missing error handling, logical inconsistencies between the description and the code, and test coverage gaps.

This is not a substitute for human review. It is a filter that ensures reviewers spend their cognitive bandwidth on judgment calls rather than catching that a null check was omitted. Teams at companies including Shopify and Stripe have publicly described using LLM pre-review passes to reduce review round-trips by catching mechanical defects before human reviewers engage.

The prompt structure matters. Vague prompts ("review this code") produce generic feedback. Specific prompts ("review this diff for security boundary violations, specifically anything that reads user-controlled input without validation before using it in a database query") produce actionable findings.

Pre-Review Pass A Claude-assisted scan of a PR diff before human review, targeting specific defect categories to preserve reviewer bandwidth for judgment-intensive decisions.

PR Description A structured narrative in a pull request explaining what changed, why the change was necessary, and what reviewers should focus on. Claude can draft it; the author must verify accuracy.

Review Saturation The cognitive phenomenon where reviewers examining large diffs show steeply declining defect detection rates. Documented by SmartBear research to onset above ~400 lines.

Structuring the Review Request

When you ask Claude to write a PR description, include: the task it was solving, the approach it took, any alternatives it considered, and any parts of the diff it is uncertain about. That last point is critical — Claude performing agentic work will sometimes make judgment calls it is not fully confident about. A good PR description surfaces those explicitly so reviewers can prioritize them.

The WHATWHY template used by teams at Google structures PR descriptions into three mandatory sections: What (one-sentence summary of the change), Why (the problem or requirement driving it), and How (the notable implementation decision, not a line-by-line walkthrough). Claude can populate all three quickly when given the task context — but without that context, it will hallucinate plausible-sounding motivations that may not match your actual intent.

The Heartbleed Lesson

Heartbleed was not a failure of intelligence. Dr. Henson was a highly competent cryptographer. It was a failure of review architecture — too much change, too little structure, no focused attention on security boundaries. Claude-assisted development produces more code faster. That is exactly when review discipline must increase, not decrease.

Quiz · Lesson 2

Pull Request Architecture and Review Quality

1. The Heartbleed vulnerability survived code review primarily because of what structural failure?

Correct. Dr. Henson acknowledged insufficient checking — a consequence of review saturation on a large, complex diff.

Henson was highly qualified. The failure was structural: too much to review reliably in a single pass.

2. SmartBear's Code Review Best Practices research identified what approximate line-count threshold for declining defect detection?

Correct. Above roughly 400 meaningful lines, reviewer defect detection rates drop sharply due to cognitive saturation.

SmartBear's research documented the saturation threshold at approximately 400 lines of meaningful diff.

3. What is the primary purpose of a Claude pre-review pass on a PR diff?

Correct. The pre-review pass is a filter, not a replacement. It handles mechanical defects so humans can focus on architectural and security judgment calls.

Claude pre-review is a filter that catches mechanical issues — it does not replace human judgment on architectural and security decisions.

4. When asking Claude to draft a PR description, what context is most critical to provide?

Correct. Without task context, Claude will hallucinate plausible-sounding motivations. Uncertainties in the diff should surface explicitly for reviewers.

The most critical context is the task being solved, the approach, alternatives, and especially areas of uncertainty — not process metadata.

5. If splitting a large PR breaks functionality, what does that indicate according to the lesson?

Correct. If a PR can't be split without breaking things, the underlying task decomposition was flawed. That is the problem to fix.

When splitting breaks functionality, the root cause is inadequate task decomposition at the planning stage — not a review process problem.

Lab · Lesson 2

Practice: PR Description and Pre-Review Prompting

Scenario: Write a PR Description and Run a Pre-Review Pass

You've completed a feature: a new rate-limiting middleware for an Express.js API. The middleware reads from Redis, tracks request counts per IP per minute, and returns 429 with a Retry-After header when limits are exceeded. You need to write a PR description and then run a pre-review pass targeting security boundaries.

Ask the assistant to help you write a PR description using the WHATWHY template (What / Why / How), then ask it to identify the specific security boundary questions a reviewer should focus on for this type of rate-limiting implementation.

PR Architecture Lab

AI Assistant

Welcome to the PR architecture lab. I can help you draft PR descriptions using structured templates and conduct focused pre-review passes targeting specific defect categories. Describe your change or paste a diff snippet, and let's build a review-ready PR together.

Module 4 · Lesson 3

Branching Strategies and Merge Discipline

How you structure branches determines what Claude can safely automate and what it must always hand back to a human. The topology of your repository is a statement of your risk tolerance.

When Claude is executing autonomously across multiple branches, what does the branching strategy protect — and what does it expose?

On March 22, 2016, Azer Koçulu unpublished 273 npm packages in a dispute with Kik Interactive over a package name. One of those packages — left-pad, an eleven-line string-padding utility — was a transitive dependency of Babel, React, and thousands of other projects. Within minutes, builds broke across the internet. npm, Inc. took the unprecedented step of republishing the package without the author's consent.

The incident is usually told as a story about dependency management. It is equally a story about what happens when an unreviewed, unpredicted removal cascades through a system with no gatekeeping at the integration point. Every project that depended on left-pad had merged that dependency without a policy governing who could approve external dependency additions. The branch protection rule that would have required human review of dependency changes simply did not exist.

The Three Major Branching Strategies

Git Flow, formalized by Vincent Driessen in 2010, uses long-lived develop and master branches with feature, release, and hotfix branches. It provides strong isolation and a clear release cadence but generates substantial merge overhead. Teams practicing continuous delivery often find Git Flow creates ceremony without safety.

GitHub Flow, described by Scott Chacon in 2011, is simpler: one main branch, short-lived feature branches, deploy from main after merge. It works well for teams releasing continuously. The risk is that main must always be deployable — every merge is a potential deployment. Branch protection rules become load-bearing.

Trunk-Based Development (TBD), practiced at Google and documented in the DORA research, pushes even further: all developers commit to a single trunk, feature flags control visibility. It maximizes integration frequency and minimizes merge conflicts but demands rigorous automated testing and feature flagging discipline.

For Claude-assisted workflows, the critical question is: which branches can Claude commit to autonomously, and which require human approval before merge? The answer should be encoded in branch protection rules, not left to convention.

Branch Protection Rule

Configure main (or your integration branch) to require at least one human reviewer, passing CI, and — if your team adds AI disclosure — a check that verifies AI-assisted PRs carry the appropriate trailer. GitHub, GitLab, and Bitbucket all support these constraints natively. Encode your policy; do not rely on habit.

Claude and Branch Autonomy

When Claude operates in agentic mode — using claude --dangerously-skip-permissions or in a CI context where it has write access — it can create branches, commit, and push without interruption. This is powerful. It is also the exact scenario where branch topology determines blast radius.

A safe Claude-autonomous setup gives it write access to a dedicated claude/ branch namespace (e.g., claude/fix-auth-token-leak) and read access to main. It can never push directly to main. Merging from any claude/* branch to main requires human review and passing CI. This architecture lets Claude work at speed while guaranteeing a human checkpoint before production impact.

Teams at Vercel and Linear have described similar architectures — namespaced bot branches, required human merge approval — in engineering blog posts discussing their AI-assisted development workflows. The pattern is consistent: autonomy in the branch, human gate at integration.

Trunk-Based Development A branching strategy where all developers integrate to a single trunk frequently, using feature flags to control production visibility. Practiced at Google and endorsed by DORA research.

Branch Protection Rule A repository setting that enforces conditions before a branch can be merged — required reviewers, CI pass, status checks. The primary mechanism for encoding human-in-the-loop policy.

Claude Branch Namespace A dedicated branch prefix (e.g., claude/*) granting Claude write access for autonomous work while restricting direct pushes to integration branches. Human review required at merge.

Merge Strategies: Merge Commit, Squash, and Rebase

GitHub, GitLab, and Bitbucket offer three merge strategies. Merge commit preserves all branch commits and adds a merge commit — maximum history fidelity, noisier log. Squash and merge collapses all branch commits into one — cleaner main log, but branch history is lost. Rebase and merge replays branch commits linearly — clean history without a merge commit, but rewrites commit SHAs.

For Claude-assisted PRs, squash-and-merge is often the right default. Claude's autonomous work may produce exploratory commits — "trying approach A," "reverting, approach B" — that are useful during development but pollute main's history. Squashing gives main one clean, well-described commit per feature. The PR itself retains the exploratory history for reference.

Configure the merge strategy at the repository level so it applies consistently. Leaving the choice to individual PR authors — especially when those authors include automated agents — guarantees inconsistency.

The left-pad Lesson Extended

left-pad cascaded because no branch protection rule required human review of dependency changes. Claude operating autonomously on dependency updates — running npm install, updating package.json, pushing to main — is the same failure pattern. The fix is identical: require human review at the integration gate, regardless of who (or what) authored the change.

Quiz · Lesson 3

Branching Strategies and Merge Discipline

1. What branching strategy does DORA research endorse as correlated with high software delivery performance?

Correct. DORA's State of DevOps research consistently shows Trunk-Based Development correlates with elite delivery performance.

DORA research documents Trunk-Based Development — frequent integration to a single trunk — as the pattern associated with high delivery performance.

2. What is the recommended safe architecture for Claude's branch access in agentic mode?

Correct. Namespaced branches give Claude operational autonomy while the human-review gate at merge protects the integration branch.

The safe pattern is namespaced claude/* branches with write access, and required human review before anything merges to main.

3. Why is squash-and-merge often the preferred strategy for Claude-assisted PRs merging to main?

Correct. Autonomous agents produce exploratory commit noise. Squash-and-merge keeps main clean while preserving the branch history for reference.

Squash is preferred because Claude's working commits are noisy — squashing collapses them into one meaningful main-branch commit without losing the detail in the PR.

4. The left-pad incident in 2016 illustrates what gap in repository governance?

Correct. The absence of a human-review gate for dependency changes meant no one caught the single-point-of-failure risk before it materialized.

The governance gap was the absence of a branch protection rule requiring human approval for dependency changes — the exact policy that would have caught the risk.

5. In Git Flow, what is the purpose of the "hotfix" branch type?

Correct. Hotfix branches in Git Flow fork from the release tag, fix the issue, and merge back to both master and develop to keep both in sync.

In Git Flow, hotfix branches are short-lived, created from the production tag, and merged back to both master/main and develop after the fix.

Lab · Lesson 3

Practice: Designing Branch Policies for Claude Autonomy

Scenario: Architect a Branch Protection Policy

Your team is adopting Claude Code for agentic work. You need to configure a branching policy that gives Claude operational autonomy on feature work while guaranteeing human review before anything reaches main. You're using GitHub and must decide: branch naming conventions, protection rules, required status checks, and merge strategy.

Ask the assistant to help you design a complete branch protection configuration for a team of 5 engineers using Claude for autonomous feature development. Discuss the trade-offs between GitHub Flow and Trunk-Based Development for your scenario, and ask for the specific GitHub branch protection settings you should enable.

Branch Strategy Lab

AI Assistant

Welcome to the branch strategy lab. I can help you design branch protection policies that balance Claude autonomy with human oversight. Tell me about your team's release cadence, CI maturity, and risk tolerance — or jump straight into designing a configuration and we'll refine from there.

Module 4 · Lesson 4

Production Discipline: Rollbacks, Post-Mortems, and the Human Gate

Speed is worthless without the infrastructure to reverse it. Production discipline is not a brake on AI-assisted development — it is what makes high velocity survivable.

What does it mean to maintain genuine human control when Claude is deploying code at the pace of automation?

On January 31, 2017, GitLab experienced a self-inflicted production database outage. A systems administrator, Yorick Peterse, was manually syncing databases and accidentally ran rm -rf on the production PostgreSQL directory instead of the staging directory. 300 GB of data was deleted. Six hours of data was permanently lost.

GitLab's public post-mortem — published in full, with a live Google Document shared during the incident — remains one of the most transparent in the industry. It revealed that of five backup mechanisms, none were functioning correctly: NFS snapshots had been disabled, S3 backups had been failing silently for months, and the database replication was delayed. The root cause was not the rm -rf command. It was the absence of verified, tested rollback procedures.

Rollback as a First-Class Design Requirement

In a Claude-assisted deployment pipeline, the speed of code production increases dramatically. A developer who previously shipped one feature per sprint can ship several per day. This acceleration is only safe if rollback is equally fast. If deploying takes 30 seconds but rolling back takes 45 minutes of manual steps, you have created an asymmetric risk: high velocity forward, slow recovery backward.

Rollback discipline has two components: deployment rollback (reverting the running artifact to a previous version) and database rollback (reversing schema or data migrations). Deployment rollback is mature — blue/green deployments, canary releases, and feature flags all support it. Database rollback is harder and frequently neglected.

When Claude writes database migrations, it must also write down migrations — the explicit reversal of every schema change. This is not automatic. Claude will write an up migration readily; it will write a down migration if asked explicitly. Your task specification must require it. In Flyway and Liquibase, this is the undo script. In Rails Active Record, it is the down method. In Alembic (Python), it is the downgrade function. The discipline is the same across tools: every migration ships with its own funeral.

Migration Rule

When prompting Claude to write a database migration, always include: "Write both the up migration and the down migration. The down migration must exactly reverse the up migration. Include a comment explaining what the down migration cannot recover if data has already been modified."

Post-Mortems in the Age of AI-Assisted Development

GitLab's 2017 post-mortem is a model of blameless analysis. The blameless post-mortem, popularized by Google's SRE practices and formalized in John Allspaw's writing at Etsy, proceeds from the assumption that engineers acted reasonably given the information and tools available. The goal is system improvement, not attribution of fault.

When Claude-generated code causes a production incident, the post-mortem must examine the human decision points: What was the commit review process? Did the PR description accurately describe the change? Did the pre-review pass happen? Were tests written and run? Was the rollback procedure tested before deployment? The AI did not fail alone — it failed at a specific point where the human oversight process had a gap.

The five-why technique, applied to AI-assisted incidents, reliably surfaces the same categories of root cause: insufficient commit scope review, inadequate PR description verification, missing down migrations, untested rollback paths, and over-broad permissions given to Claude's agentic context. These are not Claude's failures. They are process failures that Claude's speed made visible faster than human development pace would have.

Down Migration A database migration script that exactly reverses the schema changes made by the corresponding up migration, enabling clean rollback of database state.

Blameless Post-Mortem An incident analysis methodology assuming engineers acted reasonably given available information. Focuses on system and process improvement rather than individual fault attribution. Popularized by Google SRE and Etsy.

Blue/Green Deployment A release strategy maintaining two identical production environments (blue and green). Traffic switches between them atomically, enabling instant rollback by redirecting to the previous environment.

The Human Gate: Non-Negotiable Checkpoints

Claude's agentic capabilities — the ability to write code, run tests, create PRs, and push to remote — can chain into a nearly continuous deployment pipeline. This is powerful and dangerous in equal measure. The question is not whether to use it, but where to insert irreducible human checkpoints.

There are three checkpoints that should never be automated away regardless of how confident your CI pipeline is: (1) merge to main — a human must read the PR description and confirm the diff matches it; (2) production deployment authorization — a human must explicitly approve the deployment, not just watch it happen; (3) post-incident analysis — a human must conduct the post-mortem, not summarize Claude's log analysis and call it done.

These three checkpoints define what "human in the loop" actually means in a production engineering context. Everything else can be accelerated. These three cannot. GitLab's 2017 incident happened precisely because the humans in the loop assumed their backup systems were working — they were watching indicators without verifying ground truth. Human checkpoints are only effective when the human is genuinely engaging, not rubber-stamping automation.

The Production Discipline Principle

GitLab lost 300 GB of production data because backup procedures existed on paper but had never been verified in practice. The lesson for Claude-assisted development: every procedure you rely on — rollback, down migration, branch protection, post-mortem template — must be tested before you need it. The time to discover your rollback doesn't work is not during a production incident at 2 AM. Test your recovery paths on Monday morning.

Quiz · Lesson 4

Production Discipline, Rollbacks, and Human Oversight

1. In the 2017 GitLab database deletion incident, what was the actual root cause beyond the accidental rm -rf command?

Correct. GitLab's own post-mortem identified that backup systems existed in theory but had never been verified — a systemic process failure, not a single human error.

The real root cause was unverified backup procedures — five systems that appeared to exist but were all non-functional. Human error triggered the incident; process failure caused the loss.

2. When prompting Claude to write a database migration, what additional instruction is essential for production discipline?

Correct. Claude writes up migrations readily; down migrations require explicit instruction. Every migration must ship with its own reversal procedure.

The critical instruction is to require both up and down migrations explicitly. Claude will not automatically write the reversal — you must ask for it.

3. What is the core assumption of a blameless post-mortem?

Correct. Blameless post-mortems, as codified by Google SRE and John Allspaw's Etsy work, assume reasonable human action and focus on systemic improvement.

The blameless post-mortem assumes engineers acted reasonably given available information and focuses on finding system-level improvements rather than assigning blame.

4. Which three checkpoints should never be automated away in a Claude-assisted deployment pipeline?

Correct. These three checkpoints — merge approval, deployment authorization, and post-mortem conduct — define genuine human-in-the-loop control and must not be delegated to automation.

The three irreducible human checkpoints are: merge approval, production deployment authorization, and post-incident analysis. Everything else can be accelerated; these cannot.

5. When applying the five-why technique to AI-assisted production incidents, what categories of root cause are most commonly surfaced?

Correct. These are process failures — human oversight gaps that Claude's speed makes visible faster than traditional development pace would reveal them.

Five-why analysis of AI-assisted incidents consistently reveals process failures: commit review gaps, PR description inaccuracies, missing down migrations, untested rollbacks, and excessive agent permissions.

Lab · Lesson 4

Practice: Production Discipline and Post-Mortem Analysis

Scenario: Conduct a Production Incident Post-Mortem

Your team deployed a Claude-generated feature to production. A database migration added a NOT NULL column to the users table without a default value. The deployment failed mid-migration on 30% of production servers, leaving the database in an inconsistent state. The down migration was never written. You're now conducting a post-mortem.

Ask the assistant to help you conduct a blameless five-why analysis of this incident. Identify the process gaps, write the corrective actions, and draft the missing down migration for the scenario. Then ask what branch protection rule or PR template change would prevent this class of incident.

Production Discipline Lab

AI Assistant

Welcome to the production discipline lab. I can help you conduct blameless post-mortems, write five-why analyses, draft down migrations, and design process improvements to prevent incidents in Claude-assisted workflows. Describe the incident or start with the five-why analysis of the scenario above.

Module Test

PR Workflows, Commits, and Production Discipline · 15 Questions · Pass at 80%

1. Which statement best defines an atomic commit?

Correct. Atomicity is about logical scope — one change, independently reversible.

Atomic commits are defined by logical scope, not authorship, tooling, or signing.

2. The Conventional Commits specification uses what structure for commit subject lines?

Correct. type(scope): subject — lowercase type, optional scope, colon-space, imperative mood.

Conventional Commits specifies: lowercase type, optional scope in parentheses, colon-space separator, imperative mood subject.

3. Knight Capital's $440 million trading loss in 2012 is partly attributable to what commit-history failure?

Correct. The "misc cleanup" commit obscured the Power Peg decommissioning — a canonical example of non-atomic commit harm in production.

The commit-history failure was bundling the Power Peg decommissioning inside a vaguely titled, multi-purpose commit — making reconstruction impossible under time pressure.

4. What git command should you run to verify a Claude-generated commit's scope before it lands?

Correct. git diff --staged shows exactly what is staged for the next commit — the definitive check before Claude's commit lands.

git diff --staged shows the staged diff — exactly what will be committed — and is the correct pre-commit verification command.

5. Heartbleed (CVE-2014-0160) survived code review for two years primarily because of what?

Correct. Review saturation on a large, complex diff is the structural failure — Dr. Henson was competent; the review architecture was flawed.

The reviewer was highly qualified. The failure was structural: a large, complex diff exceeding reliable defect-detection capacity — the classic review saturation problem.

6. What is the SmartBear-documented threshold above which reviewer defect detection rates decline sharply?

Correct. SmartBear's 2011 Code Review Best Practices research identified approximately 400 lines as the saturation threshold.

SmartBear's research documented that reviewer defect detection drops significantly above approximately 400 meaningful lines of diff.

7. When Claude is operating in agentic mode with write access, what branching architecture minimizes blast radius?

Correct. The claude/* namespace gives autonomy in the branch and guarantees a human gate at the integration point.

The safe pattern is namespace-scoped write access (claude/*) with branch protection rules requiring human review before anything merges to main.

8. Why is squash-and-merge typically preferred over merge commit for Claude-assisted PRs landing on main?

Correct. Agentic work generates noisy exploratory commits. Squash-and-merge gives main a clean, single commit while the PR retains the full history.

Squash is preferred because autonomous agents produce working/exploratory commits that are useful in the branch but would pollute main's history if preserved as-is.

9. The left-pad npm incident in 2016 illustrates what principle relevant to branch protection?

Correct. The absence of a branch protection rule requiring review of dependency changes meant the single-point-of-failure risk went undetected.

The left-pad lesson for branch governance: no human-review gate on dependency changes meant an unreviewed, cascading risk reached every dependent project simultaneously.

10. In a database migration written by Claude, what is the operator's critical responsibility beyond reviewing the up migration?

Correct. Down migrations must be explicitly required — Claude will not generate them automatically. Every migration must ship with its own reversal.

The critical omission to guard against: Claude writes up migrations readily but skips down migrations unless explicitly instructed. Every migration must ship with its own reversal procedure.

11. Which git merge strategy replays branch commits linearly without adding a merge commit, but rewrites commit SHAs?

Correct. Rebase and merge replays commits linearly — clean history, no merge commit, but SHA rewriting is the tradeoff.

Rebase and merge is the strategy that replays commits linearly without a merge commit — clean history at the cost of SHA rewriting.

12. What is the WHATWHY PR description template's three required sections?

Correct. WHATWHY: What (summary), Why (motivation), How (key implementation decision — not a line-by-line walkthrough).

The WHATWHY template uses: What (one-sentence change summary), Why (the problem driving it), and How (the key implementation decision).

13. Which three checkpoints must remain as genuine human decisions in a Claude-assisted deployment pipeline?

Correct. These three checkpoints define genuine human-in-the-loop control — they cannot be safely delegated to automation regardless of CI confidence.

The three irreducible human checkpoints are: merge approval, deployment authorization, and post-mortem analysis. Everything else can be accelerated; these cannot.

14. What makes a human checkpoint genuinely effective rather than performative rubber-stamping?

Correct. GitLab's incident demonstrates that watching indicators is not the same as verifying ground truth. Human checkpoints require active verification, not passive observation.

GitLab's incident is the warning: humans watched backup indicators without verifying actual function. Effective checkpoints require genuine verification of ground truth, not observation of status lights.

15. When five-why analysis is applied to a production incident caused by Claude-generated code, what category of root cause does it most reliably surface?

Correct. Five-why analysis consistently reveals human oversight process gaps — failures that Claude's speed made visible faster than slower development pace would have.

The five-why technique reliably surfaces process gaps in human oversight: incomplete reviews, unverified descriptions, missing reversals, and over-broad permissions — not model-level failures.