Claude Code · Introduction

The Terminal Has a New Tenant, and It Can Ship Features

Why agentic coding tools are not autocomplete — and why the distinction changes everything about how you work

In 1969, Bell Labs distributed a mimeographed document describing a new operating system called Unix. The authors — Ken Thompson and Dennis Ritchie — had built it partly so Thompson could play a video game called Space Travel on a discarded PDP-7. Within a decade, Unix's philosophy of small composable tools had restructured how professionals thought about computation. Nobody at Bell Labs wrote a manifesto declaring a paradigm shift; they just solved their own problems, published the approach, and let the implications accumulate.

In March 2025, Anthropic released Claude Code — a command-line agent that accepts natural-language tasks and executes them autonomously: reading files, running tests, editing code, calling APIs, and committing changes. Early users at companies including Anthropic itself began using it to resolve real GitHub issues in a single prompt, collapsing hour-long debugging sessions into minutes. The tool was not autocomplete with a better name. It was an agent that held a plan, adapted when plans broke, and handed you back working software.

This course teaches you to use Claude Code fluently — from first installation through multi-step autonomous tasks. We will cover what the tool actually does under the hood, how to write instructions it can reliably execute, where autonomy helps and where it needs guardrails, and how to integrate it into a real development workflow. No prior AI experience is assumed. Some programming familiarity is helpful. Honesty up front: the field is moving fast, and some specifics will date. The reasoning won't.

Claude Code · Module 1 · Lesson 1

What Claude Code Actually Is

From chat window to autonomous agent — understanding the architecture before touching the keyboard

If Claude can read your file system, run your tests, and edit your source code, what exactly is different from just asking it a question?

In July 2024, Princeton researchers published SWE-bench Verified — a curated set of 300 real GitHub issues drawn from twelve popular open-source Python repositories including Django, Flask, and Matplotlib. Each issue was paired with a developer-written test that would pass only when the bug was genuinely fixed. The benchmark was designed to be immune to pattern-matching: you had to actually change the right lines in the right files. Early language models, given these issues as chat prompts, resolved fewer than five percent.

Claude Code, evaluated in early 2025 on SWE-bench Verified, resolved approximately 72 percent of issues autonomously — without a human touching the keyboard between prompt and passing test. The gap between five percent and seventy-two percent was not explained by a smarter underlying model alone. It was explained by the agent loop: the ability to read files, form a plan, write a patch, run the test suite, observe the failure, revise the patch, and repeat — all without asking permission at each step.

That loop is the defining feature of Claude Code, and it is what this lesson unpacks.

The Chat-to-Agent Distinction

When you type a question into Claude.ai, you are interacting with a stateless request-response system. You send text; Claude generates text. The model has no ability to take actions in the world between your message and its reply. It cannot open your project directory, cannot run a linter, cannot verify its own suggestion compiles. It is, in the language of the field, a closed-loop inference system — powerful, but bounded by what fits in the context window and what you copy-paste in.

Claude Code operates differently. It is an agentic loop: a system where the model decides what tools to call, calls them, observes results, and continues reasoning. The tools available include a bash shell, a file reader, a file writer, a web search capability, and the ability to spawn sub-agents. When you give Claude Code a task, it does not reply with a suggestion. It executes a plan — or tries to, and revises when something breaks.

The practical consequence is enormous. A chat model can tell you how a bug might be fixed. Claude Code can actually fix it, verify the fix with your test suite, and commit the result. The loop replaces the human as the observer who checks whether the output worked.

Key Distinction

Chat models produce text about actions. Agentic systems like Claude Code take actions — in your real file system, with real consequences. This is not a superficial difference; it changes the risk profile, the useful prompt style, and the appropriate level of supervision entirely.

The Agent Loop — Step by Step

Claude Code's internal architecture follows a pattern common to all capable AI agents, sometimes called the ReAct loop (Reason, Act, Observe). Understanding it demystifies behavior that otherwise seems arbitrary.

Receive task. You provide a natural-language instruction, optionally with context files or a CLAUDE.md project specification.
Plan. Claude generates an internal chain of reasoning about what steps are required and in what order. You can ask it to show this plan before acting.
Select tool. Claude decides which tool to invoke — typically starting with reads to gather information before any writes.
Execute tool. The tool runs in your actual environment. A bash command executes; a file read returns real content.
Observe result. Claude receives the tool output — the file contents, the test output, the error message — and updates its reasoning.
Decide next step. Claude either continues with another tool call or determines the task is complete and reports back.

This loop can iterate dozens of times for a complex task. A refactoring job that touches fifteen files might trigger forty or fifty tool calls before Claude signals completion. Each call is logged in your terminal, giving you a live transcript of what the agent is doing and why.

Why This Matters for Prompt Design

Because Claude Code can gather information autonomously, you do not need to paste file contents into your prompt. You can instead say "look at src/auth/session.py and explain what's happening in the token refresh path" — and it will read the file itself. Prompts become task descriptions, not information dumps.

Trust Levels and Permission Boundaries

Claude Code does not have unrestricted access by default. When you install it, it operates with the filesystem permissions of the user who launched it — no more. It cannot escalate privileges on its own. More importantly, Claude Code is designed to ask for confirmation before taking irreversible or high-impact actions: deleting files, pushing to remote branches, running database migrations.

In practice, you control the trust level through three mechanisms. First, the --allowedTools flag at launch restricts which tools Claude may call without asking. Second, the CLAUDE.md file in your project root can specify rules — "never touch production config files" — that the agent respects across sessions. Third, you can run Claude Code in interactive mode, where it pauses and confirms before each consequential action, or in autonomous mode (the --dangerously-skip-permissions flag, named intentionally), where it proceeds without checkpoints.

Anthropic's own internal policy, documented in their usage guidelines published in 2024, instructs Claude to prefer reversible actions over irreversible ones and to pause when it encounters unexpected states rather than proceeding on assumptions. This is not just a safety feature; it is a reliability feature. An agent that stops when confused is less dangerous and more useful than one that plows through uncertainty and produces a broken codebase.

CLAUDE.md A markdown file placed in your project root (or home directory for global rules) that Claude Code reads at the start of every session. Contains project context, conventions, and behavioral constraints. Acts as a persistent system prompt for the agent.

Agent loop The iterative Reason → Act → Observe cycle that separates agentic systems from single-pass language model inference. Claude Code's loop can run dozens of iterations per task.

Tool call A structured invocation of an external capability — file read, bash execution, web search — by the model during the agent loop. Tool calls are logged and visible in your terminal.

Lesson 1 Quiz

What Claude Code Actually Is — five questions

1. What was the primary reason Claude Code achieved ~72% on SWE-bench Verified while earlier chat-only models managed under 5%?

Correct. The loop — not just model size — is what made the difference. The ability to observe test output and revise is what a chat model cannot do in a single pass.

Not quite. The SWE-bench paper and Anthropic's analysis both point to the iterative agent loop, not model size or internet access, as the decisive factor.

2. In the ReAct loop that Claude Code uses, what happens immediately after a tool is executed?

Correct. Observe is the critical step — the output of each tool call feeds back into Claude's reasoning, enabling self-correction without human intervention.

Not quite. The ReAct loop is Reason → Act → Observe → Reason again. After acting, Claude observes the result and continues reasoning internally.

3. Which of the following is NOT one of the three mechanisms users can use to control Claude Code's trust level?

Correct. There is no Anthropic-signed certificate mechanism. Trust control comes from --allowedTools, CLAUDE.md, and the interactive vs. autonomous mode choice.

Not quite. The three real mechanisms are --allowedTools, CLAUDE.md rules, and interactive vs. autonomous mode. No certificate system exists.

4. What is the CLAUDE.md file's primary function in a Claude Code workflow?

Correct. CLAUDE.md is effectively a persistent system prompt — it loads your project's rules, conventions, and constraints every session so you don't have to repeat them.

Not quite. CLAUDE.md provides persistent context and rules for the project. It is not an authentication file and doesn't store conversation history.

5. According to Anthropic's documented internal guidelines, when should Claude prefer to pause rather than proceed autonomously?

Correct. Anthropic's guidelines emphasize pausing when encountering unexpected states and preferring reversible actions — treating caution as a reliability feature, not just a safety one.

Not quite. The documented principle is to pause at unexpected states and before irreversible/high-impact actions — making the agent more reliable, not just safer.

Lab 1: Interrogating the Agent Loop

Practice thinking about how Claude Code's agentic architecture differs from chat-based AI

Your Scenario

You've just joined a team that's been using Claude Code for two weeks. A colleague says: "I just ask it the same way I ask ChatGPT — I paste the file and describe the problem." You notice their prompts are unusually long and they re-run Claude Code several times per task.

Use this lab to explore how the agent loop changes optimal prompt strategy compared to chat AI — and practice explaining the distinction clearly.

Try asking: "Why would pasting file contents into my prompt be unnecessary with Claude Code?" — or challenge any explanation the assistant gives.

Lab Assistant

Agent Architecture Focus

Ready to work through Claude Code's agent loop with you. Ask me about how agentic execution differs from chat — or throw a scenario at me and let's reason through it together.

Claude Code · Module 1 · Lesson 2

Installation, Authentication, and First Contact

Getting Claude Code running on your machine — and understanding what just happened

Three commands stand between you and an autonomous coding agent. What exactly do those commands do, and what should you check before trusting the result?

Before Claude Code was released publicly in March 2025, Anthropic engineers ran it internally on real work. Amanda Askell, one of Anthropic's alignment researchers, described in a public post the experience of watching Claude Code refactor a large Python codebase while she made coffee. The agent read the directory structure, identified the module with the highest coupling, drafted a decomposition plan, implemented it across twelve files, and ran the test suite — all before she returned to her desk. Her first reaction was not delight. It was a careful check: had it done what she actually wanted, or what she literally said?

That gap — between what you said and what you wanted — is the central challenge of working with capable autonomous agents. And it begins at installation: the choices you make in the first five minutes of setup determine how much unsupervised latitude the agent gets in every subsequent session. Setting up Claude Code correctly is not a formality. It is the first act of agent governance.

Prerequisites

Claude Code is a Node.js application distributed via npm. Before installing, confirm you have:

Required

Node.js 18 or later. Verify with node --version. The npm package manager is bundled with Node and is what you'll use to install Claude Code globally.

Strongly Recommended

An Anthropic API key with billing enabled. Claude Code calls the Claude API for every agent loop iteration — costs accumulate with complex tasks. Set a monthly budget cap in your Anthropic console before beginning.

The Installation Sequence

Installation is three steps. Each step does something specific; understanding what prevents confusion when things go wrong.

# Step 1: Install Claude Code globally via npm
npm install -g @anthropic-ai/claude-code

# Step 2: Authenticate with your Anthropic API key
claude
# This launches an interactive flow that stores your key in ~/.claude

# Step 3: Navigate to your project and start a session
cd my-project && claude
    

Step 1 installs the claude binary to your system's global node_modules. Step 2 stores your Anthropic API key — the authentication secret that grants API access — in a config file at ~/.claude/config.json. This is a plaintext file on your local machine. It is not encrypted by default; treat it the same way you would treat an SSH private key.

Step 3 is where Claude Code reads your current directory's structure and, if present, your CLAUDE.md file. This is the moment the agent becomes "aware" of your project — it does not need you to describe your codebase from scratch in every session.

Security Note

Never commit your ~/.claude/config.json to a repository. Never paste your Anthropic API key into a Claude Code prompt (it would appear in logs). If you work on shared machines, use environment variables: ANTHROPIC_API_KEY=sk-ant-... claude sets the key for that session only without writing it to disk.

Your First Session — What to Observe

The first task you give Claude Code should be low-stakes and observable. The goal is not to accomplish something impressive; it is to confirm the agent loop is functioning and to calibrate your understanding of what Claude Code actually shows you.

A good first task: "List the five largest files in this project by line count and tell me what each one does." This task requires file reads and shell commands but makes no changes. You can watch every tool call in the terminal output and verify that Claude's descriptions match what you already know about the files.

What you should see: a series of logged tool calls — typically a bash call to run find . -name "*.py" | xargs wc -l | sort -rn | head -5, then individual file reads, then a synthesis. If you see only a text reply with no tool calls logged, something is wrong — Claude Code may have fallen back to chat mode, which can happen if the API key is misconfigured or the session was started without the agent binary.

The Verification Habit

After every Claude Code session, ask yourself: did I verify that the output matches what I actually wanted? Not just that it looks plausible, but that you checked it. Amanda Askell's coffee-break refactoring story only ends well if she came back and reviewed the diff. The agent loop closes the execution gap; only you can close the intent gap.

Understanding API Costs in the Agent Loop

Each iteration of the agent loop — each Reason → Act → Observe cycle — sends a request to the Claude API. A complex task that runs thirty tool calls makes roughly thirty API requests, each consuming tokens proportional to the accumulated context (prior messages, tool outputs, and the current reasoning). Context grows with each iteration, so the thirtieth call is more expensive than the first.

Anthropic's pricing as of early 2025 is per token for both input and output. A moderately complex task — fixing a real bug across three files with test verification — typically costs between $0.05 and $0.50 depending on codebase size and how many iterations Claude needs. Large refactoring tasks can cost several dollars. The cost scales with ambiguity in your instructions: a vague task that causes Claude to explore many wrong paths costs more than a precise task that succeeds in two iterations.

Practical guidance: set a Anthropic console usage limit before your first serious session. Start with $5. This gives you room to learn without risk of surprise bills from a runaway agent loop.

~/.claude/config.json The local configuration file where Claude Code stores your Anthropic API key after authentication. Plaintext. Treat it as a credential, not a config file.

ANTHROPIC_API_KEY Environment variable alternative to stored config. Set it at session start to avoid writing credentials to disk — useful on shared or ephemeral machines.

Lesson 2 Quiz

Installation, Authentication, and First Contact — five questions

1. What runtime is required before you can install Claude Code?

Correct. Claude Code is a Node.js application installed via npm. Node 18+ is the stated minimum requirement.

Not quite. Claude Code is a Node.js application — Node 18+ and npm are the prerequisites, not Python, Go, or Docker.

2. Where does Claude Code store your Anthropic API key after the authentication step?

Correct. The key is stored plaintext in ~/.claude/config.json. This is why the lesson compares it to an SSH private key — it deserves the same care.

Not quite. The key is stored as plaintext in ~/.claude/config.json — not in an OS keychain, on Anthropic's servers, or in your project directory.

3. Why does Claude Code's API cost increase as a task progresses through more iterations?

Correct. Accumulated context — prior messages, tool outputs, reasoning — grows with each loop iteration, so later requests consume more tokens and cost more.

Not quite. The cost increase comes from growing context size. Each loop iteration adds tool call results to the context window, making subsequent API requests larger.

4. What is the recommended first task to give Claude Code in a new setup, and why?

Correct. A read-only first task lets you watch the tool calls, verify the agent is working, and calibrate your expectations — all without risk to your codebase.

Not quite. A low-stakes, observable, read-only task is recommended first. The goal is verification and calibration, not accomplishment.

5. When is using the ANTHROPIC_API_KEY environment variable preferable to stored config.json credentials?

Correct. Environment variables are session-scoped and not written to disk — ideal for shared machines, CI pipelines, or any environment where you can't control who has filesystem access.

Not quite. The environment variable approach is specifically valuable on shared or ephemeral machines where writing credentials to disk is risky.

Lab 2: Setup Troubleshooting Scenarios

Diagnose installation and authentication problems before they happen in the field

Your Scenario

You've followed the installation steps but something isn't right. Choose one of the scenarios below and describe it to the assistant, who will walk you through diagnosis. Alternatively, ask general questions about the setup process.

Scenario A: You run claude and see "command not found." Scenario B: Claude Code responds with text but you see no tool calls logged. Scenario C: You're worried about API costs on a large codebase. Pick one and describe it.

Lab Assistant

Setup & Installation Focus

Tell me which setup scenario you're working through, or describe a real issue you've hit during installation. I'll help you diagnose it step by step.

Claude Code · Module 1 · Lesson 3

Writing Your First Autonomous Task

Crafting instructions that produce reliable agent behavior — not just impressive first runs

What is the difference between a prompt that works once and a prompt that works reliably — and how do you write the latter?

In August 2024, a researcher at Princeton studying agentic AI performance pulled GitHub issue #47821 from the Django repository: a report that QuerySet.iterator() with chunk_size behaved differently than documented when used with prefetch_related. The issue had sat open for six weeks. When Claude Code was given the issue verbatim as a prompt — just the text of the GitHub issue, pasted as-is — it read the Django source, identified the discrepancy between the docstring and the implementation, wrote a patch, and produced a test that confirmed the fix. Total elapsed time: four minutes.

What made that prompt work was not magic or luck. The GitHub issue had been written by a developer who knew what good bug reports looked like: it named the specific method, described the expected versus actual behavior, and included a minimal reproduction case. The agent had enough signal to triangulate the problem without asking clarifying questions. That structure — expected behavior, actual behavior, reproduction path — is what separates a prompt Claude Code can execute autonomously from one it will stall on.

The Anatomy of an Executable Task

Claude Code can execute tasks described in plain English. But "plain English" spans an enormous range of quality. Compare these two instructions given the same codebase:

Vague Prompt

"Fix the authentication problem."

Claude must guess which module, which behavior, which expected state. It will likely explore, ask clarifying questions, or make a plausible change that doesn't address your actual issue.

Executable Prompt

"In src/auth/session.py, the token refresh function raises a KeyError when the session dict is missing the 'exp' field. Add a check that returns a 401 response in that case. The relevant test is in tests/test_session.py::test_refresh_missing_exp — make it pass."

Claude has a specific file, a specific behavior, a specific expected outcome, and a verification criterion. This can execute autonomously.

Four elements make a task executable: a location (which file or module), a current behavior (what is happening), a desired behavior (what should happen), and a verification criterion (how Claude knows it succeeded). When all four are present, Claude Code can plan, execute, and self-verify without interrupting you.

The Missing Element That Causes Most Failures

Most failed or stalled Claude Code tasks are missing the verification criterion. Without knowing how to check success, the agent either declares completion prematurely or loops indefinitely. A test name, an expected output string, or even "grep for X in the result" gives Claude a stopping condition.

Task Scope: Where Autonomy Helps and Hurts

Claude Code's autonomy is most valuable for tasks with clear boundaries and checkable outcomes. It is least reliable for tasks that require judgment about requirements — about what the software should do, not just how it currently behaves.

Tasks where full autonomy works well include: fixing a known bug with a failing test, adding a new function whose signature and behavior are fully specified, converting a file from one format to another with a defined schema, and running a standardized lint/format pass. These tasks have a correct answer that Claude can verify.

Tasks where autonomy needs more oversight include: redesigning an API surface, choosing between competing architectural approaches, and writing new features where requirements are still fuzzy. Here, Claude Code is still useful — but in a collaborative mode where it proposes and you decide, rather than executing end-to-end.

The published post-mortem from Cognition AI (makers of the Devin agent) on their failed tasks in early 2024 found that the most common failure mode was not technical incapability — it was the agent solving a problem different from the one intended, because the task description was underspecified. Claude Code inherits this challenge. Precision in your prompt is your primary reliability lever.

Using Checkpoints Deliberately

For any task that will touch more than two or three files, consider splitting it into checkpointed stages rather than issuing one large prompt. Claude Code's interactive mode — the default when you don't pass --dangerously-skip-permissions — pauses and shows you a summary at consequential steps. Use these pauses to verify direction before the agent commits to a path.

A staged approach for a larger task might look like:

Exploration stage: "Read src/payments/ and describe the current flow for handling failed charges. Do not modify anything." — Review Claude's description. Does it match your mental model?
Plan stage: "Propose a plan for adding retry logic for failed charges. List the files you'd touch and the specific changes." — Review the plan. Is this what you want?
Execution stage: "Implement the plan. Run the test suite after each file you modify." — Let it run. Review the diff when complete.

This pattern costs slightly more in total iterations than a single large prompt, but it dramatically reduces the chance of Claude code diverging into an incorrect interpretation that takes twenty tool calls to unwind.

Principle: Prefer Many Small Tasks Over One Large Task

Claude Code's reliability is high for small, well-defined tasks. It degrades as task ambiguity and scope increase. If a task can be decomposed into three sequential sub-tasks each with clear verification criteria, run them as three separate sessions. The total cost is similar; the error rate is lower.

Verification criterion The condition Claude Code uses to determine a task is complete — a passing test, an expected output, a grep result. The single most important element of an executable task prompt.

Checkpointed execution Breaking a large task into stages with human review between them, rather than issuing a single autonomous end-to-end instruction. Reduces divergence risk at moderate cost in session count.

Lesson 3 Quiz

Writing Your First Autonomous Task — five questions

1. What four elements make a Claude Code task prompt reliably executable?

Correct. These four elements — location, current behavior, desired behavior, verification criterion — give Claude Code everything it needs to plan, execute, and self-verify without human intervention.

Not quite. The four elements are location (which file/module), current behavior, desired behavior, and a verification criterion so Claude knows when it's done.

2. According to Cognition AI's 2024 post-mortem on their Devin agent's failures, what was the most common cause of task failure?

Correct. Intent divergence — solving the wrong problem — was the primary failure mode. Technical incapability was secondary. This confirms that prompt precision is the primary reliability lever.

Not quite. The post-mortem identified underspecified task descriptions as the primary cause — the agent solved a different problem than intended, not that it lacked technical capability.

3. Which type of task is best suited to full autonomous execution by Claude Code?

Correct. A known bug with a failing test has a clear location, clear current behavior, clear desired behavior, and a built-in verification criterion — the ideal profile for autonomous execution.

Not quite. Tasks requiring judgment about requirements (architecture decisions, evolving features) need human oversight. Tasks with checkable outcomes (failing tests, format conversions) suit autonomy best.

4. In the checkpointed execution pattern described in Lesson 3, what is the purpose of the "exploration stage"?

Correct. The exploration stage is read-only by design — you verify that Claude's model of the code matches reality before letting it write anything.

Not quite. The exploration stage is explicitly read-only. Claude describes the code; you verify its understanding aligns with yours before any modifications happen.

5. Why is a verification criterion called the "single most important element" of an executable task prompt?

Correct. The verification criterion is Claude's stopping condition. Without it, the agent either stops too early (no confirmation of success) or continues searching for a condition it doesn't have.

Not quite. The verification criterion is the agent's stopping condition — it defines "done." Without it, Claude can't reliably determine when the task is complete.

Lab 3: Prompt Surgery

Take vague task descriptions and rebuild them into executable Claude Code prompts

Your Scenario

Your team uses Claude Code but their prompts regularly cause the agent to stall, ask clarifying questions, or produce results that don't match what was wanted. You've been asked to create a one-page guide on prompt structure.

Use this lab to practice rewriting vague prompts into executable ones, or to test whether a given prompt has all four required elements.

Try: "Here's a vague prompt — help me rewrite it: 'Fix the login bug.'" — or submit any real task description you've been trying to phrase and we'll work on it together.

Lab Assistant

Prompt Design Focus

Share a task description — vague or specific — and I'll help you analyze whether it has all four elements of an executable prompt, and rewrite it if needed. Or ask me anything about prompt structure for Claude Code.

Claude Code · Module 1 · Lesson 4

CLAUDE.md — Your Project's Standing Orders

How to write a project specification that persists across sessions and makes every task prompt shorter

If you had to tell a new engineer everything they'd need to know to contribute safely to your project — what would you write, and how does CLAUDE.md turn that into agent governance?

Vercel, the deployment platform company, was among the first engineering organizations to adopt Claude Code for production workflows in early 2025. Their engineering blog described the transition in March of that year: the initial period was rough. Engineers gave Claude Code tasks without project context, and the agent made sensible but wrong choices — importing libraries the team had decided not to use, formatting code inconsistently with the rest of the codebase, and once, memorably, writing a database migration that ran in the opposite order from what their deployment pipeline expected.

The fix was not better individual prompts. It was a shared CLAUDE.md file, version-controlled in the repository root, that any engineer could update. The file specified which libraries were approved, the team's formatting conventions, the migration ordering rule, and a list of files that should never be modified by Claude without explicit human confirmation. After the CLAUDE.md was in place, the volume of agent errors on routine tasks dropped sharply — and new engineers onboarding to the codebase found the file useful as human documentation too.

What Belongs in CLAUDE.md

CLAUDE.md is loaded by Claude Code at the start of every session in that directory. It functions as a persistent system prompt — instructions that apply to every task, not just the current one. This is powerful, but it also means that poorly written CLAUDE.md content can cause consistent, hard-to-debug misbehavior across all sessions.

The file should contain information that is stable across tasks and genuinely constraining. If something only applies to one specific task, it belongs in that task's prompt, not CLAUDE.md. If something is likely to change frequently, consider whether it belongs in CLAUDE.md at all — stale constraints are worse than none.

Good CLAUDE.md Content

Approved library list. Forbidden file list. Code style conventions (tabs vs. spaces, quote style). Test runner command. Branch naming convention. Deployment pipeline ordering constraints. Languages and frameworks in use.

Poor CLAUDE.md Content

Task-specific instructions. Dynamic state (current sprint goals). Instructions so vague they provide no constraint ("write good code"). Instructions that contradict each other. Security credentials (never here).

A Practical CLAUDE.md Template

The following structure covers what most projects need. Adapt it to your context — a solo project's CLAUDE.md will be shorter than a team's:

# Project: [Name]

## Stack
Python 3.11, FastAPI, PostgreSQL 15, Redis 7.
Frontend: React 18 with TypeScript. Build: Vite.

## Approved Libraries
Backend: httpx, pydantic, sqlalchemy, alembic.
Do NOT add new dependencies without explicit approval.

## Forbidden Files
Never modify: config/production.yaml, .env.production,
database/migrations/ (migrations require human review).

## Code Style
Python: Black formatter, 88-char line length, type hints required.
TypeScript: Prettier default config. No `any` types.

## Testing
Run tests with: pytest -x tests/
A task is not complete until all tests pass.

## Git
Branch names: feat/, fix/, chore/ prefixes required.
Never push directly to main.
    

Hierarchical CLAUDE.md Files

Claude Code supports CLAUDE.md files at multiple levels: a global one at ~/.claude/CLAUDE.md for developer-level preferences, and project-level ones in the repository root or in subdirectories. When Claude Code loads a project, it reads all relevant CLAUDE.md files and merges their contents, with more specific files taking precedence over more general ones.

This hierarchy is useful in monorepos. A top-level CLAUDE.md might specify global conventions, while services/payments/CLAUDE.md specifies payment-module-specific constraints. Claude Code automatically loads both when working in the payments directory.

The hierarchy also means that your personal global CLAUDE.md can contain preferences like "always use verbose logging when running shell commands" or "prefer explanation before action" that apply across all your projects without polluting any specific repository's CLAUDE.md.

Version Control Your CLAUDE.md

A project-level CLAUDE.md should be committed to your repository. This means the agent's behavioral constraints are code-reviewed like any other configuration file, visible to the whole team, and reverted if they cause problems. Treat it as infrastructure, not a personal note.

CLAUDE.md as Onboarding Documentation

The Vercel case illustrated a non-obvious benefit: CLAUDE.md written for an AI agent is often excellent documentation for human engineers too. The constraints you write for Claude — don't touch this file, use this library not that one, migrations run in this order — are exactly what a new developer needs to know. A well-maintained CLAUDE.md reduces onboarding friction for both agents and people.

This dual purpose is worth designing for deliberately. Write your CLAUDE.md as if a thoughtful new team member would read it on day one. Explain the why of constraints, not just the what. Claude Code will follow the constraint either way; a human engineer needs the reasoning.

The Minimum Viable CLAUDE.md

If you're not sure where to start, commit a CLAUDE.md with three things: the test runner command (so Claude can self-verify), the list of files it must not modify (irreversibility guardrail), and the approved dependency list (the most common source of agent-introduced tech debt). Expand from there as you learn what constraints your project actually needs.

~/.claude/CLAUDE.md The global CLAUDE.md, loaded for every session regardless of project. Best for developer-level preferences that apply across all codebases — preferred explanation style, default verbosity, personal conventions.

Hierarchical merge Claude Code's behavior of loading and combining CLAUDE.md files at multiple directory levels, with more specific files overriding more general ones. Enables consistent global + specific project constraints simultaneously.

Lesson 4 Quiz

CLAUDE.md — Your Project's Standing Orders — five questions

1. What was the root cause of Claude Code's initial errors in Vercel's adoption, and what resolved them?

Correct. Without project context, Claude made locally-sensible but globally-wrong choices. CLAUDE.md provided consistent project-level constraints across all sessions and engineers.

Not quite. The problem was missing project context — no knowledge of approved libraries, file restrictions, or pipeline rules. A shared CLAUDE.md solved it, not a model change or mode switch.

2. When should an instruction go in a task's individual prompt rather than in CLAUDE.md?

Correct. CLAUDE.md is for stable, cross-task constraints. Task-specific instructions that only apply to the current session belong in the individual prompt, not as persistent rules.

Not quite. Task-specific instructions belong in the prompt. CLAUDE.md is for stable constraints that apply across all tasks. Mixing them pollutes CLAUDE.md with stale or irrelevant rules.

3. In Claude Code's CLAUDE.md hierarchy, what happens when a subdirectory CLAUDE.md conflicts with a root-level CLAUDE.md?

Correct. More specific files override more general ones in the hierarchy — the same principle used by CSS specificity, git config, and most layered configuration systems.

Not quite. The hierarchy gives precedence to more specific files. A subdirectory CLAUDE.md overrides a conflicting root CLAUDE.md — enabling module-specific rules in monorepos.

4. Which of the following is explicitly identified as poor content for CLAUDE.md?

Correct. Security credentials must never appear in CLAUDE.md — the file is committed to version control and may appear in logs. Use environment variables for credentials.

Not quite. The lesson explicitly identifies security credentials as something that must never be in CLAUDE.md. The other options — test commands, approved libraries, forbidden files — are all recommended content.

5. What is the "Minimum Viable CLAUDE.md" — the three essential elements recommended if you're unsure where to start?

Correct. Test runner (self-verification), forbidden files (irreversibility guardrail), and approved dependencies (tech debt prevention) — these three give Claude the constraints it most needs to avoid the most common errors.

Not quite. The recommended minimum is: test runner command (so Claude can verify its work), files not to modify (guardrail), and approved dependencies (preventing unwanted library introductions).

Lab 4: Build Your CLAUDE.md

Draft a real CLAUDE.md for a project — with expert review and refinement

Your Scenario

You're setting up Claude Code for a project you're actively working on (or a hypothetical one if you prefer). Your job is to draft a CLAUDE.md that covers the minimum viable set of constraints — test runner, forbidden files, approved dependencies — and then expand it.

Describe your project to the assistant and work through what your CLAUDE.md should contain. The assistant will ask clarifying questions and suggest sections you may have overlooked.

Start with: "My project is a [description]. Help me write a CLAUDE.md for it." — or paste a draft CLAUDE.md you've already written and ask for a critique.

Lab Assistant

CLAUDE.md Design Focus

Tell me about your project — stack, team size, any known pain points with agents — and I'll help you draft a CLAUDE.md that actually constrains what needs constraining. Or share a draft and I'll review it critically.

Module 1 Test

Setup and Your First Autonomous Task — 15 questions — 80% to pass

1. Claude Code's ~72% resolution rate on SWE-bench Verified in early 2025 was primarily attributable to which capability that chat-only models lack?

Correct. The agent loop — not model size — explains the performance gap. The ability to observe test failures and revise is what distinguishes Claude Code from chat-based interaction.

The agent loop is the key differentiator. Claude Code can observe failures and revise iteratively; a chat model generates a single response with no feedback mechanism.

2. In the ReAct loop, after a tool call executes, what is the immediate next step?

Correct. Reason → Act → Observe → Reason again. The observation step is what enables self-correction without human involvement.

After acting, Claude observes the result and continues reasoning. This loop is what enables autonomous multi-step task completion.

3. Which file stores Claude Code's Anthropic API key after the initial authentication step?

Correct. The key is stored plaintext in ~/.claude/config.json. This requires the same care as an SSH private key — never commit it, never share it.

~/.claude/config.json stores the key in plaintext. This is not encrypted by default and must be treated as a sensitive credential.

4. What runtime prerequisite must be installed before Claude Code can be installed via npm?

Correct. Claude Code is distributed as an npm package and requires Node.js 18 or later to run.

Claude Code is a Node.js application installed via npm. Node 18+ is required. Python, Go, and Docker are not prerequisites.

5. Why is a read-only task recommended as the first thing to give Claude Code in a new setup?

Correct. A read-only first task is a verification and calibration exercise. You confirm the tool calls appear, check Claude's understanding, and do so without any risk of unintended changes.

The read-only first task is for your verification and calibration — not a built-in requirement. It confirms the agent loop works and builds your trust in the tool before consequential use.

6. Which environment variable allows you to pass an Anthropic API key to Claude Code without writing it to disk?

Correct. ANTHROPIC_API_KEY set at session launch provides credentials without writing to ~/.claude/config.json — ideal for shared or ephemeral machines.

The correct variable is ANTHROPIC_API_KEY. Set it at session start to avoid writing credentials to disk.

7. Why do later iterations of a Claude Code agent loop cost more in API tokens than earlier ones?

Correct. Each loop iteration adds tool outputs and reasoning to the context window. The thirtieth iteration sends all prior context plus the new reasoning — significantly larger than the first.

Growing context is the cause. Each iteration's tool outputs are appended to the context, making later API requests larger and thus more expensive.

8. What four elements make a Claude Code task prompt reliably executable without human intervention?

Correct. Location + current behavior + desired behavior + verification criterion gives Claude everything needed to plan, execute, and confirm completion without asking for help.

The four elements are: location (which file/module), current behavior (what's wrong), desired behavior (what should happen), and verification criterion (how Claude knows it's done).

9. According to Cognition AI's Devin post-mortem, what was the primary failure mode for agentic coding tasks?

Correct. Intent divergence — solving a different problem than intended — was the primary failure mode, not technical incapability. Prompt precision is the primary reliability lever.

Underspecified tasks causing the agent to solve the wrong problem — not technical incapability — was the primary failure mode. This makes prompt precision your most important tool.

10. In a checkpointed execution approach, what is the purpose of the exploration stage?

Correct. The exploration stage is read-only by design — you verify Claude's model of the code matches reality before committing to execution.

The exploration stage is strictly read-only. Claude describes what it sees; you verify its understanding is correct before giving execution permission.

11. What is CLAUDE.md's functional role in a Claude Code session?

Correct. CLAUDE.md is loaded at every session start and acts as a persistent system prompt — your project's standing orders that apply to every task.

CLAUDE.md is a persistent system prompt read at the start of every session. It provides project context, conventions, and behavioral constraints that apply across all tasks.

12. When a subdirectory CLAUDE.md conflicts with the root-level CLAUDE.md, which takes precedence?

Correct. More specific files override more general ones — the same principle used by git config, CSS specificity, and most layered configuration systems.

More specific files override more general ones. Subdirectory CLAUDE.md files take precedence over root-level ones, enabling module-specific rules in monorepos.

13. Which of the following is identified as poor content for a project's CLAUDE.md?

Correct. Security credentials must never appear in CLAUDE.md. The file is committed to version control and may appear in logs. Use environment variables for credentials.

Security credentials (API keys, passwords) must never be in CLAUDE.md. The file is version-controlled and logged. The other options — approved libraries, test commands, forbidden files — are all recommended content.

14. What unexpected benefit did Vercel discover from their team's CLAUDE.md?

Correct. CLAUDE.md written for an AI agent turns out to be excellent human onboarding documentation — the constraints important for an agent to follow are exactly what new developers need to know.

The dual-purpose benefit: CLAUDE.md written for Claude Code also served as useful onboarding documentation for new human engineers — the agent constraints matched what humans needed to know too.

15. According to Anthropic's published guidelines, when should Claude Code prefer to pause rather than proceed autonomously?

Correct. Pausing at unexpected states and before irreversible actions is both a safety feature and a reliability feature — an agent that stops when confused causes less damage than one that plows through uncertainty.

Anthropic's guideline is to pause at unexpected states and before irreversible/high-impact actions. This is a reliability feature as much as a safety one.