When Anthropic publicly released tool use for Claude in the spring of 2024, the documentation described it as giving the model "the ability to interact with external services and systems." What changed wasn't Claude's knowledge β it was Claude's reach. A language model that previously could only reason about the world could now query a live database, run a calculation, or send an HTTP request.
The pattern had precedent: OpenAI had shipped function calling for GPT-4 in June 2023. But Anthropic's implementation, released as the tools parameter in the Messages API, offered a structured JSON schema approach that developers quickly adopted for complex multi-step agent workflows.
Claude's training gives it a static snapshot of knowledge up to a cutoff date. It cannot browse the web, check the current time, query your database, or execute code in a sandboxed environment β unless you explicitly provide those capabilities as tools.
Tool use is the mechanism by which you extend Claude's effective capabilities beyond inference. Instead of answering "what is the current stock price of NVDA?" from memory (which would be wrong), Claude can call a get_stock_price function you define, receive the real-time result, and incorporate it into a coherent response.
Claude does not execute your tools. It requests that a tool be called by outputting a structured tool_use content block. Your code receives that request, runs the actual function, and sends the result back. Claude is the planner; your application is the executor.
Tool use is not a single API call β it is a dialogue. Understanding the turn structure is essential before writing a single line of code.
Every tool is described using a JSON Schema subset. The three required fields are name, description, and input_schema. Claude reads the description to decide when to call the tool and reads the schema to know what parameters to provide.
The description field is the most important part of a tool definition. Claude uses it to decide whether to call the tool at all. Be specific about when the tool should and should not be used. Anthropic's own documentation notes: "Claude will use descriptions to determine when and how to use tools, so good descriptions can significantly improve tool use performance."
The two-turn pattern (Claude requests β you execute β Claude finalises) is not an implementation detail to work around. It is a deliberate safety boundary. Claude cannot directly execute code or call APIs β it can only request that your application do so. This means you retain full control over what actually runs. You can validate inputs, rate-limit calls, and log every tool invocation before it happens.
This separation becomes especially important when building autonomous agents. Claude might chain multiple tool calls across many turns, but at every step, your orchestration layer is the gatekeeper. Anthropic explicitly designed it this way to maintain human oversight in agentic workflows.
In this lab you'll work with an AI coach to practice writing and critiquing tool definitions. Describe tools you want to build, discuss schema design decisions, and get feedback on your descriptions.
Have at least 3 exchanges to complete this lab.
When Anthropic released the claude-3-opus-20240229 model in March 2024, it arrived with tool use baked in. Early adopters on the Anthropic Discord documented patterns for the multi-turn loop almost immediately. The core challenge wasn't the API itself β the structure is clean β but grasping why you need to append both the assistant's tool_use turn and your tool_result turn before the final call. Skip either and Claude lacks the context to synthesise a coherent answer.
The official Python SDK handles authentication, retry logic, and response parsing. Install it once and import the client.
This is the step most developers get wrong on first attempt. You must append both the assistant's response (containing the tool_use block) and a new user message with the tool_result. Both are required for the conversation to make sense to Claude.
Sending only the tool_result user message without first appending the assistant's tool_use turn causes an API error: "messages: roles must alternate between 'user' and 'assistant'." Both turns are mandatory.
Every tool_use block has a unique id like toolu_01XxyZβ¦. Your tool_result must reference the exact same ID in its tool_use_id field. If Claude requested multiple tools in a single response, you must return a separate tool_result block for each, all within the same user message.
Claude can request multiple tools in a single response (e.g., looking up weather for three cities simultaneously). In this case, response.content will contain multiple tool_use blocks. You execute all of them and return all results in one user message with multiple tool_result content blocks β one per tool_use_id.
If your function fails, you should still return a tool_result β but with "is_error": true and a descriptive error message as content. This tells Claude the tool call failed and lets it decide whether to retry, use a different approach, or inform the user gracefully. Silently omitting a tool result breaks the conversation.
Practice diagnosing and fixing tool call issues. Describe a broken implementation, trace through what went wrong, or ask the coach to walk you through building a complete tool call loop from scratch.
Complete at least 3 exchanges to finish this lab.
When developers at companies like Sourcegraph and Notion built Claude-powered coding and productivity assistants in 2024, they discovered a friction point: Claude sometimes chose to answer from its training data when it should have called a live search tool, and other times tried to call tools when a simple conversational reply was appropriate. Anthropic's tool_choice parameter, documented in the v2 API reference, gives you precise control over this decision.
The tool_choice parameter is passed alongside tools in your API request. It accepts an object with a type field that can be one of three values:
| type | Behaviour | When to Use |
|---|---|---|
| "auto" | Claude decides whether to call a tool or respond directly. Default when tools are provided. | General-purpose assistants where tool use is optional context enrichment. |
| "any" | Claude must call at least one tool from the provided list. | Structured data extraction pipelines; cases where you always need a tool invocation. |
| "tool" | Claude must call the specific named tool. | Guaranteed schema extraction; forcing a particular function regardless of query. |
To prevent Claude from using tools even when they're defined (e.g., for a sub-prompt that should only reason), omit tools from that specific request, or pass an empty array. There is no explicit "none" tool_choice type β just don't include tools at all.
By default, Claude may request multiple tools in a single response (parallel tool use). This improves latency for independent operations but can complicate your orchestration logic. To force sequential, single-tool calls, set disable_parallel_tool_use: true inside the tool_choice object.
One of the most powerful uses of tool_choice: {type: "tool"} is forcing Claude to extract structured data from unstructured text. Define a tool whose schema matches your desired output format, force Claude to call it, and read the JSON inputs as your extracted data β you never even need to execute the tool.
When using tools purely for structured extraction, you don't need to complete the multi-turn loop. Stop after reading response.content[0].input. There's no external function to execute β the "tool" is just a schema that shapes Claude's output into parseable JSON.
When you force a specific tool, Claude immediately produces a tool_use block β even if the user prompt has nothing to do with that tool. The model will infer reasonable values or fill required fields with null if there's truly no relevant information. This behaviour is predictable and useful for pipeline stages where you always need a structured response.
Practice designing forced tool_choice extraction pipelines. Describe data you want to extract from unstructured text, and the coach will help you design the tool schema and API configuration to reliably get structured JSON output.
Complete at least 3 exchanges to finish this lab.
In Anthropic's published model spec, the company explicitly addresses agentic AI: "In agentic contexts, Claude must apply particularly careful judgment about when to proceed versus when to pause and verify with the operator or user, since mistakes may be difficult to reverse, and could have downstream consequences within the same pipeline." This guidance shaped how developers architected Claude-powered agents throughout 2024 β not as autonomous systems that run until done, but as collaborative loops with human checkpoints.
An agentic loop is a while-loop that continues until Claude returns stop_reason: "end_turn" (or hits a maximum step count). At each iteration, you check whether Claude wants to use tools, execute them, and feed results back. This is how complex tasks β "research this topic and write a report" β get broken into steps.
Without a step limit, a poorly designed agent can loop indefinitely, accumulating API costs and latency. Set a conservative limit (10β25 steps for most tasks) and surface a "max steps reached" condition to the user. This is not a failure mode to hide β it is important feedback that your tool definitions or prompts may need refinement.
Anthropic's safety guidance recommends that agentic Claude implementations request only the permissions they need, avoid storing sensitive data beyond immediate use, prefer reversible over irreversible actions, and err on the side of doing less and confirming when uncertain about scope.
In practical terms: don't give your agent a "delete_file" tool unless deletion is truly required. If deletion is required, confirm with the user before executing it. Prefer tools that read over tools that write, and tools that write over tools that delete.
When Claude's tools include reading external content β web pages, documents, emails β that content can contain adversarial instructions attempting to hijack the agent. This is known as prompt injection. Anthropic's documentation explicitly warns about it in agentic contexts.
Mitigations include: using separate system prompt instructions that reinforce Claude's actual goals, validating tool outputs before feeding them back to Claude, being skeptical of tool results that try to redefine Claude's task or permissions, and never including raw untrusted content directly in your system prompt.
In October 2024 Anthropic released Claude's computer use capability in public beta, extending the tool pattern to controlling a desktop environment. The same principles apply at larger scale: Claude requests actions (move_mouse, click, type), your code executes them, results come back as screenshots. The architecture is identical β only the stakes are higher, making the minimal footprint and human-checkpoint principles even more critical.
When using streaming (stream=True), tool use events arrive as a sequence of delta events: content_block_start, content_block_delta (with input_json_delta for building up the JSON), and content_block_stop. You must buffer the JSON deltas and parse the complete input after content_block_stop. Attempting to parse partial JSON deltas mid-stream will fail.
Work with the coach to design or critique an agentic tool loop. Describe the agent you want to build, and get guidance on tool selection, step limits, human checkpoint placement, and prompt injection defences.
Complete at least 3 exchanges to finish this lab.