🎯 Advanced · Lesson 1 of 4

Sandbox Architecture

How AI systems isolate code execution — containers, VMs, and the engineering decisions that keep millions of users safe.

In March 2023, OpenAI launched Code Interpreter (later renamed Advanced Data Analysis) in ChatGPT. The feature let users upload CSV files and run Python directly in the chat. Within days, security researchers including Johann Rehberger documented that the sandbox was stricter than it appeared: outbound network calls were blocked, the filesystem was ephemeral, and each session spun up in an isolated container that was discarded after the conversation ended. The boundary was not accidental — it was a deliberate architectural decision made before a single user touched the feature.

OpenAI's engineering blog noted the team spent months on isolation guarantees before launch, because a single misconfigured sandbox allowing network egress could have enabled data exfiltration at massive scale. The architecture question — what a sandbox permits versus what it prevents — turned out to be the entire product's safety story.

What a Sandbox Actually Is

A sandbox is an isolated execution environment where code runs with deliberately constrained access to host resources. The word comes from the idea of a physical sandbox: children play freely inside it, but sand stays inside the box. For AI code execution, the "sand" is the generated code, and the "box" is an environment engineered to prevent that code from affecting anything outside it.

Modern sandboxes used for AI code execution stack several layers of isolation. At the outermost layer, container runtimes like Docker use Linux kernel namespaces to give each execution its own view of the process tree, network interfaces, and filesystem. Inside that, seccomp (secure computing mode) filters restrict which system calls the process is allowed to invoke — blocking dangerous calls like ptrace, mount, and raw socket creation even if the code tries them. Some implementations add a hypervisor layer beneath the container, running the container inside a lightweight VM (using tools like Firecracker, AWS's open-source microVM technology) so that a container escape still lands inside a VM rather than on the host.

Google's Colab, which powers Jupyter notebooks for tens of millions of users, uses a similar principle: each runtime is a separate VM allocated per user session, preventing cross-user interference. When your Colab session times out, the VM is destroyed — not paused, not saved, destroyed — ensuring no state persists between sessions unless you explicitly write to Google Drive.

Architecture Layers

A production AI code sandbox typically stacks: hypervisor (Firecracker/KVM) → container runtime (runc/gVisor) → seccomp filter → user-level process. Each layer independently enforces constraints, so a bypass at one layer hits the next.

The gVisor Approach: Kernel Interception

Google open-sourced gVisor in 2018 as an alternative containment strategy. Rather than relying solely on Linux kernel namespaces, gVisor interposes a user-space kernel — called the Sentry — between the application and the host kernel. System calls from the sandboxed process are intercepted and handled by the Sentry, which reimplements a large subset of the Linux kernel API in Go. Only a small set of calls ever reach the actual host kernel.

The security argument is statistical: if a sandboxed process exploits a kernel vulnerability, it exploits the Sentry's Go implementation, not the host kernel. The attack surface is vastly reduced. The cost is performance: gVisor adds overhead on system-call-intensive workloads, sometimes 2–3x for I/O-heavy operations. Google uses gVisor in Google Cloud Run and App Engine, accepting the performance trade-off in exchange for stronger multi-tenant isolation.

Firecracker, developed by Amazon for AWS Lambda and Fargate, takes the opposite approach: instead of intercepting syscalls, it runs each function inside an actual micro-VM backed by KVM hardware virtualization. The boot time is under 125 milliseconds, small enough to be practical for serverless workloads. When Lambda executes your function, it is running inside a Firecracker VM — a full virtual machine, not just a container. This is why Lambda functions from different customers cannot share memory even if they run on the same physical host.

Design Trade-off

gVisor intercepts syscalls in user space (strong isolation, higher overhead). Firecracker uses hardware virtualization (near-native performance, full VM boundary). ChatGPT's Code Interpreter uses a variant of the container + seccomp approach, optimized for Python data analysis workloads specifically.

Ephemeral Filesystems and State Management

One of the most consequential sandbox design decisions is filesystem ephemerality. ChatGPT's Code Interpreter gives each session a writable /tmp directory, but that directory vanishes when the conversation context resets. There is no persistent home directory. Files the agent creates — charts, processed CSVs, intermediate model outputs — exist only for the session's duration unless the user explicitly downloads them.

This is a feature, not a limitation. An ephemeral filesystem means the sandbox starts clean every time, eliminating the risk of data from one user's session leaking into another's. It also prevents the accumulation of state that could be exploited across requests. The engineering tradeoff is that agents cannot build up persistent local knowledge between sessions without an external storage tool — which is precisely why capable AI agents integrate object storage (S3, GCS), databases, or dedicated memory tools as explicit external resources rather than relying on local disk.

Ephemeral: safe, no cross-session bleed, but requires external persistence tools
Persistent: enables multi-session state, but demands strict per-user isolation and garbage collection
Hybrid (e.g., Replit): persistent workspace per user, ephemeral execution environment per run

🎯 Advanced · Quiz 1

Quiz: Sandbox Architecture

3 questions — free, untracked, retake anytime.

1. What was the primary security concern OpenAI addressed before launching Code Interpreter in March 2023?

✓ Correct — ✓ Correct. Researchers confirmed the sandbox blocked outbound network egress — a deliberate design decision to prevent mass data exfiltration if the AI generated malicious code.

Not quite. The lesson described how network egress blocking was the central safety decision — a misconfigured sandbox allowing network calls could have enabled exfiltration at scale.

2. What does gVisor's "Sentry" component do?

✓ Correct — ✓ Correct. The Sentry is a user-space kernel written in Go that reimplements Linux syscall handling, drastically reducing the attack surface exposed to the host kernel.

Not quite. The Sentry intercepts and reimplements syscall handling in user space — if sandboxed code exploits a vulnerability, it hits Go code, not the host kernel.

3. Why is an ephemeral filesystem considered a security feature rather than a limitation?

✓ Correct — ✓ Correct. Ephemerality eliminates cross-session state bleed. The sandbox starts fresh, so no prior user's data or exploitable state can persist into a new session.

Not quite. The core security benefit is that a clean-slate start prevents cross-session data leakage — no accumulated state means nothing to exploit or leak between users.

🎯 Advanced · Lab 1

Lab: Sandbox Architecture

Interrogate an AI about the engineering trade-offs between gVisor and Firecracker for a hypothetical multi-tenant code execution platform.

Your Mission

You are evaluating sandbox architectures for a platform that will execute AI-generated Python code on behalf of enterprise clients. Your AI advisor will help you think through the engineering decisions.

Ask the AI to compare gVisor and Firecracker for your use case — focus on the security vs. performance trade-off.
Ask what happens at each isolation layer if a container escape is attempted.
Ask how you should handle filesystem state across multi-step agentic tasks.

Suggested opener: "We're building a platform to run AI-generated Python for enterprise clients. Compare gVisor and Firecracker as our sandbox foundation — what does each buy us and what does each cost us?"

🧪 Lab Assistant — Sandbox Architecture Advanced

🎯 Advanced · Lesson 2 of 4

Security Boundaries

What sandboxes block, what they allow, and the documented incidents that revealed where those lines were drawn incorrectly.

In August 2023, security researcher Johann Rehberger published a proof-of-concept demonstrating prompt injection through a malicious PDF processed by ChatGPT's Code Interpreter. The injected instructions caused the model to exfiltrate data via a URL embedded in a rendered image — a channel the sandbox's network blocks did not cover because the outbound request was constructed as an image source in the chat UI, not a direct network call from the Python process. OpenAI patched the vector within weeks. The incident illustrated a fundamental principle: security boundaries in AI systems must account for all channels through which data can leave, not just the most obvious ones.

The Attack Surface of a Code Sandbox

A code sandbox's attack surface is the set of all interfaces through which an attacker could cause unintended effects. For AI code execution environments, this surface is more complex than for traditional sandboxes because the code being executed is generated by a language model, which itself can be manipulated through the input data the code processes.

Rehberger's 2023 demonstration exposed what security engineers call an indirect prompt injection: malicious instructions embedded in data the AI processes (in that case, a PDF's metadata) that redirect the model's behavior. The sandbox correctly prevented the Python process from making direct outbound HTTP calls. But the model, influenced by injected instructions, constructed a Markdown image tag with a crafted URL. The browser rendering the chat interface then made the GET request — outside the sandbox entirely.

This category of attack — using the AI's own output rendering as an exfiltration channel — prompted OpenAI, Anthropic, and Google to implement additional output filtering, automatic URL sanitization in rendered content, and limits on what domains could be referenced in generated content. The fixes were not sandbox changes; they were model output policy changes.

Key Insight

The sandbox secures what the code process can do. It does not automatically secure what the model's output can cause when rendered. These are different threat models requiring different mitigations.

Network Policy: What Gets Through and Why

Most production AI code sandboxes implement one of three network policies: full block (no outbound connections), allowlist (only specific approved endpoints), or full access (unrestricted, used only in explicitly networked agent modes). ChatGPT's Advanced Data Analysis uses full block. Replit's Ghostwriter AI uses allowlist-based policy tied to the user's project configuration. Devin, Cognition AI's autonomous software engineer agent (released in 2024), operates with a browser and network access deliberately enabled — because the task of writing and deploying software inherently requires fetching packages, reading documentation, and running tests against live endpoints.

The choice between these policies is a function of the task, not a universal security stance. A data analysis sandbox needs no network access because all necessary data should already be uploaded. A software development agent needs network access because package managers, APIs, and deployment targets are inherently networked. The risk profiles differ by orders of magnitude: a networked agent can call external APIs, exfiltrate data, and interact with production systems.

Cognition published a transparency document in 2024 describing Devin's network access model. It runs inside a virtual machine with full internet access but with session recording, action logging, and explicit human approval gates before any deployment action. The security model is monitoring and approval rather than prevention — a fundamentally different philosophy from a locked-down data analysis sandbox.

Full block: Maximum security, minimum capability — appropriate for data analysis, document processing
Allowlist: Controlled capability — appropriate for package fetching from known registries, approved API calls
Full access + monitoring: Maximum capability, requires human oversight — appropriate for autonomous development agents

Resource Limits as Security Controls

Resource limits are often treated as performance management tools, but they are also security controls. CPU time limits prevent denial-of-service via infinite loops or computationally expensive operations designed to consume shared resources. Memory caps prevent a single session from exhausting host RAM in a multi-tenant environment. Process count limits prevent fork bombs — code that spawns processes exponentially until the host is overwhelmed.

Linux cgroups (control groups) implement these limits at the kernel level, and container runtimes expose them as configuration parameters. AWS Lambda enforces a hard 15-minute execution limit, 10 GB RAM cap, and 1,000 concurrent execution limit per account by default. Google Cloud Run enforces similar limits per container instance. ChatGPT's Code Interpreter enforces a per-cell execution timeout (observed at approximately 120 seconds) that prevents runaway computations from blocking the session indefinitely.

A less obvious resource limit is disk I/O throttling. Without it, a sandboxed process could write continuously to disk, consuming storage or causing I/O starvation that degrades performance for other tenants on the same host. Production platforms typically implement both IOPS limits (operations per second) and throughput limits (bytes per second) via cgroup blkio controllers.

Defense in Depth

Resource limits (CPU, memory, process count, disk I/O, network bandwidth) serve double duty as both performance controls and denial-of-service mitigations. A sandbox without resource limits is not truly secure even if its network policy is strict.

🎯 Advanced · Quiz 2

Quiz: Security Boundaries

3 questions — free, untracked, retake anytime.

1. In Rehberger's 2023 proof-of-concept, how did data leave ChatGPT's sandbox despite network blocks on the Python process?

✓ Correct — ✓ Correct. The browser rendering the chat UI made the GET request — outside the sandbox's process-level network controls entirely. The fix required model output policy changes, not sandbox changes.

Not quite. The exfiltration happened via a Markdown image tag in the model's output. The browser fetched it — the sandbox network policy never saw the request.

2. Why does Devin (Cognition AI's agent) use full network access rather than a full-block policy?

✓ Correct — ✓ Correct. Devin's task domain — autonomous software engineering — requires genuine network access. The security model shifts to monitoring, action logging, and human approval gates rather than prevention.

Not quite. Network access is intentional: you cannot write and deploy software without fetching packages and hitting live endpoints. Devin compensates with session recording and approval gates.

3. Which Linux kernel mechanism implements CPU, memory, and I/O resource limits at the container level?

✓ Correct — ✓ Correct. Linux cgroups implement resource limits that container runtimes expose as configuration. They prevent any single container from consuming disproportionate host resources — serving as both performance and security controls.

Not quite. cgroups (control groups) enforce resource limits — CPU, memory, I/O. seccomp filters system calls. Namespaces provide isolation views. These are complementary, not interchangeable.

🎯 Advanced · Lab 2

Lab: Security Boundaries

Work through the threat model for an AI agent that processes untrusted documents — where are the real security boundaries?

Your Mission

Your team is deploying an AI agent that accepts PDF uploads from untrusted sources, extracts data, and runs Python analysis. Think through the security boundaries with your AI advisor.

Ask the AI to identify all channels through which data could leave your sandbox unexpectedly — not just direct network calls.
Ask how indirect prompt injection through document metadata differs from direct prompt injection.
Ask what monitoring you should implement given that you cannot prevent all channels.

Suggested opener: "We accept PDFs from untrusted sources, run AI analysis on them, and display results. Walk me through every channel data could leave our sandbox — not just the obvious network calls."

🧪 Lab Assistant — Security Boundaries Advanced

🎯 Advanced · Lesson 3 of 4

Capabilities & Limits

What sandboxed AI code runners can genuinely do — and the engineering boundaries that define where their power ends.

In November 2023, Anthropic published details about Claude's computer use capability (released publicly in October 2024). The system allows Claude to control a virtual desktop — moving a mouse, clicking, typing — within a sandboxed environment. Anthropic's documentation explicitly warned users not to give Claude access to sensitive data or accounts during beta, not because the sandbox could be broken, but because the model itself might take unintended actions. The capability worked; the constraint was on what you gave that capability access to. The lesson was stark: sandbox security and model capability scope are two different problems. A perfectly secure sandbox containing a fully capable agent with access to production systems is still dangerous.

What Sandboxed Code Runners Can Actually Do

Within their permitted boundaries, modern AI code execution sandboxes are genuinely powerful. ChatGPT's Advanced Data Analysis runs a full CPython interpreter with a large pre-installed library set including NumPy, pandas, matplotlib, scikit-learn, PIL, and dozens of others. It can perform complex numerical computation, train small machine learning models, process images, parse documents, generate visualizations, and execute multi-step data pipelines — all within a single session.

The computational resources available are non-trivial. Observed benchmarks suggest the Code Interpreter environment provides approximately 2 CPU cores and 4–8 GB of RAM per session. This is sufficient to train a scikit-learn gradient boosting model on datasets with millions of rows, run FAISS vector similarity search, or perform FFT analysis on large time series. Tasks that would once require a dedicated data engineering environment can now be accomplished conversationally.

E2B (a startup that provides sandboxed code execution as an API, used by companies building on top of models from Anthropic, OpenAI, and others) publishes its sandbox specifications publicly. Their standard Python sandbox provides 2 vCPUs, 512 MB RAM, and a 5-gigabyte ephemeral disk, with sessions lasting up to 24 hours. This is a different capability profile from ChatGPT's — more persistent, more storage, but less RAM — reflecting their target use case of long-running agentic tasks.

Capability Envelope

The real limit is not what the sandbox permits computationally — it's what data and external systems the sandbox has been given access to. A capable sandbox with no external data access is powerful but bounded. The same sandbox with database credentials and API keys is a fundamentally different risk surface.

Hard Limits: What Sandboxes Cannot Do

The hard limits of sandboxed execution fall into several categories. First, there are computational limits enforced by cgroups: you cannot exceed allocated CPU or memory, and attempts to do so result in OOM (out of memory) kills or CPU throttling. Second, there are network limits: in full-block configurations, any socket operation returns immediately with a connection refused or permission denied error — the code has no way to distinguish a firewall block from a server being down.

Third, there are filesystem limits. Code running in ChatGPT's Code Interpreter cannot access the files of other users, cannot write to system directories, and cannot execute binaries that are not already present in the environment. Attempts to pip install packages that require network access will fail silently or with an explicit error if network is blocked. This means the available library set is fixed at environment provisioning time — a significant constraint for specialized domains.

Fourth, there are model-layer limits that are separate from sandbox limits. Even if the sandbox technically permits an operation, the model may refuse to generate code that performs it. Anthropic's Claude will refuse to write functional malware even if the sandbox would permit executing it. This is a model policy constraint, not a sandbox constraint — an important distinction because model policies can be updated independently of sandbox architecture.

Computational: cgroup-enforced CPU/memory ceilings, execution timeouts
Network: socket-level blocking in full-block mode, allowlist enforcement
Filesystem: read-only system paths, no cross-user access, fixed library set
Model policy: refusal to generate certain code regardless of sandbox permissions

The GPU Question

One notable absence in most AI code sandboxes is GPU access. ChatGPT's Code Interpreter runs on CPU only. This is not a security decision — GPUs can be virtualized and sandboxed effectively using NVIDIA's vGPU technology or AMD's equivalent. It is an economics decision: GPU instances cost 10–100x more than CPU instances, and providing GPU access to every user session would be prohibitively expensive at scale.

The practical consequence is that sandboxed AI code execution is suitable for data analysis, statistical modeling, and inference with pre-trained models loaded in CPU mode — but not for training deep learning models. A user trying to fine-tune a transformer model in ChatGPT's sandbox will hit computation time limits before meaningful training occurs. Google Colab addresses this by offering GPU runtimes as a premium feature, with session limits (90 minutes to 12 hours depending on tier) enforced to manage GPU allocation.

For AI agents in production that need GPU inference, the standard pattern is to call an external inference API (OpenAI, Anthropic, Replicate, Together AI) from within the sandbox, rather than running GPU workloads locally. The sandbox becomes an orchestration layer, and the GPU compute happens outside it — with all the security implications that external API calls entail.

Practical Pattern

Sandboxed code + external inference API = the dominant production pattern for AI agents needing ML capabilities. The sandbox handles data processing and orchestration; external APIs handle GPU-dependent inference. This separates the security boundary problem from the compute resource problem.

🎯 Advanced · Quiz 3

Quiz: Capabilities & Limits

3 questions — free, untracked, retake anytime.

1. Anthropic's Claude computer use beta (October 2024) warned users not to give Claude access to sensitive accounts. What did this reveal about sandbox security?

✓ Correct — ✓ Correct. The sandbox was technically sound. The risk was giving a capable agent access to production systems within that sandbox — a reminder that security perimeter and access scope are distinct concerns.

Not quite. The sandbox worked fine. The warning addressed what was inside the sandbox — if a capable agent has access to your production systems, sandbox integrity alone doesn't protect those systems.

2. Why don't most AI code sandboxes provide GPU access, despite GPU virtualization being technically feasible?

✓ Correct — ✓ Correct. The absence of GPU in most sandboxes is an economics decision, not a security one. GPU compute at scale per-session is prohibitively expensive, which is why external inference APIs have become the standard pattern.

Not quite. GPU virtualization (NVIDIA vGPU) is a solved problem. The issue is cost — providing GPU per session at scale would be economically unsustainable without premium pricing like Google Colab's paid tiers.

3. A model refuses to write functional malware code even though the sandbox would technically permit executing it. This is an example of:

✓ Correct — ✓ Correct. Model policy constraints (what the model will generate) are distinct from sandbox constraints (what generated code can execute). They can be updated independently — model policies change with fine-tuning; sandbox constraints change with infrastructure updates.

Not quite. This is a model-layer refusal — the model won't generate the code, regardless of what the sandbox permits. Sandbox architecture and model policy are two separate, independently maintained security layers.

🎯 Advanced · Lab 3

Lab: Capabilities & Limits

Design the capability envelope for a production AI data analysis agent — what do you enable, what do you restrict, and why?

Your Mission

Your organization wants to deploy an AI agent that analyzes financial data from internal databases. You need to define what the agent's sandbox can and cannot do.

Ask the AI to help you define the minimum necessary capabilities for a financial data analysis agent.
Ask what access to external inference APIs would look like from within the sandbox — what are the risks?
Ask how you would design the execution timeout and memory limits for financial modeling workloads.

Suggested opener: "I'm designing a sandboxed AI agent for financial data analysis. Help me define the minimum necessary capabilities — what should I enable and what should stay locked down?"

🧪 Lab Assistant — Capabilities & Limits Advanced

Building AI Agents III — Tools · Module 3 · Lesson 4

Lesson 4: Integration Patterns

Advanced concepts, real-world applications, and practical implications

Core Concepts

This lesson explores lesson 4: integration patterns — examining the key principles, real-world applications, and implications for practitioners working in this domain.

Understanding this topic requires both theoretical grounding and practical awareness of how these concepts manifest in deployed systems. The frameworks covered in earlier lessons provide the foundation; this lesson connects them to implementation reality.

Practical Applications

The transition from theory to practice reveals challenges that pure conceptual frameworks don't capture. Real-world deployment introduces constraints, trade-offs, and edge cases that demand nuanced judgment rather than rigid rule-following.

Effective practitioners in this space develop the ability to reason across multiple frameworks simultaneously, recognizing when different perspectives apply and how to resolve conflicts between competing priorities.

Looking Forward

As this field continues to evolve, the principles covered in this module will remain foundational even as specific technologies and implementations change. The ability to think critically about these topics — rather than simply memorizing current best practices — is what separates effective practitioners from those who merely follow checklists.

Lesson 4 Quiz

Lesson 4: Integration Patterns

What is the primary focus of Lesson 4: Integration Patterns?

✓ Correct — Correct. This lesson bridges theory and practice, focusing on real-world implementation.

Review the lesson — the focus is on connecting frameworks to practical reality.

Why does real-world deployment introduce challenges that pure theory doesn't capture?

✓ Correct — Correct. Real deployment requires judgment, not just framework application.

Practice doesn't invalidate theory — it reveals complexities that require nuanced application of theoretical principles.

What separates effective practitioners from those who merely follow checklists?

✓ Correct — Correct. Critical thinking and adaptability matter more than memorized procedures.

The key differentiator is critical thinking ability, not experience or resources alone.

🎯 Advanced · Lesson 4 Lab

Lab: Apply What You've Learned

Synthesize concepts from Lesson 4: Integration Patterns through guided AI conversation

Your Task

Use the AI below to explore the concepts from Lesson 4 in depth. Ask questions, challenge assumptions, and work through practical scenarios related to lesson 4: integration patterns.

Try: "How would the concepts from this lesson apply to a real-world scenario in this field?"

🤖 AESOP Lab Assistant Lesson 4 Lab

Module 3 Test

Code Execution Environments · 15 Questions · 70% to Pass

Score: 0/15

1. What is the core objective of Code Execution Environments?

2. How should practitioners approach applying concepts from this module?

3. Which best describes the relationship between theory and practice in Building AI Agents III — Tools?

4. What distinguishes expert practitioners from novices in this field?

5. How does Code Execution Environments build on previous modules?

6. What role do constraints play in practical implementation?

7. When applying frameworks from this module, what is most important?

8. How should practitioners handle conflicting perspectives in this field?

9. What makes the concepts in Code Execution Environments relevant beyond their immediate context?

10. How should practitioners continue developing expertise after completing this module?

11. What is the relationship between understanding Building AI Agents III — Tools concepts and making decisions?

12. How do the lessons from this module apply to novel situations?

13. What is the value of understanding multiple perspectives on {course_title}?

14. How should practitioners evaluate new information or developments in this field?

15. What is the ultimate goal of learning Code Execution Environments?