Claude Managed Agents: The Complete Deep Dive & Step-by-Step Setup Guide

Layak Singh
Apr 13
11 min read

If you've spent any time building AI agents, you already know the dirty secret: the agent logic is the easy part. The infrastructure is what kills you.

Setting up sandboxed environments. Writing agent loops. Managing state across sessions. Handling tool execution. Recovering from failures. Coordinating multiple agents. Before you write a single line of business logic, you've already burned weeks — sometimes months — on scaffolding that has nothing to do with what your agent actually does.

On April 8, 2026, Anthropic launched Claude Managed Agents in public beta, and it fundamentally changes the equation. Instead of handing developers another API to wrap their own loop around, Anthropic is now handing over the entire runtime. The sandboxing, the agent loop, the tool execution, the state management, the session persistence — all of it, managed for you.

I've spent the past few days going deep into the documentation, the API, and the early production deployments. This is what I found.

What Claude Managed Agents Actually Is

At its core, Managed Agents is a suite of composable APIs for building and deploying cloud-hosted agents at scale. You define an agent's behavior — its model, system prompt, tools, and permissions — and Anthropic runs it on their infrastructure in a secure, sandboxed container.

The agent can read files, execute shell commands, browse the web, write and run code, and connect to external services via MCP (Model Context Protocol) servers. Session continuity is built in. If the agent is working on a two-hour task and your connection drops, the work keeps going. You reconnect and pick up where you left off.

The pricing model is straightforward: standard Claude API token rates for all model inference, plus $0.08 per session-hour for the managed runtime. No fixed monthly fee. Costs scale with usage.

Here's the critical mental model shift: you are no longer building infrastructure. You are configuring an agent and letting it work.

The Four Core Concepts

Before we touch any code, you need to understand the four building blocks that everything in Managed Agents is built on:

1. Agent The agent is a reusable, versioned configuration. It defines which model to use (Claude Sonnet 4.6, Claude Opus 4.6, etc.), the system prompt that shapes its behavior, which tools it can access, any MCP servers it should connect to, and any specialized skills it should load. You create an agent once and reference it by ID across as many sessions as you want.

2. Environment The environment is the container template where the agent does its work. Think of it as a pre-configured Linux machine. You specify which packages should be pre-installed (Python, Node.js, Go, etc.), what network access rules apply, and what files should be mounted. Environments are reusable across sessions.

3. Session A session is a running agent instance. It references an agent configuration and an environment, and it's where the actual work happens. Sessions are stateful — the file system persists, the conversation history is maintained, and event history is stored server-side. You can have multiple sessions running concurrently.

4. Events Events are the communication layer between your application and the agent. You send user messages as events. Claude streams back responses, tool calls, and results via server-sent events (SSE). You can also send events mid-execution to steer the agent in a different direction or interrupt it entirely.

Step-by-Step Setup Guide: Your First Managed Agent

Let's build a working agent from scratch. By the end of this section, you'll have a Claude agent that can write code, execute commands, and search the web — all running in Anthropic's cloud.

Prerequisites

An Anthropic API key (get one at console.anthropic.com)
Python 3.8+ or Node.js 18+ installed locally
The Anthropic SDK installed

Step 1: Install the SDK and Set Your API Key

Python:

bash

pip install anthropic

TypeScript:

bash

npm install @anthropic-ai/sdk

Set your API key as an environment variable:

bash

export ANTHROPIC_API_KEY="your-api-key-here"

All Managed Agents API requests require the managed-agents-2026-04-01 beta header. The SDK sets this automatically, so you don't need to worry about it in code.

Step 2: Create an Agent

The agent configuration is where you define what your agent is. The system prompt matters enormously here — it shapes how the agent approaches every task it's given.

Python:

python

from anthropic import Anthropic

client = Anthropic()

agent = client.beta.agents.create(
    name="Coding Assistant",
    model="claude-sonnet-4-6",
    system="You are a helpful coding assistant. Write clean, well-documented code. Always include error handling and add comments explaining your reasoning.",
    tools=[{"type": "agent_toolset_20260401"}]
)

print(f"Agent ID: {agent.id}")
print(f"Agent Version: {agent.version}")

cURL:

bash

agent=$(
  curl -sS --fail-with-body https://api.anthropic.com/v1/agents \
    -H "x-api-key: $ANTHROPIC_API_KEY" \
    -H "anthropic-version: 2023-06-01" \
    -H "anthropic-beta: managed-agents-2026-04-01" \
    -H "content-type: application/json" \
    -d '{
      "name": "Coding Assistant",
      "model": "claude-sonnet-4-6",
      "system": "You are a helpful coding assistant. Write clean, well-documented code.",
      "tools": [{"type": "agent_toolset_20260401"}]
    }'
)
AGENT_ID=$(jq -er '.id' <<< "$agent")
echo "Agent ID: $AGENT_ID"

The agent_toolset_20260401 tool type is the key — it enables the full set of pre-built agent tools: bash execution, file operations (read, write, edit, glob, grep), web search, web fetch, and more. Save the returned agent.id. You'll reference it in every session.

Pro tip: The system prompt isn't just a greeting. It shapes every decision the agent makes. "Write clean, well-documented code" produces different output than "Write minimal, production-ready code with comprehensive error handling." Be intentional.

Step 3: Create an Environment

The environment is the sandbox where your agent works. It's a cloud container that you configure with the packages, network access, and files your agent needs.

Python:

python

environment = client.beta.environments.create(
    name="dev-environment",
    config={
        "type": "cloud",
        "networking": {"type": "unrestricted"}
    }
)

print(f"Environment ID: {environment.id}")

cURL:

bash

environment=$(
  curl -sS --fail-with-body https://api.anthropic.com/v1/environments \
    -H "x-api-key: $ANTHROPIC_API_KEY" \
    -H "anthropic-version: 2023-06-01" \
    -H "anthropic-beta: managed-agents-2026-04-01" \
    -H "content-type: application/json" \
    -d '{
      "name": "dev-environment",
      "config": {
        "type": "cloud",
        "networking": {"type": "unrestricted"}
      }
    }'
)
ENVIRONMENT_ID=$(jq -er '.id' <<< "$environment")
echo "Environment ID: $ENVIRONMENT_ID"

The "networking": {"type": "unrestricted"} setting gives the agent full internet access. For production deployments where you need tighter security, you can restrict network access to specific domains or disable it entirely.

Step 4: Start a Session

Now bring the agent and environment together by creating a session:

Python:

python

session = client.beta.sessions.create(
    agent=agent.id,
    environment_id=environment.id,
    title="My first agent session"
)

print(f"Session ID: {session.id}")

Step 5: Send Events and Stream Responses

This is where the magic happens. Send a task to your agent and watch it work:

Python:

python

import json

# Send a user message and stream the response
with client.beta.sessions.events.stream(
    session_id=session.id,
    event={
        "type": "user",
        "content": "Create a Python script that fetches the top 5 trending repositories on GitHub today, then write unit tests for it."
    }
) as stream:
    for event in stream:
        if event.type == "assistant":
            print(event.content)
        elif event.type == "tool_use":
            print(f"[Tool: {event.name}]")
        elif event.type == "tool_result":
            print(f"[Result: {event.content[:200]}...]")

The agent will autonomously determine which tools to use, execute them, observe the results, and iterate until the task is complete. It might write the script, run it to test it, find a bug, fix it, write tests, run the tests, and fix any failures — all without you intervening.

Step 6: Steer or Interrupt Mid-Execution

One of the most powerful features is the ability to redirect the agent while it's working:

python

# Send a follow-up instruction while the agent is still running
client.beta.sessions.events.create(
    session_id=session.id,
    event={
        "type": "user",
        "content": "Actually, also add rate limiting to handle the GitHub API limits."
    }
)

The agent receives this mid-execution and adjusts its approach accordingly.

The Complete Python Script

Here's the full working script you can copy, paste, and run:

python

from anthropic import Anthropic

client = Anthropic()

# 1. Create an agent
agent = client.beta.agents.create(
    name="Coding Assistant",
    model="claude-sonnet-4-6",
    system="You are a helpful coding assistant. Write clean, well-documented code.",
    tools=[{"type": "agent_toolset_20260401"}],
)
print(f"Agent: {agent.id}")

# 2. Create an environment
environment = client.beta.environments.create(
    name="quickstart-env",
    config={"type": "cloud", "networking": {"type": "unrestricted"}},
)
print(f"Environment: {environment.id}")

# 3. Create a session
session = client.beta.sessions.create(
    agent=agent.id,
    environment_id=environment.id,
    title="Quickstart session",
)
print(f"Session: {session.id}")

# 4. Send a task and stream the response
with client.beta.sessions.events.stream(
    session_id=session.id,
    event={
        "type": "user",
        "content": "Write a Python function that calculates compound interest, then test it.",
    },
) as stream:
    for event in stream:
        if hasattr(event, "content"):
            print(event.content)

What Makes This Different From the Messages API

If you've been using the standard Claude Messages API, you might wonder what Managed Agents adds. The distinction is fundamental:

With the Messages API, you control the loop. You send a message, get a response, handle tool calls yourself, manage conversation state, and decide when to continue. You build the entire orchestration layer.

With Managed Agents, Claude controls the loop. You define the task, the tools, and the guardrails. Claude decides when to call tools, how to manage context, how to recover from errors, and when the task is complete. The built-in harness handles prompt caching, context compaction, and performance optimizations automatically.

In Anthropic's internal testing on structured file generation tasks, Managed Agents improved task success rates by up to 10 percentage points compared to a standard prompting loop, with the largest gains on the most difficult problems.

Advanced Features Worth Knowing

Updating Agents Without Breaking Sessions

Agents are versioned. Every time you update an agent's configuration, the version number increments. Existing sessions continue using the version they were created with. New sessions get the latest version. This means you can iterate on your agent's behavior without disrupting work in progress.

python

updated_agent = client.beta.agents.update(
    agent_id=agent.id,
    version=agent.version,
    system="You are a senior Python developer. Always follow PEP 8 and write comprehensive tests."
)
print(f"New version: {updated_agent.version}")

Archiving Agents

When you're done with an agent configuration, you can archive it. This makes it read-only — existing sessions keep running, but no new sessions can reference it.

Session Tracing and Debugging

The Claude Console includes built-in session tracing. You can inspect every tool call, every decision the agent made, and every failure mode. This is invaluable for debugging complex workflows and understanding why an agent took a particular path.

MCP Server Integration

Managed Agents supports MCP (Model Context Protocol) servers, which let you connect your agent to external tools and data sources. This means your agent can interact with your CRM, query your database, post to Slack, or access any system that exposes an MCP interface.

Multi-Agent Coordination (Research Preview)

This is the feature that has enterprise teams most excited. In research preview (requires separate access request), you can have agents spin up and direct other agents to parallelize complex work. One agent acts as a coordinator, breaking down a large task and assigning sub-tasks to specialist agents that work concurrently.

Think of it as a team lead distributing work: one agent handles the API layer, another handles the database schema, a third writes tests, and a fourth handles documentation — all working in parallel.

Outcome-Based Evaluation (Research Preview)

Also in research preview, you can define success criteria for your agent's work. Claude then self-evaluates and iterates until it meets those criteria, rather than simply completing the prompt and stopping. This creates a feedback loop where the agent actively improves its own output.

Who's Already Using It in Production

The initial user base includes several high-profile companies that have already integrated Managed Agents into their products:

Notion lets teams delegate work to Claude directly inside their workspace through Notion Custom Agents, currently in private alpha.

Sentry built a system that goes from root cause analysis to writing the fix and opening a pull request. Their Senior Director of Engineering noted that Managed Agents allowed them to build the integration in weeks and removed the operational overhead of maintaining agent infrastructure.

Asana used Managed Agents to accelerate development of their AI Teammates feature, shipping advanced capabilities faster while focusing on the user experience.

Rakuten deploys specialist agents across engineering, product, sales, marketing, and finance — generating apps, proposal decks, and spreadsheets in sandboxed environments.

The Gotchas You Should Know

Session containers are ephemeral. Files written during a session live in that session's container. When the session ends, they're gone unless you extract them first via the API.

Token costs can surprise you. The $0.08/hr runtime fee is small, but token costs for multi-tool-call sessions add up quickly. A session that runs for 10 minutes with 15 tool calls might cost $0.50–$2.00 depending on the model and output length. Monitor usage closely during development.

Claude-only lock-in. There's no way to run GPT-5, Gemini, or any other model inside the harness. If you build production workflows on this infrastructure, switching providers requires rebuilding the orchestration layer.

Research preview features require separate access. Multi-agent coordination and self-evaluation — arguably the two most compelling features — are not yet generally available. You need to request access separately.

Rate limits apply. Create endpoints are limited to 60 requests per minute. Read endpoints are limited to 600 requests per minute. Organization-level spend limits and tier-based rate limits also apply.

When Should You Use This vs. Building Your Own

Use Managed Agents when:

You want to go from prototype to production in days, not months
Your agents need long-running execution (minutes or hours)
You don't want to build and maintain sandbox infrastructure
You need session persistence and stateful file systems
You want built-in observability and tracing

Build your own agent loop when:

You need fine-grained control over every model call
You're running multiple different model providers
Your latency requirements demand sub-second response times
You need to run agents on your own infrastructure for compliance reasons
You're already invested in an orchestration framework like CrewAI or LangChain

The Bigger Picture

Managed Agents isn't just a product launch. It's a strategic move that positions Anthropic as an execution platform, not just a model provider. The commercial logic is clear: selling model access generates revenue, but a managed platform creates switching costs. Once a company's agents run on Anthropic's infrastructure, the workflows and operational setup become embedded in how the business runs.

This is a familiar pattern in enterprise technology. Cloud providers spent a decade absorbing functions like database management, deployment pipelines, and monitoring that had been handled by separate vendors. Anthropic is doing the same thing to the agent infrastructure layer.

For developers, the practical takeaway is simple: if you're building agents that run on Claude, the fastest path to production is now through Managed Agents. The infrastructure work that used to take months is handled for you. The question is no longer "can we build this?" but "what should the agent actually do?"

And that's a much better question to be answering.

What are you building with Claude Managed Agents? I'd love to hear about your use cases and experiences in the comments.

Resources:

Claude Managed Agents: The Complete Deep Dive & Step-by-Step Setup Guide

What Claude Managed Agents Actually Is

The Four Core Concepts

Step-by-Step Setup Guide: Your First Managed Agent

Prerequisites

Step 1: Install the SDK and Set Your API Key

Step 2: Create an Agent

Step 3: Create an Environment

Step 4: Start a Session

Step 5: Send Events and Stream Responses

Step 6: Steer or Interrupt Mid-Execution

The Complete Python Script

What Makes This Different From the Messages API

Advanced Features Worth Knowing

Updating Agents Without Breaking Sessions

Archiving Agents

Session Tracing and Debugging

MCP Server Integration

Multi-Agent Coordination (Research Preview)

Outcome-Based Evaluation (Research Preview)

Who's Already Using It in Production

The Gotchas You Should Know

When Should You Use This vs. Building Your Own

The Bigger Picture

Recent Posts

Comments