Summary
Add first-class sandbox execution support for Copilot SDK sessions, so SDK consumers can run agent file edits, shell commands, MCP servers, and custom tools inside an isolated, resumable workspace rather than relying on the host process filesystem/network boundary.
Current state
From the public docs and examples, the SDK currently exposes several useful building blocks:
workingDirectory / session workspace configuration
availableTools / excludedTools
onPermissionRequest
onPreToolUse and onPostToolUse hooks
- custom tools that can replace built-ins
- a virtual filesystem sample that disables built-ins and backs file tools with an in-memory store
Those are helpful, but they do not appear to provide a first-class sandbox lifecycle or provider abstraction. As far as I can tell, there is no SDK-level API for:
- creating or attaching a sandbox execution environment per session/run
- selecting a sandbox provider, such as local process, Docker/container, hosted sandbox, or custom provider
- mounting repositories/directories/object stores into the sandbox workspace
- enforcing filesystem/network/process isolation below the tool-permission layer
- snapshotting/restoring the sandbox workspace across sessions
- running MCP servers inside the same restricted boundary
- exposing sandbox artifacts/ports back to the host application
If this already exists and I missed it, a docs pointer would be great.
Motivation
Many agentic SDK use cases need more than tool allow/deny decisions. They need a concrete execution boundary for untrusted or model-directed work:
- coding agents that edit files and run tests without touching the host directly
- multi-tenant SaaS apps embedding Copilot agent workflows
- eval harnesses that need reproducible, isolated workspaces
- agents that install packages or run scripts
- MCP servers that should be constrained to specific filesystem/network scopes
- workflows that produce artifacts the host app can inspect after the run
The README says the SDK can enable first-party tools that perform filesystem operations, Git operations, and web requests. Permission hooks are useful for policy, but they are not a replacement for OS/container-level isolation.
Prior art / comparison
OpenAI Agents SDK has a first-class sandbox concept where the agent runtime can connect to local/Docker/hosted sandbox providers and the sandbox owns files, commands, ports, mounts, and persisted state.
Anthropic also has sandbox-runtime (srt), a lightweight sandboxing tool designed for agent/MCP/server process isolation using OS sandbox primitives and network/filesystem restrictions.
It would be valuable for Copilot SDK to expose an equivalent integration surface, while still preserving the existing Copilot SDK tool and hook model.
Proposed API direction
One possible shape:
const session = await client.createSession({
model: "gpt-4.1",
sandbox: {
provider: "docker", // or "local", "hosted", custom provider adapter
image: "ghcr.io/example/copilot-agent-sandbox:latest",
workspace: {
mounts: [
{ source: "/path/to/repo", target: "/workspace/repo", mode: "rw" },
],
persist: true,
},
network: {
allow: ["github.com", "registry.npmjs.org"],
default: "deny",
},
resources: {
cpu: 2,
memoryMb: 4096,
timeoutSeconds: 1800,
},
},
onPermissionRequest: async (req) => ({ kind: "approved" }),
});
Summary
Add first-class sandbox execution support for Copilot SDK sessions, so SDK consumers can run agent file edits, shell commands, MCP servers, and custom tools inside an isolated, resumable workspace rather than relying on the host process filesystem/network boundary.
Current state
From the public docs and examples, the SDK currently exposes several useful building blocks:
workingDirectory/ session workspace configurationavailableTools/excludedToolsonPermissionRequestonPreToolUseandonPostToolUsehooksThose are helpful, but they do not appear to provide a first-class sandbox lifecycle or provider abstraction. As far as I can tell, there is no SDK-level API for:
If this already exists and I missed it, a docs pointer would be great.
Motivation
Many agentic SDK use cases need more than tool allow/deny decisions. They need a concrete execution boundary for untrusted or model-directed work:
The README says the SDK can enable first-party tools that perform filesystem operations, Git operations, and web requests. Permission hooks are useful for policy, but they are not a replacement for OS/container-level isolation.
Prior art / comparison
OpenAI Agents SDK has a first-class sandbox concept where the agent runtime can connect to local/Docker/hosted sandbox providers and the sandbox owns files, commands, ports, mounts, and persisted state.
Anthropic also has
sandbox-runtime(srt), a lightweight sandboxing tool designed for agent/MCP/server process isolation using OS sandbox primitives and network/filesystem restrictions.It would be valuable for Copilot SDK to expose an equivalent integration surface, while still preserving the existing Copilot SDK tool and hook model.
Proposed API direction
One possible shape: