Production Deployment

Guide for deploying py-code-mode in production environments.

Architecture

Production deployments typically combine:

RedisStorage - Shared workflow library across instances
ContainerExecutor - Isolated code execution
Pre-configured dependencies - Locked down environment
Monitoring and observability - Health checks and logging

import os
from py_code_mode import Session, RedisStorage
from py_code_mode.execution import ContainerExecutor, ContainerConfig

# Shared workflow library
storage = RedisStorage(url=os.getenv("REDIS_URL"), prefix="production")

# Isolated execution with authentication and pre-configured deps
config = ContainerConfig(
    timeout=60.0,
    allow_runtime_deps=False,  # Lock down package installation
    auth_token=os.getenv("CONTAINER_AUTH_TOKEN"),  # Required for production
    deps=["pandas>=2.0", "numpy", "requests"],  # Pre-configured dependencies
)
executor = ContainerExecutor(config)

async with Session(storage=storage, executor=executor, sync_deps_on_start=True) as session:
    result = await session.run(agent_code)

Security Best Practices

1. Enable API Authentication

The container HTTP API requires authentication by default. Never deploy without authentication.

# Load token from environment/secret store
token = os.getenv("CONTAINER_AUTH_TOKEN")
# Or: token = azure_keyvault.get_secret("container-auth-token")
# Or: token = hashicorp_vault.read("secret/container-auth")["token"]

config = ContainerConfig(
    auth_token=token,  # Required - server refuses to start without it
)

Fail-closed design: If you forget to configure auth, the container refuses to start. This prevents accidental unauthenticated deployments.

2. Lock Down Dependencies

Prevent agents from installing arbitrary packages:

config = ContainerConfig(
    allow_runtime_deps=False,  # Block runtime installation
    deps=["pandas>=2.0", "requests>=2.28.0"],  # Pre-configure allowed packages
)

3. Use Container Isolation

Run untrusted agent code in containers:

executor = ContainerExecutor(ContainerConfig(
    timeout=60.0,
    auth_token=os.getenv("CONTAINER_AUTH_TOKEN"),
    network_disabled=False,  # Set True to disable network
    memory_limit="512m",
    cpu_quota=None
))

4. Validate Input

Never trust agent code without validation:

# Bad: Direct execution
result = await session.run(user_provided_code)

# Better: Validation layer
if is_safe(user_provided_code):
    result = await session.run(user_provided_code)
else:
    raise SecurityError("Unsafe code detected")

5. Isolate Storage by Tenant

Use a stable environment prefix plus workspace_id for multi-tenant deployments:

def get_storage(tenant_id: str, redis_url: str) -> RedisStorage:
    return RedisStorage(
        url=redis_url,
        prefix="production",
        workspace_id=tenant_id,
    )

If workspace_id is omitted, the system uses the legacy default namespace. That is one shared unscoped namespace, so multi-tenant deployments should set workspace_id explicitly.

Scalability Patterns

Horizontal Scaling

Multiple agent instances share workflow library via Redis:

┌─────────────┐     ┌──────────┐     ┌─────────────┐
│  Instance 1 │────▶│  Redis   │◀────│  Instance 2 │
└─────────────┘     │ (Workflows) │     └─────────────┘
                    └──────────┘
                         ▲
                         │
                    ┌─────────────┐
                    │  Instance 3 │
                    └─────────────┘

All instances benefit when any instance creates a workflow.

Load Balancing

# Each instance runs the same code
async def handle_request(agent_code: str, tenant_id: str):
    storage = get_storage(tenant_id)
    executor = ContainerExecutor(config)

    async with Session(storage=storage, executor=executor) as session:
        return await session.run(agent_code)

Load balancer distributes requests across instances.

Remote Session Servers

For remote ContainerExecutor(remote_url=...) deployments:

the client provides workspace_id through the storage backend
the session server creates an execution session_id
workflow/artifact isolation is enforced by the server's workspace-scoped storage bundle

Configure the session server with server-owned storage roots:

storage_base_path for file-backed storage
storage_prefix for Redis-backed storage

The host storage configuration and the remote session server must refer to the same logical backing store. In practice, Redis-backed storage is the recommended production topology for remote deployments because both sides can share one namespace directly.

Container Image Management

Building Images

# Build base image
docker build -t py-code-mode:base -f docker/Dockerfile.base .

# Build with additional tools
docker build -t py-code-mode:tools -f docker/Dockerfile.tools .

Updating Images

When you update py-code-mode library code:

# Rebuild images with new code
docker build -t py-code-mode:base -f docker/Dockerfile.base .

# Restart containers to use new image
# (Kubernetes will do this automatically on rollout)

Multi-Stage Builds

Use multi-stage builds to keep images small:

# Dockerfile.base
FROM python:3.11-slim as builder
WORKDIR /build
COPY requirements.txt .
RUN pip install --user -r requirements.txt

FROM python:3.11-slim
COPY --from=builder /root/.local /root/.local
COPY src/ /app/src/
ENV PATH=/root/.local/bin:$PATH
CMD ["python", "-m", "py_code_mode.container.server"]

Monitoring and Observability

Health Checks

from fastapi import FastAPI
from py_code_mode import Session, RedisStorage

app = FastAPI()

@app.get("/health")
async def health():
    try:
        # Check Redis connectivity
        redis_client.ping()

        # Check executor can start
        async with Session(storage=storage, executor=executor) as session:
            await session.run("print('health check')")

        return {"status": "healthy"}
    except Exception as e:
        return {"status": "unhealthy", "error": str(e)}

Logging

import logging

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

async def run_agent(code: str):
    logger.info("Starting agent execution", extra={"code_length": len(code)})

    try:
        result = await session.run(code)
        logger.info("Execution succeeded", extra={"result_type": type(result.value)})
        return result
    except Exception as e:
        logger.error("Execution failed", extra={"error": str(e)}, exc_info=True)
        raise

Metrics

Track key metrics:

Execution time per request
Success/failure rates
Skill creation rate
Redis memory usage
Container startup time

Example Deployment: Azure Container Apps

See examples/azure-container-apps/ for a complete production deployment example including:

Docker image configuration
Azure Container Apps deployment
Redis integration
Environment configuration
Scaling policies

Checklist

Before going to production:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Production Deployment

Architecture

Security Best Practices

1. Enable API Authentication

2. Lock Down Dependencies

3. Use Container Isolation

4. Validate Input

5. Isolate Storage by Tenant

Scalability Patterns

Horizontal Scaling

Load Balancing

Remote Session Servers

Container Image Management

Building Images

Updating Images

Multi-Stage Builds

Monitoring and Observability

Health Checks

Logging

Metrics

Example Deployment: Azure Container Apps

Checklist

FilesExpand file tree

production.md

Latest commit

History

production.md

File metadata and controls

Production Deployment

Architecture

Security Best Practices

1. Enable API Authentication

2. Lock Down Dependencies

3. Use Container Isolation

4. Validate Input

5. Isolate Storage by Tenant

Scalability Patterns

Horizontal Scaling

Load Balancing

Remote Session Servers

Container Image Management

Building Images

Updating Images

Multi-Stage Builds

Monitoring and Observability

Health Checks

Logging

Metrics

Example Deployment: Azure Container Apps

Checklist