Run the CodeGraff coding agent from Python — streaming events, multi-turn sessions, BYOK auth, and cloud sandboxes. Like the TypeScript SDK, the agent runs in-process via native bindings (PyO3, built with maturin); only the cloud Sandbox calls leave the process.
- Package:
codegraff(PyPI) - Source:
sdk/python/ - Python: >= 3.9
- License: MIT
The Python SDK shares the same agent core and the same dhi-backed validation contract as the TypeScript SDK, but its API is idiomatic Python: a plain constructor, keyword arguments, and synchronous generators (
for ev in …, notasync for).
pip install codegraffPlatform support (0.1.0). Prebuilt wheels are currently published only for macOS Apple Silicon (arm64) on Python 3.12 / 3.13 / 3.14t. There is no sdist, so on any other target
pip install codegrafffails with "no matching distribution" — including Linux, Windows, Intel macOS, and Python 3.9–3.11 (even on Apple Silicon). The full wheel matrix is published from CI; until it lands, build from source (below).
| Target | Status |
|---|---|
| macOS arm64 · Python 3.12 / 3.13 / 3.14t | ✅ on PyPI |
| Linux · Windows · macOS x86_64 | ⏳ via CI wheel matrix |
| Python 3.9 / 3.10 / 3.11 | ⏳ via CI wheel matrix |
The crate depends on sibling workspace crates (../../crates/*) that can't be
vendored into an sdist, so pip install can't build it on an unsupported
target. Build from the full repository with maturin (needs a Rust toolchain):
git clone https://github.com/justrach/codegraff
cd codegraff/sdk/python
pip install maturin && maturin develop --releaseThe native module is codegraff._native; you import everything from the
top-level codegraff package.
from codegraff import Graff
graff = Graff() # reads ~/.forge/forge.toml, like the graff CLI
for ev in graff.chat("explain monads in 3 sentences"):
if ev.type == "TaskMessage" and ev.data.get("content", {}).get("kind") == "Markdown":
print(ev.data["content"]["text"], end="", flush=True)Every event is an AgentEvent dataclass: ev.type is the event name and
ev.data is a dict with the rest of the payload.
Graff(...) is a synchronous constructor. Pass provider + api_key to
register a bring-your-own-key credential and pin the session model in one call;
omit them to fall back to ~/.forge/forge.toml.
import os
from codegraff import Graff
graff = Graff(
provider="codegraff", # snake_case: "openai" | "anthropic" | "open_router" | "xai" | ...
api_key=os.environ["CODEGRAFF_API_KEY"],
model="deepseek-v4-pro",
max_tokens=8192,
)| Argument | Type | Notes |
|---|---|---|
cwd |
str |
Workspace root. Defaults to os.getcwd(). |
provider |
str |
Provider id (snake_case). Required if api_key is supplied. |
api_key |
str |
BYOK credential; persisted via the same auth store the CLI uses. |
model |
str |
Pin the session model. |
max_tokens |
int |
Override the default max-tokens cap. |
Under the hood the constructor sets FORGE_SESSION__PROVIDER_ID,
FORGE_SESSION__MODEL_ID, and FORGE_MAX_TOKENS for the process, then (when a
key is given) calls upsert_credential. Passing api_key without provider
raises ValueError.
chat() is a blocking generator that yields AgentEvents as they arrive.
It releases the GIL while waiting on the agent, so it won't starve an async
server loop (see the turboAPI example).
conversation_id = None
for ev in graff.chat("write a haiku about rust"):
if ev.type == "ConversationStarted":
conversation_id = ev.data["conversationId"]
elif ev.type == "TaskMessage":
content = ev.data.get("content") or {}
if content.get("kind") == "Markdown":
print(content.get("text", ""), end="", flush=True)
elif ev.type == "ToolCallStart":
print("\n→", ev.data["tool_call"]["name"])
elif ev.type == "TaskComplete":
print("\n[done]")Signature: chat(prompt: str, *, conversation_id: str | None = None, model: str | None = None).
Note: per-call
systemPrompt/customRules/temperatureoverrides are currently TypeScript-only. The Pythonchat()takesprompt,conversation_id, andmodel.
@dataclass
class AgentEvent:
type: str
data: dictThe type values and data shape mirror the TS SDK's AgentEvent union:
type |
Key fields in data |
|---|---|
ConversationStarted |
conversationId (synthesised by the SDK as the first event) |
TaskMessage |
content: {kind: "Markdown", text, partial} / {kind: "ToolInput", …} / {kind: "ToolOutput", text} |
TaskReasoning |
content (str) |
ToolCallStart |
tool_call: {name, arguments, call_id?} |
ToolCallEnd |
result: {name, output} |
RetryAttempt |
cause, duration_ms |
Interrupt |
reason: {kind, limit} |
TaskComplete |
— |
Pass conversation_id back on later turns, or use GraffSession to track it:
session = graff.session(model="claude-opus-4-7")
for ev in session.send("add a logout button"):
... # render
for ev in session.send("now write a test for it"):
... # render
print(session.conversation_id)session(*, conversation_id=None, model=None) returns a GraffSession; the
first send() captures and stores the new conversation_id automatically.
recent = graff.list_conversations(20) # list[dict]
last = graff.last_conversation() # dict | None
one = graff.get_conversation(id) # dict | None
result = graff.compact_conversation(last["id"]) # dict: {original_tokens, compacted_tokens, ...}
graff.delete_conversation(last["id"])agents = graff.get_agent_infos() # list[dict] (no provider required)
graff.upsert_credential("anthropic", "sk-ant-…")
graff.remove_credential("anthropic")
print(graff.version())Isolated cloud VMs for shell exec and file ops, managed through the CodeGraff
gateway. The sandbox client authenticates with CODEGRAFF_API_KEY (or
CG_API_KEY); override the endpoint with CODEGRAFF_GATEWAY_URL (default
https://gateway.codegraff.com).
sandbox = graff.create_sandbox(language="javascript", auto_stop_minutes=30)
res = sandbox.exec("node -v") # {"exitCode": 0, "result": "v22.x\n", ...}
sandbox.upload("hello\n", "/tmp/hello.txt")
data = sandbox.download("/tmp/hello.txt") # bytes
sandbox.stop() # preserves state, stops billing
sandbox.start() # resume
sandbox.destroy() # permanent
graff.list_sandboxes() # list[dict]
graff.get_sandbox(sandbox.id)Sandbox methods: create(graff, *, language="javascript", auto_stop_minutes=30, labels=None),
get, list, info(), exec(command, *, cwd=None, env=None, timeout_seconds=300),
upload(content, dest_path), download(path), stop(), start(), destroy().
Graff(...) and chat(...) validate their arguments with
dhi (dhi>=1.3.3) before any request leaves
the client — the same Pydantic-/Zod-compatible contract the TypeScript SDK uses.
Invalid input (e.g. a non-string prompt) is rejected at the boundary rather
than deep inside a request.
A complete example lives in sdk/python/example/:
the agent runs in-process and is exposed as a fast HTTP API on
turboAPI's Zig HTTP core, with request
bodies validated by dhi before the handler runs.
from turboapi import TurboAPI, HTTPException
from turboapi.responses import StreamingResponse
from dhi import BaseModel
from codegraff import AgentEvent, Graff, version as sdk_version
graff = Graff(provider="codegraff", api_key=os.environ.get("CODEGRAFF_API_KEY"))
app = TurboAPI()
class ChatRequest(BaseModel): # validated by dhi's Zig core before the handler runs
prompt: str
model: str | None = None
conversation_id: str | None = None
@app.post("/chat/stream")
def chat_stream(req: ChatRequest):
if not req.prompt:
raise HTTPException(status_code=400, detail="prompt is required")
def sse():
for ev in graff.chat(req.prompt, conversation_id=req.conversation_id, model=req.model):
yield f"data: {json.dumps({'type': ev.type, 'data': ev.data})}\n\n"
yield "data: [DONE]\n\n"
return StreamingResponse(sse(), media_type="text/event-stream")curl -N localhost:8000/chat/stream -H 'content-type: application/json' \
-d '{"prompt":"List three things you can do."}'
# data: {"type": "ConversationStarted", "data": {"conversationId": "..."}}
# data: {"type": "TaskMessage", "data": {"content": {"kind": "Markdown", "text": "..."}}}
# ...
# data: [DONE]Runtime caveat: turboAPI requires free-threaded Python 3.14t and a Zig
toolchain. CodeGraff currently ships cp313 wheels and does not yet declare
free-threading support, so for this example you build CodeGraff from source
against your 3.14t venv (maturin develop --release). Importing it on a 3.14t
interpreter re-enables the GIL with a warning — but chat() releases the GIL
while waiting on the agent regardless, so agent calls aren't GIL-bound. Tracked
as a follow-up. See the example's README
for full details.
from codegraff import Graff, GraffSession, Sandbox, AgentEvent, version
class Graff:
def __init__(self, *, cwd=None, provider=None, api_key=None, model=None, max_tokens=None): ...
def chat(self, prompt, *, conversation_id=None, model=None) -> Iterator[AgentEvent]: ...
def session(self, *, conversation_id=None, model=None) -> GraffSession: ...
# conversations
def list_conversations(self, limit=None) -> list[dict]: ...
def get_conversation(self, id) -> dict | None: ...
def last_conversation(self) -> dict | None: ...
def delete_conversation(self, id) -> None: ...
def compact_conversation(self, id) -> dict: ...
# agents / auth
def get_agent_infos(self) -> list[dict]: ...
def upsert_credential(self, provider_id, api_key, extra_params=None) -> None: ...
def remove_credential(self, provider_id) -> None: ...
# sandboxes
def create_sandbox(self, **kwargs) -> Sandbox: ...
def get_sandbox(self, id) -> Sandbox: ...
def list_sandboxes(self) -> list[dict]: ...
def version(self) -> str: ...
class GraffSession:
conversation_id: str | None
def send(self, prompt) -> Iterator[AgentEvent]: ...
def version() -> str: ...| Capability | TS (@codegraff/sdk) |
Python (codegraff) |
|---|---|---|
| Init | await Graff.init({...}) |
Graff(...) |
| Streaming | for await (async) |
for (sync generator) |
Per-call systemPrompt / customRules / temperature |
✅ | — (planned) |
rename_conversation |
✅ | — |
list_trajectory |
✅ | — |
| MCP config read/write | ✅ | — |
| Sandboxes | ✅ | ✅ |
| dhi validation | ✅ | ✅ |
- Source:
sdk/python/· example:example/ - TypeScript SDK: docs/sdk/typescript.md
- CLI: README