python sdk multimodal#9
Conversation
…ypes (SUBCON-468) Generated Pydantic schemas from monorepo packages/schemas/ JSON Schema source of truth. Adds Image helper (from_path/bytes/url/blob_ref), extends RunInput.content with canonical ContentBlock list, client-side capability check + 5MB size guard, densify_trace async helper for post-training data collection with bounded concurrency + streaming JSONL writer. MM-11 subconscious._schemas/ + Image helper + content types MM-12 client serialization + capability snapshot + RequestTooLargeError MM-13 densify_trace + python -m subconscious densify CLI Min Python bumped to 3.10 (datamodel-code-generator requires it). pydantic>=2.0.0 added as dep. 33 tests green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… - should actually work now
|
|
||
| # Run status types | ||
| RunStatus = Literal["queued", "running", "succeeded", "failed", "canceled", "timed_out"] | ||
| RunStatus = Literal['queued', 'running', 'succeeded', 'failed', 'canceled', 'timed_out'] |
There was a problem hiding this comment.
Do we use timed_out? If not let's delete and count this as a version of failure.
| Engine = Literal[ | ||
| 'tim', | ||
| 'tim-edge', | ||
| 'tim-claude', | ||
| 'tim-claude-heavy', | ||
| 'tim-oss-local', | ||
| 'tim-1.5', | ||
| 'tim-gpt-heavy-tc', | ||
| ] |
There was a problem hiding this comment.
See slack, I'd imagine this will change often so not high priority
There was a problem hiding this comment.
The only models we should publicly support are:
tim-1.5 (default)
tim-claude
tim-claude-heavy
tim (offline)
tim-edge (offline)
Remove from SDK and platform
tim-gpt
tim-gpt-heavy
tim-gpt-heavy-tc
tim-oss-local
|
|
||
| answer: str = '' | ||
| reasoning: list[ReasoningTask] | None = None | ||
| parsed_answer: Any = None |
There was a problem hiding this comment.
Looks like a cleaner way to handle answer_format. As long as Hongyin signs off this looks good
| answer_format: Optional[OutputSchema] = None | ||
| tools: list[Tool] = field(default_factory=list) | ||
| resources: list[str] | None = None | ||
| skills: list[str] | None = None |
There was a problem hiding this comment.
What is the skills plan here? is this the complete skill dumped into the request? Or just the headline? How is the rest retrieved?
I think you can leave skills out for the moment, and we can handle this appropriately in the future. If you feel strongly your call
| """JSON Schema for the answer output format. Use pydantic_to_schema() to generate from Pydantic.""" | ||
| reasoning_format: Optional[OutputSchema] = None | ||
| """JSON Schema for the reasoning output format. Use pydantic_to_schema() to generate from Pydantic.""" | ||
| content: list[ContentBlock] | None = None |
Summary
This PR ships the full multimodal surface for the Python SDK and bumps the package to 1.0.0. It's a breaking change — response types are overhauled to match the API wire format, Python 3.8/3.9 are dropped, and
pydanticbecomes a hard dependency.What changed and why
Multimodal input (
Imagehelper +ContentBlocktypes)New:
Image.from_path(),Image.from_bytes(),Image.from_url(),Image.from_blob_ref()insubconscious/content.py. Users can now pass images alongside text instructions in acontent: list[ContentBlock]field onRunInput.Why: SUBCON-468. The API has accepted multimodal content blocks for a while; the SDK didn't expose them. This surfaces the full
TextContent | ImageContentunion with all three image source kinds (base64,url,blob_ref) and enforces MIME validation client-side.ToolResponsetypeNew:
ToolResponse.build(tool_call_id, content)lets function-tool endpoints return images alongside text in a single normalized envelope.Why: Tools that return screenshots, charts, or captured images previously had no clean path. This unblocks visual tool results end-to-end.
Response models → Pydantic
BaseModel(breaking)Run,RunResult,Usage,RunError,ReasoningTask,AgentToolUseare now PydanticBaseModelinstead of@dataclass. Field aliases handle the camelCase ↔ snake_case mapping automatically —_parse_runis now justRun.model_validate(data).Why: The old dataclass construction in
_parse_runwas manual and fragile — it silently dropped any field the API added. Pydantic aliases give us automatic wire-format handling, validation at the boundary, and a natural path to strict mode.Breaking:
isinstance(run, BaseModel)notis_dataclass(run). Construction syntax is unchanged.Usageflattened (breaking)Usage.modelsandUsage.platform_toolsare removed. New fields:input_tokens,output_tokens,duration_ms— matching the actualRunUsageshape in the monorepo.Why: The old nested
ModelUsage/PlatformToolUsagelists never matched what the API returned. Any code accessingrun.usage.models[0].input_tokenswas already broken silently.ReasoningNode→ReasoningTask(breaking)ReasoningNodeis deleted.ReasoningTaskreplaces it.subtaskrenamed tosubtasks.tooluseis nowOptional[AgentToolUse](typed) instead ofList[Any].Why: Aligns with the monorepo
ReasoningTaskschema. The old name and shape were leftovers from an earlier design.RunResult.reasoningis nowlist[ReasoningTask] | None(breaking)Was
Optional[ReasoningNode](a single node). Now a list.Why: The API always returns a list at the top level. The old model forced callers to treat a tree root specially.
Run.error— new fieldRunErrormodel withcodeandmessageis now populated on failed runs.Why: Previously a failed run had
status="failed"with no detail. This surfaces the error without requiring a separate fetch.Engineliteral expandedAdded:
"tim","tim-edge","tim-oss-local","tim-1.5","tim-gpt-heavy-tc".Why: These engines are live in the monorepo and were missing from the type.
Request wire-format (
CreateRunBody+RunInputWire)New
CreateRunBody.build(engine, input).to_dict()centralizes all payload construction for bothrun()andstream(). Built-in 5 MB size check raisesRequestTooLargeErrorbefore the request goes out.Why: Payload construction was duplicated between
run()andstream()with diverging logic. Consolidating into Pydantic models validates structure, enforces the size limit, and normalizes tools/content in one place.Python 3.10+ required (breaking)
requires-python = ">=3.10". CI drops 3.8 and 3.9 test matrix rows, adds 3.13.Why:
X | Yunion syntax and built-in generics (dict[str, Any],list[T]) are used throughout. Backporting to 3.8 would requirefrom __future__ import annotationseverywhere andOptional[X]/Union[X, Y]in all signatures — net negative for readability.Tooling:
ruffreplacesblack+isort,pre-commitaddedWhy: Faster, single tool, consistent with other repos in the workspace. Pre-commit hooks enforce formatting before commit and in CI.
pydantic>=2.0.0added as a hard dependencyWhy: Required by the new Pydantic response models and
ToolResponse. Previously it was an assumed-present optional dep for structured output users.Breaking changes at a glance
ModelUsagePlatformToolUsageReasoningNodeReasoningTaskrun.usage.models/.platform_toolsrun.usage.input_tokens/.output_tokensReasoningTask.subtask.subtasksrun.result.reasoningRun,RunResult,Usage, etc.@dataclass→BaseModelrequires-python>=3.8→>=3.10pydanticFull migration guide with before/after code:
.cursor/skills/sdk-migration/SKILL.mdThings to check before merging
Examples (
subconscious/repo)run.usage.modelsorrun.usage.platform_toolswillAttributeErrorat runtimeReasoningNode,ModelUsage, orPlatformToolUsagewill fail at importrun.result.reasoning.title(treating reasoning as a single node) need to move torun.result.reasoning[0].titleor iterationCustomer projects that may break
run.usageaccess patterns: the flat model is a silent behavior change for any code that was iterating.modelsisinstance(run, Run)still works;is_dataclass(run)will now beFalseInternal — Co-op and other product surfaces
Usage(models=[...], platform_tools=[...])will fail — update toUsage(input_tokens=..., output_tokens=...)ReasoningNode(...)directly needs to migrate toReasoningTask.reasoningnode's.subtasklist, move to.subtaskson the new list-based modelDocs (
subconscious-docs/)run.usage.models— update torun.usage.input_tokensReasoningNode— update toReasoningTask+ list shapeImage.from_path()pydanticdependency and Python 3.10+ requirement in install instructionsMigration guide (quick reference)