python sdk multimodal by pint-drinker · Pull Request #9 · subconscious-systems/subconscious-python

pint-drinker · 2026-04-20T13:04:21Z

Summary

This PR ships the full multimodal surface for the Python SDK and bumps the package to 1.0.0. It's a breaking change — response types are overhauled to match the API wire format, Python 3.8/3.9 are dropped, and pydantic becomes a hard dependency.

What changed and why

Multimodal input (`Image` helper + `ContentBlock` types)

New: Image.from_path(), Image.from_bytes(), Image.from_url(), Image.from_blob_ref() in subconscious/content.py. Users can now pass images alongside text instructions in a content: list[ContentBlock] field on RunInput.

Why: SUBCON-468. The API has accepted multimodal content blocks for a while; the SDK didn't expose them. This surfaces the full TextContent | ImageContent union with all three image source kinds (base64, url, blob_ref) and enforces MIME validation client-side.

`ToolResponse` type

New: ToolResponse.build(tool_call_id, content) lets function-tool endpoints return images alongside text in a single normalized envelope.

Why: Tools that return screenshots, charts, or captured images previously had no clean path. This unblocks visual tool results end-to-end.

Response models → Pydantic `BaseModel` (breaking)

Run, RunResult, Usage, RunError, ReasoningTask, AgentToolUse are now Pydantic BaseModel instead of @dataclass. Field aliases handle the camelCase ↔ snake_case mapping automatically — _parse_run is now just Run.model_validate(data).

Why: The old dataclass construction in _parse_run was manual and fragile — it silently dropped any field the API added. Pydantic aliases give us automatic wire-format handling, validation at the boundary, and a natural path to strict mode.

Breaking: isinstance(run, BaseModel) not is_dataclass(run). Construction syntax is unchanged.

`Usage` flattened (breaking)

Usage.models and Usage.platform_tools are removed. New fields: input_tokens, output_tokens, duration_ms — matching the actual RunUsage shape in the monorepo.

Why: The old nested ModelUsage/PlatformToolUsage lists never matched what the API returned. Any code accessing run.usage.models[0].input_tokens was already broken silently.

`ReasoningNode` → `ReasoningTask` (breaking)

ReasoningNode is deleted. ReasoningTask replaces it. subtask renamed to subtasks. tooluse is now Optional[AgentToolUse] (typed) instead of List[Any].

Why: Aligns with the monorepo ReasoningTask schema. The old name and shape were leftovers from an earlier design.

`RunResult.reasoning` is now `list[ReasoningTask] | None` (breaking)

Was Optional[ReasoningNode] (a single node). Now a list.

Why: The API always returns a list at the top level. The old model forced callers to treat a tree root specially.

`Run.error` — new field

RunError model with code and message is now populated on failed runs.

Why: Previously a failed run had status="failed" with no detail. This surfaces the error without requiring a separate fetch.

`Engine` literal expanded

Added: "tim", "tim-edge", "tim-oss-local", "tim-1.5", "tim-gpt-heavy-tc".

Why: These engines are live in the monorepo and were missing from the type.

Request wire-format (`CreateRunBody` + `RunInputWire`)

New CreateRunBody.build(engine, input).to_dict() centralizes all payload construction for both run() and stream(). Built-in 5 MB size check raises RequestTooLargeError before the request goes out.

Why: Payload construction was duplicated between run() and stream() with diverging logic. Consolidating into Pydantic models validates structure, enforces the size limit, and normalizes tools/content in one place.

Python 3.10+ required (breaking)

requires-python = ">=3.10". CI drops 3.8 and 3.9 test matrix rows, adds 3.13.

Why: X | Y union syntax and built-in generics (dict[str, Any], list[T]) are used throughout. Backporting to 3.8 would require from __future__ import annotations everywhere and Optional[X] / Union[X, Y] in all signatures — net negative for readability.

Tooling: `ruff` replaces `black` + `isort`, `pre-commit` added

Why: Faster, single tool, consistent with other repos in the workspace. Pre-commit hooks enforce formatting before commit and in CI.

`pydantic>=2.0.0` added as a hard dependency

Why: Required by the new Pydantic response models and ToolResponse. Previously it was an assumed-present optional dep for structured output users.

Breaking changes at a glance

Symbol	Change
`ModelUsage`	Removed
`PlatformToolUsage`	Removed
`ReasoningNode`	Removed → use `ReasoningTask`
`run.usage.models` / `.platform_tools`	Removed → use `run.usage.input_tokens` / `.output_tokens`
`ReasoningTask.subtask`	Renamed → `.subtasks`
`run.result.reasoning`	Type changed — `list[ReasoningTask]
`Run`, `RunResult`, `Usage`, etc.	`@dataclass` → `BaseModel`
`requires-python`	`>=3.8` → `>=3.10`
`pydantic`	Now a required dependency

Full migration guide with before/after code: .cursor/skills/sdk-migration/SKILL.md

Things to check before merging

Examples (`subconscious/` repo)

Any example using run.usage.models or run.usage.platform_tools will AttributeError at runtime
Any example importing ReasoningNode, ModelUsage, or PlatformToolUsage will fail at import
Examples using run.result.reasoning.title (treating reasoning as a single node) need to move to run.result.reasoning[0].title or iteration

Customer projects that may break

Customers on Python 3.8 or 3.9 will get a hard install failure — confirm no known customers are on older Pythons before shipping to PyPI
run.usage access patterns: the flat model is a silent behavior change for any code that was iterating .models
isinstance(run, Run) still works; is_dataclass(run) will now be False

Internal — Co-op and other product surfaces

Any internal code constructing Usage(models=[...], platform_tools=[...]) will fail — update to Usage(input_tokens=..., output_tokens=...)
Any internal code calling ReasoningNode(...) directly needs to migrate to ReasoningTask
Co-op's reasoning display: if it iterates a single .reasoning node's .subtask list, move to .subtasks on the new list-based model
Check any internal dashboards or observability code that parses raw usage payloads from run responses

Docs (`subconscious-docs/`)

Usage section references run.usage.models — update to run.usage.input_tokens
Reasoning section references ReasoningNode — update to ReasoningTask + list shape
Add multimodal quickstart showing Image.from_path()
Note pydantic dependency and Python 3.10+ requirement in install instructions

Migration guide (quick reference)

# Install
pip install "subconscious-sdk>=1.0.0"
# pydantic is now a required dep — no separate install needed

# Usage — was
run.usage.models[0].input_tokens   # broken / wrong shape
# now
run.usage.input_tokens
run.usage.output_tokens
run.usage.duration_ms              # Optional[int]

# Reasoning — was
from subconscious import ReasoningNode
node = run.result.reasoning        # single node
node.subtask                       # list

# now
from subconscious import ReasoningTask
tasks = run.result.reasoning or [] # list
tasks[0].subtasks                  # Optional[List[ReasoningTask]]
tasks[0].tooluse.tool_name         # typed AgentToolUse

# Failed runs — new
if run.status == "failed" and run.error:
    print(f"{run.error.code}: {run.error.message}")

# Multimodal — new
from subconscious import Image, RunInput
client.run(
    engine="tim-claude",
    input=RunInput(
        instructions="What's in this image?",
        content=[Image.from_path("screenshot.png")],
    ),
)

# Tool responses with images — new
from subconscious import ToolResponse, Image
return ToolResponse.build(tool_call_id, [
    "Here's the screenshot:",
    Image.from_path("result.png"),
])

…ypes (SUBCON-468) Generated Pydantic schemas from monorepo packages/schemas/ JSON Schema source of truth. Adds Image helper (from_path/bytes/url/blob_ref), extends RunInput.content with canonical ContentBlock list, client-side capability check + 5MB size guard, densify_trace async helper for post-training data collection with bounded concurrency + streaming JSONL writer. MM-11 subconscious._schemas/ + Image helper + content types MM-12 client serialization + capability snapshot + RequestTooLargeError MM-13 densify_trace + python -m subconscious densify CLI Min Python bumped to 3.10 (datamodel-code-generator requires it). pydantic>=2.0.0 added as dep. 33 tests green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

… - should actually work now

jfobrien29 · 2026-04-23T20:10:03Z

+
 # Run status types
-RunStatus = Literal["queued", "running", "succeeded", "failed", "canceled", "timed_out"]
+RunStatus = Literal['queued', 'running', 'succeeded', 'failed', 'canceled', 'timed_out']


Do we use timed_out? If not let's delete and count this as a version of failure.

jfobrien29 · 2026-04-23T20:10:24Z

+Engine = Literal[
+    'tim',
+    'tim-edge',
+    'tim-claude',
+    'tim-claude-heavy',
+    'tim-oss-local',
+    'tim-1.5',
+    'tim-gpt-heavy-tc',
+]


See slack, I'd imagine this will change often so not high priority

The only models we should publicly support are:

tim-1.5 (default)
tim-claude
tim-claude-heavy
tim (offline)
tim-edge (offline)

Remove from SDK and platform

tim-gpt
tim-gpt-heavy
tim-gpt-heavy-tc
tim-oss-local

jfobrien29 · 2026-04-23T20:12:05Z


+    answer: str = ''
+    reasoning: list[ReasoningTask] | None = None
+    parsed_answer: Any = None


Looks like a cleaner way to handle answer_format. As long as Hongyin signs off this looks good

jfobrien29 · 2026-04-23T20:16:55Z

-    answer_format: Optional[OutputSchema] = None
+    tools: list[Tool] = field(default_factory=list)
+    resources: list[str] | None = None
+    skills: list[str] | None = None


What is the skills plan here? is this the complete skill dumped into the request? Or just the headline? How is the rest retrieved?

I think you can leave skills out for the moment, and we can handle this appropriately in the future. If you feel strongly your call

jfobrien29 · 2026-04-23T20:17:07Z

    """JSON Schema for the answer output format. Use pydantic_to_schema() to generate from Pydantic."""
-    reasoning_format: Optional[OutputSchema] = None
-    """JSON Schema for the reasoning output format. Use pydantic_to_schema() to generate from Pydantic."""
+    content: list[ContentBlock] | None = None


This is great

pint-drinker and others added 12 commits April 17, 2026 10:56

Dropping 3.8 and 3.9 support tests

d208c2a

walking back old schema generated stuff and simplifying

3674aad

uprevving, migration guide, better typing to match what api gives you…

40a6e85

… - should actually work now

enhancing error parsing

9d62bdb

adding tool response tests

ae429f5

removing ReasoningNode legacy refernce

d0b75a3

adding pre-commit hooks and adding to CI

942da89

removing capabilities on client side

bb4532d

tighetenging up types

6303c37

updating readme

a2287f2

adding back styling

0ffe78a

pint-drinker marked this pull request as ready for review April 22, 2026 12:30

pint-drinker requested review from AjayaRamachandran, jfobrien29, luohongyin, ostepan8 and wfangtw April 22, 2026 12:33

pint-drinker added 5 commits April 22, 2026 18:42

mirroring types

4464d28

refining types

42dc0be

capturing skills and widening run options and adding tests for polling

c080eb9

updating ci to run more frequently

e8d56b2

dropping reasoning format and hardening answer format

2e194b2

jfobrien29 reviewed Apr 23, 2026

View reviewed changes

dropping enum for engine

9494f11

pint-drinker merged commit b2ecb17 into main Apr 23, 2026
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

python sdk multimodal#9

python sdk multimodal#9
pint-drinker merged 18 commits into
mainfrom
subcon-468-full-multimodal-support-images

pint-drinker commented Apr 20, 2026 •

edited

Loading

Uh oh!

jfobrien29 Apr 23, 2026

Uh oh!

jfobrien29 Apr 23, 2026

Uh oh!

jfobrien29 Apr 23, 2026

Uh oh!

jfobrien29 Apr 23, 2026

Uh oh!

jfobrien29 Apr 23, 2026

Uh oh!

jfobrien29 Apr 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

pint-drinker commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What changed and why

Multimodal input (Image helper + ContentBlock types)

ToolResponse type

Response models → Pydantic BaseModel (breaking)

Usage flattened (breaking)

ReasoningNode → ReasoningTask (breaking)

RunResult.reasoning is now list[ReasoningTask] | None (breaking)

Run.error — new field

Engine literal expanded

Request wire-format (CreateRunBody + RunInputWire)

Python 3.10+ required (breaking)

Tooling: ruff replaces black + isort, pre-commit added

pydantic>=2.0.0 added as a hard dependency

Breaking changes at a glance

Things to check before merging

Examples (subconscious/ repo)

Customer projects that may break

Internal — Co-op and other product surfaces

Docs (subconscious-docs/)

Migration guide (quick reference)

Uh oh!

jfobrien29 Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

jfobrien29 Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

jfobrien29 Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

jfobrien29 Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

jfobrien29 Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

jfobrien29 Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

pint-drinker commented Apr 20, 2026 •

edited

Loading

Multimodal input (`Image` helper + `ContentBlock` types)

`ToolResponse` type

Response models → Pydantic `BaseModel` (breaking)

`Usage` flattened (breaking)

`ReasoningNode` → `ReasoningTask` (breaking)

`RunResult.reasoning` is now `list[ReasoningTask] | None` (breaking)

`Run.error` — new field

`Engine` literal expanded

Request wire-format (`CreateRunBody` + `RunInputWire`)

Tooling: `ruff` replaces `black` + `isort`, `pre-commit` added

`pydantic>=2.0.0` added as a hard dependency

Examples (`subconscious/` repo)

Docs (`subconscious-docs/`)