Skip to content

feat: static OpenAPI schema generation with cog-schema-gen#2774

Open
tempusfrangit wants to merge 22 commits intomainfrom
feat/build-time-schema
Open

feat: static OpenAPI schema generation with cog-schema-gen#2774
tempusfrangit wants to merge 22 commits intomainfrom
feat/build-time-schema

Conversation

@tempusfrangit
Copy link
Member

@tempusfrangit tempusfrangit commented Feb 24, 2026

Summary

Replace Python runtime schema generation, input validation, and output encoding with static/Rust-native alternatives. This eliminates the need to boot a Python container for schema extraction at build time, and moves hot-path logic from Python into coglet (Rust).

Changes

Static Schema Generation (cog-schema-gen)

  • New Rust binary using tree-sitter to parse Python source files statically and produce OpenAPI 3.0.2 JSON schemas at build time
  • Schema generated before Docker build, written to .cog/openapi_schema.json, bundled into the image
  • Binary embedded in the cog CLI via embed+exec pattern
  • Supports predict and train modes, Input() metadata, choices, all cog types

COG_SCHEMA_GEN_TOOL Environment Variable

The schema generator binary is always embedded in the cog CLI. For advanced use cases (testing custom builds, CI overrides), the COG_SCHEMA_GEN_TOOL env var accepts:

  • Local file path — use a specific binary on disk
  • URL (https://...) — download the binary once and cache it under ~/.cache/cog/bin/

Resolution order:

  1. COG_SCHEMA_GEN_TOOL env var (path or URL)
  2. Embedded binary (always present in release builds)
  3. dist/cog-schema-gen relative to cwd or exe (dev builds only)
  4. PATH lookup (dev builds only)

Production builds only use steps 1–2. If the binary cannot be resolved, the build fails with a clear error (no silent fallback).

Coglet: Input Validation in Rust

  • New input_validation.rs — compiles OpenAPI components.schemas.Input into a jsonschema::Validator at schema-set time
  • Validates at the HTTP edge (routes.rs), returns 422 with pydantic-compatible detail array (loc: ["body", "input", field_name])
  • Injects additionalProperties: false for pydantic parity

Coglet: Output Encoding in Rust

  • Replaced Python trampolines (cog.json.make_encodeable, cog.files) with native Rust implementations via PyO3
  • Handles Pydantic models, dataclasses, enums, datetime, numpy types, PathLike→base64, IOBase→base64

Coglet: File vs Path Input Coercion

  • prepare_input now introspects the predict/train function's type annotations to distinguish File vs Path parameters
  • File-typed inputs get File.validate() → IO-like URLFile
  • Path-typed inputs get Path.validate()URLPath (downloaded to temp file)
  • Previously all URL strings were unconditionally coerced to URLPath

Go CLI: Schema-Aware Type Coercion

  • NewInputs() now resolves allOf/$ref wrappers when looking up field types from the OpenAPI schema
  • Fixes integer/number coercion for enum/choices fields (e.g. -i int_choices=3 correctly sends 3 as integer, not "3" as string)

Python SDK Slimmed Down (26 files → 8 files)

  • Deleted: cog.json, cog.files, cog.exceptions, cog.coder, cog.coders/, cog.config, cog.logging, cog._adt, cog._inspector, cog.server.runner, cog.server.scope, cog.server.worker, cog.command/
  • Remaining: __init__.py, _version.py, types.py, predictor.py, input.py, model.py, server/__init__.py, server/http.py
  • cog.yaml is never read at runtime — config flows through Dockerfile ENV vars
  • CancelationException re-exported directly from coglet in __init__.py
  • default_factory raises hard error in Input() at class definition time

Coglet Installation

  • Coglet installed via SDK dependency (coglet package) instead of explicit Dockerfile generator logic
  • Legacy SDK (<0.17.0) falls back to old Python server, never gets coglet

CI / Docs

  • Fixed pre-existing B017 ruff lint in test_input.py
  • Fixed stale cog.json import in test_model.py
  • Updated docs/python.md, AGENTS.md, architecture/02-schema.md for deleted modules
  • Added MIT-0 to crates/deny.toml allow list (for jsonschema dep)
  • Regenerated docs/llms.txt

What Rust calls in the remaining Python SDK

Rust file Python module Functions used
lib.rs cog __version__
predictor.rs cog.predictor load_predictor_from_ref, has_setup_weights, extract_setup_weights
predictor.rs cog.input FieldInfo
input.rs cog.types Path, File, URLPath
worker_bridge.rs cog BasePredictor (SDK detection)

Testing

  • mise run test:rust — 157/157 pass
  • mise run test:python — 48/48 pass
  • mise run lint — all clean (go, rust, python)
  • mise run typecheck — all clean
  • mise run stub:check + stub:typecheck — pass
  • mise run docs:llm:check + docs:cli:check — pass
  • Integration tests: predict_many_inputs_image, file_list_input, int_predictor, granite_project — all pass

…ribution

Replace Docker container-based OpenAPI schema generation with a Rust binary
(cog-schema-gen) that uses tree-sitter to parse Python source files statically.
Schema generation now happens locally before the Docker build, not after.

Rust crate (crates/schema-gen/):
- Tree-sitter Python parser handles all predictor patterns: BasePredictor,
  non-BasePredictor classes, standalone functions, async methods
- Supports Input() kwargs, shared Input definitions (cog-flux pattern),
  Optional/union types, Iterator/ConcatenateIterator, BaseModel outputs
- Size-optimized release-small profile: LTO, opt-level=z, panic=abort,
  strip=symbols, no clap/anyhow — binary is ~879KB
- 19 unit tests, 22 integration test fixtures all passing

Go integration (pkg/schemagen/):
- Embed+exec pattern: binary embedded via go:embed, extracted to
  ~/.cache/cog/bin/cog-schema-gen-{version} on first use
- Resolution: COG_SCHEMA_GEN_BINARY env > embedded > dist/ > PATH
- pkg/image/build.go now calls schemagen.Generate() instead of booting
  a Docker container with python -m cog.command.openapi_schema

Build & CI:
- mise build:schema-gen uses --profile release-small
- mise build:cog depends on build:schema-gen, copies with platform suffix
- goreleaser per-build pre-hook embeds platform-matched binary
- CI: build-schema-gen job builds once, stashes artifact for build-cog
- Release: matrix build (4 platforms), standalone binaries attached to
  GitHub releases alongside cog CLI binaries
Schema generation now runs before the Docker build starts, failing fast
on schema errors before any container work begins.

Build side (pkg/image/build.go):
- Schema generation block moved above Docker image build
- .cog/openapi_schema.json is written before the build context is created
- No functional change to schema gen itself (still uses schemagen.Generate)

Coglet side (crates/coglet/src/worker.rs):
- Loads schema from .cog/openapi_schema.json instead of calling Python
  cog._schemas.to_json_schema() via PyO3 at runtime
- Missing schema file: clear warning, predictor accepts any input
- Corrupt schema file: clear warning, predictor accepts any input
- Images built with older cog versions (no schema file) continue to work
Now that coglet loads the OpenAPI schema from .cog/openapi_schema.json
(generated at build time by cog-schema-gen), the runtime Python schema
generation path is dead code.

Deleted:
- pkg/image/openapi_schema.go — Docker-based GenerateOpenAPISchema()
  that booted a container to run python -m cog.command.openapi_schema
- python/cog/_schemas.py — Python schema generation (to_json_schema)
- python/cog/command/openapi_schema.py — CLI entry point for above

Removed from Rust:
- Handler::schema() trait method from worker.rs (default None impl)
- WorkerBridge::schema() delegation in worker_bridge.rs
- PythonPredictor::schema() in predictor.rs that called cog._schemas
  via PyO3 at runtime

Still kept (used by coglet-python for input validation):
- python/cog/_inspector.py — check_input(), create_predictor()
- python/cog/_adt.py — PredictorInfo types, SDK detection
@tempusfrangit tempusfrangit requested a review from a team as a code owner February 24, 2026 22:38
@tempusfrangit tempusfrangit changed the title feat: static OpenAPI schema generation with cog-schema-ge feat: static OpenAPI schema generation with cog-schema-gen Feb 24, 2026
…lation

When SDK version is explicitly pinned below 0.17.0 (e.g. COG_SDK_WHEEL=pypi:0.16.12),
fall back to runtime schema generation and skip coglet installation:

- Add GenerateCombined() to merge predict+train schemas (fixes missing /trainings)
- Add canUseStaticSchemaGen() with binary availability check (graceful fallback)
- Add isLegacySDK() to skip coglet for SDK < 0.17.0
- Add DetectLocalSDKVersion() to resolve SDK version from dist/ wheels
- Restore GenerateOpenAPISchema() for legacy Docker-based schema generation
- Add legacy_sdk_schema integration test
Input validation is now handled at the HTTP edge using the OpenAPI schema
generated at build time. The worker no longer needs to introspect Python
types at runtime via _inspector/_adt.

Removed:
- python/cog/_adt.py (type ADT for runtime introspection)
- python/cog/_inspector.py (predictor introspection and input validation)
- python/tests/test_adt.py, python/tests/test_inspector.py

Simplified in coglet:
- input.rs: removed Runtime enum, InputProcessor trait, CogInputProcessor,
  detect_runtime(), try_cog_runtime(). Replaced with simple prepare_input()
  that only handles URLPath downloads.
- worker_bridge.rs: SDK detection uses cog.BasePredictor instead of cog._adt
- predictor.rs: removed runtime/input_processor fields from PythonPredictor
@tempusfrangit tempusfrangit marked this pull request as draft February 25, 2026 02:43
…ile generator

Coglet is now only installed explicitly in the Dockerfile when there's a
specific source (COGLET_WHEEL env var, local file, or pinned PyPI version).
Otherwise, pip install cog handles it — cog >= 0.17.0 declares coglet as
a hard dependency, older versions don't pull it in.

Also prevents coglet from being installed alongside legacy SDK < 0.17.0,
even when a coglet wheel is auto-detected from dist/.
… defaults in Rust

Dead code removed:
- schema.py (PredictionRequest/Response — unused)
- suppress_output.py (never imported)
- command/ directory (legacy schema gen entry point — already deleted)
- PredictorNotSet exception, get_predictor_types, requires_gpu (config.py)
- get_predict, get_train, wait_for_env, get_healthcheck (predictor.py)
- put_file_to_signed_endpoint, guess_filename, ensure_trailing_slash (files.py)
- ExperimentalFeatureWarning (types.py)

Input() changes:
- default_factory now raises TypeError at class definition time
- Stripped all mutable-default/factory machinery from Input()
- Schema-gen already errors on default_factory at build time

FieldInfo default patching (coglet predictor.rs):
- At predictor load time, replaces FieldInfo defaults with their .default
  values on predict/train method signatures so Python uses actual defaults
  instead of FieldInfo wrapper objects for missing inputs
…te dead SDK modules

Emit COG_PREDICT_TYPE_STUB, COG_TRAIN_TYPE_STUB, and COG_MAX_CONCURRENCY
as ENV lines in the generated Dockerfile so the container never needs to
parse cog.yaml at startup. http.py reads these env vars directly; coglet
reads COG_MAX_CONCURRENCY from the environment instead of calling into
Python config code.

Delete config.py, errors.py, mode.py, and logging.py from the Python SDK
as they are no longer imported by any remaining code.
… I/O)

Delete coder.py and coders/ directory (DataclassCoder, JsonCoder, SetCoder).
The Coder.register()/lookup() mechanism was used by the old FastAPI server
to encode/decode custom types. With coglet handling all I/O at the Rust
layer, no code calls Coder.lookup() anymore. Zero references in integration
tests, Python tests, or Rust code.

Also remove unused logging import from http.py.
exceptions.py was a single-line re-export from coglet. Move the import
directly into __init__.py to eliminate the indirection.
Rewrite output.rs to handle make_encodeable() and file-to-base64 encoding
directly in Rust via PyO3 isinstance checks, replacing the Python trampolines
to cog.json and cog.files. Delete both Python modules.

Add input validation in coglet's HTTP layer: compile the OpenAPI Input
schema (with additionalProperties:false injected) into a jsonschema
validator at startup, and validate prediction inputs before dispatching
to the Python worker. This replaces the pydantic validation that was
previously done by the ADT/inspector code removed earlier in this branch.

Validation errors are returned as pydantic-compatible 422 responses with
per-field detail entries that the Go CLI already knows how to parse.
…llback

- COG_SCHEMA_GEN_TOOL now accepts https:// URLs in addition to local paths;
  downloaded binaries are cached under ~/.cache/cog/bin/
- Remove binary-not-found fallback in canUseStaticSchemaGen — for SDK >= 0.17.0,
  the embedded binary is always available; missing binary is now a hard error
  instead of silently falling back to deleted cog.command
- Gate PATH and dist/ fallbacks behind isDev so production builds only use
  the embedded binary (or explicit COG_SCHEMA_GEN_TOOL override)
The Go CLI looks up property types from the OpenAPI schema to coerce
string inputs (e.g. "3" → integer 3). cog-schema-gen emits enum/choices
fields as allOf:[{$ref: ...}] where the type lives on the referenced
schema, not the wrapper. Without resolving the ref, the type check falls
through and the value is sent as a string, causing validation errors like
'"3" is not one of [3,4,5]'.
…ath-typed

The coerce_url_strings function now introspects the predict/train function's
type annotations to distinguish File vs Path parameters. File-typed inputs
get File.validate() which returns an IO-like URLFile, while Path-typed inputs
get Path.validate() which returns a URLPath for download. Previously all URL
strings were coerced via Path.validate(), causing 'Path object has no attribute
read' errors for File inputs.
Multiple parallel cog build processes extracting the embedded schema-gen
binary to the same .tmp path caused ETXTBSY (text file busy) errors on
Linux. Each process now gets a unique temp file via os.CreateTemp, then
atomically renames to the versioned cache path.
…e separately

When both predict and train are configured, cog-schema-gen now emits
TrainingInput/TrainingOutput for train mode and Input/Output for predict
mode. This prevents the merged schema from dropping train's Input schema
(since both previously used the same key). Coglet validates /trainings
requests against TrainingInput and /predictions against Input. The Go
CLI also resolves TrainingInput for cog train -i type coercion.
@tempusfrangit tempusfrangit marked this pull request as ready for review February 25, 2026 23:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant