feat: static OpenAPI schema generation with cog-schema-gen#2774
Open
tempusfrangit wants to merge 22 commits intomainfrom
Open
feat: static OpenAPI schema generation with cog-schema-gen#2774tempusfrangit wants to merge 22 commits intomainfrom
tempusfrangit wants to merge 22 commits intomainfrom
Conversation
…ribution
Replace Docker container-based OpenAPI schema generation with a Rust binary
(cog-schema-gen) that uses tree-sitter to parse Python source files statically.
Schema generation now happens locally before the Docker build, not after.
Rust crate (crates/schema-gen/):
- Tree-sitter Python parser handles all predictor patterns: BasePredictor,
non-BasePredictor classes, standalone functions, async methods
- Supports Input() kwargs, shared Input definitions (cog-flux pattern),
Optional/union types, Iterator/ConcatenateIterator, BaseModel outputs
- Size-optimized release-small profile: LTO, opt-level=z, panic=abort,
strip=symbols, no clap/anyhow — binary is ~879KB
- 19 unit tests, 22 integration test fixtures all passing
Go integration (pkg/schemagen/):
- Embed+exec pattern: binary embedded via go:embed, extracted to
~/.cache/cog/bin/cog-schema-gen-{version} on first use
- Resolution: COG_SCHEMA_GEN_BINARY env > embedded > dist/ > PATH
- pkg/image/build.go now calls schemagen.Generate() instead of booting
a Docker container with python -m cog.command.openapi_schema
Build & CI:
- mise build:schema-gen uses --profile release-small
- mise build:cog depends on build:schema-gen, copies with platform suffix
- goreleaser per-build pre-hook embeds platform-matched binary
- CI: build-schema-gen job builds once, stashes artifact for build-cog
- Release: matrix build (4 platforms), standalone binaries attached to
GitHub releases alongside cog CLI binaries
Schema generation now runs before the Docker build starts, failing fast on schema errors before any container work begins. Build side (pkg/image/build.go): - Schema generation block moved above Docker image build - .cog/openapi_schema.json is written before the build context is created - No functional change to schema gen itself (still uses schemagen.Generate) Coglet side (crates/coglet/src/worker.rs): - Loads schema from .cog/openapi_schema.json instead of calling Python cog._schemas.to_json_schema() via PyO3 at runtime - Missing schema file: clear warning, predictor accepts any input - Corrupt schema file: clear warning, predictor accepts any input - Images built with older cog versions (no schema file) continue to work
Now that coglet loads the OpenAPI schema from .cog/openapi_schema.json (generated at build time by cog-schema-gen), the runtime Python schema generation path is dead code. Deleted: - pkg/image/openapi_schema.go — Docker-based GenerateOpenAPISchema() that booted a container to run python -m cog.command.openapi_schema - python/cog/_schemas.py — Python schema generation (to_json_schema) - python/cog/command/openapi_schema.py — CLI entry point for above Removed from Rust: - Handler::schema() trait method from worker.rs (default None impl) - WorkerBridge::schema() delegation in worker_bridge.rs - PythonPredictor::schema() in predictor.rs that called cog._schemas via PyO3 at runtime Still kept (used by coglet-python for input validation): - python/cog/_inspector.py — check_input(), create_predictor() - python/cog/_adt.py — PredictorInfo types, SDK detection
# Conflicts: # mise.toml
…lation When SDK version is explicitly pinned below 0.17.0 (e.g. COG_SDK_WHEEL=pypi:0.16.12), fall back to runtime schema generation and skip coglet installation: - Add GenerateCombined() to merge predict+train schemas (fixes missing /trainings) - Add canUseStaticSchemaGen() with binary availability check (graceful fallback) - Add isLegacySDK() to skip coglet for SDK < 0.17.0 - Add DetectLocalSDKVersion() to resolve SDK version from dist/ wheels - Restore GenerateOpenAPISchema() for legacy Docker-based schema generation - Add legacy_sdk_schema integration test
Input validation is now handled at the HTTP edge using the OpenAPI schema generated at build time. The worker no longer needs to introspect Python types at runtime via _inspector/_adt. Removed: - python/cog/_adt.py (type ADT for runtime introspection) - python/cog/_inspector.py (predictor introspection and input validation) - python/tests/test_adt.py, python/tests/test_inspector.py Simplified in coglet: - input.rs: removed Runtime enum, InputProcessor trait, CogInputProcessor, detect_runtime(), try_cog_runtime(). Replaced with simple prepare_input() that only handles URLPath downloads. - worker_bridge.rs: SDK detection uses cog.BasePredictor instead of cog._adt - predictor.rs: removed runtime/input_processor fields from PythonPredictor
…ile generator Coglet is now only installed explicitly in the Dockerfile when there's a specific source (COGLET_WHEEL env var, local file, or pinned PyPI version). Otherwise, pip install cog handles it — cog >= 0.17.0 declares coglet as a hard dependency, older versions don't pull it in. Also prevents coglet from being installed alongside legacy SDK < 0.17.0, even when a coglet wheel is auto-detected from dist/.
… defaults in Rust Dead code removed: - schema.py (PredictionRequest/Response — unused) - suppress_output.py (never imported) - command/ directory (legacy schema gen entry point — already deleted) - PredictorNotSet exception, get_predictor_types, requires_gpu (config.py) - get_predict, get_train, wait_for_env, get_healthcheck (predictor.py) - put_file_to_signed_endpoint, guess_filename, ensure_trailing_slash (files.py) - ExperimentalFeatureWarning (types.py) Input() changes: - default_factory now raises TypeError at class definition time - Stripped all mutable-default/factory machinery from Input() - Schema-gen already errors on default_factory at build time FieldInfo default patching (coglet predictor.rs): - At predictor load time, replaces FieldInfo defaults with their .default values on predict/train method signatures so Python uses actual defaults instead of FieldInfo wrapper objects for missing inputs
…te dead SDK modules Emit COG_PREDICT_TYPE_STUB, COG_TRAIN_TYPE_STUB, and COG_MAX_CONCURRENCY as ENV lines in the generated Dockerfile so the container never needs to parse cog.yaml at startup. http.py reads these env vars directly; coglet reads COG_MAX_CONCURRENCY from the environment instead of calling into Python config code. Delete config.py, errors.py, mode.py, and logging.py from the Python SDK as they are no longer imported by any remaining code.
… I/O) Delete coder.py and coders/ directory (DataclassCoder, JsonCoder, SetCoder). The Coder.register()/lookup() mechanism was used by the old FastAPI server to encode/decode custom types. With coglet handling all I/O at the Rust layer, no code calls Coder.lookup() anymore. Zero references in integration tests, Python tests, or Rust code. Also remove unused logging import from http.py.
exceptions.py was a single-line re-export from coglet. Move the import directly into __init__.py to eliminate the indirection.
Rewrite output.rs to handle make_encodeable() and file-to-base64 encoding directly in Rust via PyO3 isinstance checks, replacing the Python trampolines to cog.json and cog.files. Delete both Python modules. Add input validation in coglet's HTTP layer: compile the OpenAPI Input schema (with additionalProperties:false injected) into a jsonschema validator at startup, and validate prediction inputs before dispatching to the Python worker. This replaces the pydantic validation that was previously done by the ADT/inspector code removed earlier in this branch. Validation errors are returned as pydantic-compatible 422 responses with per-field detail entries that the Go CLI already knows how to parse.
…llback - COG_SCHEMA_GEN_TOOL now accepts https:// URLs in addition to local paths; downloaded binaries are cached under ~/.cache/cog/bin/ - Remove binary-not-found fallback in canUseStaticSchemaGen — for SDK >= 0.17.0, the embedded binary is always available; missing binary is now a hard error instead of silently falling back to deleted cog.command - Gate PATH and dist/ fallbacks behind isDev so production builds only use the embedded binary (or explicit COG_SCHEMA_GEN_TOOL override)
The Go CLI looks up property types from the OpenAPI schema to coerce
string inputs (e.g. "3" → integer 3). cog-schema-gen emits enum/choices
fields as allOf:[{$ref: ...}] where the type lives on the referenced
schema, not the wrapper. Without resolving the ref, the type check falls
through and the value is sent as a string, causing validation errors like
'"3" is not one of [3,4,5]'.
…ath-typed The coerce_url_strings function now introspects the predict/train function's type annotations to distinguish File vs Path parameters. File-typed inputs get File.validate() which returns an IO-like URLFile, while Path-typed inputs get Path.validate() which returns a URLPath for download. Previously all URL strings were coerced via Path.validate(), causing 'Path object has no attribute read' errors for File inputs.
Multiple parallel cog build processes extracting the embedded schema-gen binary to the same .tmp path caused ETXTBSY (text file busy) errors on Linux. Each process now gets a unique temp file via os.CreateTemp, then atomically renames to the versioned cache path.
…e separately When both predict and train are configured, cog-schema-gen now emits TrainingInput/TrainingOutput for train mode and Input/Output for predict mode. This prevents the merged schema from dropping train's Input schema (since both previously used the same key). Coglet validates /trainings requests against TrainingInput and /predictions against Input. The Go CLI also resolves TrainingInput for cog train -i type coercion.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Replace Python runtime schema generation, input validation, and output encoding with static/Rust-native alternatives. This eliminates the need to boot a Python container for schema extraction at build time, and moves hot-path logic from Python into coglet (Rust).
Changes
Static Schema Generation (
cog-schema-gen).cog/openapi_schema.json, bundled into the imagecogCLI via embed+exec patternpredictandtrainmodes,Input()metadata, choices, all cog typesCOG_SCHEMA_GEN_TOOLEnvironment VariableThe schema generator binary is always embedded in the
cogCLI. For advanced use cases (testing custom builds, CI overrides), theCOG_SCHEMA_GEN_TOOLenv var accepts:https://...) — download the binary once and cache it under~/.cache/cog/bin/Resolution order:
COG_SCHEMA_GEN_TOOLenv var (path or URL)dist/cog-schema-genrelative to cwd or exe (dev builds only)PATHlookup (dev builds only)Production builds only use steps 1–2. If the binary cannot be resolved, the build fails with a clear error (no silent fallback).
Coglet: Input Validation in Rust
input_validation.rs— compiles OpenAPIcomponents.schemas.Inputinto ajsonschema::Validatorat schema-set timedetailarray (loc: ["body", "input", field_name])additionalProperties: falsefor pydantic parityCoglet: Output Encoding in Rust
cog.json.make_encodeable,cog.files) with native Rust implementations via PyO3Coglet: File vs Path Input Coercion
prepare_inputnow introspects the predict/train function's type annotations to distinguishFilevsPathparametersFile-typed inputs getFile.validate()→ IO-likeURLFilePath-typed inputs getPath.validate()→URLPath(downloaded to temp file)URLPathGo CLI: Schema-Aware Type Coercion
NewInputs()now resolvesallOf/$refwrappers when looking up field types from the OpenAPI schema-i int_choices=3correctly sends3as integer, not"3"as string)Python SDK Slimmed Down (26 files → 8 files)
cog.json,cog.files,cog.exceptions,cog.coder,cog.coders/,cog.config,cog.logging,cog._adt,cog._inspector,cog.server.runner,cog.server.scope,cog.server.worker,cog.command/__init__.py,_version.py,types.py,predictor.py,input.py,model.py,server/__init__.py,server/http.pycog.yamlis never read at runtime — config flows through Dockerfile ENV varsCancelationExceptionre-exported directly from coglet in__init__.pydefault_factoryraises hard error inInput()at class definition timeCoglet Installation
cogletpackage) instead of explicit Dockerfile generator logicCI / Docs
test_input.pycog.jsonimport intest_model.pydocs/python.md,AGENTS.md,architecture/02-schema.mdfor deleted modulesMIT-0tocrates/deny.tomlallow list (forjsonschemadep)docs/llms.txtWhat Rust calls in the remaining Python SDK
lib.rscog__version__predictor.rscog.predictorload_predictor_from_ref,has_setup_weights,extract_setup_weightspredictor.rscog.inputFieldInfoinput.rscog.typesPath,File,URLPathworker_bridge.rscogBasePredictor(SDK detection)Testing
mise run test:rust— 157/157 passmise run test:python— 48/48 passmise run lint— all clean (go, rust, python)mise run typecheck— all cleanmise run stub:check+stub:typecheck— passmise run docs:llm:check+docs:cli:check— passpredict_many_inputs_image,file_list_input,int_predictor,granite_project— all pass