fix: generate OpenAPI schemas for forms and models#177
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: efea2792b2
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if components is not None and name not in components: | ||
| components[name] = {} | ||
| model_schema = _model_to_schema(model_class, components) | ||
| components[name].update(model_schema) |
There was a problem hiding this comment.
Disambiguate same-named model components
When two distinct request or response model classes share the same __name__ (for example models named Item in different modules or factory-created classes), the first schema registered under that name wins and every later model returns a $ref to that existing component. That makes the generated OpenAPI document advertise the wrong fields for any subsequent same-named model; use a unique component name or detect and resolve collisions instead of keying solely on __name__.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d0ad94e44a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if isinstance(defs, dict): | ||
| for name, value in defs.items(): | ||
| _rewrite_component_refs(value) | ||
| schema_context.components.setdefault(name, value) |
There was a problem hiding this comment.
Disambiguate nested model definitions before storing components
When two Pydantic/DHI models include nested models that share the same class name but have different fields, their generated $defs/definitions entries use the same key, and this setdefault keeps the first nested schema while the later parent schema is still rewritten to reference #/components/schemas/<that key>. In that scenario the second operation's request/response schema points at the first nested model's shape, so the OpenAPI contract is wrong despite the top-level model names being disambiguated.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0049fb5eca
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| ) | ||
| component_names[(defs_key, name)] = component_name | ||
| reserved_names.add(component_name) | ||
| ref_rewrites[f"#/{defs_key}/{name}"] = f"#/components/schemas/{component_name}" |
There was a problem hiding this comment.
Rewrite component-template refs when renaming definitions
When a model schema generator honors the ref_template="#/components/schemas/{model}" passed above, nested refs arrive already in component form (for example #/components/schemas/Nested) while the definitions still get renamed here on collision (for example to Nested_2). This map only rewrites the local #/$defs/... and #/definitions/... forms, so the second colliding parent keeps pointing at the first component and documents the wrong nested schema for real Pydantic/dhi-style output.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3cc77977e8
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if ( | ||
| base_name in schema_context.components | ||
| and base_name not in schema_context.component_models | ||
| and schema_context.components[base_name] == value | ||
| ): |
There was a problem hiding this comment.
Do not reuse raw nested definitions before ref rewriting
When two parent schemas both define the same nested component name with an identical outer shape but that outer schema points to another colliding nested definition, this raw equality check reuses the first component before the inner $refs are rewritten. For example, two $defs.Child schemas that both contain #/components/schemas/Inner but whose $defs.Inner bodies differ will make the second parent still reference the original Child, while only Inner_2 is added; the second parent's child schema then describes the first parent's inner type. Compare definitions only after applying the per-parent ref rewrites, or allocate a new component whenever any referenced definition is renamed.
Useful? React with 👍 / 👎.
| for item in metadata: | ||
| if isinstance(item, _PARAM_MARKER_TYPES): | ||
| return annotation, item |
There was a problem hiding this comment.
Do not document Annotated form markers until runtime parses them
This now treats FastAPI-style Annotated[str, Form()] / Annotated[..., File()] metadata as real OpenAPI markers and emits a form or multipart request body, but the runtime form path I checked in python/turboapi/request_handler.py only enables and parses form fields when param.default is a Form/File instance. In that scenario clients following /docs send form data, while the handler will not have _has_form_params set and the Annotated form arguments are not populated; either the runtime needs the same Annotated marker handling or OpenAPI should not advertise these parameters as form fields.
Useful? React with 👍 / 👎.
|
@codex review Pushed a1d73e6 to address the latest feedback: nested definitions are no longer reused before ref rewriting, and unsupported |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a1d73e6203
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if isinstance(item, _PARAM_MARKER_TYPES): | ||
| return annotation, item |
There was a problem hiding this comment.
Stop documenting unsupported Annotated markers
When a handler uses Annotated with non-form markers, for example q: Annotated[int, Query(default=10, alias="item-query")], this branch now advertises item-query as an optional integer query parameter with a default. The runtime paths do not consume those markers: QueryParamParser.parse_query_params only matches raw query keys against signature parameter names and coerces the unwrapped annotation is never used, while defaults/aliases are only read from param.default in the request handlers/testclient. As a result /items is documented as callable without q or with ?item-query=..., but the handler still gets a missing q argument or ignores the alias until runtime parsing supports Annotated[..., Query()/Body()/Header()/Cookie()/Path()] too.
Useful? React with 👍 / 👎.
|
@codex review Pushed a20712e to address the latest Annotated-marker feedback. OpenAPI now ignores all |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a20712ed3f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| marker, | ||
| ) | ||
| ) | ||
| elif isinstance(marker, Cookie): |
There was a problem hiding this comment.
Don't advertise Cookie parameters before parsing them
When a route declares a normal parameter like session_id: str = Cookie(), this branch now documents it as an OpenAPI cookie parameter, but the request path never consumes Cookie markers: request_handler.py only has parsers for query/path/header/body/form/file and a repo-wide search shows Cookie is otherwise only used by security helpers. In that scenario a client generated from /openapi.json will send the cookie, while the handler receives the Cookie() default object or no bound value instead of the cookie string.
Useful? React with 👍 / 👎.
| "schema": param_schema, | ||
| "required": required, | ||
| "media_type": marker.media_type, | ||
| "direct": _is_model_class(annotation) and not marker.embed, |
There was a problem hiding this comment.
Don't document Body(embed=True) until binding supports it
For a single model parameter declared as item: Model = Body(embed=True), this sets direct to false, so OpenAPI tells clients to send {"item": {...}}; however RequestBodyParser.parse_json_body does not inspect Body markers and its single-model branch validates the entire JSON object against Model. Clients following the generated schema will therefore send a wrapped body that the runtime does not bind as documented.
Useful? React with 👍 / 👎.
|
@codex review Pushed 505d7bf to address the latest Cookie and Body(embed=True) feedback. OpenAPI no longer advertises unsupported Cookie route params, and Body(embed=True) model parameters now document the direct model schema to match current runtime binding. The PR description and testing summary are updated. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 505d7bfefa
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
|
||
| # Try to get schema from Satya/Pydantic models | ||
| def _resolve_annotation_and_marker(param: inspect.Parameter) -> tuple[Any, Any | None]: | ||
| annotation, metadata = _unwrap_annotated(param.annotation) |
There was a problem hiding this comment.
Do not document unsupported Annotated model bodies
When a POST handler uses item: Annotated[Item, Body()] (including Body(embed=True)) without a default marker, this unwraps the annotation and drops the marker, so the later _is_model_class(annotation) branch documents the request as a direct Item body. The JSON body parser still inspects the raw param.annotation and does not unwrap Annotated, so it will not validate/bind a raw model body for that route; the generated OpenAPI therefore tells clients to send a shape the runtime does not accept.
Useful? React with 👍 / 👎.
justrach
left a comment
There was a problem hiding this comment.
I’m not approving/accepting this PR yet. I ran through the current PR head (505d7bfefa) and I agree with Codex’s latest P2 finding.
Quoting Codex:
Do not document unsupported Annotated model bodies
When a POST handler uses
item: Annotated[Item, Body()](includingBody(embed=True)) without a default marker, this unwraps the annotation and drops the marker, so the later_is_model_class(annotation)branch documents the request as a directItembody. The JSON body parser still inspects the rawparam.annotationand does not unwrapAnnotated, so it will not validate/bind a raw model body for that route; the generated OpenAPI therefore tells clients to send a shape the runtime does not accept.
What I verified:
- The existing OpenAPI test class still passes on the PR head (
11 passed), so this specific mismatch is not covered yet. - A smoke check on the PR head shows
item: Annotated[Item, Body()]is documented as{'$ref': '#/components/schemas/Item'}and registersItemincomponents.schemas. That does not match the current runtime parser path, which does not unwrapAnnotatedfor JSON model binding. - I do not see a direct auth/RCE-style security issue in this OpenAPI-only change, but this is a correctness/API-contract risk: generated clients can be told to send a model body that TurboAPI does not actually bind the same way at runtime.
Requested before merge:
- Keep unsupported
Annotated[..., Body()/Query()/Header()/Cookie()/Path()/Form()/File()]marker metadata out of OpenAPI until runtime parsing supports those markers. ForAnnotated[Item, Body()], that means not unwrapping through to a direct model schema. - Add a regression test for
Annotated[Item, Body()]proving we do not emit a directItemrequest body or register theItemcomponent for that unsupported route shape. - After that, rerun the focused OpenAPI tests and lint for
python/turboapi/openapi.py/tests/test_fastapi_parity.py.
Once that last Codex item is fixed, the rest of the PR direction looks aligned with the current runtime behavior: it skips dependencies, handles supported default-value markers, avoids unsupported cookie route docs, and disambiguates component naming/ref rewrites.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c386c19583
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| _parameter_alias(param_name, marker), | ||
| "query", | ||
| required, |
There was a problem hiding this comment.
Do not advertise unsupported Query defaults
When a route declares a parameter such as include_archived: bool = Query(default=False, alias="archived"), this branch emits the alias and marks the parameter optional/defaulted. The runtime query parsers I checked (QueryParamParser.parse_query_params and TestClient) only match incoming query keys against the Python parameter name and do not substitute Query.default when the value is omitted, so clients following this schema can call ?archived=true or omit the parameter while the handler receives no bound value/the Query marker object instead.
Useful? React with 👍 / 👎.
a461876 to
e2abd6f
Compare
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e2abd6f2ea
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| body_entries.append( | ||
| { | ||
| "name": _parameter_alias(param_name, marker), | ||
| "schema": param_schema, |
There was a problem hiding this comment.
Avoid typed form schemas until values are coerced
When a form field is annotated as a non-string type, e.g. age: int = Form(), this stores the annotation-derived schema so OpenAPI advertises an integer form field. The enhanced runtime path I checked in request_handler.py assigns _form_fields[key] directly for Form parameters without the coercion that query/body parsing gets, so clients following the schema still deliver a string to the handler and integer/bool handlers can behave incorrectly.
Useful? React with 👍 / 👎.
| { | ||
| "name": _parameter_alias(param_name, marker), | ||
| "schema": param_schema, | ||
| "required": required, |
There was a problem hiding this comment.
Avoid optional Body defaults until runtime honors them
For Body(default=...) parameters this uses the marker default to mark the request body/property optional and publish the default, but RequestBodyParser.parse_json_body does not unwrap Body.default: when the field is absent it falls back to param.default, which is the Body marker object itself. In that scenario clients can omit the field according to /openapi.json, while the handler receives a marker object instead of the documented default value.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7758e55ee3
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| "name": _parameter_alias(param_name, marker), | ||
| "schema": param_schema, | ||
| "required": body_required, | ||
| "media_type": marker.media_type, |
There was a problem hiding this comment.
Keep non-JSON Body media types out of the schema
When a route declares a body marker such as payload: str = Body(media_type="text/plain"), this advertises a text/plain request body, but the runtime path I checked in request_handler.py still always calls RequestBodyParser.parse_json_body() and json.loads() for non-form bodies. Clients generated from this schema will send raw text and receive a bad-request parse error instead of the documented binding; either keep documenting JSON here or add runtime parsing for the declared media type.
Useful? React with 👍 / 👎.
| _parameter_alias(param_name, marker, location="header"), | ||
| "header", | ||
| required, | ||
| param_schema, |
There was a problem hiding this comment.
Avoid typed header schemas until headers are coerced
When a handler uses a non-string header like x_limit: int = Header(), this publishes the annotation-derived integer schema, but HeaderParser.parse_headers() copies the matched header value directly as a string and the enhanced handler passes it through without coercion. Clients following the OpenAPI contract can send a numeric header while the handler still receives str, which breaks code that relies on the annotated type.
Useful? React with 👍 / 👎.
Summary
Fix OpenAPI schema generation so form fields, dependency parameters, DHI/Pydantic-style models, response models, and colliding model component names serialize correctly without advertising unsupported runtime behavior.
Context
Issue #176 exposed that the OpenAPI generator treated POST parameters too broadly as JSON body fields, leaked dependency/default marker objects into generated schemas, and did not register model schemas in
components.schemas. This broke JSON serialization and produced incorrect docs for forms and typed request/response models.Codex review also flagged schema-collision and parity cases: distinct top-level model classes sharing
__name__, nested$defs/definitionsentries sharing local keys across parent models, component-template refs that still pointed at pre-rename components, raw nested-definition reuse before refs were rewritten, FastAPI-styleAnnotated[..., ...]marker metadata being documented before the runtime consumes those markers, cookie route params being documented before runtime binding exists, andBody(embed=True)documenting a wrapped model body that runtime parsing does not currently bind.Changes
components.schemasand emit$refreferences.__name__, while preserving stable refs for repeated uses of the same class.$defs/definitionsbefore moving them intocomponents.schemas.$defs/definitionsrefs and already-componentized#/components/schemas/{model}refs to the selected component names when nested definitions are renamed.Annotated[..., Query()/Header()/Cookie()/Path()/Body()/Form()/File()]metadata out of generated docs until runtime parsing supports that style; default-value markers and directUploadFiledocs remain supported.Cookie()route parameters out of OpenAPI until request parsing binds them.Body(embed=True)model parameters according to current runtime behavior by referencing the direct model schema rather than a wrapped object.Key Implementation Details
The generator now unwraps
Annotatedtype wrappers for the underlying Python type, detects TurboAPI marker defaults that the runtime actually binds, builds request bodies by media type, and recursively converts supported Python/model annotations into OpenAPI-compatible schemas. Model schemas are copied into components and referenced from operations to match FastAPI-style output.A per-schema generation context tracks model-class-to-component-name mappings so subsequent references to the same model stay stable, while later distinct models or nested definitions with colliding names receive unique component keys. Nested schemas are deep-copied before component insertion, and all relevant local/component refs are recursively rewritten to avoid stale refs after renaming.
Security Review
/openapi.jsonserialization failures.Testing
Links