CUMULUS-4553 Centralize schema generation#4295
Conversation
a42b0d3 to
3fec667
Compare
3fec667 to
05d723f
Compare
mikedorfman
left a comment
There was a problem hiding this comment.
I was able to get partway through this - wanted to post my comments before CoB for me. I'll try to get the rest of the way through it either tomorrow or Monday.
There was a problem hiding this comment.
Just confirming my understanding - this is duplicating the definition here, right? I'm not sure where the authoritative definition/schema is for CNM and what references the podaac schema, but we might want processes pointed there to point here at some point so we have a single source of truth.
There was a problem hiding this comment.
Yea true. I believe that @yenes56 originally generated the pydantic models from that schema you linked. I went through and cleaned them up a little by hand because the auto generation was a little crude and duplicated a bunch of stuff that I was able to combine.
But that is a good question, how do we ensure that updates to the CNM schema are reflected here? One thing I'm a bit concerned with is the version number enum. Let's say a new version of CNM is released, tasks that use this schema will reject that new version with a validation error until the schemas are updated, and a new version of cumulus is released even if structurally nothing has changed in the schema.
| print(f"Wrote schema to {output_path}") | ||
|
|
||
| if encountered_error: | ||
| sys.exit(-1) |
There was a problem hiding this comment.
I think you could just do raise?
There was a problem hiding this comment.
I guess I didn't want it to spew a python exception trace to the screen.
There was a problem hiding this comment.
Any reason why we're not generating/saving a uv lockfile here?
There was a problem hiding this comment.
Yea, for libraries it doesn't make sense to commit the lock file since the library itself doesn't have just one possible environment. It has a range of compatible versions which will eventually be resolved to a particular environment in the application that is using the library. In this case every task that pulls this library in as a dev dependency has a lock file in which the particular environment for that task will be captured. It's possible that each task gets slightly different versions of the dependencies based on what is compatible with everything else in the dependency tree.
| """ | ||
| LOGGER.setMetadata(event, context) | ||
| cumulus_task_return = run_cumulus_task(lambda_adapter, event, context, SCHEMAS) | ||
| cumulus_task_return = run_cumulus_task(lambda_adapter, event, context) |
There was a problem hiding this comment.
Is there a specific reason you're using the implicit/path-based schema validation instead of passing the paths to run_cumulus_task? I really wish run_cumulus_task complained/failed if it didn't find the schemas (particularly when passing the paths explicitly).
There was a problem hiding this comment.
I guess I figured we should be sticking to the same default paths that the rest of the core tasks use and it would be less likely for someone to be tempted to do it differently if we just use the implicit defaults. Not exactly sure, it felt redundant to me to pass them in explicitly.
Yea, if it failed when they were passed in explicitly that might be a reason to do it.
05d723f to
bfe8686
Compare
bfe8686 to
e75fa31
Compare
e75fa31 to
35ecc05
Compare
Summary: Refactors all python task schemas to use pydantic to generate the jsonschemas. The functionality for doing the generation is encapsulated in the
python-schemaspackage which is referenced as a dev dependency. This package also includes some reusable schemas that can be imported when needed across multiple tasks e.g.GranuleorCnm.Task schemas can be generated for the whole repo using:
This will regenerate the jsonschemas for both the python and typescript tasks.
Addresses CUMULUS-4553: Centralize Coreification input/output schemas
Changes
python-schemaspackage for converting pydantic models to jsonschema and housing reusable models needed by multiple tasks.uv-exportscript topackage.jsons so that requirements.txt can be updated withlerna run uv-exportuv-lockscript topackage.jsons to upgrade the dependencies in all lock files withlerna run uv-lockPR Checklist
📝 Note:
For most pull requests, please Squash and merge to maintain a clean and readable commit history.