You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Sep 26, 2023. It is now read-only.
This set of steps should be generalizable to other datasets, but we will start with OMI Ozone.
We will deploy a workflow (AWS Step Function) per dataset. Each dataset workflow will be triggered to run on a schedule to discover new files, generate (and publish metadata for) COGs.
We will write infrastructure as code (CDK) to:
Deploy the workflow (step function state machine)
Trigger the workflow to run on a schedule (cloudwatch event rules)
Deploy as step 1 in the workflow the generate file URLs step
Deploy as step 2 in the workflow the generate (and publish STAC metadata) COG step
Tasks breakdown
Infra as code for workflow
1. Deploy a skeleton workflow
2. Deploy trigger to schedule workflow
3. Add steps in workflow to workflow deployment
Update workflow to use processing script for OMDOAO3 version 003
There will need to be a way to trigger a parallel workflow for all files discovered in step 1 of the workflow. It looks like, for AWS Step Functions, there is a Map state type that can be used for this.
Steps in workflow
1. Docker which generates list of file URLs
Write scripts which queries CMR for a given collection short name, version and temporal range and outputs list of file URLs
Check it works for this dataset (OMDOAO3 version 003)
This set of steps should be generalizable to other datasets, but we will start with OMI Ozone.
We will deploy a workflow (AWS Step Function) per dataset. Each dataset workflow will be triggered to run on a schedule to discover new files, generate (and publish metadata for) COGs.
We will write infrastructure as code (CDK) to:
Tasks breakdown
Infra as code for workflow
OMDOAO3version003There will need to be a way to trigger a parallel workflow for all files discovered in step 1 of the workflow. It looks like, for AWS Step Functions, there is a Map state type that can be used for this.
Steps in workflow
1. Docker which generates list of file URLs
OMDOAO3version003)2. Docker which creates COG from file URL
3. Docker which creates COG also publishes to STAC
Infrastructure details for workflow
Technologies proposed: CDK + AWS StepFunctions (Lambda, Fargate) + AWS CloudWatch
We are using the AWS SF Map State
Alternatives: Prefect (Cloud Agnostic), Batch (necessary?)