Skip to content

Refactor harvest orchestration and enhance Prefect worker setup#98

Open
JessyBarrette wants to merge 20 commits into
feat/harvest-dashboardfrom
rev/harvest-dasboard-jessy
Open

Refactor harvest orchestration and enhance Prefect worker setup#98
JessyBarrette wants to merge 20 commits into
feat/harvest-dashboardfrom
rev/harvest-dasboard-jessy

Conversation

@JessyBarrette

@JessyBarrette JessyBarrette commented Jun 5, 2026

Copy link
Copy Markdown
Member

Improve the Prefect worker setup for in-process harvesting and add support for remote workers. Update the README for clarity on metadata storage and configuration handling. Introduce Coolify-specific docker-compose overrides and streamline the orchestration of harvest jobs per server. Enhance error handling and logging in the ERDDAP harvester, and publish dataset status and logs as artifacts for better visibility. Normalize cron schedule handling and improve documentation throughout.

This pull request introduces significant improvements to the deployment, scaling, and orchestration of the harvester and related services, with a focus on better support for Prefect-based orchestration, multi-host/remote worker setups, and environment-specific overrides. The changes streamline how harvest flows are run and scheduled, improve documentation, and introduce new Docker Compose configurations for both production and specialized environments like Coolify.

Prefect orchestration and worker management:

  • Replaces the previous Prefect deployment/worker setup with a new prefect_worker service that registers work pools and deployments on startup, runs harvest flows in-process (no per-run containers or Docker socket), and can be scaled horizontally with Docker Compose. Adds environment variables to control scheduling, deployment registration, and on-deploy triggers (HARVESTER_CRON, VERNACULARS_CRON, RUN_ON_DEPLOY, REGISTER_DEPLOYMENTS). ([[1]](https://github.com/cioos-siooc/explore-cioos/pull/98/files#diff-0a1c3356cafa536f2da1e810fe8ae075ca001848b63c20d86b004626789cfa88L76-L122), [[2]](https://github.com/cioos-siooc/explore-cioos/pull/98/files#diff-088d9f35d23a4347d221d71dd49b02b95001dff4abe637a40fe0bc04d502049cL54-R67), [[3]](https://github.com/cioos-siooc/explore-cioos/pull/98/files#diff-088d9f35d23a4347d221d71dd49b02b95001dff4abe637a40fe0bc04d502049cL68-R85), README.md [4]

  • Adds a new docker-compose.worker.yaml for launching remote Prefect workers on additional hosts, allowing for distributed harvesting capacity. These workers poll the central Prefect server, do not register deployments, and require access to the central database and Prefect API. ([docker-compose.worker.yamlR1-R51](https://github.com/cioos-siooc/explore-cioos/pull/98/files#diff-84721629841278ceb728e5039aec40cf3bc45a21ad3dd65df058de2cc1e044ebR1-R51))

  • Updates the Prefect server to use a non-conda image with asyncpg, ensuring metadata is stored in Postgres (not SQLite) for better concurrency and reliability. Includes logic to auto-create the prefect database if missing. ([docker-compose.production.yamlL139-R161](https://github.com/cioos-siooc/explore-cioos/pull/98/files#diff-0a1c3356cafa536f2da1e810fe8ae075ca001848b63c20d86b004626789cfa88L139-R161))

Environment and deployment configuration:

  • Adds docker-compose.coolify.yml as a Coolify-specific override, which removes published host ports and sets up environment variables for Coolify's proxy/FQDN system. ([docker-compose.coolify.ymlR1-R40](https://github.com/cioos-siooc/explore-cioos/pull/98/files#diff-28660a25c4eefbe9ae070d880f200978aea222a171ea49e62d00a5516e3a9eb0R1-R40))

  • Removes the sample local development override file (docker-compose.override.yaml.sample) to avoid confusion and clarify deployment practices. ([docker-compose.override.yaml.sampleL1-L18](https://github.com/cioos-siooc/explore-cioos/pull/98/files#diff-416e8201f68e086bd5fc13472e9ef05dca0310f18ba4247453d278907fb8053aL1-L18))

  • Updates .env.sample with new variables for harvest scheduling, config file selection, and deployment registration, along with improved documentation. ([[1]](https://github.com/cioos-siooc/explore-cioos/pull/98/files#diff-088d9f35d23a4347d221d71dd49b02b95001dff4abe637a40fe0bc04d502049cL54-R67), [[2]](https://github.com/cioos-siooc/explore-cioos/pull/98/files#diff-088d9f35d23a4347d221d71dd49b02b95001dff4abe637a40fe0bc04d502049cL68-R85))

Docker Compose networking and environment updates:

  • Modifies docker-compose.yaml to publish required ports for db and nginx by default, simplifying local development and aligning with the new override strategy for production/Coolify. ([[1]](https://github.com/cioos-siooc/explore-cioos/pull/98/files#diff-3fde9d1a396e140fefc7676e1bd237d67b6864552b6f45af1ebcc27bcd0bb6e9L5-R6), [[2]](https://github.com/cioos-siooc/explore-cioos/pull/98/files#diff-3fde9d1a396e140fefc7676e1bd237d67b6864552b6f45af1ebcc27bcd0bb6e9L33-R41))

  • Removes Coolify-specific environment variables and logic from the base Compose file, moving them to the dedicated override. ([[1]](https://github.com/cioos-siooc/explore-cioos/pull/98/files#diff-3fde9d1a396e140fefc7676e1bd237d67b6864552b6f45af1ebcc27bcd0bb6e9L33-R41), [[2]](https://github.com/cioos-siooc/explore-cioos/pull/98/files#diff-3fde9d1a396e140fefc7676e1bd237d67b6864552b6f45af1ebcc27bcd0bb6e9L56-R54))

Code and task changes:

  • Refactors db-loader/cde_db_loader/__main__.py to define the main loader as a Prefect @task instead of a @flow, aligning it with the new orchestration approach. ([[1]](https://github.com/cioos-siooc/explore-cioos/pull/98/files#diff-b72bd0dde5d880ec74272e3030892ede678350af9f592f91242363b1c39a9e71L20-R20), [[2]](https://github.com/cioos-siooc/explore-cioos/pull/98/files#diff-b72bd0dde5d880ec74272e3030892ede678350af9f592f91242363b1c39a9e71L190-R190))

Prefect orchestration and scaling:

  • Replaces previous Prefect worker/deployment model with a scalable, in-process prefect_worker service and new scheduling/environment controls (HARVESTER_CRON, VERNACULARS_CRON, RUN_ON_DEPLOY, REGISTER_DEPLOYMENTS). ([[1]](https://github.com/cioos-siooc/explore-cioos/pull/98/files#diff-0a1c3356cafa536f2da1e810fe8ae075ca001848b63c20d86b004626789cfa88L76-L122), [[2]](https://github.com/cioos-siooc/explore-cioos/pull/98/files#diff-088d9f35d23a4347d221d71dd49b02b95001dff4abe637a40fe0bc04d502049cL54-R67), [[3]](https://github.com/cioos-siooc/explore-cioos/pull/98/files#diff-088d9f35d23a4347d221d71dd49b02b95001dff4abe637a40fe0bc04d502049cL68-R85), [[4]](https://github.com/cioos-siooc/explore-cioos/pull/98/files#diff-b335630551682c19a781afebcf4d07bf978fb1f8ac04c6bf87428ed5106870f5L175-R215))
  • Adds docker-compose.worker.yaml for launching remote Prefect workers on other hosts, enabling distributed harvesting. ([docker-compose.worker.yamlR1-R51](https://github.com/cioos-siooc/explore-cioos/pull/98/files#diff-84721629841278ceb728e5039aec40cf3bc45a21ad3dd65df058de2cc1e044ebR1-R51))
  • Updates Prefect server to use Postgres via asyncpg, with logic to create the prefect DB if needed. ([docker-compose.production.yamlL139-R161](https://github.com/cioos-siooc/explore-cioos/pull/98/files#diff-0a1c3356cafa536f2da1e810fe8ae075ca001848b63c20d86b004626789cfa88L139-R161))

Deployment/environment configuration:

  • Introduces docker-compose.coolify.yml for Coolify-specific overrides, removing host port exposure and integrating with Coolify's FQDN/proxy system. ([docker-compose.coolify.ymlR1-R40](https://github.com/cioos-siooc/explore-cioos/pull/98/files#diff-28660a25c4eefbe9ae070d880f200978aea222a171ea49e62d00a5516e3a9eb0R1-R40))
  • Removes the local development override sample to clarify deployment practices. ([docker-compose.override.yaml.sampleL1-L18](https://github.com/cioos-siooc/explore-cioos/pull/98/files#diff-416e8201f68e086bd5fc13472e9ef05dca0310f18ba4247453d278907fb8053aL1-L18))
  • Updates .env.sample with new scheduling and config variables, and improved documentation. ([[1]](https://github.com/cioos-siooc/explore-cioos/pull/98/files#diff-088d9f35d23a4347d221d71dd49b02b95001dff4abe637a40fe0bc04d502049cL54-R67), [[2]](https://github.com/cioos-siooc/explore-cioos/pull/98/files#diff-088d9f35d23a4347d221d71dd49b02b95001dff4abe637a40fe0bc04d502049cL68-R85))

Docker Compose and networking:

  • Changes base Compose files to publish required ports by default and moves Coolify-specific logic to the override file. ([[1]](https://github.com/cioos-siooc/explore-cioos/pull/98/files#diff-3fde9d1a396e140fefc7676e1bd237d67b6864552b6f45af1ebcc27bcd0bb6e9L5-R6), [[2]](https://github.com/cioos-siooc/explore-cioos/pull/98/files#diff-3fde9d1a396e140fefc7676e1bd237d67b6864552b6f45af1ebcc27bcd0bb6e9L33-R41), [[3]](https://github.com/cioos-siooc/explore-cioos/pull/98/files#diff-3fde9d1a396e140fefc7676e1bd237d67b6864552b6f45af1ebcc27bcd0bb6e9L56-R54))

Code/task refactoring:

  • Refactors the main DB loader entrypoint to use a Prefect @task instead of @flow. ([[1]](https://github.com/cioos-siooc/explore-cioos/pull/98/files#diff-b72bd0dde5d880ec74272e3030892ede678350af9f592f91242363b1c39a9e71L20-R20), [[2]](https://github.com/cioos-siooc/explore-cioos/pull/98/files#diff-b72bd0dde5d880ec74272e3030892ede678350af9f592f91242363b1c39a9e71L190-R190))

…atus and log files as artifacts for better visibility in the Prefect UI
…et harvesting logic for clarity and efficiency
…etching and CSV writing, improve error handling, and streamline CKAN record fetching with caching
…corresponding API routes

- Implement HarvestRun component to display details of a specific harvest run.
- Implement HarvestServer component to show datasets from a specific server with filtering options.
- Implement Sparkline component for visualizing dataset status history.
- Implement StatusBadge component for displaying status labels.
- Add slug utility functions for encoding and decoding URLs.
- Create useHarvestFetch hook for fetching data from the harvest API.
- Add styles for Harvest components and tables.
- Update routing in index.js to include new Harvest routes.
- Add translations for Harvest-related text in English and French.
- Implement new API routes for harvest data retrieval in the backend.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant