If you just want to see how a dataset is harvested by CDE:
- Start your python environment environment,
conda create -n cde python=3.10;conda activate cde pip install -e .python -m cde_harvester --urls https://data.cioospacific.ca/erddap --dataset_ids ECCC_MSC_BUOYS- See files in
harvestfolder
- Install Docker and Docker compose. New versions of Docker include
docker compose - Rename file
.env.sampleto.envand change any settings if needed. If you are running on your local machine these settings don't need to change - Copy
harvest_config.sample.yamltoharvest_config.yamland modify if needed. - Run locally with docker compose:
- Development environment:
docker compose up -d - Production environment:
docker compose -f docker-compose.production.yaml up -d
- Development environment:
- See website at http://localhost:8098
- To update database and reharvest datasets:
- Full reload (clears all data, reloads everything):
- Development:
docker compose up -d harvester - Production:
docker compose -f docker-compose.production.yaml up -d harvester
- Development:
- Incremental update (only updates changed datasets, much faster):
- Development:
docker compose run --rm -e INCREMENTAL_MODE=true harvester - Production:
docker compose -f docker-compose.production.yaml run --rm -e INCREMENTAL_MODE=true harvester - Or use the convenience script:
./run_harvester.sh --incremental
- Development:
- Custom config file (use a different harvest configuration):
- Set
HARVEST_CONFIG_FILEenvironment variable or override at runtime: - Development:
docker compose run --rm -e HARVEST_CONFIG_FILE=/app/harvester/custom_config.yaml harvester - Production:
docker compose -f docker-compose.production.yaml run --rm -e HARVEST_CONFIG_FILE=/app/harvester/custom_config.yaml harvester
- Set
- Full reload (clears all data, reloads everything):
For more details, see:
There are two main approaches for frontend development:
Run the frontend locally while using Docker Compose for all backend services (recommended for full-stack development).
-
Rename
.env.samplefrom the root directory to.envand change any settings if needed. If you are running on your local machine, these settings don't need to change. -
Start all backend services using Docker Compose:
docker compose up -d
-
Start the frontend locally:
cd frontend npm install npm start -
See website at http://localhost:8000
Run only the frontend locally and connect to a remote API (recommended for frontend-only development).
-
Start the frontend with a custom API URL:
cd frontend npm install REACT_APP_API_URL=https://your-remote-api.com/api npm start -
See website at http://localhost:8000
For complete local development with all services running outside Docker (advanced):
-
Rename
.env.samplefrom the root directory to.envand change any settings if needed. -
Start a local database using
docker:docker compose up -d db
-
Setup Python virtual env and install Python modules:
conda create -n cde python=3.10 conda activate cde pip install -e ./downloader -e ./download_scheduler -e ./harvester -e ./db-loader
-
Start the API:
cd web-api npm install npm start -
Start the download scheduler:
python -m download_scheduler
-
Start the frontend:
cd frontend npm install npm start -
Harvest a single dataset and load CKAN data:
sh data_loader_test.sh
-
See website at http://localhost:8000
Pushes to master and development automatically deploy to the corresponding environment via the Deploy workflow. The workflow connects to the remote server over WireGuard VPN, syncs the repository to the exact commit that triggered the run, injects secrets from 1Password, and brings up the Docker Compose stack.
Deploy CDE to production using Docker Compose with the production configuration file.
-
Rename
.env.sampletoproduction.envand configure with production settings. -
Copy
harvest_config.sample.yamltoharvest_config.yamland configure the datasets to harvest. -
Delete old redis and postgres data (if needed):
sudo docker volume rm cde_postgres-data cde_redis-data
-
Start all services using the production Docker Compose file:
sudo docker compose -f docker-compose.production.yaml up -d --build
The harvester should be run on a schedule to keep the data up to date. Set up a cron job to run the harvester container:
-
Edit your crontab:
crontab -e
-
Add an entry to run the harvester nightly (example runs at 2 AM):
0 2 * * * cd /path/to/explore-cioos && docker compose -f docker-compose.production.yaml up harvester
Or to run weekly (example runs Sunday at 2 AM):
0 2 * * 0 cd /path/to/explore-cioos && docker compose -f docker-compose.production.yaml up harvester