Migrate#211
Conversation
- Document narr variable reference addition (closes #194) - Document narr variable download test coverage - Document download_data edgar dispatch fix Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add two @note entries to download_narr(): - Points to the PSL NARR Dailies server as the data source - Explains that the 88 supported variables are the complete set available as individual NetCDF files on PSL; additional variables in the NARR archive (cloud water, ice mixing ratio, friction velocity, momentum fluxes, static land/soil fields) exist only in raw merged GRIB files and cannot be downloaded with this function Remove the nolint-wrapped PDF href from @param (PDF uses wgrib uppercase abbreviations that differ from PSL NetCDF filenames and caused user confusion per issue #194). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…oad functions - Add rate_limit=2 parameter to download_ecoregion, download_nlcd, download_groads, download_population, and download_koppen_geiger, replacing the previously hardcoded literal value passed to download_run_method() - Add rate_limit=2 and nasa_earth_data_token=NULL as explicit parameters to the download_data() wrapper with Roxygen @param documentation - Refactor download_data() body to use do.call() so rate_limit is always forwarded; nasa_earth_data_token is conditionally forwarded only to functions whose formals() include it (geos, merra2, modis), preventing unexpected-argument errors on non-NASA datasets - Add 56 new unit tests covering: formals checks on all 20 download functions for rate_limit; formals check on wrapper for both new params; token routing logic; rate_limit pass-through via wrapper Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…arams Run devtools::document() to sync Rd files with updated Roxygen2 source. Six man pages were stale after the previous parameter additions: - download_data.Rd: add rate_limit and nasa_earth_data_token - download_ecoregion.Rd: add rate_limit - download_nlcd.Rd: add rate_limit - download_groads.Rd: add rate_limit - download_population.Rd: add rate_limit - download_koppen_geiger.Rd: add rate_limit Fixes R CMD check warning: 'Argument names in code not in docs: rate_limit' Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The CMR URL filter only matched '.hdf' extensions, causing VNP46A2 (VIIRS Black Marble) granules to be silently dropped and triggering the 'No granules found' error. VNP46A2 files use HDF5 (.h5) format, so extend the regex to match both .hdf and .h5. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…calar year URLs - download_run_method: add retry_on_failure=TRUE for transport-level errors (SSL drops, connection resets) and http_version param for server compatibility; exposed via new http_version= argument - download_nlcd: pass http_version=2L to force HTTP/1.1, avoiding nghttp2 PROTOCOL_ERROR from www.mrlc.gov - download_nei: fix URL construction for scalar year input; previously when called with year=2017 alone (as dispatched by targets pattern=map()), both 2017 and 2020 URL suffixes were appended to the 2017 base URL giving a 404; replace with named-vector lookup via vapply so each year maps to its correct URL independently; also fix sprintf format from %d to %s - tests: add unit tests for download_run_method http_version/retry_on_failure, download_nlcd HTTP/1.1 passthrough, and download_nei scalar/vector year URL construction and unknown-year error Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
gROADS data is hosted on data.earthdata.nasa.gov which requires Bearer token auth. download_groads previously hardcoded token=NULL, causing HTTP 401 on every download attempt (silent failure: targets marked job as complete with empty output directory). - Add nasa_earth_data_token param with get_token() retrieval from NASA_EARTHDATA_TOKEN env var (matching pattern used by download_geos, download_merra2, download_modis) - Pass token to download_run_method - Update @note and @param in Roxygen docs - Add 4 unit tests: param exists, token passed through, env-var lookup, and existing file skip path Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
check_url_status made a single httr2 HEAD request with no retry logic. Any transient SSL drop or connection reset caused it to return FALSE, which downstream callers (download_aqs, download_merra2, etc.) treat as an invalid URL and raise an error -- even when the URL is perfectly valid (confirmed HTTP 200 via curl). - Add max_tries=3L parameter and req_retry(retry_on_failure=TRUE) to the HEAD request so transient transport errors are retried before concluding the URL is bad - Add 2 unit tests verifying the new parameter and retry_on_failure flag Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
… mismatch New max_tries parameter added in prior commit was missing from Roxygen docs, causing 'Codoc mismatches' WARNING in R CMD check. Add @param and @importFrom httr2 req_retry. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
download_groads now requires nasa_earth_data_token for NASA EarthData authentication. Update the test that checks which functions accept this parameter to reflect the new groads requirement. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ownload_run_method - download_modis CMR granule query: add req_retry(retry_on_failure=TRUE, max_tries=3L) + req_timeout(60) so transient DNS timeouts do not permanently fail the entire MODIS download - download_run_method: cap exponential backoff at 30 s (was httr2 default 60 s) and add req_options(connecttimeout=30L) so each connection attempt times out in 30 s rather than hanging; prevents cascading slowdowns when DNS resolution is intermittently unreliable - tests: add connecttimeout assertion and CMR body inspection test Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add nasa_earth_data_token param (default NULL) - Call get_token() to fetch from env if not provided - Pass token to download_run_method instead of hardcoded NULL - Move download_population from non_nasa_fns to nasa_fns in tests - Regenerated man/download_population.Rd Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Wrap CMR query in tryCatch with actionable error pointing to NASA EarthData status page - Add req_options(connecttimeout=30L) for fast DNS failure detection - Increase max_tries 3->5 and req_timeout 60->120 for slow networks - Add live CMR tests (skip_if_offline) for MOD11A1 and MCD19A2 that verify granule URLs are returned for a known date/bbox - Update body test: check connecttimeout, tryCatch, and that timeout is not the old 60s value Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Pipe continuations inside tryCatch must be indented 6 spaces per lintr indentation_linter rules. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR completes the migration of download_* functionality to httr2, updates documentation/vignettes to match the new authentication + download flow, and substantially refactors/expands the test suite to align with the new “URL discovery + httr2 download runner” pattern.
Changes:
- Update vignettes and Rd docs to remove legacy
wget/curlcommand-file workflow and document NASA token setup viasetup_nasa_token()/get_token(). - Refactor many
testthattests to assert returned URL lists, add deprecation-warning coverage, and add mock-download coverage forhashbranches. - Add agent prompt files + update repo tooling/config (pkgdown nav, lintr config, CI env vars, build ignores).
Reviewed changes
Copilot reviewed 86 out of 89 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
| vignettes/terraclimate_workflow.Rmd | Removes legacy args from examples; keeps workflow aligned with new download behavior. |
| vignettes/protected_datasets.Rmd | Rewrites NASA auth guidance around tokens + setup_nasa_token(). |
| vignettes/narr_workflow.Rmd | Removes legacy download/remove_command args from examples. |
| vignettes/modis_workflow.Rmd | Updates env var name + removes legacy args from examples/output. |
| vignettes/gridmet_workflow.Rmd | Removes legacy download/remove_command args from examples. |
| vignettes/epa_download.Rmd | Reworks vignette into download_data()-first workflow and modernizes chunk options. |
| tests/testthat/test-tri.R | Updates TRI tests to assert URL discovery + adds mock download/hash + deprecation coverage. |
| tests/testthat/test-terraclimate.R | Updates TerraClimate tests to assert URL discovery + adds mock download/hash + deprecation coverage. |
| tests/testthat/test-prism.R | Refactors PRISM tests for URL discovery + adds mock download branches. |
| tests/testthat/test-koppen-geiger.R | Simplifies download tests + adds deprecation warning + mock download branches. |
| tests/testthat/test-huc.R | Refactors HUC tests for URL discovery + adds deprecation + mock download branches. |
| tests/testthat/test-hms.R | Refactors HMS tests, improves live-test skipping, and adds additional branch coverage. |
| tests/testthat/test-groads.R | Refactors gROADS tests and adds NASA token plumbing coverage. |
| tests/testthat/test-gridmet.R | Refactors gridMET tests and adds deprecation + mock download branches. |
| tests/testthat/test-gmted.R | Heavily refactors GMTED tests; adds deprecation + mock download branches and new URL parsing path. |
| tests/testthat/test-geos.R | Refactors GEOS tests to use NASA token env var + adds mock download/hash coverage. |
| tests/testthat/test-ecoregion.R | Expands ecoregion tests (URL discovery, structure, live, integration, branches). |
| tests/testthat/test-cropscape.R | Refactors CropScape tests and adds deprecation + mock download/hash coverage. |
| tests/testthat/NA | Empty file added (likely accidental placeholder). |
| man/test.Rd | Removes internal helper doc. |
| man/setup_nasa_token.Rd | Adds docs for token setup helper. |
| man/interactive.Rd | Removes internal helper doc. |
| man/get_token.Rd | Adds docs for token resolution helper. |
| man/download_tri.Rd | Updates TRI download docs for new download defaults/arguments. |
| man/download_terraclimate.Rd | Updates TerraClimate download docs for new defaults/arguments. |
| man/download_run_method.Rd | Adds docs for the httr2-based download runner. |
| man/download_run.Rd | Marks legacy download runner as deprecated in docs. |
| man/download_prism.Rd | Updates PRISM docs (new unzip/remove_zip args + httr2 settings). |
| man/download_population.Rd | Updates population docs to include token handling + new args. |
| man/download_nlcd.Rd | Updates NLCD docs for new defaults and argument text. |
| man/download_nei.Rd | Updates NEI docs for new defaults and argument text. |
| man/download_narr.Rd | Expands NARR documentation (including full variable reference section). |
| man/download_modis.Rd | Updates MODIS docs to document token sources + httr2-related args. |
| man/download_merra2.Rd | Updates MERRA-2 docs to include token arg + httr2 args. |
| man/download_koppen_geiger.Rd | Updates Köppen-Geiger docs for new defaults and argument text. |
| man/download_huc.Rd | Updates HUC docs for new defaults and argument text. |
| man/download_hms.Rd | Updates HMS docs for new defaults and argument text. |
| man/download_groads.Rd | Updates gROADS docs to include token arg + httr2 args. |
| man/download_gridmet.Rd | Updates gridMET docs for new defaults and argument text. |
| man/download_gmted.Rd | Updates GMTED docs for new defaults and argument text. |
| man/download_geos.Rd | Updates GEOS docs for token arg + httr2 args. |
| man/download_edgar.Rd | Updates EDGAR docs for new defaults and argument text. |
| man/download_ecoregion.Rd | Updates ecoregion docs for new defaults and argument text. |
| man/download_data.Rd | Updates wrapper docs to include NASA token + rate limiting. |
| man/download_cropscape.Rd | Updates CropScape docs for new defaults and argument text. |
| man/download_aqs.Rd | Updates AQS docs for new defaults and argument text. |
| man/cov.Rd | Removes internal helper doc. |
| man/check_urls.Rd | Minor formatting update. |
| man/check_url_status.Rd | Documents retry parameter. |
| man/calculate_modis_daily.Rd | Doc line wrapping changes. |
| man/calculate_modis.Rd | Doc line wrapping changes. |
| agents/test-agent.yaml | Adds testing specialist agent metadata. |
| agents/test-agent.md | Adds testing specialist agent prompt. |
| agents/process-agent.yaml | Adds process-tier specialist agent metadata. |
| agents/process-agent.md | Adds process-tier specialist agent prompt. |
| agents/download-agent.yaml | Adds download-tier specialist agent metadata. |
| agents/download-agent.md | Adds download-tier specialist agent prompt. |
| agents/calculate-agent.yaml | Adds calculate-tier specialist agent metadata. |
| agents/calculate-agent.md | Adds calculate-tier specialist agent prompt. |
| agents/README.md | Documents agent files and intended usage. |
| agent.md | Adds repository-wide agent guide. |
| _pkgdown.yml | Updates navbar label and includes setup_nasa_token in reference index. |
| R/process.R | Improves NLCD file discovery and aux.xml handling; adjusts HMS missing-data logic. |
| R/ignore.R | Refactors long URL string assembly to satisfy line length rules. |
| R/helpers.R | Removes internal container helper functions. |
| R/calculate_covariates_auxiliary.R | Minor doc formatting updates. |
| R/calculate_covariates.R | Adjusts calculate_nei geometry-return logic; minor doc formatting updates. |
| NEWS.md | Adds 1.3.3 release notes for httr2 migration. |
| NAMESPACE | Exports new helpers and updates httr2 imports. |
| DESCRIPTION | Bumps version and updates Imports list. |
| AGENTS.md | Adds Copilot-oriented repository instructions. |
| .lintr | Tightens line length limit and changes exclusions. |
| .gitignore | Adds ignores for tutorials, migration guide, and testthat problem artifacts. |
| .github/workflows/test-coverage.R | Stops treating warnings as errors during coverage run. |
| .github/workflows/test-coverage-local.yaml | Adds NASA_EARTHDATA_TOKEN env var mapping in CI. |
| .github/workflows/check-standard.yaml | Adds NASA_EARTHDATA_TOKEN env var mapping in CI. |
| .Rbuildignore | Excludes agent prompt directories/files from build. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| * `hash = TRUE`: generate unique SHA-1 hash for the downloaded files. | ||
|
|
There was a problem hiding this comment.
The vignette claims hash is a SHA-1 hash, but download_hash() currently computes a combined md5sum (and uses an external md5sum binary). Please align the wording with the implementation (or update the implementation).
There was a problem hiding this comment.
@copilot open a new pull request to apply changes based on this feedback
| full.names = TRUE, | ||
| recursive = TRUE, # ADD THIS - files may be in subdirectories | ||
| ignore.case = TRUE # ADD THIS - for robustness | ||
| ) |
There was a problem hiding this comment.
Please remove development-time annotations like “ADD THIS” / “FIXED” from comments before merging; they add noise and make it harder to understand the intent of the logic long-term.
There was a problem hiding this comment.
@copilot open a new pull request to apply changes based on this feedback
| * `hash = TRUE`: generate unique SHA-1 hash for the downloaded files. | ||
|
|
There was a problem hiding this comment.
This bullet says hash generates a “SHA-1” hash, but download_hash() currently computes a combined md5sum (and uses an external md5sum binary). Please update the vignette text (or update the hashing implementation) so users aren’t misled.
There was a problem hiding this comment.
@copilot open a new pull request to apply changes based on this feedback
| * `hash = TRUE`: generate unique SHA-1 hash for the downloaded files. | ||
|
|
There was a problem hiding this comment.
This bullet says hash generates a “SHA-1” hash, but download_hash() currently computes a combined md5sum (and uses an external md5sum binary). Please update the vignette text (or update the hashing implementation) so users aren’t misled.
There was a problem hiding this comment.
@copilot open a new pull request to apply changes based on this feedback
| * `hash = TRUE`: generate unique SHA-1 hash for the downloaded files. | ||
|
|
There was a problem hiding this comment.
This bullet says hash generates a “SHA-1” hash, but download_hash() currently computes a combined md5sum (and uses an external md5sum binary). Please update the vignette text (or update the hashing implementation) so users aren’t misled.
There was a problem hiding this comment.
@copilot open a new pull request to apply changes based on this feedback
There was a problem hiding this comment.
@copilot open a new pull request to apply changes based on this feedback
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
|
@kyle-messier I've opened a new pull request, #212, to work on those changes. Once the pull request is ready, I'll request review from you. |
|
@kyle-messier I've opened a new pull request, #213, to work on those changes. Once the pull request is ready, I'll request review from you. |
|
@kyle-messier I've opened a new pull request, #214, to work on those changes. Once the pull request is ready, I'll request review from you. |
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The new daily_agg tests for GOES, MERRA2, and GEOS build NetCDF fixtures via terra::writeCDF, which requires the ncdf4 package. Without it, testthat skipped 10 tests covering the daily_agg branches, dropping patch coverage from ~99.3% to 98.69% and failing the coverage gate. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Bump runner to ubuntu-24.04 - Add workflow_dispatch filter input for targeted reruns - Add ncdf4, withr, furrr, mirai, targets to extra-packages - Add issues: write permission - Open or update a tracking issue (label: live-test-failure) on failure using JasonEtco/create-an-issue@v2 + issue template - Repo watchers receive an email when the issue is created Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…polygon example - Fix stray processed_data refs by using processed_mean - Add 'Choosing a daily aggregation' subsection (mean vs sum) - Add polygon (areal) extraction example using a synthetic sf polygon - Cross-link the Computational Considerations vignette Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ions vignette, pkgdown dev site - README: add missing USGS HUC row to dataset table - README: add 'NASA Earthdata authentication with setup_nasa_token()' section - README: add 'Computational considerations' section linking to new vignette - New vignettes/computational_considerations.Rmd with purrr/furrr/mirai/targets examples - _pkgdown.yml: register new vignette in articles + navbar; enable development: mode: auto - .github/workflows/pkgdown.yaml: trigger on migrate/dev branches - DESCRIPTION: add furrr, mirai, targets to Suggests Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
~3-5 inputs per dataset chosen to exercise each major branch of the corresponding download_* function. Each parameter combination gets its own testthat::test_that block with a descriptive name so failures identify the exact input under test. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…se layer time metadata safely, and no longer crash in calc_message()/calc_time().
download_*functions are being migrated fully tohttr2package