|
| 1 | +# CLAUDE.md |
| 2 | + |
| 3 | +This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. |
| 4 | + |
| 5 | +## Commands |
| 6 | + |
| 7 | +```bash |
| 8 | +# Install package locally |
| 9 | +make install # runs devtools::install() |
| 10 | + |
| 11 | +# Rebuild roxygen2 documentation and NAMESPACE |
| 12 | +make build-docs # runs devtools::document() |
| 13 | + |
| 14 | +# R CMD CHECK equivalent (standard R package check) |
| 15 | +Rscript -e "devtools::check()" |
| 16 | + |
| 17 | +# Run a single function's tests interactively |
| 18 | +Rscript -e "devtools::load_all(); <function_call>" |
| 19 | + |
| 20 | +# Rebuild pkgdown site |
| 21 | +make deploy-docs # builds and deploys to GitHub Pages |
| 22 | +``` |
| 23 | + |
| 24 | +There are no formal unit tests (no `tests/` directory). Validation is done through vignettes in `vignettes/`. |
| 25 | + |
| 26 | +## Architecture |
| 27 | + |
| 28 | +**mbg** is an R package for Model-Based Geostatistics — estimating continuous spatial surfaces from point-referenced observations combined with raster covariates. |
| 29 | + |
| 30 | +### Core Components |
| 31 | + |
| 32 | +**`MbgModelRunner` R6 class** ([R/MbgModelRunner.R](R/MbgModelRunner.R)) is the main orchestrator. It holds all inputs as fields and calls the modular functions below in sequence. All functions can also be called standalone. |
| 33 | + |
| 34 | +**Pipeline stages:** |
| 35 | + |
| 36 | +1. **Input loading** — `load_covariates()`, `build_id_raster()`, `build_aggregation_table()` prepare rasters and pixel-to-polygon mappings. |
| 37 | + |
| 38 | +2. **ML stacking (optional)** — `run_regression_submodels()` ([R/run_ml_models.R](R/run_ml_models.R)) fits caret models (Random Forest, GAM, etc.) to produce predictive raster surfaces used as additional covariates in the INLA model. |
| 39 | + |
| 40 | +3. **INLA preparation** — `prepare_inla_data_stack()` ([R/prepare_inla_data_stack.R](R/prepare_inla_data_stack.R)) builds the INLA data stack with fixed effects (covariates), SPDE spatial mesh (Gaussian process), optional admin-level IID random effects, and optional nugget term. |
| 41 | + |
| 42 | +4. **Model fitting** — `fit_inla_model()` ([R/fit_inla_model.R](R/fit_inla_model.R)) wraps `INLA::inla()` with PC priors and sensible defaults. |
| 43 | + |
| 44 | +5. **Prediction** — `generate_cell_draws_and_summarize()` ([R/generate_cell_draws_and_summarize.R](R/generate_cell_draws_and_summarize.R)) draws 250 posterior predictive samples per pixel, applies inverse link function, and produces mean/median/UI rasters. |
| 45 | + |
| 46 | +6. **Aggregation** — `aggregate_draws_to_polygons()` ([R/aggregate_grid_to_polygons.R](R/aggregate_grid_to_polygons.R)) propagates uncertainty through aggregation from raster cells to polygon-level summaries. |
| 47 | + |
| 48 | +### Key Design Decisions |
| 49 | + |
| 50 | +- **INLA is a Suggested dependency** (not required), checked at runtime in [R/zzz.R](R/zzz.R). The package can be installed without INLA. |
| 51 | +- **terra >= 1.9.1 required** for efficient aggregation table support. |
| 52 | +- Raster operations use **terra**, vector/polygon operations use **sf**, tabular data uses **data.table**. |
| 53 | +- The `id_raster` (created by `build_id_raster()`) defines the spatial template — all rasters and predictions align to it. |
| 54 | +- Aggregation uses fractional pixel allocation, preserving uncertainty through the `cell_draws` matrix (pixels × samples). |
| 55 | + |
| 56 | +### Example Data |
| 57 | + |
| 58 | +`inst/extdata/` contains Benin communes boundaries and child stunting observations, used in the vignettes as a worked example. |
0 commit comments