Skip to content

Add support for the CPA (Country Programmable Aid) dataset #38

Description

@jm-rivera

Add support for the CPA (Country Programmable Aid) dataset

Background

Following up on Matt's question: is CPA a separately published dataset derived from CRS, and should we pull it from source or recalculate it ourselves?

CPA is separately published by the OECD as its own SDMX dataflow(s) under the OECD.DCD.FSD agency — confirmed against the live registry. There are two:

Dataflow Name Notes
DSD_CPA@DF_CRS_CPA (v1.4) Country Programmable Aid (CPA) New, activity-level measure derived entirely from the CRS. 2010+, replicable, published with per-year bulk dotStat files (CRS CPA <year> (dotStat format), 2010–2024).
DSD_DAC2@DF_CPA (v1.1) Country programmable aid (CPA) Old aggregate (provider × partner) measure, part of the DAC2 family. Series ends 2021.

The OECD computes CPA by subtracting specific categories from gross bilateral ODA (debt relief, humanitarian, admin, in-donor student/refugee costs, development awareness, core NGO support, equity, unallocated, etc. — full methodology is in the dataflow description).

Decision: pull from source, don't recalculate

Since the OECD publishes CPA directly — with their official methodology and a versioned, activity-level series — we should add a thin reader that pulls it from source rather than reimplementing the methodology against CRS ourselves (which would risk silently diverging from the official figures whenever OECD revises the rules).

Target: the new DSD_CPA@DF_CRS_CPA. It is activity-level and CRS-shaped, so this is a small wrapper that closely mirrors crs.py. (The dataflow ID contains CRS, so QueryBuilder already routes it to the CRS base URL.) The old aggregate measure (DSD_DAC2@DF_CPA) is out of scope — we will not use it.

Scope

Mirror the existing CRS wrapper pattern. Files to add/modify:

  • src/oda_reader/cpa.py — new module: DATAFLOW_ID = "DSD_CPA@DF_CRS_CPA", DATAFLOW_VERSION, download_cpa(...) (calls download(version="cpa", ...), @cache_info), and download_cpa_file(year=...) for the per-year bulk files via get_bulk_file_id(flow_url=CPA_FLOW_URL, search_string="CRS CPA <year> (dotStat format)") + bulk_download_parquet.
  • src/oda_reader/download/download_tools.py — add CPA_FLOW_URL and register "cpa" in the version_functions dispatch (filter_builder + convert_func).
  • src/oda_reader/download/query_builder.py — add build_cpa_filter(...), or reuse build_crs_filter if the dimension order is identical (confirm against the dataflow structure).
  • src/oda_reader/schemas/ — schema mapping (cpa_dotstat.json) + convert_cpa_to_dotstat_codes, or reuse the CRS schema/convert_crs_to_dotstat_codes if columns match.
  • src/oda_reader/__init__.py — import and add download_cpa (+ download_cpa_file) to __all__.
  • src/oda_reader/tools.py — add a "cpa" case to get_available_filters.
  • Teststests/datasets/cpa/{unit,integration} (e2e mirroring test_crs_e2e.py) + a build_cpa_filter unit test + fixtures.
  • Docs — a "Downloading CPA Data" section in README.md and a CHANGELOG.md entry.

Open questions

  1. Confirm the API dataflow is queryable (not bulk-only) and confirm its exact dimension order — decides whether build_cpa_filter is new or reuses build_crs_filter.
  2. Confirm whether a single full-dataset bulk file exists, or only the per-year dotStat files observed in the registry (drives whether we expose a bulk_download_cpa equivalent).
  3. Confirm existing codelists cover the CPA-specific dimension(s) (e.g. the CPA flag); add to codelist drift coverage if needed.

Acceptance criteria

  • download_cpa(start_year=..., end_year=...) returns the CPA dataset in the .stat schema, consistent with other download_* wrappers (pre_process / dotstat_codes honored).
  • Per-year bulk access works and is cached like CRS.
  • Output matches the OECD-published CPA figures for a spot-checked donor/year.
  • Tests and docs added.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions