Add support for the CPA (Country Programmable Aid) dataset
Background
Following up on Matt's question: is CPA a separately published dataset derived from CRS, and should we pull it from source or recalculate it ourselves?
CPA is separately published by the OECD as its own SDMX dataflow(s) under the OECD.DCD.FSD agency — confirmed against the live registry. There are two:
| Dataflow |
Name |
Notes |
DSD_CPA@DF_CRS_CPA (v1.4) |
Country Programmable Aid (CPA) |
New, activity-level measure derived entirely from the CRS. 2010+, replicable, published with per-year bulk dotStat files (CRS CPA <year> (dotStat format), 2010–2024). |
DSD_DAC2@DF_CPA (v1.1) |
Country programmable aid (CPA) |
Old aggregate (provider × partner) measure, part of the DAC2 family. Series ends 2021. |
The OECD computes CPA by subtracting specific categories from gross bilateral ODA (debt relief, humanitarian, admin, in-donor student/refugee costs, development awareness, core NGO support, equity, unallocated, etc. — full methodology is in the dataflow description).
Decision: pull from source, don't recalculate
Since the OECD publishes CPA directly — with their official methodology and a versioned, activity-level series — we should add a thin reader that pulls it from source rather than reimplementing the methodology against CRS ourselves (which would risk silently diverging from the official figures whenever OECD revises the rules).
Target: the new DSD_CPA@DF_CRS_CPA. It is activity-level and CRS-shaped, so this is a small wrapper that closely mirrors crs.py. (The dataflow ID contains CRS, so QueryBuilder already routes it to the CRS base URL.) The old aggregate measure (DSD_DAC2@DF_CPA) is out of scope — we will not use it.
Scope
Mirror the existing CRS wrapper pattern. Files to add/modify:
Open questions
- Confirm the API dataflow is queryable (not bulk-only) and confirm its exact dimension order — decides whether
build_cpa_filter is new or reuses build_crs_filter.
- Confirm whether a single full-dataset bulk file exists, or only the per-year dotStat files observed in the registry (drives whether we expose a
bulk_download_cpa equivalent).
- Confirm existing codelists cover the CPA-specific dimension(s) (e.g. the CPA flag); add to codelist drift coverage if needed.
Acceptance criteria
download_cpa(start_year=..., end_year=...) returns the CPA dataset in the .stat schema, consistent with other download_* wrappers (pre_process / dotstat_codes honored).
- Per-year bulk access works and is cached like CRS.
- Output matches the OECD-published CPA figures for a spot-checked donor/year.
- Tests and docs added.
Add support for the CPA (Country Programmable Aid) dataset
Background
Following up on Matt's question: is CPA a separately published dataset derived from CRS, and should we pull it from source or recalculate it ourselves?
CPA is separately published by the OECD as its own SDMX dataflow(s) under the
OECD.DCD.FSDagency — confirmed against the live registry. There are two:DSD_CPA@DF_CRS_CPA(v1.4)CRS CPA <year> (dotStat format), 2010–2024).DSD_DAC2@DF_CPA(v1.1)The OECD computes CPA by subtracting specific categories from gross bilateral ODA (debt relief, humanitarian, admin, in-donor student/refugee costs, development awareness, core NGO support, equity, unallocated, etc. — full methodology is in the dataflow description).
Decision: pull from source, don't recalculate
Since the OECD publishes CPA directly — with their official methodology and a versioned, activity-level series — we should add a thin reader that pulls it from source rather than reimplementing the methodology against CRS ourselves (which would risk silently diverging from the official figures whenever OECD revises the rules).
Target: the new
DSD_CPA@DF_CRS_CPA. It is activity-level and CRS-shaped, so this is a small wrapper that closely mirrorscrs.py. (The dataflow ID containsCRS, soQueryBuilderalready routes it to the CRS base URL.) The old aggregate measure (DSD_DAC2@DF_CPA) is out of scope — we will not use it.Scope
Mirror the existing CRS wrapper pattern. Files to add/modify:
src/oda_reader/cpa.py— new module:DATAFLOW_ID = "DSD_CPA@DF_CRS_CPA",DATAFLOW_VERSION,download_cpa(...)(callsdownload(version="cpa", ...),@cache_info), anddownload_cpa_file(year=...)for the per-year bulk files viaget_bulk_file_id(flow_url=CPA_FLOW_URL, search_string="CRS CPA <year> (dotStat format)")+bulk_download_parquet.src/oda_reader/download/download_tools.py— addCPA_FLOW_URLand register"cpa"in theversion_functionsdispatch (filter_builder+convert_func).src/oda_reader/download/query_builder.py— addbuild_cpa_filter(...), or reusebuild_crs_filterif the dimension order is identical (confirm against the dataflow structure).src/oda_reader/schemas/— schema mapping (cpa_dotstat.json) +convert_cpa_to_dotstat_codes, or reuse the CRS schema/convert_crs_to_dotstat_codesif columns match.src/oda_reader/__init__.py— import and adddownload_cpa(+download_cpa_file) to__all__.src/oda_reader/tools.py— add a"cpa"case toget_available_filters.tests/datasets/cpa/{unit,integration}(e2e mirroringtest_crs_e2e.py) + abuild_cpa_filterunit test + fixtures.README.mdand aCHANGELOG.mdentry.Open questions
build_cpa_filteris new or reusesbuild_crs_filter.bulk_download_cpaequivalent).Acceptance criteria
download_cpa(start_year=..., end_year=...)returns the CPA dataset in the.statschema, consistent with otherdownload_*wrappers (pre_process/dotstat_codeshonored).