Add CPA per-year bulk download (download_cpa_file) once OECD fixes malformed dotStat files

## Add CPA per-year bulk download (`download_cpa_file`) once OECD fixes the malformed dotStat files

Follow-up to #38 (CPA reader). The CPA API reader `download_cpa()` shipped in #38. The **per-year bulk
path is blocked** on an upstream OECD data bug, so `download_cpa_file()` is intentionally **not**
included yet.

### Why it's blocked

The per-year "CRS CPA \<year\> (dotStat format)" `.txt` bulk files (dataflow
`OECD.DCD.FSD:DSD_CPA@DF_CRS_CPA`) are **malformed**: a large fraction of rows have more
pipe-delimited fields than the 49-column header declares, so they can't be parsed against their own
header. The same records via the SDMX API are clean.

Affected-row counts (rows whose field count ≠ 49):

| Year | Rows | Ragged % | Rows with >49 *non-empty* fields |
|---|---|---|---|
| 2023 | 251,992 | 69% | 47,345 |
| 2022 | 233,467 | 48% | 25,704 |
| 2021 | 252,976 | 33% | 24,469 |
| 2020 | 188,902 | 35% | 12,837 |
| 2015 | 152,954 | 0% | 0 |

Tens of thousands of rows/year have **more than 49 non-empty fields** (structurally impossible for a
49-column row), and the API confirms the real text fields contain zero `|` characters — so the extra
delimiters are spurious. An extra (empty or duplicated-text) field is injected mid-row, shifting all
later columns. Not deterministically recoverable. This has been reported to the OECD.

### Implementation once the upstream files are fixed

Small — clone the CRS bulk path (`crs.py` `download_crs_file` / `get_year_crs_zip_id`):

- Add `download_cpa_file(year, save_to_path=None, *, as_iterator=False, use_raw_cache=True)` to
  `src/oda_reader/cpa.py`, plus `get_year_cpa_zip_id(year)` using
  `search_string=f"CRS CPA {year} (dotStat format)"` and `CPA_FLOW_URL` (=
  `BASE_DATAFLOW + "DSD_CPA@DF_CRS_CPA/"`).
- Export `download_cpa_file` from `__init__.py` (`__all__`).
- Bulk is **per-year only** (2010–2024); there is no all-years/full-dataset CPA bulk file, so no
  `bulk_download_cpa()`.
- Add a unit test (mock `bulk_download_parquet`) + an integration test, and a README "Per-year CPA
  files" subsection.

### Acceptance

- `download_cpa_file(<year>)` returns/saves a year of CPA data and parses cleanly (no ragged-row
  errors) once OECD has corrected the files.
- Re-verify raggedness is 0% on the corrected files before implementing.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add CPA per-year bulk download (download_cpa_file) once OECD fixes malformed dotStat files #39

Add CPA per-year bulk download (`download_cpa_file`) once OECD fixes the malformed dotStat files

Why it's blocked

Implementation once the upstream files are fixed

Acceptance

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Year	Rows	Ragged %	Rows with >49 non-empty fields
2023	251,992	69%	47,345
2022	233,467	48%	25,704
2021	252,976	33%	24,469
2020	188,902	35%	12,837
2015	152,954	0%	0

Uh oh!

Add CPA per-year bulk download (download_cpa_file) once OECD fixes malformed dotStat files #39

Description

Add CPA per-year bulk download (download_cpa_file) once OECD fixes the malformed dotStat files

Why it's blocked

Implementation once the upstream files are fixed

Acceptance

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Add CPA per-year bulk download (`download_cpa_file`) once OECD fixes the malformed dotStat files