Skip to content

weo.fetch_data() broken in v1.3.0: IMF retired the bulk SDMX download (fixed in v1.4.0+) #22

Description

@lpicci96

Summary

weo.fetch_data() fails to retrieve any WEO vintage in v1.3.0 (and earlier). The IMF retired the "Download Entire Database" bulk SDMX zip that the v1.3.0 scraper depends on, and moved WEO distribution to the SDMX REST API. As a result the 1.3.0 HTML-scraper path can no longer locate or download the data.

Note: the latest released code on main (v1.4.0+, currently v1.5.0) already handles this — it fetches from the IMF SDMX 3.0 API. This issue is filed to document the root cause and the resolution for users still pinned to 1.3.0 (e.g. via transitive dependencies such as bblocks-data-importers).

Affected versions

  • Broken: 1.3.0 and earlier (scraper-only path).
  • Fixed: 1.4.0 and later (SDMX 3.0 API path, with the old scraper kept as a fallback). Verified on 1.5.0 (see below).

Minimal reproduction (on v1.3.0)

from imf_reader import weo
weo.fetch_data()                  # latest
weo.fetch_data(("April", 2025))   # specific vintage

Observed on 1.3.0:

  • Latest / recent vintages (Oct 2025, April 2026): NoDataError: SDMX data not found.
  • Older vintages (April 2025, Oct 2024, April 2024): DataExtractionError: ... File is not a zip file.

Root cause

In imf_reader/weo/scraper.py (v1.3.0), fetch_data -> _fetch -> SDMXScraper.scrape does:

def get_soup(month, year):
    url = f"{BASE_URL}/en/Publications/WEO/weo-database/{year}/{month}/download-entire-database"
    response = make_request(url)
    return BeautifulSoup(response.content, "html.parser")

class SDMXScraper:
    @staticmethod
    def get_sdmx_url(soup):
        try:
            href = soup.find("a", string="SDMX Data").get("href")   # returns None now
        except AttributeError:
            raise NoDataError("SDMX data not found")
        ...
    @staticmethod
    def get_sdmx_folder(sdmx_url):
        response = make_request(sdmx_url)
        folder = ZipFile(io.BytesIO(response.content))   # "File is not a zip file"
        ...

The IMF "download-entire-database" page no longer serves a static <a> whose exact text is "SDMX Data" pointing at a downloadable bulk zip:

  • For recent vintages the anchor is absent in the static HTML requests sees, so soup.find("a", string="SDMX Data") returns None and .get(...) raises AttributeError -> NoDataError: SDMX data not found.
  • For older vintages a link is still found but the URL it resolves to no longer returns a zip (it returns a small XML stub / HTML), so ZipFile(...) raises "File is not a zip file" (surfaced as DataExtractionError).

Per the v1.4.0 changelog: "The October 2025 release of WEO removed bulk downloads and moved everything towards the SDMX API." This is the underlying change.

Guessed legacy bulk URLs (e.g. https://www.imf.org/-/media/Files/Publications/WEO/WEO-Database/<year>/<month>/WEO<Mon><Year>all.ashx) now return 404 or a ~215-byte XML stub, confirming the old media path is stale.

Resolution / current behaviour

main (v1.4.0+) added imf_reader/weo/api.py, which fetches WEO from the IMF SDMX 3.0 REST API:

  • Version discovery: https://api.imf.org/external/sdmx/3.0/structure/dataflow/IMF.RES/WEO/*?detail=full
  • Data: https://api.imf.org/external/sdmx/3.0/data/dataflow/IMF.RES/WEO/<api_version>/* with Accept: text/csv

reader.fetch_data now tries the API first and falls back to the old scraper. The public API (weo.fetch_data, April/October vintage selection) and the result schema are preserved.

Live verification (installed main @ v1.5.0)

weo.fetch_data()                # ('October', 2025): 361,736 rows, 145 indicators
weo.fetch_data(("April", 2025)) # 355,661 rows

Both vintages return non-empty frames. Indicator coverage is the full WEO set (145 distinct CONCEPT_CODEs), not the ~15-indicator DataMapper subset:

  • NGDPD (GDP, current USD) — present
  • GGX_NGDP (general government total expenditure, % of GDP) — present

Recommendation

Users hitting this on 1.3.0 should upgrade to >=1.4.0 (latest 1.5.0). For downstream packages that pin imf-reader, bump the lower bound to >=1.4.0.

No code change to main is required for this issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions