Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .Rbuildignore
Original file line number Diff line number Diff line change
Expand Up @@ -8,3 +8,4 @@
^pkgdown$
^\.github$
^CRAN-SUBMISSION$
^vignettes/articles$
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,2 +1,3 @@
.Rproj.user
docs
inst/doc
7 changes: 7 additions & 0 deletions DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -26,10 +26,17 @@ Imports:
wbids,
wbwdi
Suggests:
knitr,
rmarkdown,
dplyr,
forcats,
ggplot2,
testthat (>= 3.0.0)
URL: https://github.com/tidy-intelligence/r-econdataverse, https://tidy-intelligence.github.io/r-econdataverse/
BugReports: https://github.com/tidy-intelligence/r-econdataverse/issues
Config/testthat/edition: 3
Encoding: UTF-8
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.3.3
VignetteBuilder: knitr
Config/Needs/website: rmarkdown
2 changes: 2 additions & 0 deletions vignettes/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
*.html
*.R
2 changes: 2 additions & 0 deletions vignettes/articles/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
*.html
*.R
225 changes: 225 additions & 0 deletions vignettes/articles/introducing-the-econdataverse.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,225 @@
---
title: "Introducing the EconDataverse: A Universe of R Packages for Economic Data"
---

```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
```

## The Challenge of Fragmented Economic Data

Economic data is essential for research and policy analysis, yet accessing it efficiently through R has been a persistent challenge. Data scientists routinely spend more time acquiring and cleaning data than actually analyzing it. For economists and policymakers, especially those in developing countries, this creates real barriers: expensive commercial data subscriptions, time-consuming manual processing, and delays in evidence-based decision-making.

The [EconDataverse](https://www.econdataverse.org/) project, supported by the [R Consortium ISC Grant](https://r-consortium.org/all-projects/call-for-proposals.html), addresses these challenges by creating a unified ecosystem of R packages that provide consistent, tidy access to major economic data sources. Each package targets a specific data source while sharing a common design philosophy: consistent function naming, tidy data formats, and cross-source compatibility. The result is significantly less time spent on data acquisition and preparation.

You can install the meta-package, which provides access to all implemented data sources and core helper packages, from CRAN:

```{r, eval = FALSE}
install.packages("econdataverse")
```

Loading the package attaches all component packages in one step:

```{r, message = TRUE}
library(econdataverse)
```

We additionally use the following packages for data manipulation and visualization:

```{r}
library(dplyr)
library(forcats)
library(ggplot2)
```

## Core Packages

The econdataverse currently includes the following packages:

| Package | Data Source | Description |
|---------|-------------|-------------|
| `wbwdi` | World Bank | World Development Indicators |
| `wbids` | World Bank | International Debt Statistics |
| `imfweo` | IMF | World Economic Outlook |
| `imfapi` | IMF | International Monetary Fund API |
| `owidapi` | Our World in Data | Long-term economic and social indicators |
| `uisapi` | UNESCO | Education and research statistics |
| `oecdoda` | OECD | Official Development Assistance |
| `econid` | — | Standardized country/region identifiers |
| `econtools` | — | Common economic data utilities |
| `econdatasets` | — | Publicly hosted preprocessed datasets |

Let's see how these packages work in practice.

## Example: Accessing World Development Indicators

Let us fetch GDP in current USD for a selection of countries:

```{r, message = FALSE, warning = FALSE}
gdp_data <- wdi_get(
indicators = "NY.GDP.MKTP.CD",
entities = c("USA", "CHN", "DEU", "IND", "JPN"),
start_year = 2000,
end_year = 2024
)
gdp_data
```

Each observation contains identifiers for entities and indicators. We deliberately use consistent and descriptive primary key column names (e.g., `entity_id`, `indicator_id`, `series_id`) across packages. This simplifies joins and makes data structures predictable across different data sources.

A convenient way to add standardized entity names for labeling and plotting provided by the `econid` package:

```{r}
gdp_data <- standardize_entity(gdp_data, entity_id)
gdp_data
```

The [package website](https://teal-insights.github.io/r-econid/) provides additional examples and use cases.

Now we can use these standardized labels to easily visualize GDP trends:

```{r, fig.alt = "Line chart showing GDP per capita trends for USA, Germany, Japan, Brazil, and India from 2000 to 2023"}
ggplot(gdp_data, aes(x = year, y = value, color = entity_name)) +
geom_line(linewidth = 1) +
labs(
title = "GDP in Current USD Over Time",
x = NULL,
y = NULL,
color = "Country"
) +
scale_y_continuous(
labels = scales::label_dollar(scale = 1e-12, suffix = "T")
)
```

The `econtools` package provides convenience functions to enrich existing data. For instance, you can easily add a population column and calculate GDP per capita:

```{r}
add_population_column(
gdp_data,
id_column = "entity_id",
date_column = "year"
)
```

We refer to the [package documentation](https://tidy-intelligence.github.io/r-econtools/) for additional use cases.

## Combining Multiple Data Sources

A key benefit of the EconDataverse is the ability to combine data from different sources using a shared design philosophy. The wbwdi package provides World Development Indicators (WDI), with observations by entity and year. The wbids package provides International Debt Statistics (IDS), which are structured by entity, counterpart, and year. This consistency allows the two packages to work seamlessly together.

In the following examples, we focus on Thailand (ISO-3 country code "THA"). Understanding the composition of a country’s debt—distinguishing between total government debt and external debt owed to foreign creditors—helps assess fiscal vulnerability and identify potential risks.

We begin by comparing total government debt with external debt over time. Since IDS data is reported in USD, we compute government debt by combining central government debt as a percentage of GDP ("GC.DOD.TOTL.GD.ZS") with total GDP in USD ("NY.GDP.MKTP.CD"):

```{r, warning = FALSE, fig.dim = c(7, 4)}
government_debt <- wdi_get(
entities = "THA",
indicators = c("NY.GDP.MKTP.CD", "GC.DOD.TOTL.GD.ZS"),
start_year = 2014,
end_year = 2024,
format = "wide"
) |>
mutate(
debt = `GC.DOD.TOTL.GD.ZS` / 100 * `NY.GDP.MKTP.CD`,
type = "Government"
) |>
select(entity_id, year, debt, type)
```

Next, we fetch total external debt from the IDS series "DT.DOD.DPPG.CD" across all counterparts. The counterpart "WLD" represents the whole world and is used to construct aggregate external debt levels. This structure allows us to distinguish total debt from bilateral creditor exposures.

```{r}
external_debt <- ids_get(
entities = "THA",
series = "DT.DOD.DPPG.CD",
counterparts = "all",
start_year = 2014,
end_year = 2024
)

external_debt_total <- external_debt |>
filter(counterpart_id == "WLD") |>
select(entity_id, year, debt = value) |>
mutate(type = "External Debt")

debt_levels <- bind_rows(government_debt, external_debt_total)
```

We can now visualize total government and external debt in Thailand:

```{r, warning = FALSE, fig.dim = c(7, 4)}
debt_levels |>
ggplot(aes(x = year, y = debt, color = type)) +
geom_line() +
labs(
x = NULL,
y = NULL,
color = "Debt Type",
title = "Total Government and External Debt in Thailand in Current USD"
) +
scale_y_continuous(
labels = scales::label_dollar(scale = 1e-9, suffix = "B")
)
```

A key advantage of IDS is the ability to break down external debt by creditor, revealing who holds a country’s debt:

```{r, fig.dim = c(7, 4)}
debt_breakdown <- external_debt |>
filter(counterpart_id != "WLD" & year == 2024) |>
left_join(
ids_list_counterparts(),
join_by(counterpart_id)
)

debt_breakdown |>
arrange(-value) |>
slice(1:5) |>
ggplot(aes(x = value, y = fct_reorder(counterpart_name, value))) +
geom_col() +
labs(
x = "External debt (USD)",
y = NULL,
title = "Top 5 Creditors of Thailand in 2024",
subtitle = "External debt stocks, public and publicly guaranteed"
) +
scale_x_continuous(
labels = scales::label_dollar(scale = 1e-9, suffix = "B")
)
```

For more applications and insights into international debt data, see [Teal Insights' Guide to Working with the World Bank International Debt Statistics](https://teal-insights.github.io/teal-insights-guide-to-wbids/).

## Real-World Impact

The EconDataverse is already demonstrating practical impact through several Shiny applications:

- **[Debt Path Explorer](https://tealinsights.shinyapps.io/nature_finance_debt_dynamics_app/)**: Helps policymakers in climate-vulnerable countries simulate how different sustainability targets and climate policies affect long-term debt trajectories.

- **[Economic Outlook Explorer](https://apps.econdataverse.org/economic-outlook-explorer/index.html)**: Allows researchers to interactively explore IMF World Economic Outlook projections across countries and time horizons.

- **[Debt Network Visualizer](https://apps.econdataverse.org/debt-network-visualizer/index.html)**: Enables exploration of global lending networks, highlighting major creditors and cross-country debt linkages.

## The Team

This project is a collaboration between:

- **Christoph Scheuch** — Co-creator of [Tidy Finance](https://www.tidy-finance.org/), Lecturer at Humboldt-University of Berlin
- **Teal Emery** — Founder of [Teal Insights](https://www.tealemery.com/), Adjunct Lecturer at Johns Hopkins SAIS
- **Christopher C. Smith** — President of [Promptly Technologies](https://promptlytechnologies.com/)

We welcome contributions! You can:

- **Use the packages** and provide feedback via [GitHub Issues](https://github.com/tidy-intelligence/r-econdataverse/issues)
- **Contribute code** by following our [contribution guidelines](https://github.com/tidy-intelligence/r-econdataverse)
- **Spread the word** by sharing with colleagues who work with economic data

If you want to request the development of a package for a data source of your choice, feel free to get in touch with [Christoph Scheuch](https://christophscheuch.github.io/).

## Acknowledgments

We thank the R Consortium for funding this project through the ISC Grant program. This support enables us to build infrastructure that democratizes access to economic data for researchers, analysts, and policymakers worldwide. The original project proposal is available on [GitHub](https://christophscheuch.github.io/isc-proposal-econdataverse/).