This repository contains Python scripts and Jupyter notebooks for analyzing the Oxford Economics dataset from 2021. The analysis is focused on population characteristics, GDP data, and related indicators across various cities and countries. The scripts use multiple visualization techniques to compare different metrics for selected cities, benchmark cities, and national averages.
To run the code, you need the following Python libraries:
pandasnumpymatplotlibseaborngeopandas
You can install the required dependencies via pip:
pip install pandas numpy matplotlib seaborn geopandasThe dataset analyzed is stored in two CSV files:
- Main Dataset:
GC_WB_Nov21_061122_clean.csvcontains the primary data for city and country analysis. - Metadata:
metadata.csvprovides the descriptions, units, and scales for various indicators used in the analysis.
Both files should be placed in a data/ directory for the code to work correctly.
- The notebook contains functions to plot population-related metrics such as population size, growth rates, birth and death rates, and dependency ratios.
- These metrics are visualized over time for selected cities, benchmark cities, and national averages.
- The GDP analysis includes visualizations of GDP per capita, growth rates, and city shares of national GDP.
- The script compares the selected city to benchmark cities and national averages over time.
- This section provides age distribution plots for selected cities, benchmark cities, and national averages.
- The dependency ratio and the composition of youth, working-age, and elderly populations are also analyzed.
The code includes several functions to automate plotting and data processing tasks:
plot_line(): Plots time series data for a selected city, with optional comparisons to benchmark cities or national averages.plot_age_distribution(): Visualizes the age distribution of the population in a city for a given year.plot_age_composition(): Displays the composition of different age groups (youth, working, elderly) as a percentage of the total population.plot_bar_comparison(): Compares GDP per capita across a selected city and benchmark cities using bar charts.
To plot population growth rates for a city and benchmark it against other cities:
city = 'Maseru'
benchmark_cities = ["Beira", "Bloemfontein", "Buffalo City", "Bulawayo"]
plot_line(city=city, indicator='GROWTHRATE', benchmark=benchmark_cities)To visualize the age composition for a city in 2024:
plot_age_distribution(city='Maseru', year=2024)- The dataset includes projection data, and projections beyond 2024 can be plotted using the
projection=Trueoption in relevant functions. - The code handles both time series and cross-sectional comparisons of cities and countries.