Skip to content

danielcpratama/SocioEcon-Scan

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 

Repository files navigation

Socio Economic Scan

Oxford Economics Dataset Analysis - 2021

This repository contains Python scripts and Jupyter notebooks for analyzing the Oxford Economics dataset from 2021. The analysis is focused on population characteristics, GDP data, and related indicators across various cities and countries. The scripts use multiple visualization techniques to compare different metrics for selected cities, benchmark cities, and national averages.

Requirements

To run the code, you need the following Python libraries:

  • pandas
  • numpy
  • matplotlib
  • seaborn
  • geopandas

You can install the required dependencies via pip:

pip install pandas numpy matplotlib seaborn geopandas

Data

The dataset analyzed is stored in two CSV files:

  1. Main Dataset: GC_WB_Nov21_061122_clean.csv contains the primary data for city and country analysis.
  2. Metadata: metadata.csv provides the descriptions, units, and scales for various indicators used in the analysis.

Both files should be placed in a data/ directory for the code to work correctly.

Analysis Overview

1. Population Analysis

  • The notebook contains functions to plot population-related metrics such as population size, growth rates, birth and death rates, and dependency ratios.
  • These metrics are visualized over time for selected cities, benchmark cities, and national averages.

2. GDP Analysis

  • The GDP analysis includes visualizations of GDP per capita, growth rates, and city shares of national GDP.
  • The script compares the selected city to benchmark cities and national averages over time.

3. Age Distribution and Composition

  • This section provides age distribution plots for selected cities, benchmark cities, and national averages.
  • The dependency ratio and the composition of youth, working-age, and elderly populations are also analyzed.

Functions

The code includes several functions to automate plotting and data processing tasks:

  • plot_line(): Plots time series data for a selected city, with optional comparisons to benchmark cities or national averages.
  • plot_age_distribution(): Visualizes the age distribution of the population in a city for a given year.
  • plot_age_composition(): Displays the composition of different age groups (youth, working, elderly) as a percentage of the total population.
  • plot_bar_comparison(): Compares GDP per capita across a selected city and benchmark cities using bar charts.

Example Usage

To plot population growth rates for a city and benchmark it against other cities:

city = 'Maseru'
benchmark_cities = ["Beira", "Bloemfontein", "Buffalo City", "Bulawayo"]
plot_line(city=city, indicator='GROWTHRATE', benchmark=benchmark_cities)

To visualize the age composition for a city in 2024:

plot_age_distribution(city='Maseru', year=2024)

Notes

  • The dataset includes projection data, and projections beyond 2024 can be plotted using the projection=True option in relevant functions.
  • The code handles both time series and cross-sectional comparisons of cities and countries.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors