World Bank Poverty Prediction Challenge 🌍

Ranked #339 out of 1,322 Global Competitors (Top 25%)

Description

This repo hosts the solution for the DrivenData and World Bank organized competition. The challenge was to train a model capable of predicting with accuracy and precision the household-level consumption and calculate the poverty rate strictly below given thresholds.

It includes:

Code for data preprocessing, some feature engineering, and model fitting.
Poverty prediction model
Data imputer and encoding pipelines.

My approch:

Development Environment

Google colab

Data engineering:

Merged feature dataset and ground truths dataset on target features for unified pipeline model training

Data preprocessing:

Handle missing data using imputation. Median strategy for numerical and most frequent strategy for categorical data.
Encoded categorical data using ordinal encoding for 80+ features. This was chosen over One-Hot Encoding to handle high cardinality and avoid the Curse of Dimensionality.

Model selection:

Chose XGBoost because of its optimization for tabular data.

Feature engineering:

Handled a skewed data distribution (Figure 1) by implementing a logarithmic transformation to normalize target variables (Figure 2). This resulted in improving the MAE (Mean Absolute Error) by 4.5%. Reducing error from 3.27 to 3.12.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
images		images
.gitignore		.gitignore
Entire_Code.py		Entire_Code.py
README.md		README.md
data_encoder.pkl		data_encoder.pkl
data_imputer.pkl		data_imputer.pkl
poverty_prediction_model.pkl		poverty_prediction_model.pkl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

World Bank Poverty Prediction Challenge 🌍

Description

My approch:

Figure 1: Target Distribution

Figure 2: Target Distribution

About

Uh oh!

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

World Bank Poverty Prediction Challenge 🌍

Description

My approch:

Figure 1: Target Distribution

Figure 2: Target Distribution

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages