Skip to content

This project aims to build machine learning models for predicting which political party a voter is likely to vote for, based on survey data

Notifications You must be signed in to change notification settings

Hunter764/Election_Voting_Prediction

Repository files navigation

Election Voting Prediction Model

📌 Project Overview

This project aims to build machine learning models for predicting which political party a voter is likely to vote for, based on survey data. The dataset consists of 1,525 voters and 9 variables, collected as part of an exit poll by a leading news channel, CNBE. The predictions will help in estimating the overall election outcome in the surveyed regions.

📝 Problem Statement

You have been hired by CNBE to analyze voter behavior using a dataset with 9 features. The goal is to develop machine learning models that predict the party a voter will choose, helping CNBE project election results more accurately.

🎯 Project Objectives

  • Build effective classification machine learning models.
  • Conduct thorough exploratory data analysis (EDA).
  • Evaluate models using appropriate performance metrics.
  • Select the best-performing model based on evaluation criteria.

📂 Dataset Details

  • Total Records: 1,525
  • Variables:
    1. vote – Political party/candidate the voter chose.
    2. age – Respondent's age.
    3. economic.cond.national – Perceived national economic condition.
    4. economic.cond.household – Household economic condition perception.
    5. Blair – Opinion rating of Tony Blair.
    6. Hague – Opinion rating of William Hague.
    7. Europe – Stance on European Union issues.
    8. political.knowledge – Level of political knowledge.
  • Data Source: The dataset consists of two tabs: "Data" and "Data Dictionary." Only the "Data" tab is used for analysis.

🔍 Data Preprocessing & EDA

  • Data Cleaning: Checked for missing values and duplicate records.
  • Univariate & Bivariate Analysis: Studied the distribution and relationships between variables.
  • Outlier Treatment: Applied the Interquartile Range (IQR) method to handle outliers in economic variables.
  • Feature Engineering: Removed redundant features and optimized the dataset.

📊 Model Building

  • Train-Test Split: Split data into training and testing sets.
  • Applied ML Models:
    • Logistic Regression
    • Decision Tree (with pruning)
    • Naïve Bayes (optimized using Grid Search)
    • K-Nearest Neighbors (KNN with K=3, 5, 7)
    • Bagging & Boosting (GB Boost, XGBoost)
  • Model Evaluation:
    • Accuracy
    • Precision, Recall, F1-score
    • Confusion Matrix

🚀 Technical Requirements

  • Languages: Python
  • Tools: VS Code / Google Colab / Jupyter Notebook
  • Libraries: NumPy, Pandas, Matplotlib, Seaborn, Scikit-learn, XGBoost

📈 Key Insights

  • The dataset had a balanced gender representation (~53% female, ~47% male).
  • Labour had a higher voter count (69.7%) compared to Conservative (30.3%).
  • Logistic Regression and XGBoost performed well, with accuracy around 85%.

📑 Deliverables

  • Jupyter Notebook with full implementation.
  • Business report in PDF format (excluding code).
  • Visualizations and insights from model predictions.

🔗 References


Election_Voting_Prediction

About

This project aims to build machine learning models for predicting which political party a voter is likely to vote for, based on survey data

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published