Airline Passenger Satisfaction — Data Mining Project

This repository contains an end-to-end data mining project built around a simple goal: understand what drives airline passenger satisfaction and predict it reliably.
Along with model training and evaluation, a working React and Flask web app is included so predictions and insights can be explored through a clean dashboard instead of only notebooks.

Project Demo (YouTube)

https://youtu.be/TPWopQruvhI

GitHub Repository

https://github.com/ttabirami12062/DataMining-AirlineSatisfaction

What this project does

This project supports three useful outcomes:

1) Predict satisfaction for an individual passenger
Given travel + service details, the system returns:

a Satisfaction label (Satisfied / Not Satisfied)
a Satisfaction score (0–1 style score from a regression model)
a Passenger segment (cluster assignment)

2) Explain the main drivers behind satisfaction
Regression models are used to make the relationship between features and satisfaction easier to interpret (service scores, delays, and travel context).

3) Segment passengers into groups
PCA + K-Means are used to group passengers into clusters that represent different experience patterns (ex: high-service vs low-service groups).

Methods Included

Regression (scoring + interpretability)

Used to produce a satisfaction score and understand feature impact:

OLS
Ridge
Lasso
Piecewise regression (where applicable)

Clustering (passenger segmentation)

PCA for dimensionality reduction
K-Means for clustering
Cluster quality checked using standard metrics (Elbow / Silhouette / CH score depending on the experiment design)

Classification (main prediction)

Random Forest (implemented from scratch in the project workflow)
Tuned to improve generalization and reduce overfitting

Dataset

Dataset source: Kaggle – Airline Passenger Satisfaction
Typical fields include passenger type, travel type/class, delays, and multiple service rating categories.
The target is converted into a binary label:

Satisfied = 1
Neutral/Dissatisfied = 0

Data Preparation (high-level)

The dataset is prepared for modeling by:

removing non-informative ID columns
handling missing values (median for numeric, mode for categorical)
encoding categorical variables (binary mapping + one-hot encoding where needed)
engineering practical features such as:
- Total_Delay = Departure Delay + Arrival Delay
- Total_Service_Score = average of service rating features

Web App Overview (React + Flask)

Frontend (React)

Pages commonly include: Home, About, Predict, Reports, Dashboard
Predict page allows user input and shows model outputs clearly

Backend (Flask)

Loads trained models (.pkl)
Recreates engineered features during inference
Exposes endpoints for predictions and dashboard summaries

Example endpoints:

POST /api/predict
GET /api/dashboard-summary

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
data/data_dm		data/data_dm
public		public
src		src
.gitignore		.gitignore
AirlineSatisfaction_Presentation and Demo.pptx		AirlineSatisfaction_Presentation and Demo.pptx
README.md		README.md
app.py		app.py
kmeans_cluster_model.pkl		kmeans_cluster_model.pkl
model.pkl		model.pkl
package-lock.json		package-lock.json
package.json		package.json
regmodel.pkl		regmodel.pkl
server.py		server.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Airline Passenger Satisfaction — Data Mining Project

Project Demo (YouTube)

GitHub Repository

What this project does

Methods Included

Regression (scoring + interpretability)

Clustering (passenger segmentation)

Classification (main prediction)

Dataset

Data Preparation (high-level)

Web App Overview (React + Flask)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Airline Passenger Satisfaction — Data Mining Project

Project Demo (YouTube)

GitHub Repository

What this project does

Methods Included

Regression (scoring + interpretability)

Clustering (passenger segmentation)

Classification (main prediction)

Dataset

Data Preparation (high-level)

Web App Overview (React + Flask)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages