# ๐ Employee Retention Analysis using Logistic Regression
This project analyzes employee attrition using **logistic regression**, a binary classification model. Created as part of a data science course, it uses an HR dataset from **Kaggle** to explore factors that influence why employees stay or leave an organization.
---
## ๐ Objectives
- Predict whether an employee is likely to leave
- Identify key features contributing to attrition
- Visualize patterns and relationships within the data
---
## ๐งพ Dataset Columns
- `satisfaction_level`
- `last_evaluation`
- `number_project`
- `average_montly_hours`
- `time_spend_company`
- `Work_accident`
- `promotion_last_5years`
- `Department`
- `salary`
- `left` *(target variable)*
---
## ๐ ๏ธ Tools & Libraries
- `pandas` for data handling
- `matplotlib` & `seaborn` for visualization
- `scikit-learn` for machine learning (Logistic Regression)
---
## ๐ Exploratory Data Analysis
Visualizations include:
- Box plots for satisfaction level vs attrition
- Count plots of departments vs left
- Correlation heatmaps
Example:
```python
import seaborn as sns
import matplotlib.pyplot as plt
sns.boxplot(x='left', y='satisfaction_level', data=df)
plt.title("Satisfaction Level vs Attrition")
plt.show()from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
# Feature selection
X = df[['satisfaction_level', 'number_project', 'average_montly_hours', 'Work_accident']]
y = df['left']
# Train-test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Model training
model = LogisticRegression()
model.fit(X_train, y_train)# Sample input
sample = [[0.4, 4, 160, 0]] # satisfaction, project count, hours, accident
prediction = model.predict(sample)
print("Prediction:", "Left" if prediction[0] == 1 else "Stayed")This logistic regression model gives HR teams insight into which factors are most associated with employee turnover. With visual exploration and a simple predictive model, this project demonstrates the practical value of machine learning in workforce analytics.
HR Analytics: Job Change of Data Scientists โ Kaggle