Data Source: https://www.kaggle.com/datasets/rishikeshkonapure/hr-analytics-prediction/data Objective: Determine key insights related to employee attrition, satisfaction, and performance.
Primary questions to explore:
- What factors contribute to employee attrition?
- How do job satisfaction and work-life balance impact employee retention?
- What is the relationship between income, job role, and attrition?
- How do years of service affect promotion rates and attrition?
Defining data points: JobSatisfaction 1 'Low' 2 'Medium' 3 'High' 4 'Very High'
PerformanceRating 1 'Low' 2 'Good' 3 'Excellent' 4 'Outstanding'
RelationshipSatisfaction 1 'Low' 2 'Medium' 3 'High' 4 'Very High'
WorkLifeBalance 1 'Bad' 2 'Good' 3 'Better' 4 'Best'
In this hypothetical scenario. The executive leadership wants to understand the root cause of attrition and develop data driven strategies to improve employee retention, engagement and overall satisfaction. As a HR analyst, your role is to analyze employee data and uncover actionable insights that can support strategic decision making.
**Tasks to perform: ** Data Cleaning: Add primary key Deleting redundant columns. Renaming the columns. Dropping duplicates. Cleaning individual columns. Remove the NaN values from the dataset
SQL codes use hr_data;
-- rename column
ALTER TABLE hr_employee RENAME COLUMN Age to Age;
SELECT * FROM hr_employee;
-- Give each row a distinct employeeID
Alter TABLE hr_employee ADD column id INT AUTO_INCREMENT PRIMARY KEY;
-- Overall attrition rate
SELECT SUM(CASE WHEN Attrition = 'Yes' THEN 1 else 0 END) as count_yes, SUM(CASE WHEN Attrition = 'No' THEN 1 else 0 END) as count_no, COUNT(*) as total_employees, (SUM(CASE WHEN Attrition = 'Yes' THEN 1 else 0 END) 100/ COUNT()) as attrition_rate_perc FROM hr_employee;
-- Attrition rate for male and female
SELECT gender, SUM(CASE WHEN Attrition = 'Yes' THEN 1 else 0 END) as count_yes, SUM(CASE WHEN Attrition = 'No' THEN 1 else 0 END) as count_no, COUNT(*) as total_employees, (SUM(CASE WHEN Attrition = 'Yes' THEN 1 else 0 END) 100/ COUNT()) as attrition_rate_perc FROM hr_employee GROUP BY gender;
-- Attrition rate for department
SELECT department, SUM(CASE WHEN Attrition = 'Yes' THEN 1 else 0 END) as count_yes, SUM(CASE WHEN Attrition = 'No' THEN 1 else 0 END) as count_no, COUNT(*) as total_employees, (SUM(CASE WHEN Attrition = 'Yes' THEN 1 else 0 END) 100/ COUNT()) as attrition_rate_perc FROM hr_employee GROUP BY department;
-- Finding relationship between job satisfaction and attrition
SELECT JobSatisfaction, COUNT() AS total_employees, SUM(CASE WHEN Attrition = 'Yes' THEN 1 ELSE 0 END) AS attrition_count, (SUM(CASE WHEN Attrition = 'Yes' THEN 1 ELSE 0 END) * 100.0 / COUNT()) AS attrition_rate_percentage FROM hr_employee GROUP BY JobSatisfaction ORDER BY JobSatisfaction;
-- job satisfaction by departments
SELECT department, JobSatisfaction, COUNT() AS total_employees, SUM(CASE WHEN Attrition = 'Yes' THEN 1 ELSE 0 END) AS attrition_count, (SUM(CASE WHEN Attrition = 'Yes' THEN 1 ELSE 0 END) * 100.0 / COUNT()) AS attrition_rate_percentage FROM hr_employee GROUP BY department,JobSatisfaction ORDER BY JobSatisfaction desc, attrition_rate_percentage desc;
There looks to be a correlation between job satisfaction and attrition rate. There are also significant difference in attrition rate between each department with sales department coming up with highest overall attrition rate.
-- Work life balance between departments and attrition rate
SELECT Department, WorkLifeBalance, COUNT() AS total_employees, SUM(CASE WHEN Attrition = 'Yes' THEN 1 ELSE 0 END) AS attrition_count, (SUM(CASE WHEN Attrition = 'Yes' THEN 1 ELSE 0 END) * 100.0 / COUNT()) AS attrition_rate_percentage FROM hr_employee GROUP BY department, WorkLifeBalance ORDER BY WorkLifeBalance, attrition_rate_percentage desc;
Further analysis conducted on Tableau.
P.S. The employee Payroll data is undefined. Hourly rate, monthly rate and daily pay data does not align mathematically. I will be using Monthly Income as salary indicator.





