Skip to content

yasserramzy/Superstore-Python-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

US Superstore Sales Analysis (Python)

This project performs an end to end exploratory and analytical study on the US Superstore dataset using Python.
The goal is to extract actionable business insights related to sales performance, profitability, customer behavior, product performance, and operational efficiency.

This project is designed as a portfolio-level data analytics project demonstrating Python, Pandas, and data visualization skills.

Dataset

  • Source: US Superstore Sales Dataset
  • Format: CSV
  • Key Features:
    • Order and shipping dates
    • Sales and profit and discount
    • Customer and product and regional data

The raw dataset is stored in the "data/" folder and remains unchanged for transparency.

Tools & Technologies

  • Python
  • Pandas
  • Matplotlib
  • Seaborn
  • Jupyter Notebook

Analysis Workflow

1. Data Preparation & Cleaning

  • Loaded raw CSV data using Pandas
  • Cleaned column names for consistency
  • Converted date columns to datetime format
  • Created additional time-based columns:
    • Order Year
    • Order Month
    • Order Year-Month
  • Calculated shipping duration in days
  • Checked for missing values and duplicates

2. Feature Engineering

  • Time based features for trend analysis
  • Shipping duration for logistics insights
  • Aggregated metrics for customer, product, and regional analysis

3. Exploratory Data Analysis & Business Insights

The analysis focuses on multiple business dimensions:

  • Time based performance
  • Customer value
  • Product profitability
  • Regional and operational efficiency
  • Discount and profitability relationships

Key Analyses & Insights

  • Time-Based Analysis

    • Monthly revenue and profit trends
    • Sales and profit growth patterns over time
  • Customer Analysis

    • Top 10 customers by revenue
    • Top 10 customers by profit
  • Product Analysis

    • Revenue and profit by category and sub category
    • Top 10 products by revenue and profit
    • Identification of low performing products
  • Geographical Analysis

    • Revenue and profit by region
    • Identification of loss making regions

Operational & Risk Analysis

  • Shipping duration vs profit by region
  • Discount vs profit relationship
  • Impact of high discounts on profitability

Skills Demonstrated

Programming & Libraries

-Python

-Pandas – data manipulation, aggregation, feature engineering

-NumPy – numerical operations

-Matplotlib – data visualization

-Seaborn – advanced statistical visualizations

Data Cleaning & Preparation

-Handling missing values

-Data type conversions

-Removing inconsistencies

-Creating derived columns (e.g., order year, order month, shipping duration)

-Date-time processing

Feature Engineering

-Extracting year and month from order dates

-Creating time-based features for trend analysis

-Calculating shipping duration

-Preparing categorical features for grouping and aggregation

Exploratory Data Analysis (EDA)

-Sales and profit trend analysis

-Category and sub-category performance analysis

-Customer segmentation (top customers by revenue & profit)

-Product-level performance analysis

-Discount impact analysis

-Regional performance comparison

Data Aggregation & Analysis Techniques

-GroupBy operations

-Sorting and ranking

-Time-series aggregation

-Comparative analysis across categories and regions

Data Visualization

-Line charts (monthly revenue & profit trends)

-Bar charts (top customers and products)

-Heatmaps (category and sub-category profitability)

-Scatter plots (discount vs profit, shipping duration vs profit)

-Proper labeling, legends, and figure sizing

Business & Analytical Skills

-Translating data into actionable business insights

-Identifying profitability drivers and risks

-Detecting operational inefficiencies

-Insight summarization and storytelling

-Decision-oriented analysis

Project & Workflow Skills

-Structured notebook design (layer-based analysis)

-Reproducible analysis workflow

-Clear documentation using Markdown

-GitHub project organization

-Version control fundamentals

About

This is an End to end exploratory data analysis of the Superstore sales dataset using Python. This project covers data cleaning, feature engineering, time series analysis, customer and product performance, profitability insights, and business focused visualizations.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors