A polished collection of PostgreSQL analytics case studies built around realistic business and reporting requests.
This repository demonstrates how SQL can be used to translate stakeholder questions into clean, accurate, and business-ready analytical outputs. The project covers data quality checks, product analysis, customer segmentation, sales reporting, academic reporting, ranking, recursive CTEs, date logic, and window functions.
Each case study includes a business problem, a SQL solution, and a clear explanation of the logic behind the query.
This project is designed as a SQL analytics portfolio.
It shows the full analytical process:
- Understand the stakeholder request
- Identify the relevant tables
- Translate the business question into SQL logic
- Write a clean query
- Explain the approach clearly
- Produce an output that supports decision-making
The repository is structured to show not just SQL syntax, but practical business problem-solving.
📊 Business Analytics Queries
Solves realistic stakeholder requests involving sales, products, customers, orders, and student records.
🧩 Advanced SQL Logic
Uses joins, CTEs, recursive CTEs, window functions, ranking functions, date logic, and conditional aggregation.
🧹 Data Quality Checks
Identifies missing records, duplicate records, and mismatches between related tables.
👥 Customer Segmentation
Analyzes top customers, customer spending brackets, and customer order behavior over time.
📈 Sales Trend Reporting
Calculates monthly sales, cumulative sales, and six-month moving averages.
📝 Clear Documentation
Every case study includes the business problem, SQL solution, and step-by-step explanation.
Identifies products that exist in one table but not another using a FULL JOIN.
This type of query helps detect data integrity issues before they affect dashboards, sales reports, or revenue calculations.
queries/01_data_quality_missing_products/
Compares products against each other to find candy pairs within $0.25 of each other in unit price.
This demonstrates self joins and pairwise comparison logic.
queries/02_similar_product_prices/
Adds a sequential transaction number for each customer using ROW_NUMBER().
This is useful for customer journey analysis, repeat purchase tracking, and retention reporting.
queries/07_customer_transaction_number/
Generates dynamic $10 spending brackets using a recursive CTE.
This helps segment customers based on total spending.
queries/12_customer_spending_brackets/
Creates a monthly sales report with total sales, cumulative sales, and a six-month moving average.
This demonstrates time-series reporting using window functions.
queries/17_monthly_sales_moving_average/
| # | Case Study | Main Concepts |
|---|---|---|
| 01 | Data Quality Check: Missing Products | FULL JOIN, NULL filtering, data reconciliation |
| 02 | Similar Product Prices | Self join, ABS(), pairwise comparison |
| 03 | Price Deviation From Average | AVG() OVER, calculated columns |
| 04 | Products by Factory | COUNT() OVER, PARTITION BY |
| 05 | Products Cheaper Than Wicked Choccy's | CTE, MIN(), comparison filtering |
| 06 | Large Orders Over $200 | Joins, aggregation, CTEs, COUNT() OVER |
| 07 | Customer Transaction Number | ROW_NUMBER(), customer-level sequencing |
| 08 | Product Popularity Rank | DENSE_RANK(), ranking within orders |
| 09 | Second Most Popular Product | CTE, ranking, filtering ranked results |
| 10 | Customer Order Change Over Time | LAG(), order-level aggregation |
| 11 | Top 1 Percent Customers | NTILE(100), customer segmentation |
| 12 | Customer Spending Brackets | Recursive CTE, range bucketing |
| 13 | Q2 2024 Orders With Ship Date | Date filtering, INTERVAL |
| 14 | Student Duplicate Records | ROW_NUMBER(), duplicate removal |
| 15 | Student Highest Grade | DENSE_RANK(), top record per group |
| 16 | Department Grade Summary | CASE, conditional aggregation, pivot-style summary |
| 17 | Monthly Sales Moving Average | Time-series aggregation, cumulative sum, moving average |
The project uses multiple join types to connect and compare datasets.
INNER JOINLEFT JOINFULL JOIN- Self joins
CROSS JOIN
Examples include finding missing products, calculating order values, comparing product prices, and joining student records to grades.
Detailed notes are available in:
concepts/joins.md
CTEs are used to break complex logic into clear steps.
They help make the queries easier to read, explain, and maintain.
Examples include calculating order totals, ranking customers, filtering ranked records, and preparing monthly sales data.
Detailed notes are available in:
concepts/ctes.md
Window functions are used throughout the project to calculate rankings, sequence numbers, previous values, group counts, cumulative totals, and moving averages.
Functions used include:
ROW_NUMBER()DENSE_RANK()NTILE()LAG()COUNT() OVERAVG() OVERSUM() OVER
Detailed notes are available in:
concepts/window_functions.md
Recursive CTEs are used to generate dynamic customer spending brackets.
This avoids hard-coding ranges and makes the query flexible as the data changes.
Detailed notes are available in:
concepts/recursive_ctes.md
Date logic is used for quarterly filtering, calculated shipping dates, customer order sequencing, and monthly sales reporting.
Concepts include:
- Date ranges
INTERVALEXTRACT()TO_CHAR()DATE_TRUNC()
Detailed notes are available in:
concepts/date_functions.md
Ranking functions are used to identify top records, second-ranked records, duplicate records, and customer percentile groups.
Functions covered include:
ROW_NUMBER()RANK()DENSE_RANK()NTILE()
Detailed notes are available in:
concepts/ranking_functions.md
The repository includes queries that identify missing and duplicate records.
Examples:
- Products that exist in one table but not another
- Duplicate student records
These queries demonstrate how SQL can improve data reliability before reporting.
The project includes queries that help evaluate pricing, popularity, and production distribution.
Examples:
- Products with similar prices
- Products above or below average price
- Products produced by each factory
- Products ranked by popularity within each order
These queries support pricing reviews, inventory planning, product bundling, and product strategy.
The project includes sales-focused queries that calculate order values, customer spending, and sales trends.
Examples:
- Orders over
$200 - Monthly sales totals
- Cumulative sales
- Six-month moving average of sales
These queries support sales reporting, executive dashboards, and business performance reviews.
The project includes customer-focused queries that analyze spending and order behavior.
Examples:
- Customer transaction numbering
- Customer order changes over time
- Top 1% of customers by spending
- Customer spending brackets
These queries support customer segmentation, retention analysis, and loyalty program targeting.
The project also applies SQL analytics patterns to student records.
Examples:
- Removing duplicate student records
- Finding each student's highest grade
- Summarizing average grades by department and grade level
These examples show that the SQL patterns are transferable across different domains.
sql-analytics-case-studies/
│
├── README.md
├── LICENSE
├── .gitignore
├── portfolio_summary.md
│
├── data/
│ ├── README.md
│ ├── data_dictionary.md
│ └── sample_data_note.md
│
├── schema/
│ ├── README.md
│ ├── create_tables.sql
│ └── entity_relationship_notes.md
│
├── queries/
│ │
│ ├── 01_data_quality_missing_products/
│ │ ├── problem.md
│ │ ├── solution.sql
│ │ └── explanation.md
│ │
│ ├── 02_similar_product_prices/
│ │ ├── problem.md
│ │ ├── solution.sql
│ │ └── explanation.md
│ │
│ ├── 03_price_deviation_from_average/
│ │ ├── problem.md
│ │ ├── solution.sql
│ │ └── explanation.md
│ │
│ ├── 04_products_by_factory/
│ │ ├── problem.md
│ │ ├── solution.sql
│ │ └── explanation.md
│ │
│ ├── 05_products_cheaper_than_wicked_choccy/
│ │ ├── problem.md
│ │ ├── solution.sql
│ │ └── explanation.md
│ │
│ ├── 06_large_orders_over_200/
│ │ ├── problem.md
│ │ ├── solution.sql
│ │ └── explanation.md
│ │
│ ├── 07_customer_transaction_number/
│ │ ├── problem.md
│ │ ├── solution.sql
│ │ └── explanation.md
│ │
│ ├── 08_product_popularity_rank/
│ │ ├── problem.md
│ │ ├── solution.sql
│ │ └── explanation.md
│ │
│ ├── 09_second_most_popular_product/
│ │ ├── problem.md
│ │ ├── solution.sql
│ │ └── explanation.md
│ │
│ ├── 10_customer_order_change_over_time/
│ │ ├── problem.md
│ │ ├── solution.sql
│ │ └── explanation.md
│ │
│ ├── 11_top_1_percent_customers/
│ │ ├── problem.md
│ │ ├── solution.sql
│ │ └── explanation.md
│ │
│ ├── 12_customer_spending_brackets/
│ │ ├── problem.md
│ │ ├── solution.sql
│ │ └── explanation.md
│ │
│ ├── 13_q2_2024_orders_with_ship_date/
│ │ ├── problem.md
│ │ ├── solution.sql
│ │ └── explanation.md
│ │
│ ├── 14_student_duplicate_records/
│ │ ├── problem.md
│ │ ├── solution.sql
│ │ └── explanation.md
│ │
│ ├── 15_student_highest_grade/
│ │ ├── problem.md
│ │ ├── solution.sql
│ │ └── explanation.md
│ │
│ ├── 16_department_grade_summary/
│ │ ├── problem.md
│ │ ├── solution.sql
│ │ └── explanation.md
│ │
│ └── 17_monthly_sales_moving_average/
│ ├── problem.md
│ ├── solution.sql
│ └── explanation.md
│
└── concepts/
├── joins.md
├── ctes.md
├── window_functions.md
├── recursive_ctes.md
├── date_functions.md
└── ranking_functions.md
The repository includes documentation for the tables used in the case studies.
The main subject areas are:
Used for product, order, customer, factory, pricing, and sales analysis.
Main tables:
productsorders
Used for academic reporting and student performance analysis.
Main tables:
studentsstudent_grades
Data documentation is available in:
data/data_dictionary.md
Schema documentation is available in:
schema/entity_relationship_notes.md
A simplified mock schema is available in:
schema/create_tables.sql
The original learning exercises were completed using a PostgreSQL database in pgAdmin 4.
The original schema and dataset may come from a course-provided learning environment.
To respect data ownership, this repository focuses on:
- Original SQL solutions
- Query explanations
- Business logic
- Documentation
- A simplified mock schema for practice
This repository should not redistribute proprietary course data unless permission is explicitly granted by the course provider.
Each case study is located in the queries/ folder.
Every case study includes:
problem.md
solution.sql
explanation.md
Start with problem.md to understand the business request.
Open solution.sql to see the final query.
Use explanation.md to understand the logic step by step.
Use the concepts/ folder to review the SQL techniques used across the project.
queries/09_second_most_popular_product/
├── problem.md
├── solution.sql
└── explanation.md
The problem.md file explains the stakeholder request.
The solution.sql file contains the SQL query.
The explanation.md file breaks down the logic step by step.
- PostgreSQL
- pgAdmin 4
- SQL
- GitHub
This project is relevant for roles such as:
- Data Analyst
- Business Analyst
- Business Intelligence Analyst
- Analytics Engineer
- Reporting Analyst
- Data Science Intern
- Business Analytics Graduate Student
Possible future improvements include:
- Creating a fully original mock dataset
- Adding sample query outputs
- Adding screenshots from pgAdmin
- Adding an entity relationship diagram
- Adding performance notes for selected queries
- Creating a dashboard from selected outputs
- Adding multi-year-safe versions of time-series queries
- Adding SQL validation checks for the mock schema
This project is licensed under the MIT License.
See LICENSE file for details.
Adham Elkhouly
- Data Analytics and Reporting Specialist
- Power Platform Functional Consultant Associate
Developed as a SQL analytics portfolio project to demonstrate practical data analysis, business reporting, and stakeholder-focused query writing using PostgreSQL.