Skip to content

Latest commit

 

History

History
191 lines (138 loc) · 6.76 KB

File metadata and controls

191 lines (138 loc) · 6.76 KB

CardPulse Analytics

A card payments and customer analytics project for the German retail banking market. Covers 250,000 customers, 367K cards, and 1.5M transactions from 2023-2025.

Built end-to-end across SQL, Excel, and Power BI covering the full analyst toolkit a German bank would use.


Headline Findings

  • Revenue grew 48% over 3 years (EUR 42.3M to EUR 62.6M) with steady +21% YoY growth.
  • 23% of customers (Champions tier) drive 63% of revenue..
  • Premium credit cards generate 58x more profit per card than Standard debit (EUR 394 vs EUR 6.71).
  • 7,400 high-value customers at risk of churning represent EUR 7.7M in revenue.
  • 46% of fraud incidents happen in the afternoon (12:00-17:00), driving fraud-team staffing recommendations.

Full writeup in INSIGHTS.md.


Power BI Dashboards

Executive Overview

High-level KPIs, monthly revenue trend, and segment-level revenue concentration.

Executive Overview

Cards & Customer Deep-Dive

Customer distribution, fraud patterns by hour, and a churn-risk operational list.

Customer Details


Excel Analysis

KPI Dashboard

Eight KPI cards, yearly summary, and the top 10 churn risks with conditional formatting.

Excel KPI Dashboard

PivotTable + PivotChart

Monthly revenue PivotTable with grouped year/month structure and a clustered column chart.

Excel Pivot Table


Project Structure

CardPulse_Analytics/
├── README.md                                # This file
├── INSIGHTS.md                              # Business analysis writeup
├── CardPulse BI Analysis.pbix               # Power BI dashboards (2 pages, 11 measures)
├── Excel Analysis.xlsx                      # Excel workbook (5 sheets)
├── CardPulse_01_Data_Generation.ipynb       # Python data generation notebook
├── Data/                                    # 8 reference CSVs (no full transactions,cards and customers files due to Github web limit)
├── Screenshots/                             # Dashboard images
└── sql/                                     # 18 SQL files in 5 folders
    ├── 01_revenue_overview/
    ├── 02_customer_segmentation/
    ├── 03_merchant_analysis/
    ├── 04_fraud_analysis/
    └── 05_lifecycle_retention/

Tools and Stack

Layer Tool Purpose
Data generation Python (Pandas, NumPy, Faker) Synthetic dataset with German locale
Database PostgreSQL on Supabase (Frankfurt EU) Cloud SQL backend
SQL client DBeaver Query development and result export
Tabular analysis Excel (PivotTables, XLOOKUP, Power Query) Pre-aggregated analysis layer
Dashboarding Power BI Desktop (DAX, data model) Two interactive dashboards

Dataset

Nine relational tables modeled on real banking schemas:

Table Rows Purpose
customers 250,000 KYC and segmentation data
accounts 250,000 One checking account per customer
cards 367,610 Card issuance (1-3 cards per customer)
transactions 1,499,985 Card transaction fact table
merchants 48,960 German merchant brands by location
merchant_categories 25 Real ISO 18245 MCC codes
branches 174 Bank branches across 15 German cities
card_products 10 Card tier catalog (Standard / Premium / Business)
fx_rates 4,384 Daily EUR exchange rates

Date range: 2023-01-01 to 2025-12-31 (36 months) Total rows: ~2.42M MCC codes: Real-world Visa/Mastercard codes (e.g., 5411 = Grocery, 4511 = Airlines) Merchant brands: German market (Edeka, Aldi, Lidl, Lufthansa, Deutsche Telekom, etc.)

The full transactions.csv (163 MB), cards.csv (25.7 MB), and customers.csv (25.7 MB) are excluded from this repo due to GitHub web upload size limits. The 6 smaller reference CSVs are included in /Data/. To regenerate the full dataset, run CardPulse_01_Data_Generation.ipynb in Google Colab.


SQL Analysis (18 Queries)

Five themes covering executive-level questions a Head of Cards would ask:

Theme 1 - Revenue & Volume Overview (3 queries)

  • Yearly volume and growth
  • Monthly trends with YoY/MoM growth (window functions)
  • Revenue by card tier

Theme 2 - Customer Segmentation (4 queries)

  • Segment revenue contribution (Pareto analysis)
  • Top 10 cities by volume
  • Demographics (age x gender x segment)
  • Cards-per-customer distribution

Theme 3 - Merchant & Category Analysis (3 queries)

  • Top 10 merchants by location AND by brand
  • Spending by MCC category group
  • Online vs in-store channel split

Theme 4 - Fraud Analysis (3 queries)

  • Fraud rate by category
  • Fraud by hour of day (staffing insight)
  • Customer fraud exposure by city

Theme 5 - Customer Lifecycle & Retention (5 queries)

  • RFM segmentation (Recency, Frequency, Monetary)
  • Customer activity levels (Active / Casual / Dormant)
  • Cohort analysis by signup year
  • Card product profitability (fees + interchange - rewards)
  • Churn risk operational list

All queries use joins, CTEs, window functions, and conditional aggregations.


Excel Analysis (5 Sheets)

Sheet Skills Used
KPI Dashboard INDEX formulas, custom number formatting, conditional formatting
Monthly Trends PivotTable + PivotChart with grouped year/month
Customer Segments PivotTable with Slicer (interactive filtering)
Merchant Lookup XLOOKUP with error handling
Power Query Demo Filter and group-by transformation pipeline

Power BI Dashboards

Page 1 - Executive Overview

  • 4 KPI cards (Revenue, Transactions, Customers, Cards)
  • Monthly Revenue Trend (line chart)
  • Revenue by Customer Segment (bar chart)
  • YoY Revenue Growth (column chart)
  • Top 5 Merchants (table)

Page 2 - Cards & Customer Deep-Dive

  • 3 KPI cards (Avg Transaction, Fraud Rate, Churn Risks)
  • Customer Distribution by Segment (donut chart)
  • Top 10 Churn Risks (table)
  • Churn Risk by City (stacked bar chart)
  • Fraud Incidents by Hour of Day (stacked column chart)

Eleven DAX measures power the visuals, including ratio calculations with DIVIDE() and handle null denominators safely.


How to Reproduce

  1. Clone or download this repo
  2. Open CardPulse_01_Data_Generation.ipynb in Google Colab
  3. Mount Google Drive and run all cells (~10 minutes for full dataset)
  4. Load CSVs into a PostgreSQL database (e.g., Supabase)
  5. Run SQL files in numeric order from /sql/
  6. Open Excel Analysis.xlsx to explore PivotTables
  7. Open CardPulse BI Analysis.pbix in Power BI Desktop

Contact

Pavan Raj Kotagiri LinkedIn