Skip to content

πŸ“Š AI-powered GitHub analytics with intelligent insights and professional reporting.

Notifications You must be signed in to change notification settings

xindixu/github-activity-analyzer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

6 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

GitHub PR Analytics Suite

πŸ€– AI-powered GitHub PR analytics tool that transforms your pull request history into professional markdown reports with intelligent project categorization, pattern analysis, and performance insights.

✨ Features

  • πŸ“Š Complete Workflow: Fetches PRs and generates AI analysis in one command
  • πŸ€– AI-Powered Insights: Uses OpenAI to generate concise summaries and comprehensive pattern analysis
  • πŸ“ Project Categorization: Extracts project names from PR titles ([CS-1234] ProjectName: description)
  • πŸ“ Professional Reports: Generates beautiful markdown reports perfect for performance reviews
  • πŸ” Comprehensive Analysis: 5-section analysis covering project focus, technical themes, development velocity, cross-project insights, and key accomplishments
  • πŸ“ˆ Smart Metrics: Tracks lines changed, PR distribution, and project priorities
  • 🎯 Performance Review Ready: Actionable insights for self-assessments and project planning

πŸ—οΈ Project Structure

pr/
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ github_pr_fetcher.py                     # Fetches PRs from GitHub
β”‚   └── pr_summarizer.py                         # AI-powered analysis & summarization
β”œβ”€β”€ output/
β”‚   β”œβ”€β”€ pr_YYYY-MM-DD_YYYY-MM-DD_detailed.csv    # Raw PR data
β”‚   β”œβ”€β”€ pr_YYYY-MM-DD_YYYY-MM-DD_summarized.csv  # With AI summaries
β”‚   └── pr_YYYY-MM-DD_YYYY-MM-DD_summary.md      # Markdown analysis report
β”œβ”€β”€ main.py                                      # Complete workflow entry point
β”œβ”€β”€ requirements.txt                             # Dependencies
β”œβ”€β”€ .env.example                                 # Configuration template
└── README.md

πŸš€ Quick Start

1. Installation

git clone <your-repo>
cd pr
pip install -r requirements.txt

2. Setup

Create a .env file with your configuration:

cp .env.example .env

Edit .env and add your credentials:

# GitHub Configuration
GITHUB_TOKEN=your_personal_access_token_here
GITHUB_REPO=owner/repository
GITHUB_USERNAME=your_username  # Optional: specific user to search for

# Time Range (optional)
DAYS=14  # Default: 14 days

# AI Configuration (for summarization)
OPENAI_API_KEY=your_openai_api_key_here

Getting GitHub Token

  1. Go to https://github.com/settings/tokens
  2. Click "Generate new token (classic)"
  3. Select scopes: repo (for private repos) or public_repo (for public repos)
  4. Copy the generated token

Getting OpenAI API Key

  1. Go to https://platform.openai.com/api-keys
  2. Create a new API key
  3. Copy the key (keep it secure!)

3. Run Complete Analysis

# Analyze last 14 days (default)
python main.py

# Custom time range
DAYS=30 python main.py

# Last 7 days
DAYS=7 python main.py

This will:

  1. πŸ“Š Fetch all your PRs from the specified time range
  2. πŸ€– Generate AI summaries for each PR
  3. πŸ” Analyze patterns and categorize by project
  4. πŸ“ Create a professional markdown report

πŸ“Š Output Files

1. Detailed CSV (pr_YYYY-MM-DD_YYYY-MM-DD_detailed.csv)

Raw PR data with clean descriptions:

  • pr_url - Direct link to the PR
  • title - PR title
  • description - Clean description (removes boilerplate/templates)
  • lines_of_code_changes - Total lines changed
  • additions - Lines added
  • deletions - Lines deleted
  • created_at - PR creation timestamp
  • state - PR state (open/closed)
  • merged - Whether PR was merged
  • attachments - URLs of attachments found in PR

2. Summarized CSV (pr_YYYY-MM-DD_YYYY-MM-DD_summarized.csv)

All detailed data plus:

  • ai_summary - Concise AI-generated summary of each PR

3. Analysis Report (pr_YYYY-MM-DD_YYYY-MM-DD_summary.md)

Professional markdown report with:

  • πŸ“Š Executive Summary: Period, totals, averages
  • 🎯 Project Focus & Impact: Which projects got the most attention
  • ⚑ Technical Themes & Patterns: Performance, security, infrastructure initiatives
  • πŸš€ Development Velocity & Scale: Work distribution and iteration patterns
  • πŸ”— Cross-Project Insights: Shared challenges and dependencies
  • πŸ† Key Accomplishments & Trends: Significant achievements and innovation
  • πŸ“‹ Individual PR Details: Each PR with project categorization, status, and summary

🎯 Project Categorization

The tool intelligently extracts project names from PR titles using the format:

[TICKET-123] ProjectName: Summary of changes

Examples:

  • [CS-6304] Roles: DB schema for RBAC β†’ Roles project
  • [CS-5916] Connector Catalog: update cypress tests β†’ Connector Catalog project
  • [INFRA-123] Docker: Update base images β†’ Docker project

PRs that don't match this pattern are categorized as "Uncategorized" and handled separately.

βš™οΈ Advanced Usage

Run Components Separately

# Just fetch PR data
python src/github_pr_fetcher.py

# Just run AI analysis on existing CSV
python src/pr_summarizer.py
python src/pr_summarizer.py output/specific_file.csv

Time Range Options

# Last week
DAYS=7 python main.py

# Last month
DAYS=30 python main.py

# Last quarter
DAYS=90 python main.py

# Last year
DAYS=365 python main.py

🎨 Sample Output

Program output

πŸš€ Starting GitHub PR Analytics Suite
==================================================
πŸ“Š Step 1: Fetching PRs from GitHub...
Searching for PRs created by: xindixu
Time range: Past 180 day(s)
Fetching PRs from instabase/instabase created by 'xindixu' after 2025-01-15...
Found 325 PRs matching criteria
Added PR: [CS-6454] DS UI: Disable indexing modal should list connected chatbots (165 lines changed)
Added PR: [CS-6435] DS UI: Indexing items over limit + use real feature flag (165 lines changed)
Added PR: [CS-0000] Roles: GA in 25.30 (11 lines changed)
Added PR: [CS-6452] Roles: gate mount backend for create/edit/delete mount points (50 lines changed)
Processed 5/325 PRs...
...

Exported 325 PRs to output/pr_2025-01-15_2025-07-14_detailed.csv

Summary:
Total PRs found: 325
Total lines changed: 165431
Average lines per PR: 509.0
βœ… PR data saved to: output/pr_2025-01-15_2025-07-14_detailed.csv

πŸ€– Step 2: Generating AI summaries...
πŸ“Š Loaded 325 PRs from output/pr_2025-01-15_2025-07-14_detailed.csv
βœ… OpenAI client initialized with model: gpt-3.5-turbo
πŸ€– Generating AI summaries...
Processing PR 1/325: [CS-6454] DS UI: Disable indexing modal should lis...
Processing PR 2/325: [CS-6435] DS UI: Indexing items over limit + use r...
...
πŸ” Analyzing patterns...
πŸ’Ύ Saved summarized data to output/pr_2025-01-15_2025-07-14_summarized.csv
πŸ“ Saved pattern analysis to output/pr_2025-01-15_2025-07-14_summary.md

🎯 QUICK ANALYSIS
==================================================
...

πŸŽ‰ WORKFLOW COMPLETE!
==================================================
πŸ“ Detailed PR data: output/pr_2025-01-15_2025-07-14_detailed.csv
πŸ€– AI summarized data: output/pr_2025-01-15_2025-07-14_summarized.csv
πŸ“ Pattern analysis: output/pr_2025-01-15_2025-07-14_summary.md

Example Files

Check out the examples/ directory for complete sample output demonstrating:

About

πŸ“Š AI-powered GitHub analytics with intelligent insights and professional reporting.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages