AI Issue Triage

An AI-powered issue analysis tool that uses Google's Gemini AI to perform comprehensive analysis of software issues based on your codebase content.

🚀 Quick Start

Want to set up automated issue analysis in your GitHub repo?

👉 See the QUICKSTART Guide for step-by-step instructions to:

Set up automated GitHub Actions workflows
Configure AI-powered issue analysis
Analyze issues against your codebase automatically
Get started in under 10 minutes

Looking to use this as a CLI tool or library? Continue reading below.

Features

AI-Powered Analysis: Uses Google Gemini 2.0 Flash for intelligent issue analysis with the latest Google Gen AI SDK
Two-Pass Architecture: Librarian identifies relevant files, Surgeon performs deep analysis on targeted context
Root Cause Analysis: Identifies primary causes and contributing factors
Solution Generation: Proposes specific code changes with rationale
Issue Triage: Automatically classifies issues as bugs, enhancements, or feature requests
Severity Assessment: Rates issues from low to critical priority
Code Location Mapping: Identifies relevant files, functions, and classes
Export Capabilities: Export analysis results in JSON format
Enhanced Performance: Faster analysis with the latest Gemini 2.0 Flash model
Smart Retry Mechanism: Automatically retries analysis if low-quality responses are detected
Security Protection: Built-in prompt injection detection to protect against malicious inputs
Duplicate Detection: Automatically identifies and flags duplicate issues
GitHub Actions Integration: Multiple workflow options including label-based filtering for selective analysis
PR Review & Analysis: AI-powered pull request review with code quality feedback and suggestions

Setup

1. Install the Package

The easiest way to get started is to install the package:

# Clone the repository
git clone https://github.com/shvenkat-rh/AI-Issue-Triage.git
cd AI-Issue-Triage

# Install in development mode (editable)
pip install -e .

This will:

Install all dependencies
Make the utils package importable
Install CLI commands: ai-triage, ai-triage-duplicate, ai-triage-cosine, ai-triage-pr

Alternative (manual dependency installation):

pip install -r requirements.txt

2. Get Gemini API Key

⚠️ Important Notes:

Red Hat employees: Do NOT follow these steps. Please refer to the RH Internal Guidelines for generating your API keys.

Already have a GCP/Gemini API key? You can skip this section and use your existing key.

Visit Google AI Studio
Create a new API key
Copy the env_example.txt to .env and add your API key:

cp env_example.txt .env
# Edit .env and add your API key

Note: The application now uses the latest Google Gen AI SDK with the advanced gemini-2.0-flash-001 model for faster and more accurate analysis.

3. Prepare Your Codebase

By default, the analyzer looks for a repomix-output.txt file in the project directory. This file should contain your codebase content generated by Repomix.

To generate this file:

# Install repomix
npm install -g repomix

# Generate codebase file in your project directory
repomix --output repomix-output.txt

Alternative: You can use any text file containing your codebase content and specify its path using the --source-path option (see CLI usage below).

Usage

The AI Issue Triage system can be used in multiple ways:

1. GitHub Actions Workflow (Automated) - ⚡ RECOMMENDED

📖 For complete setup instructions, see the QUICKSTART Guide

Quick Overview

The most powerful way to use this system is through automated GitHub Actions workflows. The system provides four workflow options:

Automatic Workflows (All Issues)

Single Issue Analysis: Analyzes each new issue as it's created
Bulk Issue Analysis: Re-analyzes all open issues when code changes (with smart duplicate detection)

Label-Based Workflows (Selective Analysis)

Labeled Issue Analysis: Only analyzes issues with "Gemini Analyze" label
Bulk Labeled Analysis: Re-analyzes only labeled issues on PR merge

Common Features

Beautiful Formatting: Professional GitHub-flavored Markdown with emojis and collapsible sections
Security Checks: Built-in prompt injection detection
Duplicate Detection: Identifies similar/duplicate issues automatically

Quick Setup

For detailed step-by-step instructions, see the QUICKSTART Guide.

TL;DR:

Copy workflows from cutlery/workflows/ to .github/workflows/ in your repo
Add GEMINI_API_KEY secret in GitHub repository settings
Create triage.config.json (example in cutlery/triage.config.json)
(Optional) Create "Gemini Analyze" label for selective analysis
Push changes and create a test issue

How It Works

Single Issue Analysis - When a new issue is opened:

Security Check: Scans for prompt injection attempts to protect the AI system
Duplicate Detection: Compares against existing issues to identify duplicates
AI Analysis: Performs comprehensive issue analysis using your codebase
Auto-Labeling: Adds appropriate labels based on issue type and severity
Comment Generation: Posts detailed analysis results as issue comments

Bulk Issue Analysis - When a PR is merged to main:

Fetches all open issues (sorted oldest → newest for canonical duplicate handling)
For each issue in sequential order:
- Prompt Injection Check: Scans and posts security report comment
- Duplicate Detection: Compares against previously analyzed issues in this run
  - If duplicate: adds label, posts duplicate comment with confidence score, skips AI analysis
- AI Analysis: Re-analyzes with updated codebase (only if not duplicate and safe)
  - Posts "Updated AI Analysis" comment with fresh insights
Automatically labels and comments on all issues

Note: Bulk analysis is efficient - duplicates and high-risk issues skip expensive AI analysis, saving API calls and time.

Workflow Features

Security Protection: Automatically detects and flags malicious prompt injection attempts
Smart Duplicate Detection:
- Single issue workflow: checks against all existing open issues
- Bulk analysis workflow: processes oldest → newest, comparing each against previously analyzed in the same run
- Older issues become "canonical" references for newer duplicates
- Duplicates are marked and skipped to save API calls
Smart Labeling: Adds labels like type:bug, severity:high, gemini-analyzed, duplicate, security-alert
Comprehensive Comments: Posts three types of comments:
- Prompt injection security reports (all issues)
- Duplicate detection results (when duplicates found)
- Updated AI analysis (non-duplicate, safe issues)
Artifact Storage: Saves analysis results and debug logs for review
Fast Processing: Uses latest Gemini 2.0 Flash model for quick analysis
Flexible Filtering: Choose between automatic (all issues) or label-based (selective) workflows

Available Workflows

The system provides four workflow files in cutlery/workflows/:

Workflow File	Trigger	Filter	Use Case
`gemini-issue-analysis.yml`	Issue opened	None	Analyze all new issues automatically
`gemini-labeled-issue-analysis.yml`	Issue opened/labeled	"Gemini Analyze" label	Selective analysis - only labeled issues
`ai-bulk-issue-analysis.yml`	PR merged to main	None	Re-analyze all open issues
`ai-bulk-labeled-issue-analysis.yml`	PR merged to main	"Gemini Analyze" label	Re-analyze only labeled issues

When to Use Label-Based Workflows:

🎯 Cost Control: Reduce API usage by analyzing only selected issues
🔍 Manual Triage: Team decides which issues need AI analysis
⚡ Complex Issues: Focus AI resources on difficult problems
📊 Gradual Rollout: Test AI analysis on select issues first

Workflow Configuration Example:

# Automatic: Analyzes all issues
name: AI Issue Analysis
on:
  issues:
    types: [opened]

# Label-Based: Only analyzes issues with "Gemini Analyze" label
name: AI Labeled Issue Analysis
on:
  issues:
    types: [labeled, opened]
jobs:
  analyze-issue:
    if: contains(github.event.issue.labels.*.name, 'Gemini Analyze')

Key Configuration Options:

Trigger Events: Modify on.issues.types to include edited, reopened, etc.
Label Filter: Change 'Gemini Analyze' to use a different label name
Repository Source: Change the AI-Issue-Triage repository reference if using a fork
Node.js Version: Adjust Node.js version for repomix compatibility
Python Version: Modify Python version based on your requirements
Artifact Retention: Adjust how long analysis artifacts are stored

Workflow Artifacts

The workflow generates several artifacts for debugging and audit purposes:

prompt_injection_result.json: Security scan results
prompt_injection_debug.log: Debug information for security checks
duplicate_result.json: Duplicate detection results
analysis_result.json: Complete AI analysis in JSON format
analysis_result.txt: Human-readable analysis results
repomix-output.txt: Generated codebase content

2. Web Interface (Interactive) - 🚧 Work in Progress

⚠️ Note: The Streamlit web interface is currently under development and not recommended for production use.

streamlit run ui/streamlit_app.py

This will open a web interface where you can:

Enter your Gemini API key in the sidebar
Provide issue details:
- Issue Title
- Detailed Description
Click "Analyze Issue" to get comprehensive analysis
Review results including:
- Issue classification and severity
- Root cause analysis
- Proposed solutions with code changes
- Confidence score
Export results as JSON for further use

Status: We're actively working on improving the UI/UX. For production use, please use the GitHub Actions workflow or CLI.

3. Command Line Interface (CLI) - Scripting & Automation

The analyzer provides a powerful command-line interface for automation and scripting.

Quick Start

Using installed commands (after pip install -e .):

# Interactive mode - prompts for title and description
ai-triage

# Direct analysis
ai-triage --title "Login bug" --description "Users can't login on mobile devices"

# Analyze from file
ai-triage --file sample_issue.txt

# Use custom source of truth file
ai-triage --title "Bug" --description "Description" --source-path /path/to/my-codebase.txt

# Use custom prompt template
ai-triage --title "Bug" --description "Description" --custom-prompt /path/to/custom_prompt.txt

# Use a different Gemini model
ai-triage --title "Bug" --description "Description" --model gemini-1.5-pro

# Save output to file
ai-triage --title "Bug" --description "Description" --output analysis.txt

# JSON output for automation
ai-triage --title "Bug" --description "Description" --format json

# Quiet mode (no progress messages)
ai-triage --quiet --title "Bug" --description "Description"

# Configure retry attempts for better quality
ai-triage --title "Bug" --description "Description" --retries 3

Alternative (using Python module):

python -m cli.analyze --title "Bug" --description "Details"

CLI Options

positional arguments:
  none

options:
  -h, --help            show this help message and exit
  --title TITLE, -t TITLE
                        Issue title
  --description DESCRIPTION, -d DESCRIPTION
                        Issue description  
  --file FILE, -f FILE  Read issue from file (title on first line, description below)
  --output OUTPUT, -o OUTPUT
                        Output file (default: stdout)
  --format {text,json}  Output format (default: text)
  --source-path SOURCE_PATH, -s SOURCE_PATH
                        Path to source of truth file (default: repomix-output.txt)
  --custom-prompt CUSTOM_PROMPT
                        Path to custom prompt template file
  --api-key API_KEY     Gemini API key (default: from GEMINI_API_KEY env var)
  --model MODEL         Gemini model name (default: gemini-2.0-flash-001)
  --retries RETRIES     Maximum retry attempts for low quality responses (default: 2)
  --quiet, -q           Suppress progress messages
  --no-clean            Disable data cleaning (preserve raw input)
  --version             show program's version number and exit

File Format for --file option

Issue Title Here
Issue description starts here.
Can be multiple lines.
Include all relevant details.

Additional Command-Line Tools

The project includes several specialized CLI tools for specific tasks:

1. Duplicate Issue Detection

Detect duplicate issues using AI-powered semantic analysis:

Using installed commands:

# Check if a new issue is duplicate
ai-triage-duplicate --title "Issue title" --description "Issue details" --issues issues.json

# Use a different Gemini model
ai-triage-duplicate --title "Issue title" --description "Details" --issues issues.json --model gemini-1.5-pro

# Batch check multiple issues
ai-triage-duplicate --file new-issues.json --issues existing-issues.json

Alternative (using Python module):

python -m cli.duplicate_check --title "..." --description "..." --issues issues.json

Features:

AI-powered semantic similarity detection
Compares against existing open issues
Provides similarity scores and recommendations
Configurable Gemini model selection

Status: ✅ Stable and ready for use

2. Cosine Similarity Duplicate Detection

Alternative duplicate detection using TF-IDF and cosine similarity:

Using installed commands:

ai-triage-cosine --title "Issue title" --description "Details" --issues issues.json

Alternative (using Python module):

python -m cli.cosine_check --title "..." --description "..." --issues issues.json

Features:

Fast, no API required
Uses scikit-learn TF-IDF vectorization
Good for offline/local analysis

Status: 🚧 Experimental - We're still refining the similarity thresholds and accuracy. Use with caution.

4. Pull Request Review

AI-powered pull request analysis and code review:

Using installed command (after pip install -e .):

# Review a PR from JSON file
ai-triage-pr --pr-file pr_data.json

# Review with custom configuration
ai-triage-pr --pr-file pr.json --config pr_prompt_config.yml

# Review and save to file
ai-triage-pr --pr-file pr.json --output review.md --format markdown

# Review with repo URL for context-specific prompts
ai-triage-pr --pr-file pr.json --repo-url "https://github.com/user/repo"

# Review with inline data
ai-triage-pr --title "Add feature" --body "Description" --files changes.json

Using as a module:

# Review a PR from JSON file
python -m cli.pr_review --pr-file pr_data.json

# Review with custom configuration
python -m cli.pr_review --pr-file pr.json --config pr_prompt_config.yml

# Review and save to file
python -m cli.pr_review --pr-file pr.json --output review.json --format markdown

# Review with repo URL for context-specific prompts
python -m cli.pr_review --pr-file pr.json --repo-url "https://github.com/user/repo"

# Review with inline data
python -m cli.pr_review --title "Add feature" --body "Description" --files changes.json

PR JSON file format:

{
  "title": "PR title",
  "body": "PR description",
  "repo_url": "https://github.com/user/repo",
  "file_changes": [
    {
      "filename": "path/to/file.py",
      "status": "modified",
      "additions": 10,
      "deletions": 5,
      "patch": "@@ -1,5 +1,10 @@\n..."
    }
  ]
}

Features:

Comprehensive code review with AI
File-specific comments with line numbers
Identifies strengths, issues, and suggestions
Configurable prompts for different repo types (Python, AI/ML, etc.)
Workflow analysis for GitHub Actions
Markdown and JSON output formats

Prompt Configuration:

The PR analyzer uses a YAML configuration file (pr_prompt_config.yml) to customize review behavior based on repository type:

# Repository URL patterns
repo_mappings:
  python:
    - 'github.com/.*/.*-python.*'
  ai_ml:
    - 'github.com/.*/.*AI.*'

# Custom prompts per repo type
prompts:
  python:
    pr_review:
      system_role: 'Python expert code reviewer...'
      review_structure: |
        Focus on:
        - PEP 8 compliance
        - Type hints
        - Docstrings
        ...

Status: ✅ Stable and ready for use

3. Prompt Injection Detection

Security tool to detect malicious prompt injection attempts:

Using as a module:

python -m utils.security.prompt_injection "title" "description"

Using as a library:

from utils.security import PromptInjectionDetector

detector = PromptInjectionDetector()
result = detector.detect("Issue content")
print(f"Risk Level: {result.risk_level}")

Features:

Detects prompt injection patterns
ML-based detection using pytector
Pattern-based heuristics
Risk level classification

Status: ✅ Stable - Automatically integrated into GitHub Actions workflows

5. Two-Pass Architecture (Librarian + Surgeon)

For complex issues requiring full code context, use the Two-Pass Architecture that intelligently breaks down the codebase into directory chunks and identifies relevant files before deep analysis:

How It Works:

Directory Chunking: Repository is cloned and divided into per-directory compressed repomix files
Pass 1 - Librarian: Analyzes each directory chunk to identify relevant files (with dependency tracking)
Pass 2 - Surgeon: Creates targeted repomix with only identified files for deep analysis

Pass 1 - Librarian (File Identification):

# Librarian analyzes directory chunks to identify relevant files
python -m cli.librarian \
  --title "Bug in authentication flow" \
  --description "Users cannot login after password reset" \
  --chunks-dir repomix-chunks \
  --output relevant_files.json

Pass 2 - Surgeon (Deep Analysis) - Use existing analyzer with targeted files:

# Surgeon pass uses the standard analyzer with targeted repomix
# (see GitHub Actions workflow for automated integration)

Benefits:

Scalable: Works with repos of any size by breaking into chunks
Token Efficient: Avoids 1M+ token limits by analyzing directories separately
Smart Dependencies: If file A imports file B, both are included
Precise Context: Surgeon gets only relevant files, not entire codebase

Automated Workflow: The ai-lib-triage.yml workflow automatically handles:

Repository cloning and directory tree generation
Per-directory repomix generation with compression
Librarian analysis across all chunks
Targeted repomix creation with identified files
Surgeon analysis with full context of relevant files
All security, duplicate detection, and labeling features

How It Works:

Librarian analyzes compressed codebase skeleton
AI identifies ALL relevant files (no arbitrary limits)
Includes dependency chains (if file A imports B, both included)
Creates targeted repomix with only identified files
Surgeon performs deep analysis with focused context
Results in more accurate analysis with lower token usage

Features:

AI determines relevant file count (no manual limits)
Automatic dependency inclusion
Targeted codebase generation
Integrates with existing analyzer (Surgeon)
GitHub Actions workflow available (ai-lib-triage.yml)

GitHub Workflow: The ai-lib-triage.yml workflow provides:

Label-triggered ("AI_Triage" or bypass label)
Security checks with prompt injection detection
Duplicate detection
Two-pass analysis (Librarian → Targeted Repomix → Surgeon)
Auto-labeling based on results
Comprehensive comments with file lists

When to Use:

✅ Subtle bugs requiring full code context
✅ Complex issues spanning multiple files
✅ Issues where file location is unclear
✅ Large codebases where full context exceeds token limits

Status: ✅ Stable and ready for use

Programmatic Usage (Python Library)

You can also use the analyzer programmatically:

# Import from the package
from utils import GeminiIssueAnalyzer, IssueAnalysis, IssueType, Severity

# Or import specific modules
from utils.analyzer import GeminiIssueAnalyzer
from utils.duplicate import CosineDuplicateAnalyzer, GeminiDuplicateAnalyzer
from utils.models import IssueAnalysis, IssueType, Severity
from utils.security import PromptInjectionDetector
from utils.pr_analyzer import PRAnalyzer
from utils.librarian import LibrarianAnalyzer

# Initialize analyzer with default source path
analyzer = GeminiIssueAnalyzer(api_key="your-api-key")

# Or initialize with custom source path
analyzer = GeminiIssueAnalyzer(
    api_key="your-api-key",
    source_path="/path/to/your/codebase.txt"
)

# Or use a different Gemini model
analyzer = GeminiIssueAnalyzer(
    api_key="your-api-key",
    model_name="gemini-1.5-pro"
)

# Note: The analyzer uses the Google Gen AI SDK with gemini-2.0-flash-001 by default

# Analyze an issue
analysis = analyzer.analyze_issue(
    title="Login page crashes on mobile",
    issue_description="When users try to login on mobile devices, the app crashes..."
)

print(f"Issue Type: {analysis.issue_type}")
print(f"Severity: {analysis.severity}")
print(f"Root Cause: {analysis.root_cause_analysis.primary_cause}")

# Use duplicate detection
duplicate_analyzer = GeminiDuplicateAnalyzer(
    api_key="your-api-key",
    model_name="gemini-1.5-pro"  # Optional
)
result = duplicate_analyzer.detect_duplicate(
    new_issue_title="Bug title",
    new_issue_description="Details",
    existing_issues=[...]
)

# Use security detection
security = PromptInjectionDetector()
check = security.detect("User input text")
print(f"Risk: {check.risk_level}")

# Use PR analyzer
pr_analyzer = PRAnalyzer(
    api_key="your-api-key",
    model_name="gemini-2.0-flash-001"  # Optional
)
review = pr_analyzer.review_pr(
    title="Add new feature",
    body="Description of changes",
    file_changes=[
        {
            "filename": "src/feature.py",
            "status": "modified",
            "additions": 10,
            "deletions": 5,
            "patch": "@@ -1,5 +1,10 @@\n..."
        }
    ],
    repo_url="https://github.com/user/repo"
)
print(f"Overall Assessment: {review.overall_assessment}")
print(f"Issues Found: {len(review.issues_found)}")

# Format review for display
formatted_review = pr_analyzer.format_review_summary(review)
print(formatted_review)

# Use Two-Pass Architecture (Librarian + Surgeon)
librarian = LibrarianAnalyzer(
    api_key="your-api-key",
    chunks_dir="repomix-chunks"
)

# Pass 1: Identify relevant files from directory chunks
result = librarian.identify_relevant_files(
    title="Authentication Bug",
    issue_description="Users cannot login after password reset"
)
print(f"Analysis: {result['analysis_summary']}")
print(f"Identified {len(result['relevant_files'])} relevant files")

# Pass 2: Use standard analyzer with targeted context
# (create targeted repomix with only relevant_files, then use GeminiIssueAnalyzer)

Source of Truth Configuration

The analyzer uses a "source of truth" file containing your codebase content to perform intelligent analysis. This gives the AI context about your specific code structure, patterns, and implementation details.

Default Behavior

By default, the analyzer looks for repomix-output.txt in the current directory
This file should contain your complete codebase content

Custom Source Path

You can specify a different source file using the --source-path option:

# Use a custom codebase file
ai-triage --source-path /path/to/my-project-dump.txt --title "Issue" --description "Details"

# Use a file in a different directory
ai-triage -s ../other-project/codebase.txt --title "Issue" --description "Details"

Supported File Formats

Any plain text file containing your codebase
Generated by tools like Repomix
Manual concatenation of source files
Output from other code analysis tools

Best Practices

Include all relevant source files in your source of truth
Keep the file updated when your codebase changes
Consider excluding large binary files or dependencies
Include configuration files, documentation, and tests for better analysis

Custom Prompt Templates

You can customize how the AI analyzes your issues by providing your own prompt template. This gives you complete control over the analysis style and focus areas.

Creating a Custom Prompt

Create a text file with your custom prompt template
Use placeholders for dynamic content:
- {title} - Issue title
- {issue_description} - Issue description
- {codebase_content} - Full codebase content
Example custom prompt (my_prompt.txt):

You are a security-focused code reviewer analyzing the following issue:

Title: {title}
Description: {issue_description}

Codebase: {codebase_content}

Focus on:
- Security vulnerabilities
- Input validation issues
- Authentication/authorization problems
- Data exposure risks

Provide analysis in JSON format with security_risks field.

Using Custom Prompts

# CLI usage
ai-triage --title "Security Issue" --description "Details..." --custom-prompt my_prompt.txt

# Web UI usage
# Enter the path in the "Custom Prompt Path" field in the sidebar

Custom Prompt Use Cases

Security Analysis: Focus on vulnerabilities and security best practices
Performance Review: Emphasize performance optimization opportunities
Architecture Review: Concentrate on design patterns and architectural improvements
Compliance Check: Ensure code meets specific coding standards or regulations
Domain-Specific: Tailor analysis for specific frameworks or technologies

Security Features

The AI Issue Triage system includes comprehensive security protections to prevent misuse and protect the AI analysis system.

Prompt Injection Detection

The system automatically scans all issue content for potential prompt injection attempts using:

Machine Learning Detection: Uses the pytector library with trained models
Pattern-Based Detection: Custom regex patterns for common injection techniques
Heuristic Analysis: Behavioral analysis for suspicious content patterns

Detection Categories

The system identifies various types of malicious inputs:

Role Manipulation: Attempts to change the AI's role or persona
System Prompts: Trying to inject system-level instructions
Instruction Bypass: Commands to ignore previous instructions
File Manipulation: Requests to create, modify, or access files
Code Injection: Attempts to execute arbitrary code
Data Extraction: Trying to extract sensitive information
Prompt Leakage: Attempts to reveal system prompts

Risk Levels

Issues are classified into risk levels:

Critical: Severe injection attempts (flagged and processing stopped)
High: Clear malicious intent (flagged with warning)
Medium: Suspicious patterns (flagged for review)
Low: Minor concerns (noted but processed)
Safe: No security concerns detected

Security Response

When prompt injection is detected:

Issue Flagging: Adds security labels (security-alert, prompt-injection-detected)
Warning Comment: Posts educational message explaining the detection
Processing Halt: Stops AI analysis to prevent system manipulation
Audit Trail: Logs detection details for security review

Duplicate Detection

The system includes intelligent duplicate detection to prevent redundant analysis and improve issue management.

How It Works

Semantic Analysis: Uses AI to understand issue meaning beyond exact text matches
Similarity Scoring: Calculates confidence scores for potential duplicates
Context Awareness: Considers issue status, labels, and resolution state
Cross-Reference: Compares against all existing open issues

Duplicate Handling

When duplicates are detected:

Automatic Labeling: Adds duplicate label
Reference Comment: Links to the original issue
Processing Skip: Avoids redundant AI analysis
Consolidation: Helps maintainers merge related issues

Smart Retry Mechanism

The analyzer includes an intelligent retry system that automatically detects low-quality responses and retries the analysis for better results.

How It Works

The system automatically identifies responses that contain:

Generic phrases like "requires further investigation" or "to be determined"
Very low confidence scores (< 60%)
Vague file paths or empty solutions
Short or incomplete analysis summaries

Configuration

# Default: 2 retries
ai-triage --title "Issue" --description "Details"

# Custom retry count
ai-triage --title "Issue" --description "Details" --retries 3

# Disable retries
ai-triage --title "Issue" --description "Details" --retries 0

Benefits

Higher Quality: Automatically improves analysis quality
Reliability: Reduces chance of getting generic responses
Transparency: Shows retry attempts in progress messages
Configurable: Adjust retry count based on your needs

Analysis Components

Issue Classification

Bug: Issues that represent errors or defects
Enhancement: Improvements to existing functionality
Feature Request: New functionality requests

Severity Levels

Critical: System-breaking issues requiring immediate attention
High: Important issues affecting core functionality
Medium: Moderate impact issues
Low: Minor issues with minimal impact

Root Cause Analysis

Primary cause identification
Contributing factors
Affected components
Related code locations

Solution Proposals

Specific code changes
Implementation rationale
Target locations (files, functions, classes)
Step-by-step implementation guidance

Example Analysis

{
  "title": "Authentication timeout not handled properly",
  "issue_type": "bug",
  "severity": "high",
  "root_cause_analysis": {
    "primary_cause": "Missing timeout exception handling in auth module",
    "contributing_factors": [
      "No retry mechanism implemented",
      "User feedback on timeout missing"
    ],
    "affected_components": ["authentication", "user_session"],
    "related_code_locations": [
      {
        "file_path": "src/auth/login.py",
        "line_number": 45,
        "function_name": "authenticate_user"
      }
    ]
  },
  "proposed_solutions": [
    {
      "description": "Add timeout exception handling with user feedback",
      "code_changes": "try:\n    response = auth_request()\nexcept TimeoutError:\n    return {'error': 'Authentication timeout'}",
      "location": {
        "file_path": "src/auth/login.py",
        "function_name": "authenticate_user"
      },
      "rationale": "Provides graceful error handling and user feedback"
    }
  ],
  "confidence_score": 0.85
}

Testing

The project includes a comprehensive test suite to ensure code quality and reliability.

Continuous Integration

Automated quality checks run on every pull request and push to main via GitHub Actions.

CI Workflow (`ci.yml`) - All-in-One Status Check

The main CI workflow combines all checks into a single unified status:

Unit Tests (Matrix Strategy)

Runs on Python 3.11, 3.12, and 3.13
All versions run in parallel
fail-fast: false - All Python versions complete even if one fails

Lint Checks

Black: Code formatting validation
isort: Import sorting validation
Flake8: Code quality linting
Blocking: PRs cannot merge if linting fails

All Checks Pass Job

Single unified status check
Only passes if all unit tests AND linting succeed
Perfect for branch protection rules

See .github/workflows/ci.yml for configuration details.

Running Tests Locally

# Run all tests
pytest tests/

# Run with verbose output
pytest tests/ -v

# Run with coverage report (optional, requires pytest-cov)
# pytest tests/ --cov=. --cov-report=html

# Run only unit tests (no API required)
pytest tests/ -m unit -v

# Run only integration tests (requires API key)
pytest tests/ -m integration -v

# Run specific test file
pytest tests/test_models.py -v

# Use the test runner script
python run_tests.py

Running Linting Checks Locally

Before pushing code, run these checks locally:

# Install linting tools
pip install black isort flake8 flake8-docstrings flake8-bugbear

# Auto-fix formatting issues
black .
isort .

# Check formatting without fixing
black --check --diff .
isort --check-only --diff .

# Run flake8 linting
flake8 . --max-line-length=127 --extend-ignore=E203,W503

# Run all checks at once
black . && isort . && flake8 .

Test Organization

tests/
├── __init__.py                       # Package initialization
├── conftest.py                       # Pytest configuration & fixtures
├── test_models.py                    # Tests for data models
├── test_gemini_analyzer.py           # Tests for Gemini analyzer
├── test_duplicate_analyzer.py        # Tests for Gemini duplicate detection
└── test_cosine_duplicate_analyzer.py # Tests for cosine similarity analyzer

Test Features

Comprehensive test coverage for all major functionality
Fixtures: Reusable test data and setup
Markers: Categorize tests by type (unit, integration, slow)
Unit tests: No API key required, fast execution
Integration tests: Require GEMINI_API_KEY for full functionality

Project Structure

AI-Issue-Triage/
├── .github/
│   └── workflows/
│       ├── gemini-issue-analysis.yml  # (Example) Auto issue analysis workflow
│       └── ci.yml                     # Combined CI workflow (tests + lint)
│
├── utils/                      # 📦 Core Library Package
│   ├── __init__.py            # Package exports
│   ├── models.py              # Pydantic data models
│   ├── analyzer.py            # Main issue analyzer (Surgeon)
│   ├── librarian.py           # ✅ File identification analyzer (Librarian - Pass 1)
│   ├── pr_analyzer.py         # ✅ PR review analyzer
│   ├── duplicate/             # Duplicate detection module
│   │   ├── __init__.py
│   │   ├── gemini_duplicate.py    # ✅ AI-powered duplicate detection
│   │   └── cosine_duplicate.py    # 🚧 TF-IDF based detection (WIP)
│   └── security/              # Security module
│       ├── __init__.py
│       └── prompt_injection.py    # ✅ Prompt injection detection
│
├── cli/                        # 🖥️ Command-Line Tools
│   ├── __init__.py
│   ├── analyze.py             # ✅ Main CLI (ai-triage / Surgeon)
│   ├── duplicate_check.py     # ✅ Duplicate check CLI (ai-triage-duplicate)
│   ├── cosine_check.py        # 🚧 Cosine check CLI (ai-triage-cosine, WIP)
│   ├── pr_review.py           # ✅ PR review CLI
│   └── librarian.py           # ✅ Librarian CLI (Pass 1 - file identification)
│
├── ui/                         # 🎨 User Interface
│   ├── __init__.py
│   ├── streamlit_app.py       # 🚧 Streamlit web UI (WIP)
│   └── run_app.py             # Application runner
│
├── tests/                      # ✅ Comprehensive Test Suite
│   ├── __init__.py
│   ├── conftest.py            # Pytest configuration & fixtures
│   ├── test_models.py         # Data models tests
│   ├── test_gemini_analyzer.py        # Analyzer tests
│   ├── test_duplicate_analyzer.py     # Duplicate detection tests
│   ├── test_cosine_duplicate_analyzer.py  # Cosine similarity tests
│   └── test_pr_analyzer.py    # ✅ PR analyzer tests
│
├── cutlery/                    # 🚀 Quick Start Resources
│   ├── QUICKSTART.md          # Complete setup guide
│   ├── workflows/             # GitHub Actions workflow templates
│   │   ├── gemini-issue-analysis.yml           # ✅ Auto: Single issue
│   │   ├── gemini-labeled-issue-analysis.yml   # ✅ Label: Single issue
│   │   ├── ai-bulk-issue-analysis.yml          # ✅ Auto: Bulk issues
│   │   ├── ai-bulk-labeled-issue-analysis.yml  # ✅ Label: Bulk issues
│   │   ├── ai-pr-review.yml                    # ✅ PR review (label-triggered)
│   │   └── ai-lib-triage.yml                   # ✅ Two-Pass Architecture (Librarian+Surgeon)
│   ├── triage.config.json     # Example configuration
│   └── samples/               # Sample files for testing
│
├── Configuration Files
│   ├── setup.py               # Package installation config
│   ├── requirements.txt       # Python dependencies
│   ├── pytest.ini             # Pytest configuration
│   ├── pyproject.toml         # Black, isort configuration
│   ├── .flake8                # Flake8 linting configuration
│   ├── pr_prompt_config.yml   # ✅ PR review prompt configuration
│   └── env_example.txt        # Environment variables template
│
└── Documentation & Samples
    ├── README.md              # This documentation
    ├── run_tests.py           # Test runner with options
    ├── sample_issue.txt       # Example issue for testing
    └── sample_issues.json     # Sample issues data

Legend:
✅ = Stable and ready for production use
🚧 = Work in progress, use with caution
🚀 = Recommended starting point
📦 = Pip-installable package
🖥️ = Command-line interface
🎨 = Web interface

Contributing

We welcome contributions! Please follow these steps:

Development Workflow

Fork and clone the repository

git clone https://github.com/YOUR_USERNAME/AI-Issue-Triage.git
cd AI-Issue-Triage

Create a feature branch

git checkout -b feature/your-feature-name

Install dependencies

pip install -r requirements.txt
pip install black isort flake8 pytest

Make your changes and format code

# Auto-format code
black .
isort .

# Check linting
flake8 .

Add tests for new functionality
- Add unit tests in tests/
- Run tests locally: pytest tests/ -m unit -v

Run CI checks locally

# Run all unit tests
pytest tests/ -m unit -v

# Check formatting
black --check .
isort --check-only .
flake8 .

Commit and push

git add .
git commit -m "Description of your changes"
git push origin feature/your-feature-name

Submit a pull request
- CI will automatically run tests and linting
- All checks must pass before merging
- The "CI / All Checks Pass" status must be green

Code Standards

Python versions: Must support 3.11, 3.12, and 3.13
Formatting: Use black with 127 character line length
Import sorting: Use isort with black-compatible settings
Linting: Must pass flake8 checks
Testing: Add tests for new features
Documentation: Update README for significant changes

License

This project is licensed under the Apache License 2.0.

Support

For issues and questions:

Check the existing issues
Create a new issue with detailed description
Include your environment details and error messages

📚 Additional Resources

QUICKSTART Guide - Complete setup guide for GitHub Actions workflows
Test Examples - Sample files for testing and configuration
Workflow Templates - Ready-to-use GitHub Actions workflows

Note: This tool requires a valid Google Gemini API key. Usage may incur costs based on Google's pricing for the Gemini API.

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
.github/workflows		.github/workflows
cli		cli
cutlery		cutlery
tests		tests
ui		ui
utils		utils
.flake8		.flake8
.gitignore		.gitignore
README.md		README.md
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
requirements.txt		requirements.txt
run_tests.py		run_tests.py
setup.py		setup.py

Folders and files

Latest commit

History

Repository files navigation

AI Issue Triage

🚀 Quick Start

Features

Setup

1. Install the Package

2. Get Gemini API Key

3. Prepare Your Codebase

Usage

1. GitHub Actions Workflow (Automated) - ⚡ RECOMMENDED

Quick Overview

Automatic Workflows (All Issues)

Label-Based Workflows (Selective Analysis)

Common Features

Quick Setup

How It Works

Workflow Features

Available Workflows

Workflow Artifacts

2. Web Interface (Interactive) - 🚧 Work in Progress

3. Command Line Interface (CLI) - Scripting & Automation

Quick Start

CLI Options

File Format for --file option

Additional Command-Line Tools

1. Duplicate Issue Detection

2. Cosine Similarity Duplicate Detection

4. Pull Request Review

3. Prompt Injection Detection

5. Two-Pass Architecture (Librarian + Surgeon)

Programmatic Usage (Python Library)

Source of Truth Configuration

Default Behavior

Custom Source Path

Supported File Formats

Best Practices

Custom Prompt Templates

Creating a Custom Prompt

Using Custom Prompts

Custom Prompt Use Cases

Security Features

Prompt Injection Detection

Detection Categories

Risk Levels

Security Response

Duplicate Detection

How It Works

Duplicate Handling

Smart Retry Mechanism

How It Works

Configuration

Benefits

Analysis Components

Issue Classification

Severity Levels

Root Cause Analysis

Solution Proposals

Example Analysis

Testing

Continuous Integration

CI Workflow (ci.yml) - All-in-One Status Check

Running Tests Locally

Running Linting Checks Locally

Test Organization

Test Features

Project Structure

Contributing

Development Workflow

Code Standards

License

Support

📚 Additional Resources

About

Topics

Resources

Uh oh!

Stars

CI Workflow (`ci.yml`) - All-in-One Status Check

Packages