Skip to content

ridash2005/GNN-based-Fraud-Detection

Repository files navigation

GraphGE: Uncertainty-Aware Fraud Detection with GraphSAGE

Python PyTorch PyG Scikit-Learn Pandas License: MIT

An implementation of Graph Neural Networks for Bitcoin transaction fraud detection featuring principled uncertainty quantification using Monte Carlo Dropout.

Overview

This project implements a GraphSAGE-based fraud detection system on the Elliptic Bitcoin Dataset. It prioritizes reliable uncertainty estimation for high-stakes financial applications, achieving well-calibrated predictions (ECE < 0.05) through epistemic and aleatoric uncertainty decomposition.

Key Features

  • Bayesian Uncertainty Estimation: Monte Carlo Dropout (T=30) for robust prediction intervals.
  • Class Imbalance Mitigation: Inverse frequency weighting (7.63x) for the fraud class.
  • Temporal Analysis: Detection of distribution drift through time-series uncertainty monitoring.
  • Rigorous Evaluation: Comprehensive ablations for dropout rates, hidden dimensions, and feature engineering.
  • Model Calibration: Calibration curves and risk-coverage analysis for selective prediction.

Architecture

GraphSAGE Model:
├── 2 Graph Convolutional Layers (64 hidden dims)
├── Dropout (p=0.5) for uncertainty quantification
├── RobustScaler preprocessing
└── Node degree features (in/out-degree)

The model utilizes Negative Log-Likelihood Loss with class weights, achieving an F1-score of 0.42 and PR-AUC of 0.40 after threshold tuning.

Results

Performance Metrics

Metric Value Interpretation
F1 Score 0.4209 +8.9% improvement via post-hoc threshold optimization
PR-AUC 0.3979 Effective handling of 7.6:1 class imbalance
ECE 0.0450 Well-calibrated confidence estimates
Entropy-AUC 0.1400 Strong separation of correct/incorrect predictions by uncertainty

Ablation Studies

  • Dropout Rate: An optimal rate of 0.2 provides the best balance between uncertainty quantification and regularization.
  • Hidden Dimensions: 64-dimensional layers provide optimal capacity without overfitting.
  • Feature Engineering: Degree features enhance F1-score by 3% and significantly improve uncertainty separation.

Installation

# Clone repository
git clone https://github.com/ridash2005/GNN-Based-Fraud-Detection.git
cd GNN-Based-Fraud-Detection

# Install dependencies
pip install torch torch-geometric scikit-learn pandas numpy matplotlib seaborn

Dataset

The Elliptic Bitcoin Dataset consists of:

  • 203,769 Bitcoin transactions (nodes)
  • 166-dimensional node features
  • Temporal graph structure (49 time steps)
  • Binary labels: licit (0) vs illicit (1)

Project Structure

GNN-Based-Fraud-Detection/
├── graphge/
│   ├── src/
│   │   ├── load_data.py          # Data loading utilities
│   │   ├── models.py             # GraphSAGE implementation
│   │   └── uncertainty.py        # MC Dropout functions
│   └── results/
│       ├── metrics.csv           # Performance logs
│       └── figures/              # Calibration and ablation plots
├── GNN_Fraud_Detection_Pipeline.ipynb     # Primary implementation pipeline
├── Uncertainty_Quantification_Study.ipynb # Detailed Bayesian UQ analysis
├── Extended_Experimental_Ablations.ipynb  # Comprehensive ablation experiments
├── Detailed_Report.md                     # Technical experimental report
└── README.md

Citation

@software{graphge2025,
  author = {Rickarya Das},
  title = {GraphGE: Uncertainty-Aware Fraud Detection with GraphSAGE},
  year = {2025},
  url = {https://github.com/ridash2005/GNN-Based-Fraud-Detection}
}

References

  • Hamilton et al. (2017) - "Inductive Representation Learning on Large Graphs"
  • Gal & Ghahramani (2016) - "Dropout as a Bayesian Approximation"
  • Weber et al. (2019) - "Anti-Money Laundering in Bitcoin: Experimenting with Graph Convolutional Networks"

License

MIT License - see LICENSE file for details.

Contact

Rickarya Das - GitHub

About

GraphGE: Uncertainty-aware fraud detection on Bitcoin transactions using GraphSAGE. Implements Bayesian uncertainty quantification via Monte Carlo Dropout, class-imbalance mitigation, and selective prediction with calibrated probabilities (ECE < 0.05).

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors