Skip to content

CassianLee14/SentinEL-Adversarial-ML

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

2 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ›‘οΈ SentinEL: Adversarial ML Robustness Research

Lead Researcher & ML Architect: Mourya Reddy Udumula Operations & Pipeline Lead: Jeet Anand Upadhyaya Presented at: Indrashil University Research Symposium, January 2026


🧠 What Is SentinEL?

SentinEL is an adversarial ML robustness research platform for phishing URL detection. The central finding of this research is that standard Random Forest classifiers β€” even those achieving 97%+ accuracy on clean data β€” can be systematically bypassed using Unicode character-encoding manipulation.

The project quantifies this robustness gap, builds a production-grade detection pipeline with explainable verdicts, and evaluates alternative explainability methods under real SOC latency constraints.

Research question: Can pattern-matching ML detectors defend against character-encoding manipulation, or do they learn superficial string features rather than semantic intent? Answer: They learn superficial patterns. The 15.8% robustness gap is the evidence.


πŸ“Š Key Research Metrics

Metric Value
Baseline accuracy (clean data) 97.2%
Accuracy under Cyrillic homoglyph attacks 81.4%
Robustness gap 15.8%
Native XAI attribution latency < 2 ms
SHAP / LIME latency (rejected alternative) 50–80 ms
Feature dimensions 17
ROC-AUC 1.00

The 15.8% degradation is not a model training failure β€” it is a fundamental property of pattern-matching classifiers. They cannot reason about attacker intent, only character sequences.


πŸ”¬ Research Methodology

Stage 1: Baseline Model

A Random Forest classifier was trained on a standard phishing dataset using 17 hand-engineered URL features (domain age, entropy, special character ratios, TLD risk scores, WHOIS forensics, etc.). Baseline accuracy: 97.2%.

Stage 2: Adversarial Attack Design

Cyrillic homoglyph substitution attacks were designed to exploit the Unicode ambiguity exploited by real phishing actors. Latin characters (e.g., a, e, o, p, c) are replaced with visually identical Cyrillic equivalents (e.g., Π°, Π΅, ΠΎ, Ρ€, с). The URL is visually indistinguishable to a human but has entirely different byte-level features.

Stage 3: Robustness Evaluation

The trained model was evaluated against the adversarial corpus. Accuracy fell from 97.2% to 81.4% β€” a 15.8% degradation β€” confirming the classifier had learned string-level patterns rather than semantic structure.

Stage 4: Explainability Framework

SHAP and LIME were evaluated as explainability methods but rejected due to 50–80ms inference overhead β€” incompatible with SOCs processing 500+ alerts/hour requiring real-time verdicts. A native attribution method using Gini importance weights was implemented instead, achieving <2ms latency while producing human-readable forensic justifications.


πŸ—οΈ System Architecture

Input URL
    β”‚
    β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                  Triage Pipeline                     β”‚
β”‚                                                      β”‚
β”‚  Stage 1: Reputation Allowlist                       β”‚
β”‚    └── Known-safe domains β†’ LEGITIMATE (instant)    β”‚
β”‚                                                      β”‚
β”‚  Stage 2: Forensic Heuristics                        β”‚
β”‚    └── WHOIS age, entropy, TLD, IP-in-URL checks    β”‚
β”‚                                                      β”‚
β”‚  Stage 3: Random Forest Classifier (17 features)    β”‚
β”‚    └── Probability score β†’ Verdict                  β”‚
β”‚                                                      β”‚
β”‚  Stage 4: Native XAI Attribution (<2ms)             β”‚
β”‚    └── Gini importance β†’ Human-readable reasons     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
    β”‚
    β–Ό
Verdict: LEGITIMATE / SUSPICIOUS / PHISHING + Reasons

πŸ“‚ Engineering Attribution

Module Lead Core Technology
classifier.py Mourya Udumula Random Forest, adversarial testing, XAI attribution
feature_extractor.py Mourya Udumula 17-dimensional forensic URL vectorization
Analytics / experiments/ Mourya Udumula ROC-AUC, confusion matrix, model calibration viz
Batch Triage / pipeline/ Jeet Upadhyaya Concurrent IOC ingestion pipeline
Audit Logs Jeet Upadhyaya Session event logging and export
Forensic heuristics Jeet Upadhyaya WHOIS forensics, domain age, IP reputation

πŸ–₯️ Dashboard Features

πŸ” Forensic Scanner

Live single-URL scan with full verdict breakdown. Enter any URL and receive a probability score, verdict classification (LEGITIMATE / SUSPICIOUS / PHISHING), and a list of forensic reasons derived from the XAI attribution layer. Analysts can flag false positives to build a session-level whitelist.

πŸ“¦ Batch Triage

Upload or paste a list of URLs for concurrent batch evaluation. Results are returned with per-URL verdicts, probability scores, and downloadable CSV export. Built for SOC analysts processing bulk IOC feeds.

πŸ“Š Analytics

Model performance visualization: confusion matrix, ROC curve (AUC = 1.00), F1/Precision/Recall metrics. Displays the statistical validation underpinning the 97.2% baseline accuracy claim.

πŸ“ Audit Logs

Full session event log with timestamps, event categories (SCAN, BATCH, OVERRIDE, FEEDBACK, SYSTEM), and CSV download for offline analysis.


πŸ”§ Installation & Quickstart

# Clone
git clone https://github.com/Maze-6/SentinEL-Adversarial-ML.git
cd SentinEL-Adversarial-ML

# Install dependencies
pip install -r requirements.txt

# Run tests
python -m pytest tests/ -v

# Run adversarial robustness evaluation
python experiments/adversarial_eval.py

# Run baseline evaluation
python experiments/baseline_eval.py

# Run batch evaluation on a CSV
python pipeline/batch_evaluate.py --csv data/url_list.csv

πŸ“¦ Requirements

streamlit
pandas
scikit-learn
matplotlib
seaborn
requests
dnspython
beautifulsoup4
python-whois

πŸ“ Repository Structure

SentinEL-Adversarial-ML/
β”œβ”€β”€ classifier.py            # Random Forest model + train/predict pipeline
β”œβ”€β”€ feature_extractor.py     # 17-feature URL vectorization (offline + enriched modes)
β”œβ”€β”€ explainer.py             # Native Gini XAI attribution (<2ms)
β”œβ”€β”€ config.py                # Model paths, thresholds, dataset config
β”œβ”€β”€ requirements.txt
β”œβ”€β”€ adversarial/
β”‚   β”œβ”€β”€ homoglyph_map.py     # Cyrillic ↔ Latin character substitution map
β”‚   └── attack_generator.py  # Adversarial URL corpus generation
β”œβ”€β”€ experiments/
β”‚   β”œβ”€β”€ baseline_eval.py     # Clean-data evaluation (ROC-AUC, F1, FNR, FPR, 5-fold CV)
β”‚   β”œβ”€β”€ adversarial_eval.py  # Homoglyph attack robustness evaluation
β”‚   β”œβ”€β”€ explainability_bench.py  # Gini vs SHAP vs LIME latency comparison
β”‚   └── generate_graphs.py   # Publication-quality chart generation
β”œβ”€β”€ pipeline/
β”‚   └── batch_evaluate.py    # Single-URL and CSV batch evaluation CLI
β”œβ”€β”€ data/
β”‚   β”œβ”€β”€ dataset.csv          # Training dataset (phishing + legitimate URLs)
β”‚   └── build_dataset.py     # Dataset construction script (PhishTank + Tranco)
β”œβ”€β”€ tests/
β”‚   └── test_sentinEL.py     # 27-test pytest suite
β”œβ”€β”€ docs/
β”‚   └── SentinEL_Technical_Report.pdf
└── assets/
    β”œβ”€β”€ adversarial_detection.png
    β”œβ”€β”€ analytics_metrics.png
    └── ...

πŸ”¬ Key Research Findings

Finding 1: Pattern-matching classifiers cannot generalize to adversarial inputs

The model learned that certain character sequences are associated with phishing. Cyrillic substitution changes those sequences while preserving visual appearance, bypassing learned patterns entirely. This is not a hyperparameter problem β€” it is a fundamental architectural limitation.

Finding 2: SHAP and LIME are impractical for production SOC deployment

Both methods require inference times of 50–80ms per URL. A SOC processing 500 alerts/hour at this latency would spend 7–11 hours on explainability computation alone. Native Gini attribution at <2ms makes real-time deployment viable.

Finding 3: Solving adversarial brittleness requires constraint-based reasoning, not better features

Adding more features trained on clean data will not close the robustness gap, because the attack operates at the Unicode level β€” below the abstraction layer the model reasons about. Closing this gap requires either constraint-based URL normalization at the feature extraction stage, or a reasoning layer that understands attacker intent.

This finding motivates the proposed MSc thesis direction: integrating human analyst-labeled adversarial examples into a Case-Based Reasoning system for adaptive phishing detection.


πŸ“– Technical Report

A 70-page technical report covering full methodology, statistical analysis, adversarial attack taxonomy, and explainability framework evaluation is available in docs/SentinEL_Technical_Report.pdf.


πŸ”— Related Project

VaultZero β€” Fault-tolerant distributed storage with threshold cryptography


Senior capstone research β€” Indrashil University mouryaudumula@gmail.com

Releases

No releases published

Packages

 
 
 

Contributors

Languages