Comprehensive testing framework for ML model robustness against adversarial content attacks
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ "Testing the boundaries so the boundaries don't break you" โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
# Clone and install
git clone https://github.com/franksx/adversarial-ml-tester.git
cd adversarial-ml-tester
pip install -r requirements.txt
# Generate adversarial test content
python -m adversarial_ml_tester generate -c 1000 --verbose
# Test your model's robustness
python -m adversarial_ml_tester test -m http://your-model-api.com/predict
# Validate responses
python -m adversarial_ml_tester validate -i responses.json| Attack | Description | Example |
|---|---|---|
| Homoglyph | Cyrillic/Latin confusion | ะฐdmin vs admin |
| Invisible | Zero-width characters | userโname |
| ZWJ | Zero-width joiner | fโrโaโnโk |
| RTL | Right-to-left override | โฎresuโญ |
| Case | Random case | UsErNaMe |
| Leet | 1337 speak | 4dm1n |
| Glitch | Combining marks | aฬทdฬทmฬทiฬทnฬท |
| Punycode | IDN homographs | xn--admin-wmc |
| Emoji | Emoji injection | user๐ฆname |
- โ PII Detection - Identifies personal information leakage
- โ Injection Detection - XSS/script injection attempts
- โ Encoding Validation - Suspicious encoding detection
- โ Prompt Leakage - System prompt exposure detection
- โ Consistency Check - Output consistency verification
{
"username": "ะฐdminโistrator",
"first_name": "Jะพhn",
"last_name": "Smัth",
"address": "123 Mะฐin St, New Yะพrk",
"description": "Hi, I'm Jะพhn. I love cะพding...",
"attack_vectors": ["homoglyph", "invisible"],
"byte_hash": "a3f9e2b8c1d4e5f6"
}Total: 6 tests
โ
Passed: 4
โ Failed: 1
โ ๏ธ Warnings: 1
Average Score: 0.82
homoglyph_robustness: pass (score: 0.85)
invisible_character_handling: pass (score: 0.90)
case_sensitivity: warning (score: 0.60)
prompt_injection_resistance: fail (score: 0.45)
length_boundary_handling: pass (score: 0.95)
encoding_robustness: pass (score: 0.88)
git clone https://github.com/yourusername/adversarial-ml-tester.git
cd adversarial-ml-tester
pip install -r requirements.txtpip install -e .# Generate adversarial profiles
python -m adversarial_ml_tester generate -c 100 -o profiles.json
# Test model robustness
python -m adversarial_ml_tester test -m http://api.example.com/predict
# Validate responses
python -m adversarial_ml_tester validate -i responses.json
# Fuzzing mode
python -m adversarial_ml_tester fuzz --verbose -o findings.json
# Generate report
python -m adversarial_ml_tester report -o report.jsonfrom generators.content_generator import ContentGenerator
from adversarial.robustness_tester import RobustnessTester
from validators.response_validator import ContentValidator
# Generate content
gen = ContentGenerator(seed=42)
profile = gen.generate_profile()
# Test robustness
def my_model(text):
return {"prediction": "class_1", "confidence": 0.95}
tester = RobustnessTester(my_model)
results = tester.run_full_suite("test input")
# Validate responses
validator = ContentValidator()
reports = validator.validate_all(model_output)# Run unit tests
python tests/test_suite.py
# Run example demos
python scripts/examples.py
# Generate and test
python -m adversarial_ml_tester generate -c 100
python -m adversarial_ml_tester testadversarial_ml_tester/
โโโ generators/ # Content generation
โ โโโ content_generator.py
โโโ adversarial/ # Robustness testing
โ โโโ robustness_tester.py
โโโ validators/ # Response validation
โ โโโ response_validator.py
โโโ tests/ # Unit tests
โ โโโ test_suite.py
โโโ docs/ # Documentation
โ โโโ USAGE_GUIDE.md
โ โโโ ATTACK_REFERENCE.md
โโโ scripts/ # Examples
โ โโโ examples.py
โโโ __main__.py # CLI entry
โโโ README.md # This file
โโโ requirements.txt # Dependencies
โโโ setup.py # Package setup
โโโ LICENSE # MIT License
โ Appropriate Use:
- Testing your own ML models
- Security research with permission
- Educational purposes
- Improving model robustness
โ Inappropriate Use:
- Attacking systems without authorization
- Generating harmful content
- Bypassing security controls
- Impersonating real users
- Usage Guide - Detailed usage instructions
- Attack Reference - Complete attack documentation
- Package Summary - Package overview
Contributions welcome! Please:
- Fork the repository
- Create a feature branch
- Add tests for new functionality
- Submit a pull request
MIT License - see LICENSE file
13th Hour Productions
"Testing the boundaries so the boundaries don't break you"
Note: This tool is designed for defensive security testing. Use responsibly and only on systems you own or have explicit permission to test.