GitHub - wkusnierczyk/redoxide: High-performance, modular, extensible LLM Red Teaming tool written in Rust.

RedOxide is a high-performance, modular, and extensible LLM Red Teaming tool written in Rust. It is designed to evaluate the safety and robustness of Large Language Models (LLMs) by simulating various adversarial attacks.

Note

The redoxide crate is available from crates.io.
The redoxide documentation is available on docs.rs.
The redoxide readme on GitHub pages.

Introduction

RedOxide mimics the architecture of professional security tools but remains lightweight and completely open-source. It supports:

Concurrency: Uses tokio streams to run parallel attacks for high throughput.
Modularity: Plug-and-play architecture for new Attack Strategies and Evaluation logic.
LLM Judge: Optional integration with GPT-4 to grade the safety of responses more accurately than simple keyword matching.

What is Red Teaming?

Red Teaming in the context of AI involves actively attempting to "break" or bypass the safety filters of an LLM. The goal is to elicit harmful, unethical, or illegal responses (e.g., bomb-making instructions, hate speech) to identify vulnerabilities before bad actors do.

Popular References:

Lakera Red Team: A leading commercial platform for AI security.
Garak: An open-source LLM vulnerability scanner (Python-based).
AdvBench: A dataset of adversarial prompts used for academic benchmarks.

RedOxide provides a Rust-native alternative that focuses on speed and developer extensibility.

Project Structure

The codebase is organized as a library with a CLI wrapper, enabling you to use it as a standalone tool or import its modules into your own Rust applications.

red_oxide/
├── Cargo.toml            # Dependencies and Package info
├── .github/              # CI/CD Workflows
├── src/
│   ├── lib.rs            # Library entry point & Error types
│   ├── main.rs           # CLI application logic
│   ├── target.rs         # LLM Interface (OpenAI, Local models)
│   ├── strategy.rs       # Attack generators (Jailbreaks, Obfuscation)
│   ├── evaluator.rs      # Grading logic (Keywords, LLM Judge)
│   └── runner.rs         # Async engine using Tokio streams
└── tests/
    └── integration.rs    # Full pipeline tests using Mock Targets

Installation & Build

Prerequisites

Rust (latest stable)
An OpenAI API Key (exported as OPENAI_API_KEY)

Build

# Clone the repository
git clone https://github.com/wkusnierczyk/redoxide.git
cd redoxide

# Build release binary
cargo build --release

Usage Guide

Run the tool using cargo run or the compiled binary.

Scanning

The primary command is scan. By default, it runs a basic jailbreak test against gpt-3.5-turbo.

export OPENAI_API_KEY=<your-api-key>

# Run a basic scan
cargo run -- scan

Command Line Options

Option	Short	Default	Description
`--model`	`-m`	`gpt-3.5-turbo`	The target model ID to attack.
`--file`	`-f`	`None`	Path to a file containing prompts (one per line).
`--strategy`	`-s`	`jailbreak`	The attack strategy (`jailbreak`, `splitting`, `research`).
`--use-judge`		`false`	Use GPT-4 as a judge (more accurate, costs $).
`--concurrency`	`-c`	`5`	Number of parallel requests to run.
`--output`	`-o`	`report.json`	Filename for the JSON results.

Examples:

# Attack using a file of prompts with the "Payload Splitting" strategy
cargo run -- scan --file attacks/simple.txt --strategy splitting

# Use GPT-4 as a judge for higher accuracy (slower/costlier)
cargo run -- scan --use-judge --model gpt-4

Strategies

Jailbreak: Wraps prompts in templates like "DAN" (Do Anything Now) or fictional storytelling frames.
Splitting: Obfuscates keywords (e.g., "B-O-M-B") to bypass simple blocklists.
Research: Frames the malicious request as a theoretical or educational inquiry.

Note
You can add your strategy by implementing the Strategy trait in src/strategy.rs.

Evaluators

Keyword (Default): Fast and free. Checks for refusal phrases like "I cannot", "As an AI".
LLM Judge: Uses a separate LLM call to analyze the response contextually. Recommended for production use.

Testing & Benchmarking

RedOxide includes a comprehensive suite of tests and benchmarks.

Run Unit & Integration Tests

cargo test

Unit Tests: Verify logic in strategy.rs and evaluator.rs.
Integration Tests: Run the full pipeline against a Mock Target to verify the Runner without network costs.

Run Performance Benchmarks

We use Criterion to measure the overhead of the async runner.

cargo bench

Contributing

Fork the repository.
Create a feature branch (git checkout -b feature/amazing-attack).
Commit your changes.
Push to the branch.
Open a Pull Request.

Please ensure cargo test and cargo clippy pass before submitting.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
.github		.github
attacks		attacks
benches		benches
graphics		graphics
src		src
tests		tests
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Repository files navigation

Contents

Introduction

What is Red Teaming?

Project Structure

Installation & Build

Prerequisites

Build

Usage Guide

Scanning

Command Line Options

Strategies

Evaluators

Testing & Benchmarking

Run Unit & Integration Tests

Run Performance Benchmarks

Contributing

About

Uh oh!

Releases 3

Sponsor this project

Uh oh!

Packages

Uh oh!

Languages

Uh oh!

License

wkusnierczyk/redoxide

Folders and files

Latest commit

History

Repository files navigation

Contents

Introduction

What is Red Teaming?

Project Structure

Installation & Build

Prerequisites

Build

Usage Guide

Scanning

Command Line Options

Strategies

Evaluators

Testing & Benchmarking

Run Unit & Integration Tests

Run Performance Benchmarks

Contributing

About

Topics

Resources

License

Code of conduct

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 3

Sponsor this project

Uh oh!

Packages 0

Uh oh!

Languages

Packages