promptfoo LLM red teaming and evaluation framework with CI/CD integration
Garak LLM vulnerability scanner
AI-Infra-Guard LLM vulnerability scanner with Web UI, REST APIs, and Dockerized
LLM Guard Security toolkit for LLM interactions
Agentic Security Security toolkit for AI agents
DeepTeam LLM red teaming framework (prompt injection, hallucination, data leaks, jailbreaks)
AI-Scanner AI model safety scanner built on NVIDIA garak
LLMmap Tool for mapping LLM vulnerabilities
LLaMator Framework for testing vulnerabilities of LLMs
Plexiglass Security toolbox for testing and safeguarding LLMs
Inkog AI agent security scanner (CLI + MCP server) detects prompt injection, SQLi via LLM
llm-security-scanner Prompt injection with OWASP LLM Top 10 and Turkish payloads
AgentBench: Benchmark to evaluate LLMs as agents
Agentic Radar Open-source CLI security scanner for agentic workflows
MCP Scanner Scan MCP servers for potential threats & security findings
Awesome MCP Security Curated list of MCP security resources
MCP Shield Security scanner for MCP servers
Invariant Trace analysis tool for AI agents
Agent Threat Rules Detection rule standard for AI agent threats (e.g., prompt injection and MCP attacks)
MCP Safety Scanner Automated MCP safety auditing and remediation using Agents
Agent Security Scanner MCP MCP server for scanning code for web vulnerabilities, prompt injection, and AI-hallucinated package detection
Tenuo Capability-based authorization for AI agents
Awesome LLM Agent Security LLM agent security resources, attacks, vulnerabilities
Armorer Guard Local Rust scanner for AI-agent prompt injection and dangerous tool-call context
Ziran Security testing framework for AI agents
MCPs-audit OWASP Security Scanner for MCP Servers
Agent Guard Runtime governance firewall for AI agents, policy enforcement, MCP tool scanning
PoisonedRAG Poisoned RAG systems
- RAG Attacks and Mitigations RAG attacks, mitigations, and defense strategies
- Awesome Jailbreak on LLMs - RAG Attacks RAG-based LLM attack techniques
Jailbreak LLMs: Real-world prompt jailbreak dataset (15k+ examples)
Awesome Jailbreak LLMs: Collection of jailbreak techniques, datasets, and defenses
Jailbreaking LLMs (PAIR): Black-box jailbreak generation via automatic prompt refinement
Prompt Fuzzer: Harden your GenAI applications
Open Prompt Injection: Evaluate prompt injection attacks and defenses on benchmark datasets
LLMFuzzer: Fuzzing framework for LLM prompt generation
Spikee: Prompt injection toolkit
Jailbreak Evaluation: Python package for language model jailbreak evaluation.
Shannon
Strix
PentAGI
PentestGPT
CAI
PentestAgent
Raptor
HackingBuddyGPT
Pentest-Swarm-AI Go-native agents to autonomously perform full-cycle pentests.
Pentest-Copilot
BreachSeek - PENA
Guardrails: Add structured validation and policy enforcement for LLMs
NeMo Guardrails: Protects against jailbreak and hallucinations with customizable rulesets
PurpleLlama: Tools to assess and improve LLM security from META
PyRIT: Python Risk Identification Tool for generative AI
LLM-Guard: Tool for securing LLM interactions (replaced rebuff)
LangKit: Functions for jailbreak detection, prompt injection, and sensitive information detection
Prompt Injection Defenses: Practical and proposed defenses against prompt injection
Vigil: Prompt injection detection toolkit and REST API for LLM security risk scoring
Plexiglass: Security tool for LLM applications
Last Layer: Low-latency pre-filter for prompt injection prevention
ShellWard: AI Agent security middleware
LocalMod: Self-hosted content moderation API with prompt injection detection, toxicity filtering, PII detection, and NSFW filtering
Veritensor: AI model scanner to detect Pickle/PyTorch malware, check licenses, and verify HF hashes
Tenuo: Capability tokens for AI agents with task-scoped TTLs, offline verification, and proof-of-possession binding
LLM Confidentiality: Tool for ensuring confidentiality in LLMs
Aigis: Zero-dependency Python firewall for AI agents. 180+ patterns across OWASP LLM Top 10, StruQ-style structured prompts, goal-conditioned FSM, RAG context filter, MCP 3-stage scanning, MemoryGraft defence, judge-manipulation detection. Multi-layer: 4 walls + L4-L7 capability/AEP/safety/FSM
TrustGate: Generative Application Firewall for GenAI Applications
OpenClaw Security Suite: Defensive security suite for AI agent workspaces (prompt injection, integrity verification, secret scanning, supply chain analysis)
Acgs-lite: Governance layer for AI agents that blocks unsafe actions before execution, enforces MACI separation of powers, and keeps tamper-evident audit trails
Prompt Shield: GitHub Action for detecting indirect prompt injection in CI/CD pipelines. 4-layer defense architecture
- AIDEFEND: Practical knowledge base for AI security defenses
- OWASP Agent Memory Guard: Reference implementation for ASI06 (Memory Poisoning). Runtime defense for LLM agent memory.
- JailbreakBench: Evaluating and analyzing jailbreak methods for LLMs
L1B3RT45: AI jailbreaking tools
Easy Jailbreak: Python framework to generate adversarial jailbreak prompts
PALLMs (Payloads for Attacking Large Language Models)
Lakera PINT Benchmark: Benchmark for prompt injection detection
LLM Hacking Database: Attacks against LLMs
- ThreatModels: Repository for LLM threat models
- Pangea Attack Taxonomy: Comprehensive taxonomy of AI/LLM attacks and vulnerabilities
- AI Risk Taxonomy
- AIR-Bench 2024
- Gandalf: Prompt injection wargame
Damn Vulnerable LLM Agent
PromptMe
LLM CV Screener
PwnzzAI OWASP LLM ToP 10 vulnerabilities
- PromptTrace: Prompt injection and AI security 10 labs + 15-level CTF with real LLMs.
CipherChat: Secure communication tool for LLMs
LLMs Finetuning Safety: Safety for fine-tuning LLMs
Visual Adversarial Examples: Jailbreaking LLMs with visual adversarial examples
FigStep: Jailbreaking vision-language models via typographic visual prompts
OWASP Agentic AI: OWASP Top 10 for Agentic AI
BrokenHill: Automated attack tool for GCG attack
Weak-to-Strong Generalization: Eliciting strong capabilities with weak supervision
AnyDoor: Arbitrary backdoor instances in LLMs
Image Hijacks: Image-based hijacks of LLMs
Imperio: Robust prompt engineering for anchoring LLMs
LMSanitator: Defending LLMs against stealthy prompt injection
Virtual Prompt Injection: Tool for virtual prompt injection
CBA: Consciousness-Based Authentication for LLM Security
PromptWare: PromptWares for GenAI-powered applications
MuScleLoRA: Multi-scenario backdoor fine-tuning of LLMs
TrojText: Trojan attacks on text classifiers
BadActs: Backdoor attacks via activation steering
Backdoor Attacks on Fine-tuned LLaMA: Backdoor attacks on fine-tuned LLaMA
- AI Security Explained: Short essential theoretical knowledge
- AI Agents for Pentest: Using agents for penetration testing
- Prompt Injection and Jailbreaking: Practical short lab studies
WhistleBlower: Infer the system prompt of an AI agent based on its generated text outputs.
- LLM Security startups
- LLM Security Problems at DEFCON31 Quals: The world's top security competition
- 0din GenAI Bug Bounty from Mozilla: GenAI models vulnerabilities (prompt injection, training data poisoning, DoS)
- Adversarial Prompting: Documentation
- OWASP Top 10 for LLMs: Official list of key LLM risks including prompt injection
- π¦ X: @llm_sec
- π¦ X: @SanderSchullhoff
- π Blog: LLM Security (by @llm_sec)
- π Blog: Embrace The Red
- π Blog: Simon Willison
- π° Newsletter: AI safety takes
- π° Newsletter & Blog: Hackstery
This repository is actively maintained as a fork of the original project. It includes pending contributions, removes broken links, and separates academic papers from other resources for better organization.
Contributions are always welcome. Please read the Contribution Guidelines before contributing.
Alternative: Awesome LLMSecOps