feat(redteam): add built-in red teaming support by kevmyung · Pull Request #184 · strands-agents/evals

kevmyung · 2026-03-31T23:14:30Z

Description

Adds built-in red teaming capabilities to strands-evals, enabling automated adversarial testing of AI agents.

Core components:

Attack presets (jailbreak, prompt_extraction, harmful_content): Pre-built actor profiles, goals, seed inputs, and per-preset evaluation metrics
Strategy system: Pluggable attack strategies separated from presets. Ships with gradual_escalation — an adaptive multi-turn strategy that analyzes target responses and pivots techniques dynamically
RedTeamJudgeEvaluator: Composite safety evaluator with 3 metrics (guardrail_breach, harmfulness, prompt_leakage). Dynamically builds judge prompts based on only the metrics relevant to each attack pattern
run_red_team() entry point: End-to-end orchestration — case generation, multi-turn attack simulation via ActorSimulator, and safety evaluation in a single call
Target-aware goal generation: Optional target_info parameter for LLM-generated attack goals tailored to the specific target system

Related Issues

Closes #177

Type of Change

New feature

Testing

I ran hatch run prepare
Unit tests for presets, runner, and judge evaluator (49 tests passing)
Integration tested against mock compliant target and Claude Haiku target

Checklist

I have read the CONTRIBUTING document
I have added any necessary tests that prove my fix is effective or my feature works
I have updated the documentation accordingly
I have added an appropriate example to the documentation to outline the feature, or no new docs are needed
My changes generate no new warnings
Any dependent changes have been merged and published

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

…gnment

…exit condition tuning

…pport - AttackStrategy ABC, RiskCategory, AttackGoal shared types - red_team() entry point with Agent auto-extraction and tool trace capture - AttackSuccessEvaluator with continuous 0.0-1.0 scoring - Strategy cross-product expansion and custom case injection - RedTeamReport with grouped views

poshinchen

Could you use built-in python | / list instead of typing's deprecated Union, List and so on?

kevmyung · 2026-05-01T20:27:02Z

Could you use built-in python | / list instead of typing's deprecated Union, List and so on?

Quick heads-up – fixed it in 438f9e0

kevmyung added 5 commits March 31, 2026 13:32

feat(redteam): add initial attack presets and system prompt template

d0cb839

feat(redteam): add runner helpers and RedTeamJudgeEvaluator

97bb0d3

feat(redteam): add target-specific goal generation and model type ali…

570dd28

…gnment

test(redteam): add unit tests for presets, runner, and judge evaluator

476814d

refactor(redteam): add evaluation metrics, dynamic judge prompt, and …

db33b7c

…exit condition tuning

kevmyung had a problem deploying to manual-approval March 31, 2026 23:14 — with GitHub Actions Failure

fix(redteam): resolve mypy type errors in runner

87cef42

kevmyung had a problem deploying to manual-approval March 31, 2026 23:21 — with GitHub Actions Failure

kevmyung requested a deployment to manual-approval April 25, 2026 00:17 — with GitHub Actions Waiting

kevmyung force-pushed the feat/red-team-foundation branch from 8d7d3f5 to c9f5845 Compare April 25, 2026 04:18

kevmyung requested a deployment to manual-approval April 25, 2026 04:19 — with GitHub Actions Waiting

poshinchen reviewed May 1, 2026

View reviewed changes

refactor(redteam): replace Union with built-in | syntax

438f9e0

kevmyung requested a deployment to manual-approval May 1, 2026 15:23 — with GitHub Actions Waiting

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(redteam): add built-in red teaming support#184

feat(redteam): add built-in red teaming support#184
kevmyung wants to merge 8 commits intostrands-agents:mainfrom
kevmyung:feat/red-team-foundation

kevmyung commented Mar 31, 2026 •

edited

Loading

Uh oh!

poshinchen left a comment

Uh oh!

kevmyung commented May 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

kevmyung commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issues

Type of Change

Testing

Checklist

Uh oh!

poshinchen left a comment

Choose a reason for hiding this comment

Uh oh!

kevmyung commented May 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

kevmyung commented Mar 31, 2026 •

edited

Loading