Feature Request: Conditional Extractors with when Blocks
Summary
Add support for conditional execution of extractors using when blocks, similar to the existing functionality in state transitions. This would allow extractors to run only when specific conditions are met, improving flexibility and reducing unnecessary processing.
Motivation
Currently, all extractors defined in a state are executed unconditionally. This creates several limitations:
Problem 1: Error-prone extraction
When a response doesn't contain the expected data (e.g., 404 error, validation failure), extractors still attempt to run and may:
- Generate misleading error messages
- Extract incorrect values from error pages
- Pollute the context with invalid data
Problem 2: Status-dependent extraction
Different HTTP status codes often require different extraction logic:
- 200 OK → Extract session token
- 401 Unauthorized → Extract error message
- 403 Forbidden → Extract rate limit info
Currently, you must either:
- Use complex regex patterns that handle all cases
- Accept failed extractions in error scenarios
- Create separate states for each status code
Problem 3: Content-type dependent extraction
JSON vs HTML responses require different extractors, but currently both execute regardless of response type.
Proposed Solution
Extend the extractor syntax to support optional when conditions, using the same syntax as state transitions:
Basic Syntax
states:
login:
request: |
POST /api/login HTTP/1.1
...
extract:
# Conditional extractor - only runs when condition is true
session_token:
type: jpath
pattern: "$.token"
when:
on_status: 200
# Another condition example
error_message:
type: jpath
pattern: "$.error"
when:
on_status: [400, 401, 403]
Advanced Examples
Example 1: Status-based extraction
extract:
# Success case
auth_token:
type: regex
pattern: 'token=([a-zA-Z0-9]+)'
when:
on_status: 200
# Error cases
error_code:
type: jpath
pattern: "$.error.code"
when:
on_status: [400, 401, 403, 500]
rate_limit_reset:
type: header
pattern: "X-RateLimit-Reset: (\\d+)"
when:
on_status: 429
Example 2: Content-type based extraction
extract:
# JSON extraction
user_id:
type: jpath
pattern: "$.user.id"
when:
on_header:
name: "Content-Type"
pattern: "application/json"
# HTML extraction
csrf_token:
type: regex
pattern: 'name="csrf" value="([^"]+)"'
when:
on_header:
name: "Content-Type"
pattern: "text/html"
Example 3: Body pattern matching
extract:
# Only extract if response contains success indicator
transaction_id:
type: regex
pattern: 'transaction_id=([0-9]+)'
when:
on_body:
pattern: "status.*success"
# Extract error details only if error present
error_details:
type: jpath
pattern: "$.error"
when:
on_body:
pattern: '"error":'
Example 4: Multiple conditions (AND logic)
extract:
api_key:
type: jpath
pattern: "$.data.api_key"
when:
on_status: 200
on_header:
name: "Content-Type"
pattern: "application/json"
on_body:
pattern: '"api_key":'
Example 5: Race condition scenarios
states:
race_attack:
race:
threads: 20
request: |
POST /api/redeem-coupon HTTP/1.1
...
extract:
# Only extract discount if successful
discount_applied:
type: jpath
pattern: "$.discount"
when:
on_status: 200
# Track failures separately
already_used:
type: jpath
pattern: "$.error"
when:
on_status: 409
on_body:
pattern: "already.*used"
Implementation Details
Supported when Conditions
All condition types from state transitions should be supported:
| Condition |
Description |
Example |
on_status |
HTTP status code(s) |
on_status: 200 or on_status: [200, 201] |
on_header |
Response header pattern |
on_header: {name: "Content-Type", pattern: "json"} |
on_body |
Response body pattern |
on_body: {pattern: "success"} |
on_response_time |
Response time threshold |
on_response_time: {less_than: 100} |
Behavior Specification
When condition is TRUE:
- Extractor executes normally
- If extraction fails, behavior depends on extractor configuration
- Extracted value is added to context
When condition is FALSE:
- Extractor is skipped entirely
- No error is raised
- No value is added to context
- Variable remains undefined (or keeps previous value if already set)
When condition is UNDEFINED:
- If
when block is not present → Always execute (backward compatible)
- If
when block is present but evaluates to undefined → Skip extraction
Error Handling
extract:
token:
type: jpath
pattern: "$.token"
when:
on_status: 200
required: true # Fail if condition is true but extraction fails
required: true + condition FALSE → No error (extractor skipped)
required: true + condition TRUE + extraction fails → Error raised
Use Cases
1. Multi-status Response Handling
# Current approach (problematic)
extract:
token:
type: jpath
pattern: "$.token" # Fails on error responses!
# Proposed approach (clean)
extract:
token:
type: jpath
pattern: "$.token"
when:
on_status: 200
error:
type: jpath
pattern: "$.error"
when:
on_status: [400, 401, 403, 500]
2. Rate Limit Handling
extract:
# Extract data on success
results:
type: jpath
pattern: "$.results"
when:
on_status: 200
# Extract rate limit info on 429
retry_after:
type: header
pattern: "Retry-After: (\\d+)"
when:
on_status: 429
3. Race Condition Analysis
states:
race_coupon:
race:
threads: 50
extract:
# Count successes
coupon_success:
type: jpath
pattern: "$.success"
when:
on_status: 200
# Count rate limits (shouldn't happen in race!)
rate_limited:
type: literal
value: true
when:
on_status: 429
# Count conflicts
already_used:
type: literal
value: true
when:
on_status: 409
4. Content-Type Negotiation
extract:
# Try JSON first
data_json:
type: jpath
pattern: "$.data"
when:
on_header:
name: "Content-Type"
pattern: "application/json"
# Fallback to XML
data_xml:
type: xpath
pattern: "//data"
when:
on_header:
name: "Content-Type"
pattern: "application/xml"
Benefits
- Cleaner Code: No need for complex regex patterns that handle all cases
- Better Error Messages: Extractors only run when expected to succeed
- Performance: Skip unnecessary extraction attempts
- Flexibility: Handle multiple response types in single state
- Debugging: Clear which extractors ran vs were skipped
- Race Conditions: Better analysis of different outcomes per thread
Backward Compatibility
✅ Fully backward compatible
- Existing configurations without
when blocks continue to work unchanged
when block is optional
- Default behavior (no
when) = always execute (current behavior)
Related Features
This feature would pair well with:
- State-level
when blocks (already implemented)
- Conditional state transitions (already implemented)
- Logger conditional output (could be a future enhancement)
Example: Complete Login Flow
states:
login:
request: |
POST /api/login HTTP/1.1
Content-Type: application/json
{"username": "{{ username }}", "password": "{{ password }}"}
extract:
# Success extractors
session_token:
type: jpath
pattern: "$.token"
when:
on_status: 200
user_id:
type: jpath
pattern: "$.user.id"
when:
on_status: 200
# Error extractors
error_message:
type: jpath
pattern: "$.error.message"
when:
on_status: [400, 401, 403]
# Rate limit extractors
retry_after:
type: header
pattern: "Retry-After: (\\d+)"
when:
on_status: 429
rate_limit_remaining:
type: header
pattern: "X-RateLimit-Remaining: (\\d+)"
when:
on_status: 429
logger:
on_state_leave: |
Status: {{ status }}
{% if session_token %}
✅ Login successful! Token: {{ session_token[:16] }}...
{% elif error_message %}
❌ Login failed: {{ error_message }}
{% elif retry_after %}
⚠️ Rate limited. Retry after {{ retry_after }} seconds
{% endif %}
next:
- on_status: 200
goto: authenticated_action
- on_status: 401
goto: login_failed
- on_status: 429
goto: rate_limited
Implementation Checklist
Phase 1: Core Functionality
Phase 2: Advanced Features
Phase 3: Documentation
Phase 4: Validation
Testing Strategy
# Test case: Multiple conditions
test_conditional_extractors:
request: GET /test
extract:
success_token:
type: regex
pattern: "token=([a-z]+)"
when:
on_status: 200
error_code:
type: regex
pattern: "error=([0-9]+)"
when:
on_status: 400
assertions:
# On 200: success_token defined, error_code undefined
# On 400: success_token undefined, error_code defined
# On 500: both undefined
Alternative Designs Considered
Alternative 1: Separate extractor lists per status
extract:
on_success: # Only on 2xx
token: {type: jpath, pattern: "$.token"}
on_error: # Only on 4xx/5xx
error: {type: jpath, pattern: "$.error"}
Rejected because:
- Less flexible (can't use custom conditions)
- Harder to read for complex scenarios
- Doesn't support other condition types
Alternative 2: Post-processing filter
extract:
token:
type: jpath
pattern: "$.token"
filter_extractions:
- if: "{{ status != 200 }}"
remove: ["token"]
Rejected because:
- Extraction still happens (performance cost)
- More complex syntax
- Errors still occur during extraction
Questions for Discussion
-
Should we support OR logic for multiple conditions?
when:
any_of:
- on_status: 200
- on_status: 201
-
Should we support negation?
when:
not:
on_status: [400, 401, 403]
-
Should skipped extractors be logged in verbose mode?
-
Should we allow extractors to reference other extracted values in conditions?
when:
on_variable:
name: "content_type"
pattern: "json"
References
- Current
when block implementation in state transitions
- Extractor documentation:
docs/source/extractors.rst
- Schema:
schema/treco-config.schema.json
- Related: PortSwigger time-sensitive lab (needs conditional extraction)
Priority: Medium-High
Complexity: Medium
Breaking Changes: None (fully backward compatible)
Related Labs: Time-sensitive, Rate-limit-bypass (would benefit from this)
Feature Request: Conditional Extractors with
whenBlocksSummary
Add support for conditional execution of extractors using
whenblocks, similar to the existing functionality in state transitions. This would allow extractors to run only when specific conditions are met, improving flexibility and reducing unnecessary processing.Motivation
Currently, all extractors defined in a state are executed unconditionally. This creates several limitations:
Problem 1: Error-prone extraction
When a response doesn't contain the expected data (e.g., 404 error, validation failure), extractors still attempt to run and may:
Problem 2: Status-dependent extraction
Different HTTP status codes often require different extraction logic:
Currently, you must either:
Problem 3: Content-type dependent extraction
JSON vs HTML responses require different extractors, but currently both execute regardless of response type.
Proposed Solution
Extend the extractor syntax to support optional
whenconditions, using the same syntax as state transitions:Basic Syntax
Advanced Examples
Example 1: Status-based extraction
Example 2: Content-type based extraction
Example 3: Body pattern matching
Example 4: Multiple conditions (AND logic)
Example 5: Race condition scenarios
Implementation Details
Supported
whenConditionsAll condition types from state transitions should be supported:
on_statuson_status: 200oron_status: [200, 201]on_headeron_header: {name: "Content-Type", pattern: "json"}on_bodyon_body: {pattern: "success"}on_response_timeon_response_time: {less_than: 100}Behavior Specification
When condition is TRUE:
When condition is FALSE:
When condition is UNDEFINED:
whenblock is not present → Always execute (backward compatible)whenblock is present but evaluates to undefined → Skip extractionError Handling
required: true+ condition FALSE → No error (extractor skipped)required: true+ condition TRUE + extraction fails → Error raisedUse Cases
1. Multi-status Response Handling
2. Rate Limit Handling
3. Race Condition Analysis
4. Content-Type Negotiation
Benefits
Backward Compatibility
✅ Fully backward compatible
whenblocks continue to work unchangedwhenblock is optionalwhen) = always execute (current behavior)Related Features
This feature would pair well with:
whenblocks (already implemented)Example: Complete Login Flow
Implementation Checklist
Phase 1: Core Functionality
whenfield toExtractorConfigmodelon_statusconditionson_headerconditionson_bodyconditionsPhase 2: Advanced Features
on_response_timeconditionsrequiredfield interaction withwhenPhase 3: Documentation
Phase 4: Validation
whenblocksTesting Strategy
Alternative Designs Considered
Alternative 1: Separate extractor lists per status
Rejected because:
Alternative 2: Post-processing filter
Rejected because:
Questions for Discussion
Should we support OR logic for multiple conditions?
Should we support negation?
Should skipped extractors be logged in verbose mode?
Should we allow extractors to reference other extracted values in conditions?
References
whenblock implementation in state transitionsdocs/source/extractors.rstschema/treco-config.schema.jsonPriority: Medium-High
Complexity: Medium
Breaking Changes: None (fully backward compatible)
Related Labs: Time-sensitive, Rate-limit-bypass (would benefit from this)