feat(extractors): conditional Extractors with `when` Blocks

# Feature Request: Conditional Extractors with `when` Blocks

## Summary

Add support for conditional execution of extractors using `when` blocks, similar to the existing functionality in state transitions. This would allow extractors to run only when specific conditions are met, improving flexibility and reducing unnecessary processing.

## Motivation

Currently, all extractors defined in a state are executed unconditionally. This creates several limitations:

### Problem 1: Error-prone extraction
When a response doesn't contain the expected data (e.g., 404 error, validation failure), extractors still attempt to run and may:
- Generate misleading error messages
- Extract incorrect values from error pages
- Pollute the context with invalid data

### Problem 2: Status-dependent extraction
Different HTTP status codes often require different extraction logic:
- 200 OK → Extract session token
- 401 Unauthorized → Extract error message
- 403 Forbidden → Extract rate limit info

Currently, you must either:
- Use complex regex patterns that handle all cases
- Accept failed extractions in error scenarios
- Create separate states for each status code

### Problem 3: Content-type dependent extraction
JSON vs HTML responses require different extractors, but currently both execute regardless of response type.

## Proposed Solution

Extend the extractor syntax to support optional `when` conditions, using the same syntax as state transitions:

### Basic Syntax

```yaml
states:
  login:
    request: |
      POST /api/login HTTP/1.1
      ...
    
    extract:
      # Conditional extractor - only runs when condition is true
      session_token:
        type: jpath
        pattern: "$.token"
        when:
          on_status: 200
      
      # Another condition example
      error_message:
        type: jpath
        pattern: "$.error"
        when:
          on_status: [400, 401, 403]
```

### Advanced Examples

#### Example 1: Status-based extraction

```yaml
extract:
  # Success case
  auth_token:
    type: regex
    pattern: 'token=([a-zA-Z0-9]+)'
    when:
      on_status: 200
  
  # Error cases
  error_code:
    type: jpath
    pattern: "$.error.code"
    when:
      on_status: [400, 401, 403, 500]
  
  rate_limit_reset:
    type: header
    pattern: "X-RateLimit-Reset: (\\d+)"
    when:
      on_status: 429
```

#### Example 2: Content-type based extraction

```yaml
extract:
  # JSON extraction
  user_id:
    type: jpath
    pattern: "$.user.id"
    when:
      on_header:
        name: "Content-Type"
        pattern: "application/json"
  
  # HTML extraction
  csrf_token:
    type: regex
    pattern: 'name="csrf" value="([^"]+)"'
    when:
      on_header:
        name: "Content-Type"
        pattern: "text/html"
```

#### Example 3: Body pattern matching

```yaml
extract:
  # Only extract if response contains success indicator
  transaction_id:
    type: regex
    pattern: 'transaction_id=([0-9]+)'
    when:
      on_body:
        pattern: "status.*success"
  
  # Extract error details only if error present
  error_details:
    type: jpath
    pattern: "$.error"
    when:
      on_body:
        pattern: '"error":'
```

#### Example 4: Multiple conditions (AND logic)

```yaml
extract:
  api_key:
    type: jpath
    pattern: "$.data.api_key"
    when:
      on_status: 200
      on_header:
        name: "Content-Type"
        pattern: "application/json"
      on_body:
        pattern: '"api_key":'
```

#### Example 5: Race condition scenarios

```yaml
states:
  race_attack:
    race:
      threads: 20
    
    request: |
      POST /api/redeem-coupon HTTP/1.1
      ...
    
    extract:
      # Only extract discount if successful
      discount_applied:
        type: jpath
        pattern: "$.discount"
        when:
          on_status: 200
      
      # Track failures separately
      already_used:
        type: jpath
        pattern: "$.error"
        when:
          on_status: 409
          on_body:
            pattern: "already.*used"
```

## Implementation Details

### Supported `when` Conditions

All condition types from state transitions should be supported:

| Condition | Description | Example |
|-----------|-------------|---------|
| `on_status` | HTTP status code(s) | `on_status: 200` or `on_status: [200, 201]` |
| `on_header` | Response header pattern | `on_header: {name: "Content-Type", pattern: "json"}` |
| `on_body` | Response body pattern | `on_body: {pattern: "success"}` |
| `on_response_time` | Response time threshold | `on_response_time: {less_than: 100}` |

### Behavior Specification

#### When condition is TRUE:
- Extractor executes normally
- If extraction fails, behavior depends on extractor configuration
- Extracted value is added to context

#### When condition is FALSE:
- Extractor is **skipped entirely**
- No error is raised
- No value is added to context
- Variable remains undefined (or keeps previous value if already set)

#### When condition is UNDEFINED:
- If `when` block is not present → Always execute (backward compatible)
- If `when` block is present but evaluates to undefined → Skip extraction

### Error Handling

```yaml
extract:
  token:
    type: jpath
    pattern: "$.token"
    when:
      on_status: 200
    required: true  # Fail if condition is true but extraction fails
```

- `required: true` + condition FALSE → No error (extractor skipped)
- `required: true` + condition TRUE + extraction fails → Error raised

## Use Cases

### 1. Multi-status Response Handling

```yaml
# Current approach (problematic)
extract:
  token:
    type: jpath
    pattern: "$.token"  # Fails on error responses!

# Proposed approach (clean)
extract:
  token:
    type: jpath
    pattern: "$.token"
    when:
      on_status: 200
  
  error:
    type: jpath
    pattern: "$.error"
    when:
      on_status: [400, 401, 403, 500]
```

### 2. Rate Limit Handling

```yaml
extract:
  # Extract data on success
  results:
    type: jpath
    pattern: "$.results"
    when:
      on_status: 200
  
  # Extract rate limit info on 429
  retry_after:
    type: header
    pattern: "Retry-After: (\\d+)"
    when:
      on_status: 429
```

### 3. Race Condition Analysis

```yaml
states:
  race_coupon:
    race:
      threads: 50
    
    extract:
      # Count successes
      coupon_success:
        type: jpath
        pattern: "$.success"
        when:
          on_status: 200
      
      # Count rate limits (shouldn't happen in race!)
      rate_limited:
        type: literal
        value: true
        when:
          on_status: 429
      
      # Count conflicts
      already_used:
        type: literal
        value: true
        when:
          on_status: 409
```

### 4. Content-Type Negotiation

```yaml
extract:
  # Try JSON first
  data_json:
    type: jpath
    pattern: "$.data"
    when:
      on_header:
        name: "Content-Type"
        pattern: "application/json"
  
  # Fallback to XML
  data_xml:
    type: xpath
    pattern: "//data"
    when:
      on_header:
        name: "Content-Type"
        pattern: "application/xml"
```

## Benefits

1. **Cleaner Code**: No need for complex regex patterns that handle all cases
2. **Better Error Messages**: Extractors only run when expected to succeed
3. **Performance**: Skip unnecessary extraction attempts
4. **Flexibility**: Handle multiple response types in single state
5. **Debugging**: Clear which extractors ran vs were skipped
6. **Race Conditions**: Better analysis of different outcomes per thread

## Backward Compatibility

✅ **Fully backward compatible**

- Existing configurations without `when` blocks continue to work unchanged
- `when` block is optional
- Default behavior (no `when`) = always execute (current behavior)

## Related Features

This feature would pair well with:
- State-level `when` blocks (already implemented)
- Conditional state transitions (already implemented)
- Logger conditional output (could be a future enhancement)

## Example: Complete Login Flow

```yaml
states:
  login:
    request: |
      POST /api/login HTTP/1.1
      Content-Type: application/json
      
      {"username": "{{ username }}", "password": "{{ password }}"}
    
    extract:
      # Success extractors
      session_token:
        type: jpath
        pattern: "$.token"
        when:
          on_status: 200
      
      user_id:
        type: jpath
        pattern: "$.user.id"
        when:
          on_status: 200
      
      # Error extractors
      error_message:
        type: jpath
        pattern: "$.error.message"
        when:
          on_status: [400, 401, 403]
      
      # Rate limit extractors
      retry_after:
        type: header
        pattern: "Retry-After: (\\d+)"
        when:
          on_status: 429
      
      rate_limit_remaining:
        type: header
        pattern: "X-RateLimit-Remaining: (\\d+)"
        when:
          on_status: 429
    
    logger:
      on_state_leave: |
        Status: {{ status }}
        {% if session_token %}
        ✅ Login successful! Token: {{ session_token[:16] }}...
        {% elif error_message %}
        ❌ Login failed: {{ error_message }}
        {% elif retry_after %}
        ⚠️  Rate limited. Retry after {{ retry_after }} seconds
        {% endif %}
    
    next:
      - on_status: 200
        goto: authenticated_action
      - on_status: 401
        goto: login_failed
      - on_status: 429
        goto: rate_limited
```

## Implementation Checklist

### Phase 1: Core Functionality
- [ ] Add `when` field to `ExtractorConfig` model
- [ ] Implement condition evaluation in extractor execution
- [ ] Support `on_status` conditions
- [ ] Support `on_header` conditions
- [ ] Support `on_body` conditions
- [ ] Update extractor execution logic to check conditions
- [ ] Skip extractor if condition is false

### Phase 2: Advanced Features
- [ ] Support `on_response_time` conditions
- [ ] Support multiple conditions (AND logic)
- [ ] Add `required` field interaction with `when`
- [ ] Logging for skipped extractors (debug mode)

### Phase 3: Documentation
- [ ] Update extractors.rst documentation
- [ ] Add examples to README
- [ ] Update schema validation
- [ ] Add tests for conditional extractors
- [ ] Update PortSwigger lab examples

### Phase 4: Validation
- [ ] Schema validation for `when` blocks
- [ ] Error messages for invalid conditions
- [ ] Warning for unreachable extractors (always false condition)

## Testing Strategy

```yaml
# Test case: Multiple conditions
test_conditional_extractors:
  request: GET /test
  
  extract:
    success_token:
      type: regex
      pattern: "token=([a-z]+)"
      when:
        on_status: 200
    
    error_code:
      type: regex
      pattern: "error=([0-9]+)"
      when:
        on_status: 400
  
  assertions:
    # On 200: success_token defined, error_code undefined
    # On 400: success_token undefined, error_code defined
    # On 500: both undefined
```

## Alternative Designs Considered

### Alternative 1: Separate extractor lists per status

```yaml
extract:
  on_success:  # Only on 2xx
    token: {type: jpath, pattern: "$.token"}
  on_error:    # Only on 4xx/5xx
    error: {type: jpath, pattern: "$.error"}
```

**Rejected because:**
- Less flexible (can't use custom conditions)
- Harder to read for complex scenarios
- Doesn't support other condition types

### Alternative 2: Post-processing filter

```yaml
extract:
  token:
    type: jpath
    pattern: "$.token"
  
filter_extractions:
  - if: "{{ status != 200 }}"
    remove: ["token"]
```

**Rejected because:**
- Extraction still happens (performance cost)
- More complex syntax
- Errors still occur during extraction

## Questions for Discussion

1. Should we support OR logic for multiple conditions?
   ```yaml
   when:
     any_of:
       - on_status: 200
       - on_status: 201
   ```

2. Should we support negation?
   ```yaml
   when:
     not:
       on_status: [400, 401, 403]
   ```

3. Should skipped extractors be logged in verbose mode?

4. Should we allow extractors to reference other extracted values in conditions?
   ```yaml
   when:
     on_variable:
       name: "content_type"
       pattern: "json"
   ```

## References

- Current `when` block implementation in state transitions
- Extractor documentation: `docs/source/extractors.rst`
- Schema: `schema/treco-config.schema.json`
- Related: PortSwigger time-sensitive lab (needs conditional extraction)

---

**Priority:** Medium-High  
**Complexity:** Medium  
**Breaking Changes:** None (fully backward compatible)  
**Related Labs:** Time-sensitive, Rate-limit-bypass (would benefit from this)

Condition	Description	Example
`on_status`	HTTP status code(s)	`on_status: 200` or `on_status: [200, 201]`
`on_header`	Response header pattern	`on_header: {name: "Content-Type", pattern: "json"}`
`on_body`	Response body pattern	`on_body: {pattern: "success"}`
`on_response_time`	Response time threshold	`on_response_time: {less_than: 100}`

feat(extractors): conditional Extractors with when Blocks #51

Description

Feature Request: Conditional Extractors with when Blocks

Summary

Motivation

Problem 1: Error-prone extraction

Problem 2: Status-dependent extraction

Problem 3: Content-type dependent extraction

Proposed Solution

Basic Syntax

Advanced Examples

Example 1: Status-based extraction

Example 2: Content-type based extraction

Example 3: Body pattern matching

Example 4: Multiple conditions (AND logic)

Example 5: Race condition scenarios

Implementation Details

Supported when Conditions

Behavior Specification

When condition is TRUE:

When condition is FALSE:

When condition is UNDEFINED:

Error Handling

Use Cases

1. Multi-status Response Handling

2. Rate Limit Handling

3. Race Condition Analysis

4. Content-Type Negotiation

Benefits

Backward Compatibility

Related Features

Example: Complete Login Flow

Implementation Checklist

Phase 1: Core Functionality

Phase 2: Advanced Features

Phase 3: Documentation

Phase 4: Validation

Testing Strategy

Alternative Designs Considered

Alternative 1: Separate extractor lists per status

Alternative 2: Post-processing filter

Questions for Discussion

References

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

feat(extractors): conditional Extractors with `when` Blocks #51

Feature Request: Conditional Extractors with `when` Blocks

Supported `when` Conditions