Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added app/src/guardrails.zip
Binary file not shown.
19 changes: 19 additions & 0 deletions app/src/guardrails/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
# Guardrails

This module contains the guardrails layer used by the application to validate agent responses before they are returned to the user.

The current implementation focuses on output validation.

## Structure

- `output_guard.py`: Defines the output guard and exposes the main validation entry point.
- `validators.py`: Defines the custom validators used by the guardrails layer.
- `docs/`: Contains detailed documentation for the guardrails module.

## Documentation

### Guardrails documentation

- [Guardrails overview](docs/guardrails-overview.md)
- [Output validation](docs/output-validation.md)
- [Add a new guardrail](docs/add-new-guardrail.md)
114 changes: 114 additions & 0 deletions app/src/guardrails/docs/add-new-guardrail.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
# Add a new guardrail

This document explains how to add a new guardrail to the guardrails module.

Guardrails are used to validate final agent responses before they are returned to the user. Add a new guardrail when a new output pattern needs to be detected, blocked or sanitized.

## Location

Validators are defined in:

```text
src/guardrails/validators.py
````

The output guard is configured in:

```text
src/guardrails/output_guard.py
```

## Add the validator

Create a new validator in `validators.py`.

Example:

```python
import re

from guardrails import register_validator
from guardrails.validator_base import (
FailResult,
PassResult,
ValidationResult,
Validator,
)


@register_validator(name="no_internal_example", data_type="string")
class NoInternalExample(Validator):
def validate(self, value: str, metadata: dict) -> ValidationResult:
pattern = r"internal_example_[a-zA-Z0-9_]+"

if re.search(pattern, value):
fixed_value = re.sub(
pattern,
"[hidden internal field]",
value,
)

return FailResult(
error_message="Internal example field detected.",
fix_value=fixed_value,
)

return PassResult()
```

## Add it to the output guard

After creating the validator, include it in the output guard used by:

```python
validate_agent_output(...)
```

This is configured in:

```text
src/guardrails/output_guard.py
```

## Expected behavior

A guardrail should:

* Check the final agent response
* Return `PassResult()` when the response is valid
* Return `FailResult(...)` when the response must be sanitized
* Provide a `fix_value` when automatic replacement is possible

## Test it

Test at least one valid response and one response that should be sanitized.

Example:

```python
from src.guardrails.output_guard import validate_agent_output


validate_agent_output(
"The payment was created successfully.",
agent_name="finance_agent",
)

validate_agent_output(
"The payment was created with internal_example_id 4.",
agent_name="finance_agent",
)
```

The second response should replace the internal field with:

```text
[hidden internal field]
```

## Update documentation

After adding a new guardrail, update:

* `src/guardrails/docs/output-validation.md`
* `src/guardrails/docs/add-new-guardrail.md` if the process changes
107 changes: 107 additions & 0 deletions app/src/guardrails/docs/guardrails-overview.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
# Guardrails overview

This document explains the role of the guardrails layer in the project.

Guardrails validate agent responses before they are returned to the user. The current implementation focuses on output validation: it sanitizes final text responses to avoid exposing internal Odoo fields, technical identifiers, credentials or traceback details.

## Role in the architecture

The guardrails layer runs after an agent has generated a response.

```text
Agent response
validate_agent_output(...)
Output guard
Sanitized response
````

This keeps the agent focused on task handling, while the guardrails layer focuses on cleaning the final user-facing output.

## Current implementation

The current output guard is defined in:

```text
src/guardrails/output_guard.py
```

The main validation entry point is:

```python
validate_agent_output(text, agent_name="finance_agent")
```

The validator is defined in:

```text
src/guardrails/validators.py
```

Current validator:

```text
NoInternalOdooFields
```

It detects internal or sensitive patterns in the generated response and replaces them with:

```text
[hidden internal field]
```

## Protected patterns

The current validator checks for patterns related to:

* Internal Odoo field names
* Technical record identifiers
* Credential-related terms
* API keys
* Passwords
* Python traceback output

Examples of protected terms include:

```text
partner_id
journal_id
payment_method_line_id
create_uid
write_uid
access_token
api_key
password
Traceback
```

## Scope

Guardrails are responsible for final response validation only.

They should handle:

* Sanitizing user-facing agent responses
* Removing internal technical fields from final text
* Hiding credential-like strings
* Preventing traceback details from being exposed

They should not replace:

* Agent intent detection
* Required field collection
* Payment validation
* Odoo record lookup
* MCP tool schema validation
* XML-RPC error handling
* Database auditing

Those responsibilities belong to the agent, MCP or service layers.

## Relationship with MCP

MCP tools validate structured operations before execution.

Guardrails validate the final natural language response after the agent or tool flow.
143 changes: 143 additions & 0 deletions app/src/guardrails/docs/output-validation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,143 @@
# Output validation

This document explains how output validation is currently handled in the guardrails layer.

Output validation sanitizes the final text generated by agents before it is returned to the user. Its purpose is to prevent internal Odoo fields, technical identifiers, credentials and traceback details from appearing in user-facing responses.

## Location

Output validation is implemented in:

```text
src/guardrails/output_guard.py
````

The custom validator is defined in:

```text
src/guardrails/validators.py
```

## Entry point

The main entry point is:

```python
validate_agent_output(text, agent_name="finance_agent")
```

This function receives the final agent response as text and returns a validated string.

Example:

```python
from src.guardrails.output_guard import validate_agent_output

safe_text = validate_agent_output(
text=agent_response,
agent_name="finance_agent",
)
```

## Validation flow

```text
Agent response
validate_agent_output(...)
Output guard
NoInternalOdooFields
Sanitized response
```

If the response does not contain protected patterns, the original text is returned.

If protected patterns are detected, they are replaced before the response is returned.

## Current validator

The current validator is:

```text
NoInternalOdooFields
```

It checks the response text for internal or sensitive patterns and replaces matches with:

```text
[hidden internal field]
```

## Protected patterns

The validator currently protects against terms related to:

* Internal Odoo fields
* Technical identifiers
* Credentials
* API keys
* Passwords
* Traceback output

Examples include:

```text
partner_id
journal_id
payment_method_line_id
create_uid
write_uid
access_token
api_key
password
Traceback
```

## Example

Input response:

```text
The payment was created with journal_id 8 and payment_method_line_id 3.
```

Sanitized response:

```text
The payment was created with [hidden internal field] 8 and [hidden internal field] 3.
```

## Usage expectations

Output validation should be applied to final agent responses, not to internal tool outputs.

Use it when:

* A response is about to be returned to the user
* The response may include information from Odoo
* The response may include MCP tool output
* The response may include exception or traceback details

Do not use it as a replacement for:

* Pydantic validation
* MCP tool validation
* Odoo service error handling
* Business rule checks

## Current limitation

The current implementation sanitizes text by replacing detected patterns.

It does not:

* Validate structured tool input
* Check business correctness
* Verify whether an Odoo operation should be allowed
* Rewrite the full answer semantically
* Decide whether the agent should call a tool

Those responsibilities belong to the agent, MCP and service layers.
Loading