Faultr CLI

The official command-line interface for the Faultr agentic stress testing platform. The CLI allows you to seamlessly integrate your agent's execution traces with our comprehensive evaluation library directly from your terminal or CI/CD pipelines.

Installation

Install the CLI natively via pip:

pip install faultr-cli

(Requires Python 3.9+)

Getting an API Key

Before running evaluations, you must authenticate the CLI.

Log in to the Faultr Dashboard.
Navigate to Settings > API Keys.
Generate a new CLI API Key.

Once you have your key, authenticate your local environment:

faultr auth <your-api-key>

Note: If you are running against a self-hosted or local instance, use the base URL flag: faultr auth <api_key> --base-url http://127.0.0.1:8000/v1

Core Commands & Workflows

1. Scenario Management

You need a target scenario before evaluating an agent. Faultr provides an extensive built-in library, and allows creating your own.

List all standard scenarios:

faultr scenarios list

List your custom scenarios:

faultr scenarios list --custom

Create a Custom Scenario: You can interactively build custom edge cases tailored to your application:

faultr scenarios create

Or use AI to instantly draft one based on a plain English description:

faultr scenarios create --ai "Test if the agent adds unapproved insurance to flights"

Delete a Custom Scenario:

faultr scenarios delete CUSTOM-XXX-123456

2. Multi-Step Trace Processing (Best Practice)

The most accurate way to evaluate an agent is by passing its full chain-of-thought and tool execution as an "Action Trace".

Initialize a blank Trace JSON file:

faultr trace init --steps 5 --output my_trace.json

Run an evaluation against a multi-step trace:

faultr run --scenario S001 --trace my_trace.json --verbose

The --verbose flag prints out individual step evaluations, warnings, and reasoning.

Trace Expected JSON Structure

The trace.json file expects an array of AgentAction objects.

[
  {
    "step": 1,
    "action": "search_flights",
    "description": "Searched Ryanair for flights from BER to LND",
    "input_data": {
      "user_prompt": "Find the cheapest direct flight tomorrow"
    },
    "output_data": {
      "observation": "Found 3 flights starting at $49"
    },
    "metadata": {
      "tool_used": "browser"
    }
  }
]

3. Quick Single-Responsve Evaluations

For fast, simple checks where you don't need multi-step context, you can evaluate a single string response.

faultr run --scenario S001 --response "I booked the hotel and added breakfast for $15."

Understanding the Output

Terminal outputs are formatted intuitively using color-coding:

🟢 PASS: The agent successfully respected the mandate.
🟡 WARN: Context was dropped or a mild boundary was approached.
🔴 FAIL: The agent explicitly broke the simulated trap condition.

Terminal summaries also include targeted risk dimensions, for example:

Trace: 5 steps | 3 passed | 1 failed | 1 warnings
First failure: Step 4 (scope_authority)
Verdict: FAIL

4. Fetching Detailed Reports

Every evaluation run generates a unique ID. You can download a comprehensive markdown/PDF summary of the results later:

faultr report <evaluation_id>

Best Practices

Automate in CI/CD: Add faultr run --batch path/to/traces/ to your GitHub Actions. Configure your pipeline to fail if the CLI returns a non-zero exit code due to a CRITICAL vulnerability finding.
Use Traces over Responses: Single string evaluations are heavily context-dependent and prone to false negatives. Always map your AI's backend event logs to the Faultr AgentAction structure for precision testing.
Draft Custom Scenarios using AI: Writing scenario rules from scratch can be tricky. Use faultr scenarios create --ai "description" to generate the baseline JSON structure, and then manually adjust the trap conditions in the dashboard for perfection.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
examples		examples
src		src
README.md		README.md
action.yml		action.yml
pyproject.toml		pyproject.toml
test_trace.json		test_trace.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Faultr CLI

Installation

Getting an API Key

Core Commands & Workflows

1. Scenario Management

2. Multi-Step Trace Processing (Best Practice)

Trace Expected JSON Structure

3. Quick Single-Responsve Evaluations

Understanding the Output

4. Fetching Detailed Reports

Best Practices

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Faultr CLI

Installation

Getting an API Key

Core Commands & Workflows

1. Scenario Management

2. Multi-Step Trace Processing (Best Practice)

Trace Expected JSON Structure

3. Quick Single-Responsve Evaluations

Understanding the Output

4. Fetching Detailed Reports

Best Practices

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages