Audit-to-Foundry Validation Pipeline

This repository is a research artifact for turning natural-language smart contract audit findings into executable Foundry validation tests. Given an audit finding and the corresponding contract project, the pipeline extracts vulnerability-relevant code context, prompts an LLM to generate a Foundry test, executes the generated test, and optionally applies a differential validation stage when the finding can be checked through before/after state observations.

The goal is not to replace human auditors or automate offensive exploitation. The goal is to make audit claims, privileged-risk scenarios, and security assumptions more reproducible, inspectable, and falsifiable through executable artifacts.

Project Highlights

Research focus: converting audit-report findings into runnable Foundry validation tests
Ethereum relevance: improves reproducibility, reviewability, and falsifiability of audit claims
Implemented artifact: code pipeline, curated examples, and runnable generated tests are already included in the repository
Evidence included: benchmark-style cases and real audit-report cases are both documented under example/README.md

Why This Matters for Ethereum

Ethereum security work depends heavily on audit reports, but audit findings are usually published as natural-language claims rather than runnable evidence. That makes it difficult to:

reproduce a reported issue quickly
compare findings across tools and auditors
falsify a claim when the described exploit path is incomplete
build evaluation datasets for future security research

This project explores a workflow where an audit finding is treated as a hypothesis to be operationalized into code. For Ethereum researchers and reviewers, that means a report can be transformed into something closer to executable evidence rather than remaining only a narrative conclusion.

Core Contributions

Audit finding to executable test generation
The pipeline converts audit-report excerpts into Foundry .t.sol tests that import and interact with the original target contract.
Static-analysis-assisted context extraction
Slither-based analysis and custom helpers recover functions, call relationships, public state, and source snippets relevant to the finding.
Dynamic execution, iterative repair, and differential validation
Generated tests are executed in Foundry, repaired using compiler/runtime feedback, and optionally extended with before/after observations when the vulnerability supports differential checking.

What Is Implemented in This Repo

The current repository contains the following real components:

scripts/main.py Main research pipeline entrypoint for preprocessing, generation, execution, and differential validation.
scripts/MySlither/ Slither-based code analysis utilities used to recover CFG and call-graph related context.
scripts/tools/ Prompt assembly, Foundry project helpers, test correction helpers, statistics utilities, and smaller support scripts.
scripts/Preprocess/ Preprocessing utilities for report filtering and metadata extraction. Some of these scripts remain experimental and are not part of the main reproduction path.
scripts/example/ Example report and generated-test pairs used in prompting.
example/ Curated runnable artifacts for both benchmark-style and audit-report-driven cases.

The repository is intentionally kept as a research codebase under scripts/ rather than repackaged as a production-grade Python library.

Curated Examples And Generated Artifacts

To make the repository easier to review, it also includes a curated artifact gallery under example/README.md. This directory contains ready-to-inspect Foundry projects generated or organized from:

SmartBugs benchmark-style cases
real audit-report findings

These examples are important for the research story of the repository. They show that the project does not stop at describing a pipeline architecture. It already produces concrete, runnable outputs that a reviewer can inspect directly:

vulnerable source projects
generated validation tests
verification-oriented logs or execution records

For Ethereum Foundation style review, this matters because it strengthens three signals at once:

the work addresses a real Ethereum security problem
the outputs are concrete rather than aspirational
the artifact is inspectable without reconstructing the entire experiment stack first

If you want the fastest route through the repository, start with:

Repository Status

This repository should be read as an open research artifact:

appropriate for understanding the pipeline design
appropriate for reproducing or extending experiment-oriented workflows
appropriate for reviewing the implementation choices behind the artifact
not presented as a production-ready tool

Several components still reflect research iteration rather than hardened software packaging. This is intentional and documented rather than hidden.

How To Run

Requirements

Python 3.10+
Foundry
Slither
An OpenAI-compatible API endpoint
RPC endpoints if fork-based execution is required by the generated tests

Install Python dependencies:

pip install -r requirements.txt

Configure Environment

Copy .env.example into your local environment setup and provide values for the required variables:

OPENAI_API_KEY
OPENAI_BASE_URL
LLM_MODEL
RPC_ENDPOINTS_FILE
AUDIT_REPORT_DIR
CONTRACT_PROJECT_DIR
PIPELINE_OUTPUT_DIR

The main script also supports command-line overrides:

python3 scripts/main.py \
  --reports-dir /path/to/audit_reports \
  --contracts-dir /path/to/contract_projects \
  --output-dir /path/to/output \
  --model your-model-name \
  --rpc-file /path/to/rpc_endpoints.json \
  --example no_differential_\$solar_1

Run python3 scripts/main.py --help for the full argument list and default behavior.

Expected Inputs

The current artifact assumes:

a directory of audit-report files
a directory of target contract projects
preprocessing artifacts under scripts/Preprocess/save/
example prompt pairs under scripts/example/

The preprocessing utilities in scripts/Preprocess/ help prepare these inputs, but they should be treated as artifact support scripts rather than a polished CLI suite.

Pipeline Overview

At a high level, the implemented workflow is:

Select findings of interest from audit-report inputs.
Extract relevant Solidity code context and public state.
Build prompts for Foundry validation test generation.
Execute generated tests and iteratively repair them with runtime feedback.
Apply differential validation when the finding supports before/after comparison.
Save generated tests, logs, and verification artifacts under the configured output directory.

Example Coverage

The curated examples currently demonstrate two kinds of evidence:

benchmark-aligned vulnerability reproduction for canonical bugs such as reentrancy and tx.origin misuse
real audit-finding operationalization for report-driven cases such as privileged fee misconfiguration and authorized-user asset movement

This distinction is useful because benchmark cases show the pipeline can handle widely recognized vulnerability patterns, while audit-report cases show the more ambitious research contribution: turning natural-language findings, including owner-controlled or authorized-caller risk scenarios, into executable validation tests.

Limitations

The current artifact has clear limits and those limits are part of the honest evaluation of the repository:

It works best for findings that can be reproduced dynamically.
Differential validation is not applicable to every vulnerability class.
Reproducibility still depends on local environment setup, dataset preparation, and toolchain availability.
Some helper scripts remain experimental and are preserved mainly for transparency and extension.
Test quality depends on model behavior, prompt quality, contract complexity, and whether the report contains enough operational detail.
Some report-driven cases describe privileged-risk or owner-abuse scenarios rather than permissionless exploits; the examples are documented accordingly and should be read with those trust assumptions in mind.

Safety

This repository is intended for:

defensive validation of smart contract audit findings
benchmarking and reproducibility research
audit artifact inspection

It is not intended as:

a general-purpose offensive exploitation toolkit
a guarantee of real-world exploitability
a substitute for manual review by experienced auditors

Any reproduction involving deployed systems, live infrastructure, or forked chain state should be done only with appropriate authorization.

Open-Source Checklist

Before publishing or sharing derived work, review:

local path assumptions
RPC endpoint files
API credentials and .env handling
dataset-specific identifiers that should not be made public
generated logs and artifacts that may contain environment-specific details

This repository now uses requirements.txt and environment-based configuration rather than requiring inline credential edits in source files.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Audit-to-Foundry Validation Pipeline

Project Highlights

Why This Matters for Ethereum

Core Contributions

What Is Implemented in This Repo

Curated Examples And Generated Artifacts

Repository Status

How To Run

Requirements

Configure Environment

Expected Inputs

Pipeline Overview

Example Coverage

Limitations

Safety

Open-Source Checklist

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
example		example
scripts		scripts
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Audit-to-Foundry Validation Pipeline

Project Highlights

Why This Matters for Ethereum

Core Contributions

What Is Implemented in This Repo

Curated Examples And Generated Artifacts

Repository Status

How To Run

Requirements

Configure Environment

Expected Inputs

Pipeline Overview

Example Coverage

Limitations

Safety

Open-Source Checklist

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages