PerfEvolve

From static documentation to executable skills for agentic PostgreSQL tuning.

A Case for Agentic Tuning: From Documentation to Action in PostgreSQL
Hongyu Lin*, Mingyu Li*, Weichen Zhang, Mingjie Xing, Yihang Lou, Yanjun Wu, Haibo Chen
arXiv: 2605.19988v1

Modern database documentation is excellent at recording what experts recommend, but it rarely preserves how experts arrive at those recommendations. As hardware, workloads, and software versions evolve, static parameter values can become stale, context-insensitive, or even harmful.

PerfEvolve turns tuning knowledge into executable procedural skills that LLM-based agents can follow, verify, and adapt on a live PostgreSQL deployment.

Why PerfEvolve?

Documentation-driven tuning systems often feed manuals directly into an LLM or an optimizer. This narrows the search space, but it also inherits the weaknesses of static documentation:

Staleness: recommendations may reflect old hardware or older PostgreSQL behavior.
Context insensitivity: one value rarely fits OLTP, OLAP, read-heavy, and write-heavy workloads equally well.
Missing interactions: parameter dependencies are often absent from per-parameter documentation.

PerfEvolve instead encodes the process of tuning: what to measure, how to interpret measurements, when to branch, and which parameters must be optimized together.

In short, PerfEvolve shifts tuning knowledge from static recommendation: "set knob X to value Y" to executable skill: "measure signal S, compare it with criterion C, then decide how to tune knob X".

Method Overview

PerfEvolve has two phases:

Offline profiling runs once per system version. It discovers which parameters matter, where safe operating ranges lie, and which parameters interact.
Online deployment gives the generated procedural skill document to a tuning agent, which executes targeted profiling and optimization against the target deployment.

The generated document is a DAG of skills. Each skill contains:
- preconditions: what must hold before execution;
- procedures: concrete actions such as benchmark, measure, compare, and branch;
- decision criteria: rules for choosing the next action;
- postconditions: expected outcomes after execution;
- reference data: empirical profiling data that anchors the decisions.

For PostgreSQL v16, PerfEvolve generates 23 executable skills and 160 parameter profiles.

Key Techniques

1. Dimensionality Reduction

PerfEvolve scans PostgreSQL performance-relevant parameters and ranks them by workload-specific sensitivity. In the paper's PostgreSQL study, this reduces the effective tuning set from 116 parameters to roughly 15 high-impact parameters, avoiding blind search over hundreds of knobs.

The generated artifact records:

sensitivity ranking;
empirical safe ranges;
response-curve shapes;
workload-specific sensitivity profiles.

2. Topology Discovery

High-impact parameters are not independent. PerfEvolve screens parameter pairs with factorial experiments and ANOVA, then builds an interaction graph. Connected components in this graph are optimized jointly, while independent nodes can be handled separately.

This makes joint optimization practical: instead of searching the full global configuration space, PerfEvolve decomposes it into small interacting groups.

3. Agent-Compatible Skill Generation

The profiling artifacts are compiled into structured skills that can be consumed by an LLM-based agent or exported into existing tuning frameworks such as GPTuner and E2ETune.

PerfEvolve can be used in two ways:

Full procedural execution: an agent follows the complete skill DAG step by step.
Knowledge injection: PerfEvolve exports structured knowledge such as top-k knobs, safe ranges, and interaction hints to another tuning framework.

Highlights from the Paper

PerfEvolve was evaluated on PostgreSQL using TPC-C and TPC-H workloads.

Result	Takeaway
Up to 35.2% performance improvement	Procedural skills outperform static documentation-driven tuning.
Up to 58.9 percentage-point recovery under cross-hardware transfer	Procedural knowledge helps rescue documentation-based tuning from severe degradation.
Structure-only procedural knowledge already helps	The form of knowledge matters, not only the amount of knowledge.

See the paper for the full experimental setup, baselines, hardware details, and ablation studies.

What This Repository Contains

This repository provides a lightweight research artifact for generating and inspecting PerfEvolve-style procedural tuning documents.

It currently includes:

procedural skill schema and document generation logic;
parsers for profiling and documentation data;
aggregation and analysis utilities;
PostgreSQL knob metadata;
sample profiling and interaction results;
a generated perfevolve_document.yaml artifact.

This repository is mainly intended for:

understanding the PerfEvolve skill representation;
regenerating the procedural tuning document from sample profiling artifacts;
adapting the generator to your own profiling results;
using the generated document as input to an LLM-based tuning agent.

Repository Structure

.
├── doctrl/
│   ├── docgen/        # procedural skill schema, templates, and generator
│   ├── parsers/       # parsers for profiling and documentation data
│   ├── aggregation/   # aggregation utilities for profiling runs
│   ├── analysis/      # knob classification and sensitivity utilities
│   └── models/        # data models used by the artifact
├── data/              # PostgreSQL knob metadata
├── res/               # sample profiling and interaction results
├── figs/              # figures used in the README
├── perfevolve_document.yaml
└── pyproject.toml

Requirements

For document generation:

Python 3.11+
pip

For executing the generated tuning skills against a live database, you may also need:

PostgreSQL 16
a benchmark driver such as BenchBase
representative OLTP / OLAP workloads
an isolated test environment or VM cluster
an LLM-based agent or tuning framework that can consume the generated skill document

Safety warning: PerfEvolve is a system tuning artifact. Generated skills may suggest testing PostgreSQL configurations that affect throughput, latency, memory usage, or stability. Always run tuning experiments in an isolated test environment before applying any configuration to production.

Quick Start

Clone the repository:

git clone https://github.com/ISCAS-OSLab/PerfEvolve.git
cd PerfEvolve

Install the package in editable mode:

python -m pip install -e .

Generate the default procedural skill document:

python -m doctrl.docgen.generator --output perfevolve_document.yaml

The default generator reads sample profiling results from:

res/result-0.json
res/result-e5a.json
res/result-e124.json

You can also provide your own profiling files:

python -m doctrl.docgen.generator \
  --single-knob path/to/single_knob.json \
  --screening path/to/screening.json \
  --e-series path/to/e_series.json \
  --output generated_document.yaml

Output

The generated perfevolve_document.yaml contains a procedural tuning guide for PostgreSQL. It is designed to guide an agent through:

environment and version checks
baseline measurement
candidate knob selection
single-knob sensitivity scans
interaction screening
component-wise joint optimization
cross-workload verification

A simplified example of a generated skill looks like this:

skills:
  - id: baseline_measurement
    type: orchestration
    preconditions:
      - postgresql_instance_available
      - benchmark_driver_available
    procedure:
      - measure_default_configuration
      - record_baseline_throughput
      - record_baseline_latency
    decision_criteria:
      - if baseline_is_unstable: repeat_measurement
      - otherwise: proceed_to_candidate_selection
    postconditions:
      - baseline_result_recorded

  - id: interaction_screening
    type: component_skill
    preconditions:
      - candidate_knobs_selected
      - safe_ranges_available
    procedure:
      - run_factorial_experiments
      - compute_interaction_strength
      - build_correlation_graph
    decision_criteria:
      - if interaction_strength > threshold: trigger_joint_optimization
      - otherwise: tune_independently
    postconditions:
      - correlation_components_identified

The actual generated document contains richer parameter profiles, reference data, safe ranges, and decision rules.

Using PerfEvolve with an Agent

A typical agent workflow is:

1. Load perfevolve_document.yaml
2. Check PostgreSQL version and deployment environment
3. Measure default performance under the target workload
4. Select candidate knobs from the procedural document
5. Verify sensitivity and safe ranges on the target deployment
6. Detect correlated knobs
7. Run component-wise joint optimization
8. Validate the final configuration across workloads

PerfEvolve does not require the agent to blindly trust fixed recommended values.PerfEvolve Instead, it provides executable procedures and empirical reference data that help the agent decide what to measure and how to adapt.

Limitations

PerfEvolve is designed as a procedural knowledge representation for system tuning, not as a universal optimizer. Please keep the following limitations in mind:

Profiling cost: full offline profiling can be expensive, although most runs are parallelizable and performed once per system version.
Hardware dependence: empirical safe ranges, sensitivity strengths, and interaction weights may shift across hardware platforms.
Workload dependence: the generated skills assume that offline profiling workloads are representative of the target deployment.
Higher-order interactions: the current design focuses on pairwise and low-dimensional parameter interactions; some higher-order effects may be missed.
Agent reliability: the generated skills reduce ambiguity, but the agent must still correctly run benchmarks, parse logs, compare results, and apply decision rules.
Current scope: the released artifact focuses on PostgreSQL tuning. Applying the methodology to other systems may require additional profiling, rollback, sandboxing, and safety mechanisms.

Citation

If you find PerfEvolve useful, please cite our paper:

@misc{lin2026perfevolve,
  title        = {A Case for Agentic Tuning: From Documentation to Action in PostgreSQL},
  author       = {Hongyu Lin and Mingyu Li and Weichen Zhang and Mingjie Xing and Yihang Lou and Yanjun Wu and Haibo Chen},
  year         = {2026},
  eprint       = {2605.19988},
  archivePrefix= {arXiv},
  primaryClass = {cs.DB},
  url          = {https://arxiv.org/abs/2605.19988v1}
}

Feedback

Contributions and feedback are greatly appreciated! Whether you've found a bug, have a question, or want to suggest improvements, please open an issue. Your input helps make PerfEvolve better for everyone.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PerfEvolve

Why PerfEvolve?

Method Overview

Key Techniques

1. Dimensionality Reduction

2. Topology Discovery

3. Agent-Compatible Skill Generation

Highlights from the Paper

What This Repository Contains

Repository Structure

Requirements

Quick Start

Output

Using PerfEvolve with an Agent

Limitations

Citation

Feedback

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
data		data
doctrl		doctrl
figs		figs
res		res
.gitignore		.gitignore
README.md		README.md
perfevolve_document.yaml		perfevolve_document.yaml
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

PerfEvolve

Why PerfEvolve?

Method Overview

Key Techniques

1. Dimensionality Reduction

2. Topology Discovery

3. Agent-Compatible Skill Generation

Highlights from the Paper

What This Repository Contains

Repository Structure

Requirements

Quick Start

Output

Using PerfEvolve with an Agent

Limitations

Citation

Feedback

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages