Skip to content

ISCAS-OSLab/PerfEvolve

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PerfEvolve

From static documentation to executable skills for agentic PostgreSQL tuning.

arXiv Python

A Case for Agentic Tuning: From Documentation to Action in PostgreSQL
Hongyu Lin*, Mingyu Li*, Weichen Zhang, Mingjie Xing, Yihang Lou, Yanjun Wu, Haibo Chen
arXiv: 2605.19988v1

Modern database documentation is excellent at recording what experts recommend, but it rarely preserves how experts arrive at those recommendations. As hardware, workloads, and software versions evolve, static parameter values can become stale, context-insensitive, or even harmful.

PerfEvolve turns tuning knowledge into executable procedural skills that LLM-based agents can follow, verify, and adapt on a live PostgreSQL deployment.

Architecture

Why PerfEvolve?

Documentation-driven tuning systems often feed manuals directly into an LLM or an optimizer. This narrows the search space, but it also inherits the weaknesses of static documentation:

  • Staleness: recommendations may reflect old hardware or older PostgreSQL behavior.
  • Context insensitivity: one value rarely fits OLTP, OLAP, read-heavy, and write-heavy workloads equally well.
  • Missing interactions: parameter dependencies are often absent from per-parameter documentation.

PerfEvolve instead encodes the process of tuning: what to measure, how to interpret measurements, when to branch, and which parameters must be optimized together.

In short, PerfEvolve shifts tuning knowledge from static recommendation: "set knob X to value Y" to executable skill: "measure signal S, compare it with criterion C, then decide how to tune knob X".

Method Overview

PerfEvolve has two phases:

  1. Offline profiling runs once per system version. It discovers which parameters matter, where safe operating ranges lie, and which parameters interact.
  2. Online deployment gives the generated procedural skill document to a tuning agent, which executes targeted profiling and optimization against the target deployment.
  • The generated document is a DAG of skills. Each skill contains:
    • preconditions: what must hold before execution;
    • procedures: concrete actions such as benchmark, measure, compare, and branch;
    • decision criteria: rules for choosing the next action;
    • postconditions: expected outcomes after execution;
    • reference data: empirical profiling data that anchors the decisions.

For PostgreSQL v16, PerfEvolve generates 23 executable skills and 160 parameter profiles.

Key Techniques

1. Dimensionality Reduction

PerfEvolve scans PostgreSQL performance-relevant parameters and ranks them by workload-specific sensitivity. In the paper's PostgreSQL study, this reduces the effective tuning set from 116 parameters to roughly 15 high-impact parameters, avoiding blind search over hundreds of knobs.

The generated artifact records:

  • sensitivity ranking;
  • empirical safe ranges;
  • response-curve shapes;
  • workload-specific sensitivity profiles.

2. Topology Discovery

High-impact parameters are not independent. PerfEvolve screens parameter pairs with factorial experiments and ANOVA, then builds an interaction graph. Connected components in this graph are optimized jointly, while independent nodes can be handled separately.

This makes joint optimization practical: instead of searching the full global configuration space, PerfEvolve decomposes it into small interacting groups.

3. Agent-Compatible Skill Generation

The profiling artifacts are compiled into structured skills that can be consumed by an LLM-based agent or exported into existing tuning frameworks such as GPTuner and E2ETune.

PerfEvolve can be used in two ways:

  • Full procedural execution: an agent follows the complete skill DAG step by step.
  • Knowledge injection: PerfEvolve exports structured knowledge such as top-k knobs, safe ranges, and interaction hints to another tuning framework.

Highlights from the Paper

PerfEvolve was evaluated on PostgreSQL using TPC-C and TPC-H workloads.

Result Takeaway
Up to 35.2% performance improvement Procedural skills outperform static documentation-driven tuning.
Up to 58.9 percentage-point recovery under cross-hardware transfer Procedural knowledge helps rescue documentation-based tuning from severe degradation.
Structure-only procedural knowledge already helps The form of knowledge matters, not only the amount of knowledge.

See the paper for the full experimental setup, baselines, hardware details, and ablation studies.

What This Repository Contains

This repository provides a lightweight research artifact for generating and inspecting PerfEvolve-style procedural tuning documents.

It currently includes:

  • procedural skill schema and document generation logic;
  • parsers for profiling and documentation data;
  • aggregation and analysis utilities;
  • PostgreSQL knob metadata;
  • sample profiling and interaction results;
  • a generated perfevolve_document.yaml artifact.

This repository is mainly intended for:

  • understanding the PerfEvolve skill representation;
  • regenerating the procedural tuning document from sample profiling artifacts;
  • adapting the generator to your own profiling results;
  • using the generated document as input to an LLM-based tuning agent.

Repository Structure

.
├── doctrl/
│   ├── docgen/        # procedural skill schema, templates, and generator
│   ├── parsers/       # parsers for profiling and documentation data
│   ├── aggregation/   # aggregation utilities for profiling runs
│   ├── analysis/      # knob classification and sensitivity utilities
│   └── models/        # data models used by the artifact
├── data/              # PostgreSQL knob metadata
├── res/               # sample profiling and interaction results
├── figs/              # figures used in the README
├── perfevolve_document.yaml
└── pyproject.toml

Requirements

For document generation:

  • Python 3.11+
  • pip

For executing the generated tuning skills against a live database, you may also need:

  • PostgreSQL 16
  • a benchmark driver such as BenchBase
  • representative OLTP / OLAP workloads
  • an isolated test environment or VM cluster
  • an LLM-based agent or tuning framework that can consume the generated skill document

Safety warning: PerfEvolve is a system tuning artifact. Generated skills may suggest testing PostgreSQL configurations that affect throughput, latency, memory usage, or stability. Always run tuning experiments in an isolated test environment before applying any configuration to production.

Quick Start

Clone the repository:

git clone https://github.com/ISCAS-OSLab/PerfEvolve.git
cd PerfEvolve

Install the package in editable mode:

python -m pip install -e .

Generate the default procedural skill document:

python -m doctrl.docgen.generator --output perfevolve_document.yaml

The default generator reads sample profiling results from:

  • res/result-0.json
  • res/result-e5a.json
  • res/result-e124.json

You can also provide your own profiling files:

python -m doctrl.docgen.generator \
  --single-knob path/to/single_knob.json \
  --screening path/to/screening.json \
  --e-series path/to/e_series.json \
  --output generated_document.yaml

Output

The generated perfevolve_document.yaml contains a procedural tuning guide for PostgreSQL. It is designed to guide an agent through:

  1. environment and version checks
  2. baseline measurement
  3. candidate knob selection
  4. single-knob sensitivity scans
  5. interaction screening
  6. component-wise joint optimization
  7. cross-workload verification

A simplified example of a generated skill looks like this:

skills:
  - id: baseline_measurement
    type: orchestration
    preconditions:
      - postgresql_instance_available
      - benchmark_driver_available
    procedure:
      - measure_default_configuration
      - record_baseline_throughput
      - record_baseline_latency
    decision_criteria:
      - if baseline_is_unstable: repeat_measurement
      - otherwise: proceed_to_candidate_selection
    postconditions:
      - baseline_result_recorded

  - id: interaction_screening
    type: component_skill
    preconditions:
      - candidate_knobs_selected
      - safe_ranges_available
    procedure:
      - run_factorial_experiments
      - compute_interaction_strength
      - build_correlation_graph
    decision_criteria:
      - if interaction_strength > threshold: trigger_joint_optimization
      - otherwise: tune_independently
    postconditions:
      - correlation_components_identified

The actual generated document contains richer parameter profiles, reference data, safe ranges, and decision rules.

Using PerfEvolve with an Agent

A typical agent workflow is:

1. Load perfevolve_document.yaml
2. Check PostgreSQL version and deployment environment
3. Measure default performance under the target workload
4. Select candidate knobs from the procedural document
5. Verify sensitivity and safe ranges on the target deployment
6. Detect correlated knobs
7. Run component-wise joint optimization
8. Validate the final configuration across workloads

PerfEvolve does not require the agent to blindly trust fixed recommended values.PerfEvolve Instead, it provides executable procedures and empirical reference data that help the agent decide what to measure and how to adapt.

Limitations

PerfEvolve is designed as a procedural knowledge representation for system tuning, not as a universal optimizer. Please keep the following limitations in mind:

  • Profiling cost: full offline profiling can be expensive, although most runs are parallelizable and performed once per system version.
  • Hardware dependence: empirical safe ranges, sensitivity strengths, and interaction weights may shift across hardware platforms.
  • Workload dependence: the generated skills assume that offline profiling workloads are representative of the target deployment.
  • Higher-order interactions: the current design focuses on pairwise and low-dimensional parameter interactions; some higher-order effects may be missed.
  • Agent reliability: the generated skills reduce ambiguity, but the agent must still correctly run benchmarks, parse logs, compare results, and apply decision rules.
  • Current scope: the released artifact focuses on PostgreSQL tuning. Applying the methodology to other systems may require additional profiling, rollback, sandboxing, and safety mechanisms.

Citation

If you find PerfEvolve useful, please cite our paper:

@misc{lin2026perfevolve,
  title        = {A Case for Agentic Tuning: From Documentation to Action in PostgreSQL},
  author       = {Hongyu Lin and Mingyu Li and Weichen Zhang and Mingjie Xing and Yihang Lou and Yanjun Wu and Haibo Chen},
  year         = {2026},
  eprint       = {2605.19988},
  archivePrefix= {arXiv},
  primaryClass = {cs.DB},
  url          = {https://arxiv.org/abs/2605.19988v1}
}

Feedback

Contributions and feedback are greatly appreciated! Whether you've found a bug, have a question, or want to suggest improvements, please open an issue. Your input helps make PerfEvolve better for everyone.

About

PerfEvolve: Agentic Tuning for PostgreSQL

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages