From static documentation to executable skills for agentic PostgreSQL tuning.
A Case for Agentic Tuning: From Documentation to Action in PostgreSQL
Hongyu Lin*, Mingyu Li*, Weichen Zhang, Mingjie Xing, Yihang Lou, Yanjun Wu, Haibo Chen
arXiv: 2605.19988v1
Modern database documentation is excellent at recording what experts recommend, but it rarely preserves how experts arrive at those recommendations. As hardware, workloads, and software versions evolve, static parameter values can become stale, context-insensitive, or even harmful.
PerfEvolve turns tuning knowledge into executable procedural skills that LLM-based agents can follow, verify, and adapt on a live PostgreSQL deployment.
Documentation-driven tuning systems often feed manuals directly into an LLM or an optimizer. This narrows the search space, but it also inherits the weaknesses of static documentation:
- Staleness: recommendations may reflect old hardware or older PostgreSQL behavior.
- Context insensitivity: one value rarely fits OLTP, OLAP, read-heavy, and write-heavy workloads equally well.
- Missing interactions: parameter dependencies are often absent from per-parameter documentation.
PerfEvolve instead encodes the process of tuning: what to measure, how to interpret measurements, when to branch, and which parameters must be optimized together.
In short, PerfEvolve shifts tuning knowledge from static recommendation: "set knob X to value Y" to executable skill: "measure signal S, compare it with criterion C, then decide how to tune knob X".
PerfEvolve has two phases:
- Offline profiling runs once per system version. It discovers which parameters matter, where safe operating ranges lie, and which parameters interact.
- Online deployment gives the generated procedural skill document to a tuning agent, which executes targeted profiling and optimization against the target deployment.
- The generated document is a DAG of skills. Each skill contains:
- preconditions: what must hold before execution;
- procedures: concrete actions such as benchmark, measure, compare, and branch;
- decision criteria: rules for choosing the next action;
- postconditions: expected outcomes after execution;
- reference data: empirical profiling data that anchors the decisions.
For PostgreSQL v16, PerfEvolve generates 23 executable skills and 160 parameter profiles.
PerfEvolve scans PostgreSQL performance-relevant parameters and ranks them by workload-specific sensitivity. In the paper's PostgreSQL study, this reduces the effective tuning set from 116 parameters to roughly 15 high-impact parameters, avoiding blind search over hundreds of knobs.
The generated artifact records:
- sensitivity ranking;
- empirical safe ranges;
- response-curve shapes;
- workload-specific sensitivity profiles.
High-impact parameters are not independent. PerfEvolve screens parameter pairs with factorial experiments and ANOVA, then builds an interaction graph. Connected components in this graph are optimized jointly, while independent nodes can be handled separately.
This makes joint optimization practical: instead of searching the full global configuration space, PerfEvolve decomposes it into small interacting groups.
The profiling artifacts are compiled into structured skills that can be consumed by an LLM-based agent or exported into existing tuning frameworks such as GPTuner and E2ETune.
PerfEvolve can be used in two ways:
- Full procedural execution: an agent follows the complete skill DAG step by step.
- Knowledge injection: PerfEvolve exports structured knowledge such as top-k knobs, safe ranges, and interaction hints to another tuning framework.
PerfEvolve was evaluated on PostgreSQL using TPC-C and TPC-H workloads.
| Result | Takeaway |
|---|---|
| Up to 35.2% performance improvement | Procedural skills outperform static documentation-driven tuning. |
| Up to 58.9 percentage-point recovery under cross-hardware transfer | Procedural knowledge helps rescue documentation-based tuning from severe degradation. |
| Structure-only procedural knowledge already helps | The form of knowledge matters, not only the amount of knowledge. |
See the paper for the full experimental setup, baselines, hardware details, and ablation studies.
This repository provides a lightweight research artifact for generating and inspecting PerfEvolve-style procedural tuning documents.
It currently includes:
- procedural skill schema and document generation logic;
- parsers for profiling and documentation data;
- aggregation and analysis utilities;
- PostgreSQL knob metadata;
- sample profiling and interaction results;
- a generated
perfevolve_document.yamlartifact.
This repository is mainly intended for:
- understanding the PerfEvolve skill representation;
- regenerating the procedural tuning document from sample profiling artifacts;
- adapting the generator to your own profiling results;
- using the generated document as input to an LLM-based tuning agent.
.
├── doctrl/
│ ├── docgen/ # procedural skill schema, templates, and generator
│ ├── parsers/ # parsers for profiling and documentation data
│ ├── aggregation/ # aggregation utilities for profiling runs
│ ├── analysis/ # knob classification and sensitivity utilities
│ └── models/ # data models used by the artifact
├── data/ # PostgreSQL knob metadata
├── res/ # sample profiling and interaction results
├── figs/ # figures used in the README
├── perfevolve_document.yaml
└── pyproject.toml
For document generation:
- Python 3.11+
- pip
For executing the generated tuning skills against a live database, you may also need:
- PostgreSQL 16
- a benchmark driver such as BenchBase
- representative OLTP / OLAP workloads
- an isolated test environment or VM cluster
- an LLM-based agent or tuning framework that can consume the generated skill document
Safety warning: PerfEvolve is a system tuning artifact. Generated skills may suggest testing PostgreSQL configurations that affect throughput, latency, memory usage, or stability. Always run tuning experiments in an isolated test environment before applying any configuration to production.
Clone the repository:
git clone https://github.com/ISCAS-OSLab/PerfEvolve.git
cd PerfEvolve
Install the package in editable mode:
python -m pip install -e .Generate the default procedural skill document:
python -m doctrl.docgen.generator --output perfevolve_document.yamlThe default generator reads sample profiling results from:
res/result-0.jsonres/result-e5a.jsonres/result-e124.json
You can also provide your own profiling files:
python -m doctrl.docgen.generator \
--single-knob path/to/single_knob.json \
--screening path/to/screening.json \
--e-series path/to/e_series.json \
--output generated_document.yamlThe generated perfevolve_document.yaml contains a procedural tuning guide for PostgreSQL. It is designed to guide an agent through:
- environment and version checks
- baseline measurement
- candidate knob selection
- single-knob sensitivity scans
- interaction screening
- component-wise joint optimization
- cross-workload verification
A simplified example of a generated skill looks like this:
skills:
- id: baseline_measurement
type: orchestration
preconditions:
- postgresql_instance_available
- benchmark_driver_available
procedure:
- measure_default_configuration
- record_baseline_throughput
- record_baseline_latency
decision_criteria:
- if baseline_is_unstable: repeat_measurement
- otherwise: proceed_to_candidate_selection
postconditions:
- baseline_result_recorded
- id: interaction_screening
type: component_skill
preconditions:
- candidate_knobs_selected
- safe_ranges_available
procedure:
- run_factorial_experiments
- compute_interaction_strength
- build_correlation_graph
decision_criteria:
- if interaction_strength > threshold: trigger_joint_optimization
- otherwise: tune_independently
postconditions:
- correlation_components_identified
The actual generated document contains richer parameter profiles, reference data, safe ranges, and decision rules.
A typical agent workflow is:
1. Load perfevolve_document.yaml
2. Check PostgreSQL version and deployment environment
3. Measure default performance under the target workload
4. Select candidate knobs from the procedural document
5. Verify sensitivity and safe ranges on the target deployment
6. Detect correlated knobs
7. Run component-wise joint optimization
8. Validate the final configuration across workloads
PerfEvolve does not require the agent to blindly trust fixed recommended values.PerfEvolve Instead, it provides executable procedures and empirical reference data that help the agent decide what to measure and how to adapt.
PerfEvolve is designed as a procedural knowledge representation for system tuning, not as a universal optimizer. Please keep the following limitations in mind:
- Profiling cost: full offline profiling can be expensive, although most runs are parallelizable and performed once per system version.
- Hardware dependence: empirical safe ranges, sensitivity strengths, and interaction weights may shift across hardware platforms.
- Workload dependence: the generated skills assume that offline profiling workloads are representative of the target deployment.
- Higher-order interactions: the current design focuses on pairwise and low-dimensional parameter interactions; some higher-order effects may be missed.
- Agent reliability: the generated skills reduce ambiguity, but the agent must still correctly run benchmarks, parse logs, compare results, and apply decision rules.
- Current scope: the released artifact focuses on PostgreSQL tuning. Applying the methodology to other systems may require additional profiling, rollback, sandboxing, and safety mechanisms.
If you find PerfEvolve useful, please cite our paper:
@misc{lin2026perfevolve,
title = {A Case for Agentic Tuning: From Documentation to Action in PostgreSQL},
author = {Hongyu Lin and Mingyu Li and Weichen Zhang and Mingjie Xing and Yihang Lou and Yanjun Wu and Haibo Chen},
year = {2026},
eprint = {2605.19988},
archivePrefix= {arXiv},
primaryClass = {cs.DB},
url = {https://arxiv.org/abs/2605.19988v1}
}Contributions and feedback are greatly appreciated! Whether you've found a bug, have a question, or want to suggest improvements, please open an issue. Your input helps make PerfEvolve better for everyone.
