Skip to content

Commit c388bb1

Browse files
feat: enhance parameter docs with structured alerts
Improve readability and highlight critical information using GitHub-style alert blocks (NOTE, TIP, WARNING, IMPORTANT). Convert formulas to code blocks, add backticks to parameters/commands, and reorganize recommendations for better discoverability. Makes warnings about OOM risks, Windows limits, and security practices more prominent and scannable. Signed-off-by: Sebastian Webber <sebastian@swebber.me>
1 parent 5b38419 commit c388bb1

1 file changed

Lines changed: 108 additions & 57 deletions

File tree

rules.yml

Lines changed: 108 additions & 57 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ categories:
44
abstract: |
55
Allocates shared memory for caching data pages. Acts as PostgreSQL's main disk cache, similar to Oracle's SGA buffer.
66
7-
Start with **25% of RAM** as a baseline. For optimal tuning, use the **pg_buffercache extension** to analyze cache hit ratios for your specific workload.
7+
Start with **25% of RAM** as a baseline. For optimal tuning, use the `pg_buffercache` extension to analyze cache hit ratios for your specific workload.
88
recomendations:
99
Tuning Your PostgreSQL Server: https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server#shared_buffers
1010
Determining optimal shared_buffers using pg_buffercache: https://aws.amazon.com/blogs/database/determining-the-optimal-value-for-shared_buffers-using-the-pg_buffercache-extension-in-postgresql/
@@ -27,17 +27,28 @@ categories:
2727
Optimize PostgreSQL Server Performance Through Configuration: https://blog.crunchydata.com/blog/optimize-postgresql-server-performance
2828
work_mem:
2929
abstract: |
30-
Memory per operation for sorts, hash joins, and aggregates. Each query can use **multiple work_mem buffers** simultaneously.
31-
32-
**⚠️ Warning**: With high concurrency and large datasets, you can easily trigger **OOM kills** in Kubernetes pods or cloud instances.
33-
34-
Maximum potential memory = `work_mem × operations × parallel_workers × connections`
35-
36-
Example worst-case: 128MB × 3 operations × 2 workers × 100 connections = **102GB**
37-
38-
**Windows ≤ PostgreSQL 17**: Maximum value is ~2GB (2097151 kB) due to Windows LLP64 model where `sizeof(long)==4` even on 64-bit systems. Fixed in [PostgreSQL 18](https://www.postgresql.org/message-id/flat/1a01f0-66ec2d80-3b-68487680@27595217) which increased the limit to 2TB. See also [pgvector issue #667](https://github.com/pgvector/pgvector/issues/667).
39-
40-
Monitor temp file usage with `log_temp_files`. Consider **per-session** tuning (`SET work_mem`) for heavy queries instead of global settings.
30+
Memory per operation for sorts, hash joins, and aggregates. Each query can use multiple `work_mem` buffers simultaneously.
31+
32+
> [!WARNING]
33+
> With high concurrency and large datasets, you can easily trigger **OOM kills** in Kubernetes pods or cloud instances.
34+
>
35+
> Maximum potential memory:
36+
> ```
37+
> max = work_mem × operations × parallel_workers × connections
38+
> ```
39+
>
40+
> Example worst-case:
41+
> ```
42+
> 128MB × 3 operations × 2 workers × 100 connections = 102GB
43+
> ```
44+
45+
> [!NOTE]
46+
> **Windows ≤ PostgreSQL 17**: Maximum value is ~2GB (2097151 kB) due to Windows LLP64 model where `sizeof(long)==4` even on 64-bit systems.
47+
>
48+
> Fixed in [PostgreSQL 18](https://www.postgresql.org/message-id/flat/1a01f0-66ec2d80-3b-68487680@27595217) which increased the limit to 2TB. See also [pgvector issue #667](https://github.com/pgvector/pgvector/issues/667).
49+
50+
> [!TIP]
51+
> Monitor temp file usage with `log_temp_files`. Consider **per-session** tuning (`SET work_mem`) for heavy queries instead of global settings.
4152
details:
4253
- Specifies the amount of memory to be used by internal sort operations and hash tables before writing to temporary disk files. The value defaults to four megabytes (4MB). Note that for a complex query, several sort or hash operations might be running in parallel; each operation will be allowed to use as much memory as this value specifies before it starts to write data into temporary files. Also, several running sessions could be doing such operations concurrently. Therefore, the total memory used could be many times the value of work_mem; it is necessary to keep this fact in mind when choosing the value. Sort operations are used for ORDER BY, DISTINCT, and merge joins. Hash tables are used in hash joins, hash-based aggregation, and hash-based processing of IN subqueries.
4354
recomendations:
@@ -49,15 +60,22 @@ categories:
4960
Let's get back to basics - PostgreSQL Memory Components: https://www.postgresql.fastware.com/blog/back-to-basics-with-postgresql-memory-components
5061
maintenance_work_mem:
5162
abstract: |
52-
Memory for maintenance operations: **VACUUM**, **CREATE INDEX**, **ALTER TABLE**, and autovacuum workers.
53-
54-
Can be set higher than work_mem since fewer concurrent maintenance operations run.
55-
56-
**Important**: Total usage = `maintenance_work_mem × autovacuum_max_workers`. Consider using `autovacuum_work_mem` separately.
57-
58-
**PostgreSQL ≤16**: 1GB limit (~179M dead tuples per pass). **PostgreSQL 17+**: No limit (uses radix trees).
59-
60-
**Windows ≤ PostgreSQL 17**: Maximum value is ~2GB (2097151 kB) due to Windows LLP64 model where `sizeof(long)==4` even on 64-bit systems. Fixed in [PostgreSQL 18](https://www.postgresql.org/message-id/flat/1a01f0-66ec2d80-3b-68487680@27595217) which increased the limit to 2TB. See also [pgvector issue #667](https://github.com/pgvector/pgvector/issues/667).
63+
Memory for maintenance operations: `VACUUM`, `CREATE INDEX`, `ALTER TABLE`, and autovacuum workers.
64+
65+
Can be set higher than `work_mem` since fewer concurrent maintenance operations run.
66+
67+
> [!IMPORTANT]
68+
> Total usage:
69+
> ```
70+
> total = maintenance_work_mem × autovacuum_max_workers
71+
> ```
72+
>
73+
> Consider using `autovacuum_work_mem` separately.
74+
75+
> [!NOTE]
76+
> **PostgreSQL ≤16**: 1GB limit (~179M dead tuples per pass). **PostgreSQL 17+**: No limit (uses radix trees).
77+
>
78+
> **Windows ≤ PostgreSQL 17**: Maximum value is ~2GB (2097151 kB) due to Windows LLP64 model where `sizeof(long)==4` even on 64-bit systems. Fixed in [PostgreSQL 18](https://www.postgresql.org/message-id/flat/1a01f0-66ec2d80-3b-68487680@27595217) which increased the limit to 2TB. See also [pgvector issue #667](https://github.com/pgvector/pgvector/issues/667).
6179
recomendations:
6280
Adjusting maintenance_work_mem: https://www.cybertec-postgresql.com/en/adjusting-maintenance_work_mem/
6381
How Much maintenance_work_mem Do I Need?: http://rhaas.blogspot.com/2019/01/how-much-maintenanceworkmem-do-i-need.html
@@ -69,7 +87,7 @@ categories:
6987
checkpoint_related:
7088
min_wal_size:
7189
abstract: |
72-
Minimum size of pg_wal directory (pg_xlog in versions <10). WAL files are **recycled** rather than removed when below this threshold.
90+
Minimum size of `pg_wal` directory (`pg_xlog` in versions <10). WAL files are **recycled** rather than removed when below this threshold.
7391
7492
Useful to handle **WAL spikes** during batch jobs or high write periods.
7593
recomendations:
@@ -79,11 +97,12 @@ categories:
7997
"Tuning Your Postgres Database for High Write Loads": https://www.crunchydata.com/blog/tuning-your-postgres-database-for-high-write-loads
8098
max_wal_size:
8199
abstract: |
82-
Triggers checkpoint when pg_wal exceeds this size. Larger values reduce checkpoint frequency but increase crash recovery time.
83-
84-
**Recommendation**: Set to hold **1 hour of WAL**. Write-heavy systems may need significantly more.
100+
Triggers checkpoint when `pg_wal` exceeds this size. Larger values reduce checkpoint frequency but increase crash recovery time.
85101
86-
Monitor `pg_stat_bgwriter` to ensure most checkpoints are **timed** (not requested).
102+
> [!TIP]
103+
> Set to hold **1 hour of WAL**. Write-heavy systems may need significantly more.
104+
>
105+
> Monitor `pg_stat_bgwriter` to ensure most checkpoints are **timed** (not requested).
87106
recomendations:
88107
"Basics of Tuning Checkpoints": https://www.enterprisedb.com/blog/basics-tuning-checkpoints
89108
"Tuning max_wal_size in PostgreSQL": https://www.enterprisedb.com/blog/tuning-maxwalsize-postgresql
@@ -95,8 +114,14 @@ categories:
95114
abstract: |
96115
Spreads checkpoint writes over this fraction of `checkpoint_timeout` to reduce I/O spikes.
97116
98-
**Example**: `checkpoint_timeout = 5min` and `checkpoint_completion_target = 0.9`
99-
→ Checkpoint spreads writes over **270 seconds (4min 30s)**, leaving 30s buffer for sync overhead.
117+
> [!TIP]
118+
> Example:
119+
> ```
120+
> checkpoint_timeout = 5min
121+
> checkpoint_completion_target = 0.9
122+
> ```
123+
>
124+
> Checkpoint spreads writes over **270 seconds (4min 30s)**, leaving 30s buffer for sync overhead.
100125
101126
Values higher than 0.9 risk checkpoint delays. Monitor via `pg_stat_bgwriter`.
102127
recomendations:
@@ -127,22 +152,35 @@ categories:
127152
abstract: |
128153
Network interfaces PostgreSQL listens on for connections.
129154
130-
**Security**: Default is `localhost` (local-only). Never use `*` or `0.0.0.0` exposed to internet.
131-
132-
Use specific IPs with `pg_hba.conf` rules, or SSH tunnels/VPN for remote access.
155+
> [!WARNING]
156+
> **Security**: Default is `localhost` (local-only). Avoid `*` or `0.0.0.0` exposed to internet.
157+
>
158+
> Use specific IPs with `pg_hba.conf` rules, or SSH tunnels/VPN for remote access.
159+
>
160+
> If exposing PostgreSQL over network, **always enable SSL/TLS** (`ssl = on` + certificates) and enforce `hostssl` in `pg_hba.conf`.
133161
recomendations:
134162
"PostgreSQL Connections and Authentication": https://www.postgresql.org/docs/current/runtime-config-connection.html
135163
"PostgreSQL Security: 12 rules for database hardening": https://www.cybertec-postgresql.com/en/postgresql-security-things-to-avoid-in-real-life/
136164
"Postgres security best practices": https://www.bytebase.com/reference/postgres/how-to/postgres-security-best-practices/
137165
max_connections:
138166
abstract: |
139-
Maximum concurrent database connections. Each connection consumes memory (~10MB + work_mem per operation).
140-
141-
**Best practice**: Use **connection pooling** (PgBouncer, pgpool) instead of high max_connections.
142-
143-
With pooling: 20-50 connections. Without pooling: 100-200 (but review memory impact).
144-
145-
Formula: `(RAM - shared_buffers) / (work_mem × avg_operations_per_query)` for rough estimate.
167+
Maximum concurrent database connections. Each connection consumes memory (~10MB + `work_mem` per operation).
168+
169+
> [!TIP]
170+
> Use **connection pooling** instead of high `max_connections`:
171+
> - [PgBouncer](https://www.pgbouncer.org/) - Lightweight, battle-tested
172+
> - [PgCat](https://github.com/postgresml/pgcat) - Modern, written in Rust
173+
> - [Pgpool-II](https://www.pgpool.net/) - Feature-rich with query caching
174+
>
175+
> | Scenario | Recommended Connections |
176+
> |----------|------------------------|
177+
> | With pooling | 20-50 |
178+
> | Without pooling | 100-200 (review memory impact) |
179+
>
180+
> Memory estimation formula:
181+
> ```
182+
> max_connections_limit = (RAM - shared_buffers) / (work_mem × avg_operations_per_query)
183+
> ```
146184
recomendations:
147185
"Tuning max_connections in PostgreSQL": https://www.cybertec-postgresql.com/en/tuning-max_connections-in-postgresql/
148186
"Why you should use Connection Pooling": https://www.enterprisedb.com/postgres-tutorials/why-you-should-use-connection-pooling-when-setting-maxconnections-postgres
@@ -156,13 +194,16 @@ categories:
156194
157195
Lower values favor index scans, higher values favor sequential scans. Sequential scans become more efficient when queries return ~5-10% or more of table rows, common in analytical/DW workloads.
158196
159-
**Debate (2025)**: Some experts advocate keeping higher values (4.0) for **plan stability** across cache states, while others recommend lower values (1.1-2.0) for SSD to favor index scans.
197+
> [!NOTE]
198+
> **Ongoing debate (2025)**: Some experts advocate keeping higher values (4.0) for **plan stability** across cache states, while others recommend lower values (1.1-2.0) for SSD to favor index scans.
199+
>
200+
> Check suggested readings #1 and #2 for detailed analysis.
160201
161202
Test with `EXPLAIN ANALYZE` to verify query plan choices for your workload.
162203
recomendations:
163-
"How a single PostgreSQL config change improved slow query performance by 50x": https://amplitude.engineering/how-a-single-postgresql-config-change-improved-slow-query-performance-by-50x-85593b8991b0
164-
"Better PostgreSQL performance on SSDs": https://www.cybertec-postgresql.com/en/better-postgresql-performance-on-ssds/
165204
"PostgreSQL with modern storage: what about a lower random_page_cost?": https://dev.to/aws-heroes/postgresql-with-modern-storage-what-about-a-lower-randompagecost-5b7f
205+
"Better PostgreSQL performance on SSDs": https://www.cybertec-postgresql.com/en/better-postgresql-performance-on-ssds/
206+
"How a single PostgreSQL config change improved slow query performance by 50x": https://amplitude.engineering/how-a-single-postgresql-config-change-improved-slow-query-performance-by-50x-85593b8991b0
166207
"Postgres Scan Types in EXPLAIN Plans": https://www.crunchydata.com/blog/postgres-scan-types-in-explain-plans
167208
"Tuning Your PostgreSQL Server": https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server
168209
effective_io_concurrency:
@@ -171,22 +212,22 @@ categories:
171212
172213
Bitmap scans are used when queries need to fetch moderate result sets (too many rows for index scans, too few for sequential scans) or when combining multiple indexes. They're more common in analytical workloads.
173214
174-
PostgreSQL 18 changes the default from 1 to 16. Values above 200 show diminishing returns in benchmarks.
215+
> [!NOTE]
216+
> **PostgreSQL 18** changes the default from `1` to `16`. Values above `200` show diminishing returns in benchmarks.
175217
recomendations:
176218
"PostgreSQL: effective_io_concurrency benchmarked": https://portavita.github.io/2019-07-19-PostgreSQL_effective_io_concurrency_benchmarked/
177219
"Bitmap Heap Scan - pganalyze": https://pganalyze.com/docs/explain/scan-nodes/bitmap-heap-scan
178220
"PostgreSQL indexing: Index scan vs. Bitmap scan vs. Sequential scan (basics)": https://www.cybertec-postgresql.com/en/postgresql-indexing-index-scan-vs-bitmap-scan-vs-sequential-scan-basics/
179221
io_method:
180222
abstract: |
181-
Selects the async I/O implementation for read operations (PostgreSQL 18+).
182-
183-
**worker** (default): Uses dedicated background processes. Best for most workloads, especially high-bandwidth sequential scans. Recommended as default.
184-
185-
**io_uring** (Linux only): Kernel-level async I/O. Only switch after extensive testing proves benefit for your specific low-latency random-read patterns. Can hit file descriptor limits with high max_connections.
223+
Selects the async I/O implementation for read operations (PostgreSQL 18+):
186224
187-
**sync**: Traditional synchronous I/O. Slower than async methods - avoid unless debugging or testing.
225+
- **`worker`** (default): Uses dedicated background processes. Best for most workloads, especially high-bandwidth sequential scans. Recommended as default.
226+
- **`io_uring`** (Linux only): Kernel-level async I/O. Only switch after extensive testing proves benefit for your specific low-latency random-read patterns. Can hit file descriptor limits with high `max_connections`.
227+
- **`sync`**: Traditional synchronous I/O. Slower than async methods - avoid unless debugging or testing.
188228
189-
Note: Only affects reads. Writes, checkpoints, and WAL still use sync I/O.
229+
> [!NOTE]
230+
> Only affects reads. Writes, checkpoints, and WAL still use sync I/O.
190231
recomendations:
191232
"Tuning AIO in PostgreSQL 18 - Tomas Vondra": https://vondra.me/posts/tuning-aio-in-postgresql-18/
192233
"Waiting for Postgres 18: Accelerating Disk Reads with Asynchronous I/O - pganalyze": https://pganalyze.com/blog/postgres-18-async-io
@@ -197,7 +238,10 @@ categories:
197238
abstract: |
198239
Background worker processes for async I/O when `io_method = worker`.
199240
200-
Default of 3 is too low for modern multi-core systems. Recommendation: **10-40% of CPU cores** depending on workload.
241+
> [!TIP]
242+
> Default of `3` is too low for modern multi-core systems.
243+
>
244+
> **Recommendation**: 10-40% of CPU cores depending on workload.
201245
202246
Higher values benefit workloads with:
203247
- Sequential scans (DW/analytical queries)
@@ -247,7 +291,10 @@ categories:
247291
abstract: |
248292
Hard limit on concurrent I/O operations per backend process (PostgreSQL 18+).
249293
250-
Controls read-ahead with async I/O. Formula: `max read-ahead = effective_io_concurrency × io_combine_limit`
294+
Controls read-ahead with async I/O:
295+
```
296+
max_read_ahead = effective_io_concurrency × io_combine_limit
297+
```
251298
252299
Higher values benefit high-latency storage (cloud/EBS) with high IOPS. Watch memory usage - high concurrency increases memory pressure.
253300
recomendations:
@@ -258,9 +305,10 @@ categories:
258305
"PostgreSQL 18 Asynchronous I/O - Neon": https://neon.com/postgresql/postgresql-18/asynchronous-io
259306
file_copy_method:
260307
abstract: |
261-
Method for copying files during **CREATE DATABASE** and **ALTER DATABASE SET TABLESPACE** (PostgreSQL 18+).
308+
Method for copying files during `CREATE DATABASE` and `ALTER DATABASE SET TABLESPACE` (PostgreSQL 18+).
262309
263-
Recommendation: Use **clone** if your filesystem supports it - dramatically faster (200-600ms for 100s of GB) and initially consumes zero extra disk space.
310+
> [!TIP]
311+
> Use `clone` if your filesystem supports it - dramatically faster (200-600ms for 100s of GB) and initially consumes zero extra disk space.
264312
recomendations:
265313
"Instant database clones with PostgreSQL 18": https://boringsql.com/posts/instant-database-clones/
266314
"Instant Per-Branch Databases with PostgreSQL 18's clone": https://medium.com/axial-engineering/instant-per-branch-databases-with-postgresql-18s-clone-file-copy-and-copy-on-write-filesystems-1b1930bddbaa
@@ -273,9 +321,10 @@ categories:
273321
Pool from which all background workers are drawn. Must accommodate:
274322
- Parallel query workers (`max_parallel_workers`)
275323
- Logical replication workers
276-
- Extensions (pg_stat_statements, etc.)
324+
- Extensions (`pg_stat_statements`, etc.)
277325
278-
Recommendation: Set to **CPU core count** or at least **25% of vCPUs**. Requires restart.
326+
> [!TIP]
327+
> Set to **CPU core count** or at least **25% of vCPUs**. Requires restart.
279328
recomendations:
280329
"PostgreSQL Performance Tuning Best Practices 2025": https://www.mydbops.com/blog/postgresql-parameter-tuning-best-practices
281330
"PostgreSQL Performance Tuning: Key Parameters": https://www.tigerdata.com/learn/postgresql-performance-tuning-key-parameters
@@ -284,7 +333,8 @@ categories:
284333
abstract: |
285334
Maximum parallel workers per query executor node.
286335
287-
Each worker consumes resources individually (work_mem, CPU, I/O). A query with 4 workers uses 5x resources (1 leader + 4 workers).
336+
> [!IMPORTANT]
337+
> Each worker consumes resources individually (`work_mem`, CPU, I/O). A query with 4 workers uses 5x resources (1 leader + 4 workers).
288338
recomendations:
289339
"Increasing max parallel workers per gather in Postgres": https://www.pgmustard.com/blog/max-parallel-workers-per-gather
290340
"Postgres Tuning & Performance for Analytics Data": https://www.crunchydata.com/blog/postgres-tuning-and-performance-for-analytics-data
@@ -296,7 +346,8 @@ categories:
296346
297347
Limits total parallel workers from the `max_worker_processes` pool. Cannot exceed `max_worker_processes`.
298348
299-
Recommendation: Set equal to **CPU core count** or `max_worker_processes`.
349+
> [!TIP]
350+
> Set equal to **CPU core count** or `max_worker_processes`.
300351
recomendations:
301352
"Parallel Queries in Postgres": https://www.crunchydata.com/blog/parallel-queries-in-postgres
302353
"PostgreSQL Performance Tuning Best Practices 2025": https://www.mydbops.com/blog/postgresql-parameter-tuning-best-practices

0 commit comments

Comments
 (0)