Static analysis for PostgreSQL schemas. Finds the schema smells that break replication, slow down queries, or quietly accumulate bad data — missing primary keys, foreign keys without indexes, type mismatches across FKs, redundant indexes, and more.
Works against PostgreSQL 10+. Pure-SQL checks against the system catalogs — no extensions, no superuser, read-only.
From source:
git clone https://github.com/pirr/pgsleuth.git
cd pgsleuth
pip install -e '.[dev]'Or with Docker (no Python on the host required):
docker build -t pgsleuth .
docker run --rm pgsleuth check --dsn postgresql://user:pw@host/dbThe image is built on python:3.13-alpine and runs as a non-root user.
Passing a config file: files on your host aren't visible inside the container — mount the config as a volume and point --config at the in-container path:
docker run --rm \
-v "$PWD/pgsleuth.toml:/pgsleuth.toml:ro" \
pgsleuth check \
--dsn postgresql://user:pw@host/db \
--config /pgsleuth.tomlThe same pattern works for --baseline or any other path-typed flag.
pgsleuth check --dsn postgresql://user:pw@host/dbOr via env var:
export PGSLEUTH_DSN=postgresql://user:pw@host/db
pgsleuth checkFilter to specific checkers, change output format, or raise the severity floor:
pgsleuth check --dsn $DSN --checkers missing_primary_key,missing_fk_index
pgsleuth check --dsn $DSN --format json
pgsleuth check --dsn $DSN --min-severity warningList every available check:
pgsleuth list-checkersExit codes: 0 = clean, 1 = issues found, 2 = error.
| Name | Catches |
|---|---|
missing_primary_key |
Ordinary tables without a PK (breaks logical replication, ambiguous rows). |
missing_replica_identity |
Tables whose effective REPLICA IDENTITY is empty — UPDATE/DELETE won't replicate logically. |
primary_key_type |
Primary keys whose type can run out of values (e.g. int4 on a hot table). |
column_value_at_risk |
Sequence-backed columns whose sequence is past 70% of its type's max value. |
missing_fk_index |
Foreign-key columns without a covering index — slow joins, slow ON DELETE. |
fk_type_mismatch |
FK columns whose type differs from the referenced PK/unique column. |
fk_without_on_delete |
FKs with no explicit ON DELETE (defaults to NO ACTION, rarely the intended choice). |
redundant_index |
Indexes whose column list is a strict prefix of another index on the table. |
unused_index |
Secondary indexes that have never been scanned (pg_stat_user_indexes.idx_scan = 0). |
not_valid_constraints |
FK and CHECK constraints stuck at NOT VALID (convalidated = false). |
sequence_drift |
Sequences whose nextval would collide with rows already in the table. |
three_state_boolean |
Boolean columns without NOT NULL (true / false / null is rarely intended). |
json_over_jsonb |
Columns typed as json; jsonb supports indexing and is faster on read. |
timestamp_without_tz |
Columns typed as timestamp (without time zone) — silent timezone bugs. |
varchar_length |
varchar(N) columns where text is equivalent in Postgres and avoids the cap. |
Every rule has a dedicated page covering rationale, examples, fix SQL, and "when to ignore." Each reported Issue carries a docs_url pointing at the rule's page, plus a severity (info / warning / error), a fully-qualified object name, a human-readable message, and a suggested fix.
Severity is intentional, not arbitrary — see docs/severity.md for the full philosophy, the per-rule rationale, and --min-severity guidance for CI.
For brownfield databases with hundreds of pre-existing findings, baseline mode lets you adopt pgsleuth in CI without fixing all of them first. Snapshot the current findings to a JSON file, commit it; from then on CI fails only on new findings.
# Day 1 — one-time setup
pgsleuth baseline write --dsn $DSN
git add pgsleuth.baseline.json
git commit -m "baseline: snapshot of accepted schema findings"
# Every CI run (auto-discovers ./pgsleuth.baseline.json)
pgsleuth check --dsn $DSN
# pgsleuth: using pgsleuth.baseline.json (auto-discovered; pass --no-baseline to skip)
# Suppressed 116 findings via baseline.
# No issues found.
# exit 0When a migration introduces a new finding, CI fails on just that one finding:
[WARNING] public.audit_log(user_id)
Foreign key 'audit_log_user_id_fkey' has no covering index.
suggestion: CREATE INDEX ON public.audit_log (user_id);
Suppressed 116 findings via baseline.
Summary: 1 warning
exit 1
The author either fixes it (preferred) or runs pgsleuth baseline write again to accept it — visible in the PR diff for review.
pgsleuth baseline show # human-readable view of the file (no DB)
pgsleuth baseline prune --dsn $DSN # remove entries that no longer reproduce
pgsleuth baseline prune --dry-run --dsn $DSN # preview what prune would removeRun prune periodically (every few months, or before reviewing the baseline). Without it, fixed-but-still-baselined entries silently mask regressions.
| Flag | Effect |
|---|---|
| (none) | Auto-discovers ./pgsleuth.baseline.json if present. Prints a one-line stderr notice when it does. |
--baseline PATH |
Use the baseline at PATH (overrides auto-discovery). |
--no-baseline |
Disable auto-discovery for this run; report every finding. Useful for periodic full audits. |
Each finding gets a stable fingerprint of (checker, object_name) — sha256("pgsleuth/baseline/v1\0checker\0object_name"). The fingerprint excludes the human-readable message, so rephrasing the message in a future pgsleuth release doesn't break committed baselines. The file is sorted deterministically by (checker, object) for clean diffs in code review.
Pass a TOML config file with --config:
[pgsleuth]
exclude_schemas = ["pg_catalog", "information_schema", "pg_toast", "audit"]
exclude_tables = ["^tmp_", "_archive$"] # regex, matched against table name
[pgsleuth.checkers.three_state_boolean]
severity = "error"
[pgsleuth.checkers.redundant_index]
enabled = falseThe same filters can be passed on the command line:
pgsleuth check --dsn $DSN \
--exclude-schema audit \
--exclude-table '^tmp_'pgsleuth is non-zero on findings, so it drops into a CI step:
- run: pgsleuth check --dsn $DATABASE_URL --min-severity warning
# If pgsleuth.baseline.json is in the repo root, it's auto-discovered
# and CI fails only on new findings beyond it. See "Baseline mode" above.Use --format json if you want to pipe results into a reporter.
pip install -e '.[dev]'
pytest -vThe test suite spins up real PostgreSQL containers (10, 13, 15, 17) via testcontainers and runs every check against each — Docker required.
Adding a new rule? See docs/writing_checkers.md for the framework, the two base classes, the wire-up steps, and a worked example.
See ROADMAP.md for what's coming and how to pick up an item.
MIT.