Skip to content

read_only open racing another process's checkpoint SIGSEGVs (0.17.1, 0.18.0): reliable repro, plus verification that #615 fixes it #642

Description

@M0nkeyFl0wer

Version: ladybug==0.18.0 (also reproduces on 0.17.1)
OS: Linux 6.11.0-29-generic (Ubuntu 24.04)
Python: 3.12

Summary

This is the crash #615 addresses. Filing as a tracking issue since #615 has no
linked issue, with two contributions: a reliable minimal repro, and independent
verification that #615 fixes it (built from source and tested today).

When one process opens a database with read_only=True while another process is
write-churning it (insert batch, CHECKPOINT, close, in a loop), the reader's
Database() constructor usually fails cleanly with:

RuntimeError: Runtime exception: Couldn't replay shadow pages under read-only mode.
RuntimeError: IO exception: Cannot open file <db>.wal: No such file or directory

Both are catchable, and both are fine. Occasionally the same race dereferences bad
memory instead, and the reader process dies with SIGSEGV. This matches #615's
description of recovery observing intermediate checkpoint artifacts.

A note on the bug class, since most recent crash fixes here have been in the
non-exhaustive-switch family (#619, #566, the arrow_array_scan reports): this one
is temporal rather than structural. No switch is missing a case. The detection
already exists (the clean shadow-pages error above proves it), but one
interleaving slips between detecting mid-checkpoint state and dereferencing it.
Still present in 0.18.0, so #611/#612 (single-process rel-delete plus checkpoint)
is a different bug.

Reproducer

Single self-contained script (below): seeds a fresh DB with 200k small nodes,
then runs 3 reader processes (loop: read_only open, MATCH count, close)
against 1 writer process (loop: open, 20 inserts, CHECKPOINT, close).

Observed on this machine, 60 to 75 second runs:

  • 3/3 reader processes exit -11 (SIGSEGV) on both 0.17.1 and 0.18.0, every
    run, interleaved with hundreds of the clean RuntimeErrors before the fatal one.
  • A small seed (2k rows) did not reproduce within 45 s. The window appears to
    scale with recovery/replay work; 200k rows reproduces reliably.
  • Under gdb the crash did not fire (11,307 opens, 1,706 clean errors, 0 SEGV).
    Timing-sensitive, consistent with a race.

faulthandler at the moment of death. The crash is inside the native
Database() constructor (recover/replay path), not in query execution:

Fatal Python error: Segmentation fault
  File ".../ladybug/database.py", line 260 in init_pybind_database
  File ".../ladybug/database.py", line 221 in init_database
  File ".../ladybug/database.py", line 136 in __init__

A production coredump of the same signature (0.17.1) backtraced through
Database::Database → StorageManager::recover → NodeTable::insert → lookupPK.

Verification: #615 fixes it

Built ericyuanhui/ladybug@main_test (head 2b73562, base 0.18.0) from source
and ran the identical repro, with module provenance asserted before each run:

Build Trials Reader crashes
stock 0.18.0 (PyPI) 75 s, 3 readers 3/3 SIGSEGV, every run
#615 build (0.18.0 + fix) 2 × 75 s, 3 readers 0/6

Under the fix, each reader completed roughly 4,700 successful opens and 5,200
clean catchable RuntimeErrors per 40 s. That is the desired behavior: race
detected, error raised, process alive. Since the review on #615 is waiting on
more careful consideration post-0.18.x, this repro may be useful as a regression
harness: it validates any revision of the fix within about a minute. Happy to
re-run it against future revisions.

Expected behaviour

The reader always gets the existing clean error, never a SIGSEGV. Applications
can retry an error; a dead process is uncatchable. #615 achieves exactly this.

Repro note

Run readers with ulimit -c 0. With coredumps enabled, the kernel attempts to
dump the ~8 TB sparse buffer-pool reservation; we observed crashed readers stuck
unkillable at 100% CPU in the dump path for 10+ minutes (SIGKILL queued behind
the in-kernel dump). Possibly worth a separate small issue: MADV_DONTDUMP on
the buffer-pool reservation would make any crash far cheaper to debug.

Related

Workaround (deployed)

Snapshot-swap on the read path: the writer owns the live file and atomically
refreshes a read-only snapshot after each checkpoint; readers open only the
snapshot. No shared open, no race. Works, at the cost of a file copy per write
cycle and some read staleness.

Repro script

#!/usr/bin/env python3
"""Minimal SYNTHETIC repro: read_only Database open racing a writer's CHECKPOINT.

Self-contained — creates its own DB from scratch, no external data, plain
`import ladybug`. Expected (correct) behavior for a reader that races the
checkpoint: the clean `RuntimeError: Couldn't replay shadow pages under
read-only mode`. Observed bug: the same race path occasionally dereferences
bad memory and the reader process dies with SIGSEGV.

Usage: python repro_synthetic.py [--seconds 60] [--readers 3] [--seed-rows 50000]
"""
import argparse
import faulthandler
import subprocess
import sys
import time
from pathlib import Path

def seed(path, rows):
    import ladybug as lb
    db = lb.Database(path)
    conn = lb.Connection(db)
    conn.execute("CREATE NODE TABLE IF NOT EXISTS N(id INT64, payload STRING, PRIMARY KEY(id))")
    conn.execute(f"UNWIND range(0, {rows-1}) AS i "
                 f"CREATE (:N {{id: i, payload: '{'x'*256}'}})")
    conn.execute("CHECKPOINT")
    conn.close()
    db.close()
    print(f"[seed] {rows} rows, file={Path(path).stat().st_size/1e6:.0f} MB", flush=True)

def worker(role, path, seconds):
    faulthandler.enable()
    import ladybug as lb
    end = time.time() + seconds
    if role == "writer":
        n = int(time.time() * 1000) * 1000  # unique base per run
        while time.time() < end:
            try:
                db = lb.Database(path, buffer_pool_size=256*1024*1024)
                conn = lb.Connection(db)
                for _ in range(20):
                    conn.execute("CREATE (:N {id: $i, payload: $p})",
                                 {"i": n, "p": "y" * 256})
                    n += 1
                conn.execute("CHECKPOINT")
                conn.close()
                db.close()
            except Exception as e:
                print(f"[writer] caught: {type(e).__name__}: {str(e)[:70]}", flush=True)
            time.sleep(0.05)
        print("[writer] DONE", flush=True)
    else:
        opens = caught = 0
        while time.time() < end:
            try:
                db = lb.Database(path, read_only=True, buffer_pool_size=128*1024*1024)
                conn = lb.Connection(db)
                r = conn.execute("MATCH (n:N) RETURN count(n)")
                while r.has_next():
                    r.get_next()
                conn.close()
                db.close()
                opens += 1
            except Exception as e:
                caught += 1
                if caught <= 2:
                    print(f"[reader] caught (clean): {type(e).__name__}: {str(e)[:70]}", flush=True)
        print(f"[reader] DONE opens={opens} clean_errors={caught}", flush=True)

def main():
    ap = argparse.ArgumentParser()
    ap.add_argument("--role", default="")
    ap.add_argument("--path", default="")
    ap.add_argument("--seconds", type=float, default=60.0)
    ap.add_argument("--readers", type=int, default=3)
    ap.add_argument("--seed-rows", type=int, default=50000)
    ap.add_argument("--scratch", default="/tmp/lb-synth-repro")
    a = ap.parse_args()
    if a.role:
        worker(a.role, a.path, a.seconds)
        return 0
    import shutil
    scratch = Path(a.scratch)
    if scratch.exists():
        shutil.rmtree(scratch)
    scratch.mkdir(parents=True)
    dbpath = str(scratch / "synth.ldb")
    import ladybug as lb
    print(f"ladybug {lb.__version__}", flush=True)
    seed(dbpath, a.seed_rows)
    base = [sys.executable, __file__, "--path", dbpath, "--seconds", str(a.seconds)]
    def spawn(role):
        return subprocess.Popen(["bash","-c",'ulimit -c 0; exec "$@"',"_",*base,"--role",role])
    procs = {f"reader{i}": spawn("reader") for i in range(1, a.readers+1)}
    procs["writer"] = spawn("writer")
    codes = {n: p.wait() for n, p in procs.items()}
    print("\n=== EXIT CODES (-11 = SIGSEGV) ===")
    for n, c in codes.items():
        print(f"  {n}: {c}")
    segv = any(c not in (0,) for n, c in codes.items() if n.startswith("reader"))
    print("RESULT:", "SEGV REPRODUCED" if segv else "no SEGV this run")
    return 0

if __name__ == "__main__":
    raise SystemExit(main())

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Fields

No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions