Summary
db.iter_rows_keyset(...) hangs indefinitely when the by column has NULL values. Reproduces on SQLite (aiosqlite) and Postgres (asyncpg, psycopg).
Repro (SQLite, no setup)
import asyncio
from etchdb import DB, Row
class T(Row):
__table__ = "t"
id: int
label: str | None = None
async def main():
db = await DB.from_url("sqlite+aiosqlite:///:memory:")
try:
await db.execute("CREATE TABLE t (id INTEGER PRIMARY KEY, label TEXT)")
await db.execute("INSERT INTO t VALUES (1, NULL), (2, NULL), (3, 'a'), (4, 'b')")
rows = []
async for r in db.iter_rows_keyset(T, by="label", batch_size=2):
rows.append((r.id, r.label))
if len(rows) > 10:
print(f"infinite loop: {len(rows)} yields, expected 4")
print(f" first 6: {rows[:6]}")
return
print(f"finished with {len(rows)} rows")
finally:
await db.close()
asyncio.run(main())
Expected: 4 rows, then terminates.
Actual: yields (1, None), (2, None) forever.
Also reproduces on Postgres
Same shape, slightly different trigger: Postgres sorts NULLs last (SQLite sorts them first), so the bug needs >= batch_size + 1 NULL rows so that one lands at the end of a non-final page.
# postgres: 3 NULL rows + 1 non-null + batch_size=2 reproduces
await db.execute("INSERT INTO t VALUES (1, NULL), (2, NULL), (3, NULL), (4, 'a')")
Root cause
In src/etchdb/db.py:294:
last_seen = rows[-1][by] # becomes None if last row has NULL in `by`
In src/etchdb/sql/__init__.py:169:
if last_seen is not None:
where_parts.append(f"{by} > {placeholder(len(params))}")
When last_seen is None, the cursor predicate is skipped entirely, so the next page is identical to the previous one - same rows, same last_seen=None, looping forever.
On the documented contract
The docstring says by must be "monotonic-ordered and unique enough that no two rows tie" - NULL technically violates that. But a silent hang is a much worse failure mode than the docstring's wording prepares the user for: no exception, no log line, just a coroutine that never ends. Easy to ship to production unnoticed.
Fix shape - two options
Option A (small, narrow): runtime guard. Keeps the single-column keyset narrow, converts the hang into a clear error.
last_seen = rows[-1][by]
if last_seen is None:
raise ValueError(
f"iter_rows_keyset: {by!r} is NULL at the page boundary; the "
f"cursor cannot advance. Pick a non-nullable column for `by`, "
f"or filter NULLs out via raw SQL."
)
This is the minimum to stop the silent hang. Doesn't address tied non-NULL values, which silently drop rows from the same root cause (WHERE by > last_seen can't advance through equal values either) — separate issue, same family.
Option B (bigger): compound cursor (by, pk) with NULL-aware comparison (IS NOT DISTINCT FROM on Postgres, IS on SQLite). Tie-breaks on PK so the cursor can advance through both NULL rows and tied by values. Fixes the NULL hang and the silent-tie-skip together. Bigger surface - dialect-specific SQL, more involved testing.
Environment
- etchdb 0.6.0
- Python 3.14
- aiosqlite 0.22.1 / asyncpg 0.31.0 / psycopg 3.3.4
Summary
db.iter_rows_keyset(...)hangs indefinitely when thebycolumn has NULL values. Reproduces on SQLite (aiosqlite) and Postgres (asyncpg, psycopg).Repro (SQLite, no setup)
Expected: 4 rows, then terminates.
Actual: yields
(1, None), (2, None)forever.Also reproduces on Postgres
Same shape, slightly different trigger: Postgres sorts NULLs last (SQLite sorts them first), so the bug needs
>= batch_size + 1NULL rows so that one lands at the end of a non-final page.Root cause
In
src/etchdb/db.py:294:In
src/etchdb/sql/__init__.py:169:When
last_seen is None, the cursor predicate is skipped entirely, so the next page is identical to the previous one - same rows, samelast_seen=None, looping forever.On the documented contract
The docstring says
bymust be "monotonic-ordered and unique enough that no two rows tie" - NULL technically violates that. But a silent hang is a much worse failure mode than the docstring's wording prepares the user for: no exception, no log line, just a coroutine that never ends. Easy to ship to production unnoticed.Fix shape - two options
Option A (small, narrow): runtime guard. Keeps the single-column keyset narrow, converts the hang into a clear error.
This is the minimum to stop the silent hang. Doesn't address tied non-NULL values, which silently drop rows from the same root cause (
WHERE by > last_seencan't advance through equal values either) — separate issue, same family.Option B (bigger): compound cursor
(by, pk)with NULL-aware comparison (IS NOT DISTINCT FROMon Postgres,ISon SQLite). Tie-breaks on PK so the cursor can advance through both NULL rows and tiedbyvalues. Fixes the NULL hang and the silent-tie-skip together. Bigger surface - dialect-specific SQL, more involved testing.Environment