Skip to content

fix: reclaim expired item leases#5

Merged
ccarvalho-eng merged 2 commits into
mainfrom
fix/reclaim-expired-leases
May 22, 2026
Merged

fix: reclaim expired item leases#5
ccarvalho-eng merged 2 commits into
mainfrom
fix/reclaim-expired-leases

Conversation

@ccarvalho-eng

@ccarvalho-eng ccarvalho-eng commented May 22, 2026

Copy link
Copy Markdown
Collaborator

Why

Leasing uses the item vesting_time to hide active work from other consumers. Once the lease expiry has passed, that same item should become visible again so another worker can reclaim stale work.

Current behavior keeps expired items hidden because Item.visible?/2 rejects any item with a lease_id, even when lease_expires_at has passed. Store.obtain_lease/6 also rejects those items for the same reason, so dequeue/5 cannot recover a stale lease.

This also fixes first-retry timing. Store.requeue/4 increments error_count before calculating delay, so using base_delay * 2^error_count makes the first retry wait 2x base_delay. The first retry should use base_delay; later retries can grow from there.

Steps To Reproduce

Expired lease recovery:

  1. Enqueue or store an item with vesting_time = now.
  2. Lease it with lease_duration = 1_000.
  3. After leasing, the item has a non-nil lease_id, lease_expires_at = now + 1_000, and vesting_time = now + 1_000.
  4. At now + 1_000, call Store.peek/4 or Store.dequeue/5.
  5. Before this change, the item is still hidden because lease_id is set. After this change, the expired lease is visible and can be claimed by a new worker.

Retry timing:

  1. Lease an item with error_count = 0.
  2. Call Store.requeue/4 with base_delay: 1_000.
  3. Before this change, the first retry is not visible at now + 1_000; it waits until now + 2_000.
  4. After this change, the first retry is visible at now + 1_000.

What Changed

  • Treat items with expired lease_expires_at as visible once their vesting_time has passed.
  • Allow obtain_lease/6 to replace an expired item lease.
  • Avoid moving pending/processing counters when replacing an expired lease, because the item is already counted as processing.
  • Calculate retry delays with 2^(error_count - 1) so the first retry uses the configured base delay.
  • Added regression coverage for expired lease visibility, stale lease reclaim, and first retry visibility.

Allow expired item leases to become visible and be claimed by another worker. Also align retry backoff so the first retry uses the configured base delay instead of doubling it.
Extract the lease write path so expired lease reclamation keeps the same behavior while satisfying the strict Credo checks used by CI.
@ccarvalho-eng ccarvalho-eng requested a review from jallum May 22, 2026 01:31
@ccarvalho-eng ccarvalho-eng merged commit 7c867e2 into main May 22, 2026
2 checks passed
@ccarvalho-eng ccarvalho-eng deleted the fix/reclaim-expired-leases branch May 22, 2026 11:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants