Skip to content

fix(nostr): advance discovery cursor only after successful event processing#835

Open
keshav0479 wants to merge 1 commit into
citadel-tech:masterfrom
keshav0479:feat/nostr-cursor-ordering-fix
Open

fix(nostr): advance discovery cursor only after successful event processing#835
keshav0479 wants to merge 1 commit into
citadel-tech:masterfrom
keshav0479:feat/nostr-cursor-ordering-fix

Conversation

@keshav0479
Copy link
Copy Markdown

@keshav0479 keshav0479 commented Apr 15, 2026

Description

Small correctness fix for the watchtower's Nostr discovery path. Not responding to a reported incident, found while re-reading code around #829.

Before this PR, handle_relay_message in src/watch_tower/nostr_discovery.rs did things in this order:

  1. registry.save_nostr_cursor(relay_url, event.created_at) persists cursor to disk
  2. seen_txid.insert(txid) marks txid as processed
  3. bitcoin_rpc.get_raw_tx(&txid) fetches the transaction and can fail transiently
  4. early return on failure

Issue: if step 3 fails, progress was already committed. The cursor moved past the event and the txid was in the seen set. On reconnect or replay, that fidelity announcement could be skipped instead of retried.

When can step 3 realistically fail:

  • bitcoind restart or upgrade
  • brief RPC unavailability or network blip between watcher and bitcoind
  • the tx has not propagated to the watcher's mempool yet
  • REST endpoint hiccup or transient rate limit

Makers do re-broadcast fidelity events before expiry, so the protocol can self-heal over time. But relying on that leaves a blind spot where the watcher has stale bond state. The watchtower's job is reliable fidelity-bond tracking, so it should not commit progress for an event it did not actually fetch.

What the fix does

This keeps the existing routine small, but moves the state commits later:

  • fetch the raw tx first
  • if the fetch fails, return without touching the seen cache or cursor
  • after a successful fetch, process the event
  • save the cursor only after the event has reached the processed path
  • already-seen txids still advance the cursor
  • logs in this routine are now info / warn instead of debug

Post-fix behavior

scenario cursor advanced txid marked seen
get_raw_tx transient fail no no
already-seen replay after successful fetch yes already was
process_fidelity returns None yes yes
happy path yes yes

Known tradeoff

Already-seen txids now fetch the raw tx before hitting the seen-cache skip. That can mean an extra RPC call for duplicate relay delivery, but it keeps the patch small and avoids adding a new helper or rollback logic in code that is already queued for refactor.

Related Issue(s)

None. Found while auditing code around #829.

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Code refactor / performance improvement
  • Documentation update only
  • CI / Docker / Build changes
  • Other (please describe):

Protocol Version(s) Affected

  • Legacy (ECDSA: messages.rs, contract.rs, handlers.rs)
  • Taproot-Musig2 (messages2.rs, contract2.rs, handlers2.rs)
  • Both
  • Neither (infrastructure / tooling only)

Affected Component(s)

  • makerd (background daemon)
  • taker (CLI client)
  • maker-cli (command-line tool)
  • Core protocol / cryptography / library
  • watch_tower (contract monitoring & recovery)
  • Docker / deployment scripts
  • Tests / test framework
  • Documentation (docs/)
  • Other (please specify):

Testing

  • cargo check
  • cargo test --lib watch_tower::
  • cargo +stable clippy --all-features --lib --bins --tests -- -D warnings

Copilot AI review requested due to automatic review settings April 15, 2026 08:04
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 15, 2026

📝 Walkthrough

Walkthrough

Control flow refactoring in Nostr transaction discovery processing: cursor persistence is deferred to occur conditionally after transaction handling rather than unconditionally before cache checks. Error handling for transaction fetches uses explicit match statements with early returns. Multiple logging severity levels are increased from debug to warn/info for various event and processing scenarios.

Changes

Cohort / File(s) Summary
Nostr Discovery Control Flow
src/watch_tower/nostr_discovery.rs
Refactored handle_relay_message to defer cursor persistence until after transaction processing; improved error handling for raw transaction fetches with explicit match statements and early returns; increased logging severity for invalid fidelity checks, expired events, and duplicate transaction IDs.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Poem

🐰 A cursor's journey, now deferred with care,
No hasty saves without a transaction there,
Errors matched and logged with proper weight,
The discovery flows at just the right rate!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: reordering event handler logic to save cursor only after successful processing, which is the core fix in this correctness bug.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description check ✅ Passed The pull request provides a comprehensive description with clear problem statement, solution, behavior table, and testing details following the template structure.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/watch_tower/nostr_discovery.rs`:
- Around line 231-234: The log message "Received invalid txid" is misleading
because bitcoin_rpc.get_raw_tx can fail transiently; update the error handling
around bitcoin_rpc.get_raw_tx(&txid) (the Ok(tx) else branch) to log a
descriptive message indicating an RPC/fetch failure rather than an invalid txid,
include the captured error details (error variable from the Err arm) and the
txid in the message (e.g., "Failed to fetch raw tx for {txid}: {error}"), and
keep the same control flow (returning Ok(())) after logging.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: aa1a0861-fb8b-486e-95b8-dfc0a9726d8f

📥 Commits

Reviewing files that changed from the base of the PR and between 3480328 and 33288c2.

📒 Files selected for processing (2)
  • src/watch_tower/nostr_discovery.rs
  • src/watch_tower/utils.rs

Comment thread src/watch_tower/nostr_discovery.rs Outdated
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR targets a correctness issue in the watchtower’s Nostr-based discovery flow by changing when the per-relay discovery cursor and txid dedup cache are advanced, aiming to avoid permanently skipping events when Bitcoin REST lookups fail.

Changes:

  • Added a read-only SeenTxids::contains helper to support pre-fetch dedup checks.
  • Reordered handle_relay_message to run dedup and transaction fetch/processing before committing seen_txid and the persisted Nostr cursor.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
src/watch_tower/utils.rs Adds SeenTxids::contains to allow dedup checks without mutating the seen set.
src/watch_tower/nostr_discovery.rs Reorders Nostr event handling to defer committing cursor/seen-txid until after successful processing.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/watch_tower/nostr_discovery.rs Outdated
Comment thread src/watch_tower/nostr_discovery.rs Outdated
@keshav0479 keshav0479 force-pushed the feat/nostr-cursor-ordering-fix branch from 33288c2 to 0598da2 Compare April 15, 2026 08:08
@mojoX911
Copy link
Copy Markdown

This section of the codebase is going to be heavily refactored soon.

Looping in @stark-3k . Pls check if theres anything in this PR you can find useful.

@stark-3k
Copy link
Copy Markdown

This section of the codebase is going to be heavily refactored soon.

Looping in @stark-3k . Pls check if theres anything in this PR you can find useful.

yes this part will be refactored but we can keep the PR on hold for now

Copy link
Copy Markdown

@mojoX911 mojoX911 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some pointers on the codes in this routine.

Same effect can be done by moving the the cursor save within the Ok() branch.

Change the debug logs into info, warning or error logs in this function. They should not be debugs.

Comment on lines -227 to -230
if seen_txid.lock()?.insert(txid) {
log::debug!("add new cache {}", txid);
let Ok(tx) = bitcoin_rpc.get_raw_tx(&txid) else {
log::debug!("Received invalid txid: {txid:?}");
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can just save the nostr cursor within this Ok() branch to get the same effect. That would be a much smaller change set. No need to restructure the whole logic.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stark-3k do you think we can get this a separate PR or you wanna handle this in your rewrite?

@keshav0479 keshav0479 force-pushed the feat/nostr-cursor-ordering-fix branch from 0598da2 to 0f3d2b4 Compare April 28, 2026 21:33
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/watch_tower/nostr_discovery.rs`:
- Around line 225-233: The code currently inserts txid into the SeenTxids set
before fetching/processing the transaction (seen_txid.lock()?.insert(txid)
happens before bitcoin_rpc.get_raw_tx), which causes txids to be marked seen on
fetch failure and later silently skipped; fix by deferring the insertion into
SeenTxids until after bitcoin_rpc.get_raw_tx succeeds and after any processing
that must not be retried, i.e., move the call to seen_txid.lock()?.insert(txid)
to immediately after successful retrieval/processing of tx (and do the same
adjustment for the duplicate-handling branch around the code at lines handling
the duplicate path), ensuring the cursor-advancement logic only runs for txids
actually processed successfully.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 02d6b409-2701-4ff7-9b51-4352dfa13177

📥 Commits

Reviewing files that changed from the base of the PR and between 33288c2 and 0f3d2b4.

📒 Files selected for processing (1)
  • src/watch_tower/nostr_discovery.rs

Comment thread src/watch_tower/nostr_discovery.rs Outdated
@keshav0479 keshav0479 force-pushed the feat/nostr-cursor-ordering-fix branch from 0f3d2b4 to 05cfc53 Compare April 28, 2026 21:38
@keshav0479
Copy link
Copy Markdown
Author

Updated this after rechecking retry path, raw tx fetch now happens before touching the seen cache or cursor, so a fetch failure leaves both untouched. Kept it to the small one file change, no extra helper, and changed the logs in this routine to info/warn.

Copy link
Copy Markdown

@mojoX911 mojoX911 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack.

@stark-3k this code now looks solid to me. You can either cherry pick this in your PR or we can merge this and you rebase on top of it. I suppose you will need this logic section in the new design too.

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 29, 2026

Codecov Report

❌ Patch coverage is 62.50000% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 78.80%. Comparing base (776b75d) to head (05cfc53).
⚠️ Report is 154 commits behind head on master.

Files with missing lines Patch % Lines
src/watch_tower/nostr_discovery.rs 62.50% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #835      +/-   ##
==========================================
+ Coverage   68.87%   78.80%   +9.93%     
==========================================
  Files          35       57      +22     
  Lines        4932    15628   +10696     
==========================================
+ Hits         3397    12316    +8919     
- Misses       1535     3312    +1777     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants