Skip to content

fix: race condition#8

Merged
beer-1 merged 2 commits into
mainfrom
fix/race-condition
Jan 14, 2026
Merged

fix: race condition#8
beer-1 merged 2 commits into
mainfrom
fix/race-condition

Conversation

@beer-1
Copy link
Copy Markdown
Member

@beer-1 beer-1 commented Jan 14, 2026

Summary by CodeRabbit

  • Tests

    • Added an integration test verifying keyservers reject simultaneous conflicting votes at the same height and improved test synchronization to better validate signing behavior and error handling.
  • Bug Fixes

    • Prevented older in-memory signing state from overwriting newer persisted state and tightened state-update synchronization to reduce race conditions during signing.

✏️ Tip: You can customize this high-level summary in your review settings.

Copilot AI review requested due to automatic review settings January 14, 2026 08:47
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jan 14, 2026

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Adds a test hook and an integration test that exercise concurrent signing to ensure the keyserver rejects conflicting votes at the same height by introducing controlled timing inside the validator's last-sign-state sync path.

Changes

Cohort / File(s) Summary
Validator hook & locking
pkg/keyserver/validator.go
Adds var syncLastSignStateHook func(point string, state *fsm.LastSignState). Refactors PrivValidator.syncLastSignState to use defer for unlocking and to call the hook at two points: "before-raft" and "before-write". Ensures state write occurs while locked.
LastSignState guard
pkg/fsm/last_sign_state.go
Adds an early-return check in LastSignState.CopyToFilePV to avoid overwriting FilePVLastSignState when the in-memory state is older than the pv's HRS.
Integration test
pkg/keyserver/signing_integration_test.go
Adds TestKeyserverRejectsConflictingVotesAtSameHeight, which sets up a 3-node Raft cluster, coordinates two validators to attempt conflicting votes concurrently, uses the hook to delay state writes, and asserts at least one signer fails with a conflicting-vote error.

Sequence Diagram(s)

sequenceDiagram
    participant Test
    participant PrivVal as PrivValidator
    participant Raft
    participant Disk

    Test->>PrivVal: start concurrent Sign attempts (A and B)
    PrivVal->>PrivVal: acquire lock
    PrivVal->>PrivVal: compute new LastSignState
    PrivVal->>PrivVal: call hook "before-raft" (if set)
    PrivVal->>Raft: sync state via Raft
    Raft-->>PrivVal: ack
    PrivVal->>PrivVal: call hook "before-write" (if set)
    alt conflict detected
        PrivVal-->>Test: return RemoteSignerError (conflicting vote)
    else no conflict
        PrivVal->>Disk: write LastSignState
        Disk-->>PrivVal: persist ack
        PrivVal-->>Test: return success
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

Hop hop, I held a sync with care,
Two signers racing — what a dare!
I paused the ink, then checked the file,
One conflict chased away with style 🐇✨

🚥 Pre-merge checks | ✅ 1 | ❌ 2
❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title 'fix: race condition' is vague and generic. It mentions fixing a race condition but doesn't specify which race condition or what part of the system is affected, making it unclear what the primary change actually is. Consider a more specific title that identifies the particular race condition or affected component, such as 'fix: race condition in LastSignState synchronization' or similar.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings


📜 Recent review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 4aaaebd and d635b58.

📒 Files selected for processing (1)
  • pkg/fsm/last_sign_state.go
🔇 Additional comments (1)
pkg/fsm/last_sign_state.go (1)

35-41: LGTM! Guard correctly prevents state regression.

The early return when s.Less(...) is true properly prevents overwriting a newer FilePVLastSignState with an older in-memory state, which is the right fix for the race condition.

Note: When H/R/S are equal, the copy still proceeds, allowing Signature and SignBytes to be updated. This appears intentional for idempotent updates at the same signing height.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request fixes a race condition in the validator's state synchronization logic that could potentially allow conflicting signatures. The fix ensures that the entire state sync operation is atomic by holding the mutex for the duration of syncLastSignState() instead of releasing and re-acquiring it.

Changes:

  • Modified locking in syncLastSignState() to use defer l.mu.Unlock() for the entire function duration instead of explicit lock/unlock pairs
  • Added a test hook (syncLastSignStateHook) to enable testing of concurrent execution scenarios
  • Added new integration test TestKeyserverRejectsConflictingVotesAtSameHeight to verify the race condition fix

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
pkg/keyserver/validator.go Fixed race condition by holding mutex for entire syncLastSignState operation; added test hook for coordinating concurrent test scenarios
pkg/keyserver/signing_integration_test.go Added integration test to verify conflicting votes at the same height are properly rejected under concurrent access

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread pkg/keyserver/validator.go
Comment thread pkg/keyserver/signing_integration_test.go
@beer-1 beer-1 merged commit 92ca0a2 into main Jan 14, 2026
5 checks passed
@beer-1 beer-1 deleted the fix/race-condition branch January 14, 2026 09:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants