Skip to content

Conversation

@eureka928
Copy link
Contributor

@eureka928 eureka928 commented Jan 8, 2026

Summary

Adds validation to detect when validators submit different weights during the commit-reveal finalization phase. Previously, the system silently used only the first submission, potentially masking consensus issues.

Problem

File: crates/epoch/src/commit_reveal.rs

The finalize() function assumed all validator submissions were identical:

// Before: Takes first submission, silently ignores the rest
let aggregated = submissions
    .into_iter()
    .next()
    .map(|w| weights::normalize_weights(w))
    .unwrap_or_default();

Issues:

  • No validation that submissions match
  • Silent data loss if validators disagreed
  • No observability into consensus problems
  • HashMap iteration order is non-deterministic

Solution

Added check_submission_divergence() method that compares all submissions against the first:

fn check_submission_divergence(&self, submissions: &[Vec<WeightAssignment>]) -> bool {
    let first_weights: HashMap<&str, f64> = first
        .iter()
        .map(|w| (w.hotkey.as_str(), w.weight))
        .collect();

    for (idx, submission) in submissions.iter().enumerate().skip(1) {
        // Check count, hotkeys, and weight values
    }
}

Validation checks:

Check Detects
Count mismatch Validators evaluated different number of miners
Missing hotkey Validators have different miner sets
Weight difference (>0.1%) Validators computed different scores

Logging:

  • warn! for each specific divergence with details
  • error! summary when finalizing with divergence

Behavior: Still uses first submission to avoid breaking production. Divergence indicates a bug or network issue requiring investigation.

Changes

  • Added check_submission_divergence() helper method
  • Updated finalize() to call validation before aggregation
  • Added HashSet import and error log level

Summary by CodeRabbit

  • Bug Fixes
    • Enhanced weight submission validation during finalization to detect inconsistencies and improved error logging for diagnostics.

✏️ Tip: You can customize this high-level summary in your review settings.

Contribution by Gittensor, see my contribution statistics at https://gittensor.io/miners/details?githubId=42954461

@coderabbitai
Copy link

coderabbitai bot commented Jan 8, 2026

📝 Walkthrough

Walkthrough

The PR introduces divergence validation during weight submission finalization in the commit-reveal module. A new helper function detects inconsistencies in hotkeys and weights across submissions with a 0.1% tolerance threshold. When divergence is detected, an error is logged and the first submission is used for aggregation instead of proceeding without validation.

Changes

Cohort / File(s) Summary
Divergence Validation for Weight Submissions
crates/epoch/src/commit_reveal.rs
Added internal helper function detect_divergence() to validate consistency across submissions. Enhanced finalization logic to inspect all submissions, compare hotkeys and weights with tolerance, emit error logs on divergence, and use first submission for aggregation. Expanded imports to include HashSet and error-level logging.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 Hopping through submissions, checking each with care,
Weights must align or divergence flares,
Tolerance of point-one guards the way,
First submission's truth saves the day!
Validation hops on! ✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Add Weight Submission Divergence Validation' directly and clearly describes the main change: adding validation logic to detect divergence in weight submissions during commit-reveal finalization.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In @crates/epoch/src/commit_reveal.rs:
- Around line 144-157: The current code picks the baseline submission from
submissions collected via self.reveals.values(), which is non-deterministic; to
fix, build submissions deterministically (e.g., collect (validator_hotkey,
submission) pairs from self.reveals.iter(), sort by validator_hotkey) and then
use the first element of that sorted list as the baseline before calling
check_submission_divergence and weights::normalize_weights; update the logic
around submissions, divergence detection (check_submission_divergence) and the
aggregated = weights::normalize_weights(...) call to use the deterministically
chosen submission.
- Around line 202-262: The function check_submission_divergence assumes hotkeys
are unique within each Vec<WeightAssignment> but doesn't validate that; add a
defensive duplicate-hotkey check at the start of the function (before comparing
to the first submission) that iterates each submission Vec<WeightAssignment>,
builds a HashSet of hotkey strings, and if a duplicate is encountered logs a
warn (include self.epoch and a truncated hotkey similar to existing messages)
and returns true to signal divergence; keep the rest of
check_submission_divergence logic unchanged.
🧹 Nitpick comments (2)
crates/epoch/src/commit_reveal.rs (2)

9-9: Remove unused HashSet import.

The HashSet type is imported but never used in this file. The divergence check uses only HashMap.

♻️ Proposed fix
-use std::collections::{HashMap, HashSet};
+use std::collections::HashMap;

217-218: Document the tolerance rationale.

The 0.1% tolerance for weight differences is reasonable, but consider documenting why this specific threshold was chosen to help future maintainers understand the tradeoff between strictness and floating-point precision.

📝 Suggested documentation
-// Tolerance for floating-point comparison (0.1% difference allowed)
+// Tolerance for floating-point comparison (0.1% difference allowed)
+// This threshold accounts for minor floating-point arithmetic variations
+// while still detecting meaningful weight divergence between validators
 const WEIGHT_TOLERANCE: f64 = 0.001;
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 138fbf9 and d586a89.

📒 Files selected for processing (1)
  • crates/epoch/src/commit_reveal.rs
🧰 Additional context used
🧬 Code graph analysis (1)
crates/epoch/src/commit_reveal.rs (1)
crates/challenge-sdk/src/weights.rs (1)
  • normalize_weights (36-46)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Test
  • GitHub Check: Clippy
🔇 Additional comments (1)
crates/epoch/src/commit_reveal.rs (1)

239-239: LGTM: Safe string truncation for logging.

The hotkey truncation using .min(weight.hotkey.len()) safely prevents panics while keeping logs readable.

Also applies to: 249-249

Comment on lines +144 to +157
// Validate that all submissions are consistent
// All validators read from shared chain DB, so submissions should be identical
// Just take the first one and normalize
let aggregated = submissions
.into_iter()
.next()
.map(|w| weights::normalize_weights(w))
.unwrap_or_default();
let first = &submissions[0];
let divergence_detected = self.check_submission_divergence(&submissions);

if divergence_detected {
error!(
"Epoch {}: Weight submissions diverged across {} validators! Using first submission.",
self.epoch,
submissions.len()
);
}

let aggregated = weights::normalize_weights(first.clone());
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Critical: Non-deterministic "first submission" selection undermines validation.

The PR objectives state that non-deterministic HashMap iteration order is a problem being addressed, but lines 134-135 collect submissions from self.reveals.values(), which iterates a HashMap in non-deterministic order. This means:

  1. Which submission is considered "first" (the baseline) varies between runs
  2. Different validators' submissions may be used as the baseline in different executions
  3. If validator A and validator B submit different weights, sometimes A's weights will be used (when A is "first"), and sometimes B's will be used

This makes the divergence detection inconsistent and doesn't solve the stated problem of non-deterministic behavior.

🔧 Recommended fix

Use a deterministic ordering for submissions, such as sorting by validator hotkey:

 // Collect valid submissions
-let submissions: Vec<Vec<WeightAssignment>> =
-    self.reveals.values().map(|r| r.weights.clone()).collect();
+let mut reveals_sorted: Vec<_> = self.reveals.iter().collect();
+reveals_sorted.sort_by(|a, b| a.0.cmp(b.0));  // Sort by hotkey
+let submissions: Vec<Vec<WeightAssignment>> =
+    reveals_sorted.iter().map(|(_, r)| r.weights.clone()).collect();

This ensures the same validator's submission is always used as the baseline when present.

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In @crates/epoch/src/commit_reveal.rs around lines 144 - 157, The current code
picks the baseline submission from submissions collected via
self.reveals.values(), which is non-deterministic; to fix, build submissions
deterministically (e.g., collect (validator_hotkey, submission) pairs from
self.reveals.iter(), sort by validator_hotkey) and then use the first element of
that sorted list as the baseline before calling check_submission_divergence and
weights::normalize_weights; update the logic around submissions, divergence
detection (check_submission_divergence) and the aggregated =
weights::normalize_weights(...) call to use the deterministically chosen
submission.

Comment on lines +202 to +262
/// Check if submissions from different validators have diverged.
/// Returns true if divergence is detected.
fn check_submission_divergence(&self, submissions: &[Vec<WeightAssignment>]) -> bool {
if submissions.len() <= 1 {
return false;
}

let first = &submissions[0];

// Build a map of hotkey -> weight for the first submission
let first_weights: HashMap<&str, f64> = first
.iter()
.map(|w| (w.hotkey.as_str(), w.weight))
.collect();

// Tolerance for floating-point comparison (0.1% difference allowed)
const WEIGHT_TOLERANCE: f64 = 0.001;

for (idx, submission) in submissions.iter().enumerate().skip(1) {
// Check if same number of weight assignments
if submission.len() != first.len() {
warn!(
"Epoch {}: Submission {} has {} weights, first has {}",
self.epoch,
idx,
submission.len(),
first.len()
);
return true;
}

// Check if same hotkeys with similar weights
for weight in submission {
match first_weights.get(weight.hotkey.as_str()) {
None => {
warn!(
"Epoch {}: Submission {} has hotkey {} not in first submission",
self.epoch, idx, &weight.hotkey[..16.min(weight.hotkey.len())]
);
return true;
}
Some(&first_weight) => {
let diff = (weight.weight - first_weight).abs();
if diff > WEIGHT_TOLERANCE {
warn!(
"Epoch {}: Weight divergence for hotkey {}: {} vs {} (diff: {:.4})",
self.epoch,
&weight.hotkey[..16.min(weight.hotkey.len())],
first_weight,
weight.weight,
diff
);
return true;
}
}
}
}
}

false
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Search for uniqueness validation on WeightAssignment collections
echo "=== Searching for uniqueness/duplicate checks ==="
rg -n -C3 'unique|duplicat|distinct' --type=rust crates/epoch/src/

echo -e "\n=== Checking WeightAssignment struct definition ==="
rg -n -C5 'struct WeightAssignment' --type=rust

echo -e "\n=== Looking for deduplication logic ==="
rg -n '\.dedup\(\)' --type=rust crates/epoch/src/

echo -e "\n=== Searching for weight validation or collection building ==="
rg -n -C3 'WeightAssignment|submissions' --type=rust crates/epoch/src/commit_reveal.rs | head -100

Repository: PlatformNetwork/platform

Length of output: 4985


Document the uniqueness assumption for hotkeys within submissions or add validation.

The divergence detection logic assumes hotkeys are unique within each submission. While no explicit validation exists in the codebase, this assumption is reasonable given that submissions all derive from the shared chain DB (as noted in the code comment). However, the check would fail to detect certain divergences if duplicate hotkeys ever occurred (e.g., first has [A, A, B] and another has [A, B, B]—both pass the length and existence checks).

Consider either:

  1. Adding a comment explicitly documenting that hotkeys must be unique per submission, or
  2. Adding a defensive check to validate this assumption during submission processing.
🤖 Prompt for AI Agents
In @crates/epoch/src/commit_reveal.rs around lines 202 - 262, The function
check_submission_divergence assumes hotkeys are unique within each
Vec<WeightAssignment> but doesn't validate that; add a defensive
duplicate-hotkey check at the start of the function (before comparing to the
first submission) that iterates each submission Vec<WeightAssignment>, builds a
HashSet of hotkey strings, and if a duplicate is encountered logs a warn
(include self.epoch and a truncated hotkey similar to existing messages) and
returns true to signal divergence; keep the rest of check_submission_divergence
logic unchanged.

@eureka928
Copy link
Contributor Author

@echobt I'd love to get your review and approval
Thank you for your time

@eureka928 eureka928 closed this Jan 9, 2026
@eureka928 eureka928 reopened this Jan 9, 2026
Copy link
Contributor

@echobt echobt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice

@echobt echobt merged commit f4f9ad1 into PlatformNetwork:main Jan 9, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants