Skip to content

Incorporate determinism into GPU heuristics + parallel determinsitic B&B+GPU exploration#4

Closed
aliceb-nv wants to merge 228 commits into
mainfrom
gpudet
Closed

Incorporate determinism into GPU heuristics + parallel determinsitic B&B+GPU exploration#4
aliceb-nv wants to merge 228 commits into
mainfrom
gpudet

Conversation

@aliceb-nv
Copy link
Copy Markdown
Owner

local test PR to trigger coderabbit review

Description

Issue

Checklist

  • I am familiar with the Contributing Guidelines.
  • Testing
    • New or existing tests cover these changes
    • Added tests
    • Created an issue to follow-up
    • NA
  • Documentation
    • The documentation is up to date with these changes
    • Added new documentation
    • NA

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 18, 2026

📝 Walkthrough

Walkthrough

Adds deterministic/work-unit-based execution and incumbent tracking across the MIP stack: new termination/work-limit abstractions, solution-origin metadata and extended callbacks, deterministic B&B & heuristics changes, recombiner/work accounting updates, many timer→work_limit_timer API adjustments, work-unit prediction, and extensive determinism-focused tests and sanitizer suppressions.

Changes

Cohort / File(s) Summary
Incumbent tracking & callbacks
benchmarks/linear_programming/cuopt/run_mip.cpp, cpp/src/pdlp/cuopt_c.cpp, cpp/include/cuopt/linear_programming/cuopt_c.h
Adds incumbent_record_t and incumbent_tracker_t; new extended callback type/registration (cuOptMIPSolutionCallbackInfo, cuOptMIPGetSolutionCallbackExt, cuOptSetMIPGetSolutionCallbackExt) and C++ wrapper to dispatch extended callbacks.
Determinism constants & origin metadata
cpp/include/cuopt/linear_programming/constants.h, cpp/include/cuopt/linear_programming/utilities/internals.hpp
Introduces bitset-based CUOPT_DETERMINISM_* macros and CUOPT_MIP_SOLUTION_ORIGIN_* constants; adds mip_solution_origin_t enum, string mapper, and extended callback info type; adds GET_SOLUTION_EXT callback class.
Termination / work-limit framework
cpp/src/utilities/termination_checker.hpp, cpp/src/utilities/work_limit_timer.hpp, cpp/src/utilities/work_limit_context.hpp
New termination_checker_t (alias work_limit_timer_t) supporting deterministic (work-unit) and non-deterministic modes, parent chaining, callbacks, and work recording; work_limit_context_t extended with producer sync, scaling, and record_work semantics.
Solver settings & validation
cpp/include/cuopt/linear_programming/mip/solver_settings.hpp, cpp/src/math_optimization/solver_settings.cu
Adds cpufj/gpu_heur/bnb work-unit scale fields and gpu_heur_wait_for_exploration; determinism_mode default adjusted to bitset semantics; validation bounds updated to new constants.
Branch-and-bound determinism
cpp/src/branch_and_bound/... (cpp/src/branch_and_bound/branch_and_bound.cpp, .hpp, deterministic_workers.hpp, pseudo_costs.*)
Major deterministic integration: origin/work timestamps for queued/external solutions, emit_solution_callback/retire helper, deterministic replay/sorting, strong-branching work accounting and API extensions.
Timer→Work-limit API migrations
multiple files under mip_heuristics/* (local_search, diversity, presolve, feasibility_jump, rounding, constraint_prop, bounds_repair, lb_*, etc.)
Widespread replacement of timer_t → work_limit_timer_t / termination_checker_t in public and internal APIs; added max_timer/related ctor changes and deterministic work checks; many signatures updated accordingly.
Population, solutions & hashes
cpp/src/mip_heuristics/diversity/population.*, cpp/src/mip_heuristics/solution/solution.*, cpp/src/mip_heuristics/diversity/population.cuh
Replaces solution_origin_t with internals::mip_solution_origin_t; introduces drained_external_solution_t preserving origin; adds solution get_hash(); removes lower_slack/upper_slack; updates population APIs to carry origin.
Recombiners & heuristics work accounting
cpp/src/mip_heuristics/diversity/recombiners/*, recombiner_stats.hpp, diversity_manager.*
Recombine APIs now return tuple (solution,bool,double) including work metric; deterministic logging and ordering fixes; recombiner init accepts context and records last_recombiner_work; diversity manager and configs updated for deterministic paths.
Feasibility jump & kernels
cpp/src/mip_heuristics/feasibility_jump/*
Adds get_feature_vector, ML predictor hooks, work_budget and work_unit scaling; changes scoring types to int32_t in places; introduces deterministic tie-breaking, TPB propagation, contiguous_set::sort helper, and kernel signature changes.
Relaxed LP & PDLP
cpp/src/mip_heuristics/relaxed_lp/*, cpp/src/pdlp/pdlp.cu
Adds iteration_limit, work_limit, work_context to relaxed LP settings; deterministic LP-mode estimation and logging; ensures primal copy before restart-to-average clamp.
Main solver & context
cpp/src/mip_heuristics/solve.cu, solver.cu/.cuh, solver_context.cuh, solution_callbacks.cuh
Timer→termination_checker migration; adds bb_observer_adapter_t, solution_publication_t and solution_injection_t to centralize publishing/injection with origin/metadata; integrates work_unit_predictors and deterministic GPU heuristics paths.
Work-unit prediction & models
cpp/src/utilities/models/fj_predictor/*, cpp/src/utilities/work_unit_predictor.*, work_unit_predictor.hpp
Adds fj_predictor header and quantize implementation; new templated work_unit_predictor with cpu/gpu scalers, cache, and explicit instantiation.
Utilities & helpers
cpp/src/utilities/* (cuda_helpers, copy_helpers, determinism_log, seed_generator, version_info, work_unit_scheduler)
NVTX memory-mark helpers; make_span overloads for rmm::device_scalar; determinism_log header (noop macro); seed_generator refactor (base_seed_/epoch_ and per-thread state); get_cpu_max_clock_mhz(); work_limit_scheduler now uses set_current_work; work_limit_timer alias header.
Presolve & probing
cpp/src/mip_heuristics/presolve/*
Adds WORK_LIMIT termination criterion; initializes bounds arrays with NaN on resize; probing cache and presolve methods gain deterministic gating and parameter; third_party_presolve gains deterministic flag and setter.
Testing & CI
cpp/tests/mip/*, cpp/tests/CMakeLists.txt, cpp/tests/CMakeLists.txt, ci/compute-sanitizer-suppressions.xml
Massive determinism test additions and utilities (detection, spin_stream_raii_t, parameterized tests for diversity/local_search/presolve/FJ), CMake helpers for CUPTI/PAPI, new tests enabled, and Compute Sanitizer suppression XML added.
Build & small fixes
cpp/CMakeLists.txt, cpp/src/CMakeLists.txt, dataset script
Adds optional PAPI linking block; registers new utility sources; adds dataset entry supportcase42; minor numeric/logging guards and defensive checks across multiple modules.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch gpudet

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

Note

Due to the large number of review comments, Critical severity comments were prioritized as inline comments.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (4)
cpp/src/mip_heuristics/solution/solution.cu (1)

92-108: ⚠️ Potential issue | 🔴 Critical

Avoid copying potentially uninitialized derived buffers in copy_from.

Lines 92-99 mark source buffers/scalars as initialized, and Lines 101-108 immediately copy them. If source state is actually uninitialized, this propagates indeterminate values and hides a real initialization bug.

🛠️ Safer direction (don’t copy unknown-derived state; force recompute)
-  // slack, excess, and constraint value may be uninitialized (and computed later). Mark them as
-  // such
-  cuopt::mark_span_as_initialized(make_span(other_sol.lower_excess), handle_ptr->get_stream());
-  cuopt::mark_span_as_initialized(make_span(other_sol.upper_excess), handle_ptr->get_stream());
-  cuopt::mark_span_as_initialized(make_span(other_sol.constraint_value), handle_ptr->get_stream());
-  cuopt::mark_span_as_initialized(make_span(other_sol.obj_val), handle_ptr->get_stream());
-  cuopt::mark_span_as_initialized(make_span(other_sol.n_feasible_constraints),
-                                  handle_ptr->get_stream());
-
-  expand_device_copy(lower_excess, other_sol.lower_excess, handle_ptr->get_stream());
-  expand_device_copy(upper_excess, other_sol.upper_excess, handle_ptr->get_stream());
-  expand_device_copy(constraint_value, other_sol.constraint_value, handle_ptr->get_stream());
-  raft::copy(obj_val.data(), other_sol.obj_val.data(), 1, handle_ptr->get_stream());
-  raft::copy(n_feasible_constraints.data(),
-             other_sol.n_feasible_constraints.data(),
-             1,
-             handle_ptr->get_stream());
+  // Derived feasibility/objective buffers can be stale or unset in the source.
+  // Keep deterministic state by resetting and recomputing on demand.
+  lower_excess.resize(problem_ptr->n_constraints, handle_ptr->get_stream());
+  upper_excess.resize(problem_ptr->n_constraints, handle_ptr->get_stream());
+  constraint_value.resize(problem_ptr->n_constraints, handle_ptr->get_stream());
+  thrust::fill(handle_ptr->get_thrust_policy(), lower_excess.begin(), lower_excess.end(), f_t{0});
+  thrust::fill(handle_ptr->get_thrust_policy(), upper_excess.begin(), upper_excess.end(), f_t{0});
+  thrust::fill(
+    handle_ptr->get_thrust_policy(), constraint_value.begin(), constraint_value.end(), f_t{0});
+  obj_val.set_value_to_zero_async(handle_ptr->get_stream());
+  n_feasible_constraints.set_value_to_zero_async(handle_ptr->get_stream());
+  is_feasible = false;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/solution/solution.cu` around lines 92 - 108, The
current copy_from implementation marks and then blindly copies derived fields
(other_sol.lower_excess, other_sol.upper_excess, other_sol.constraint_value,
other_sol.obj_val, other_sol.n_feasible_constraints) which can propagate
uninitialized data; remove the cuopt::mark_span_as_initialized calls for those
derived buffers and stop calling expand_device_copy/raft::copy on them in
copy_from — instead leave the destination fields (lower_excess, upper_excess,
constraint_value, obj_val, n_feasible_constraints) as uninitialized or
explicitly set them to safe defaults (e.g., zero) and set/clear the appropriate
"needs recompute" flag so they will be recomputed later; adjust copy_from logic
accordingly (refer to functions/objects: copy_from, mark_span_as_initialized,
expand_device_copy, raft::copy, other_sol.* and destination members) so no
unknown-derived state is silently copied.
cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu (1)

1491-1502: ⚠️ Potential issue | 🟠 Major

This work-budget check is too coarse for deterministic scheduling.

work_units_elapsed is only updated and compared every 100 iterations. Any budget smaller than one batch is effectively rounded up to the next checkpoint, so CPUFJ can overshoot the requested budget by a large amount before Line 1500 ever fires.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu` around lines 1491 - 1502,
The work-budget check currently only runs at the batch checkpoint, so
fj_cpu.work_units_elapsed (updated by biased_work) can overshoot for budgets
smaller than a batch; modify the loop that accumulates biased_work so you update
and compare fj_cpu.work_units_elapsed against fj_cpu.work_budget on each
iteration (or immediately after adding biased_work) and break out as soon as the
budget is exceeded; ensure you use the same atomic load/store semantics as other
sites (work_units_elapsed.load/store with std::memory_order_relaxed or the
existing pattern) and preserve the producer_sync/notify_progress and
CUOPT_LOG_TRACE calls so they still run when exiting early.
cpp/include/cuopt/linear_programming/utilities/internals.hpp (1)

27-116: 🛠️ Refactor suggestion | 🟠 Major

Document the new callback ABI in this public header.

The new origin enum, mip_solution_callback_info_t, and GET_SOLUTION_EXT interface need Doxygen here: pointer types, struct_size/lifetime semantics, and callback threading expectations are not obvious from the signatures alone.

As per coding guidelines, cpp/include/cuopt/**/*: "Check if new public functions/classes have documentation comments (Doxygen format)", "Flag API changes that may need corresponding docs/ updates", and "Suggest documenting thread-safety, GPU requirements, and numerical behavior".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/include/cuopt/linear_programming/utilities/internals.hpp` around lines 27
- 116, Add Doxygen comments to the public ABI symbols introduced: document
mip_solution_origin_t (enumerators and their semantic meanings),
mip_solution_callback_info_t (explain each field including struct_size, valid
lifetime, ownership, alignment and versioning expectations),
base_solution_callback_type (describe GET_SOLUTION_EXT purpose), and
get_solution_callback_ext_t::get_solution (document pointer parameter types,
which may be null, whether objective_value/solution_bound are floats or doubles
depending on isFloat, threading/callback concurrency guarantees, GPU vs CPU
context restrictions, and numerical behavior/precision expectations). Also
annotate base_solution_callback_t::set_user_data/get_user_data to clarify
ownership of user_data. Keep comments in Doxygen format in this header so API
docs and maintainers can find pointer semantics, thread-safety, and
ABI/versioning rules.
cpp/src/mip_heuristics/solver.cu (1)

304-310: ⚠️ Potential issue | 🟠 Major

Deterministic B&B still never ingests GPU-heuristic incumbents.

In this branch branch_and_bound_callback remains unset, so heuristic incumbents cannot tighten the B&B bound or affect pruning. That leaves the new deterministic “parallel B&B + GPU exploration” mode only half-wired.

If you want, I can turn that TODO into a concrete follow-up issue with the queueing and retirement steps spelled out.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/solver.cu` around lines 304 - 310, In the
deterministic B&B branch ensure GPU-heuristic incumbents are forwarded by
setting context.problem_ptr->branch_and_bound_callback: capture the
BranchAndBound instance (branch_and_bound.get()) and assign a lambda that calls
bb->queue_external_solution_deterministic(solution, 0.0) so GPU heuristics can
supply deterministic incumbents; keep the existing
set_concurrent_lp_root_solve(false) and restore the commented TODO into an
active callback assignment using the exact symbol branch_and_bound_callback and
the method queue_external_solution_deterministic to enable pruning and bound
tightening.
🟡 Minor comments (13)
cpp/tests/mip/unit_test.cu-286-305 (1)

286-305: ⚠️ Potential issue | 🟡 Minor

Commented-out test has issues that must be fixed before uncommenting.

Three problems need attention before this test can be used:

  1. Line 299 member name: settings.presolve = true; should be settings.presolver = cuopt::linear_programming::presolver_t::None; to match all other tests in the file (lines 198, 213, 245, 266). The current name is incompatible with the settings struct.

  2. Unused extracted parameters: scaling and heuristics_only are extracted from GetParam() on lines 289-290 but immediately overwritten with hardcoded true values on lines 297-298. Either use the parameters or remove the extractions.

  3. Missing time limit: Unlike all other tests in the file, this test does not set settings.time_limit = 5;. Add it to prevent indefinite test runs.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/tests/mip/unit_test.cu` around lines 286 - 305, The commented-out test
has three fixes: replace the invalid member assignment settings.presolve = true
with settings.presolver = cuopt::linear_programming::presolver_t::None to match
the settings struct and other tests; stop overwriting the parameters extracted
from GetParam() by either removing the unused local variables scaling and
heuristics_only or by using them when setting settings.mip_scaling and
settings.heuristics_only (i.e., set settings.mip_scaling = scaling and
settings.heuristics_only = heuristics_only); and add a time limit like
settings.time_limit = 5 to avoid indefinite runs.
ci/compute-sanitizer-suppressions.xml-209-222 (1)

209-222: ⚠️ Potential issue | 🟡 Minor

Typo in comment and broad suppression scope.

  1. Line 209 has a typo: "actualy" → "actually".

  2. This suppression is notably broad—it suppresses all cuMemcpyDtoDAsync uninitialized memory access errors regardless of the calling context. While the comment acknowledges the risk, relying on "errors may be caught later on" is optimistic and could mask genuine issues in device-to-device copies.

Consider whether a tighter hostStack constraint is feasible, or document specific known call sites that trigger this.

🔧 Proposed fix for the typo
-  <!-- Uninitialized device-to-device copies are usually harmless - if actualy bogus, errors may be caught later on -->
+  <!-- Uninitialized device-to-device copies are usually harmless - if actually bogus, errors may be caught later on -->
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ci/compute-sanitizer-suppressions.xml` around lines 209 - 222, The
suppression comment in the <record> for InitcheckApiError has a typo ("actualy"
→ "actually") and the hostStack pattern matching cuMemcpyDtoDAsync is too broad;
fix the typo in the comment text and tighten the suppression by narrowing the
<hostStack>/<frame>/<func> or adding additional <frame> entries to match known
safe call sites (e.g., include specific wrapper function names or modules
instead of ".*libcuda.so.*"), or else document the exact safe call sites in a
new comment above the <record>; locate the record that references
cuMemcpyDtoDAsync and update the comment text and hostStack constraints (and
optionally add a note listing the verified call sites) so the suppression only
covers intended, reviewed contexts.
cpp/src/dual_simplex/basis_updates.cpp-2204-2206 (1)

2204-2206: ⚠️ Potential issue | 🟡 Minor

Work estimate formula uses incorrect variable for sort operation.

The std::sort at line 2204 operates on nz elements (range from m to m + nz), but the work estimate uses (m + nz) * log2(m + nz) instead of nz * log2(nz). This inflates the work estimate, especially when m is large. Additionally, the guard (m + nz) > 1 can pass when nz == 0 (if m >= 2), adding spurious work for an empty sort operation.

Contrast with line 2217 in the same file, which correctly uses etilde.i.size() for both the guard and the work formula when estimating sort work.

Suggested fix
   std::sort(xi_workspace_.begin() + m, xi_workspace_.begin() + m + nz, std::less<i_t>());
-  if ((m + nz) > 1) { work_estimate_ += (m + nz) * std::log2((f_t)(m + nz)); }
+  if (nz > 1) { work_estimate_ += nz * std::log2((f_t)nz); }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/dual_simplex/basis_updates.cpp` around lines 2204 - 2206, The
work-estimate after sorting xi_workspace_ is using the wrong size and guard:
change the guard from (m + nz) > 1 to nz > 1 and compute the cost as nz *
log2(nz) (with the same cast style used elsewhere, e.g. (f_t)nz) and add that to
work_estimate_ instead of using (m + nz) * log2((f_t)(m + nz)); update the block
around the std::sort call that references xi_workspace_, m, nz and
work_estimate_ accordingly.
cpp/src/utilities/timer.hpp-37-50 (1)

37-50: ⚠️ Potential issue | 🟡 Minor

Add [[maybe_unused]] to suppress compiler warnings on unused parameters.

Parameters caller, file, and line are not used in the active code path (lines 42–49 are commented out). With -Werror enabled in the build (cpp/CMakeLists.txt:78, 144), this will cause compilation failures. Annotate these parameters with [[maybe_unused]] to silence the warning, matching the existing convention used elsewhere in the codebase.

Suggested fix
-  bool check_time_limit(const char* caller = __builtin_FUNCTION(),
-                        const char* file   = __builtin_FILE(),
-                        int line           = __builtin_LINE()) const noexcept
+  bool check_time_limit([[maybe_unused]] const char* caller = __builtin_FUNCTION(),
+                        [[maybe_unused]] const char* file   = __builtin_FILE(),
+                        [[maybe_unused]] int line           = __builtin_LINE()) const noexcept
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/utilities/timer.hpp` around lines 37 - 50, The parameters caller,
file, and line in the method check_time_limit(...) are currently unused (the
logging block is commented out) and trigger -Werror warnings; annotate those
parameters with [[maybe_unused]] (e.g., change parameter declarations for
caller, file, and line in check_time_limit) to suppress unused-parameter
warnings while preserving the current defaults and behavior.
cpp/tests/mip/diversity_test.cu-71-73 (1)

71-73: ⚠️ Potential issue | 🟡 Minor

Remove or use the seed parameter.

The seed parameter with default value std::random_device{}() is declared in test_full_run_determinism, test_initial_solution_determinism, and test_recombiners_determinism, but it is never used within the function bodies. Either remove the parameter or use it to seed the random number generation.

Also applies to: 132-134, 192-194

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/tests/mip/diversity_test.cu` around lines 71 - 73, The three functions
test_full_run_determinism, test_initial_solution_determinism, and
test_recombiners_determinism declare an unused seed parameter; either remove the
seed parameter from their signatures or actually apply it to the RNG used in
each test (e.g., seed whatever std::mt19937 / std::default_random_engine or CUDA
host RNG the tests use) so the tests become deterministic; update all
overloads/call sites accordingly and ensure the default behavior remains
unchanged if you keep the default argument.
cpp/tests/mip/diversity_test.cu-285-293 (1)

285-293: ⚠️ Potential issue | 🟡 Minor

Avoid exit(1) in test code; use GTest assertions instead.

Calling exit(1) bypasses the test framework's cleanup and reporting. Use FAIL() or ASSERT_EQ to properly fail the test.

🧪 Suggested fix
         if (hash_map[std::make_tuple(path, i, j, recombiner)] != offspring_hash) {
           printf("%s: hash mismatch for %d,%d: %d != %d\n",
                  path.c_str(),
                  i,
                  j,
                  hash_map[std::make_tuple(path, i, j, recombiner)],
                  offspring_hash);
-          exit(1);
+          FAIL() << "Hash mismatch for " << path << " at (" << i << "," << j << ")";
         }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/tests/mip/diversity_test.cu` around lines 285 - 293, Replace the direct
process exit with a GTest assertion: where the code compares
hash_map[std::make_tuple(path, i, j, recombiner)] to offspring_hash, remove
exit(1) and use ASSERT_EQ or EXPECT_EQ to report the mismatch (for example
ASSERT_EQ(hash_map[std::make_tuple(path, i, j, recombiner)], offspring_hash) <<
"hash mismatch for " << path << ":" << i << "," << j;), or use FAIL() <<
formatted message if you need to include both values; this ensures the test
framework records the failure and performs proper cleanup instead of calling
exit.
cpp/src/utilities/seed_generator.cuh-55-65 (1)

55-65: ⚠️ Potential issue | 🟡 Minor

Potential issue with SEED_GENERATOR_DEBUG macro check.

Using #if SEED_GENERATOR_DEBUG without first checking if the macro is defined will cause a compilation warning or treat an undefined macro as 0. Consider using #ifdef SEED_GENERATOR_DEBUG or #if defined(SEED_GENERATOR_DEBUG) && SEED_GENERATOR_DEBUG.

🛠️ Suggested fix
-#if SEED_GENERATOR_DEBUG
+#ifdef SEED_GENERATOR_DEBUG
   static int64_t get_seed(const char* caller = __builtin_FUNCTION(),
                           const char* file   = __builtin_FILE(),
                           int line           = __builtin_LINE())
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/utilities/seed_generator.cuh` around lines 55 - 65, The preprocessor
check for SEED_GENERATOR_DEBUG should guard against the macro being undefined;
update the existing conditional around the get_seed overloads to use either
`#ifdef` SEED_GENERATOR_DEBUG or `#if` defined(SEED_GENERATOR_DEBUG) &&
SEED_GENERATOR_DEBUG so the compiler won't warn or treat an undefined macro as
0, keeping the debug variant of static int64_t get_seed(const char* caller =
__builtin_FUNCTION(), const char* file = __builtin_FILE(), int line =
__builtin_LINE()) and the non-debug static int64_t get_seed() { return
local_state().counter++; } behavior unchanged; ensure you modify the opening
conditional only (and keep the corresponding `#else/`#endif) so get_seed is
selected correctly based on the macro definition.
cpp/tests/mip/determinism_test.cu-36-62 (1)

36-62: ⚠️ Potential issue | 🟡 Minor

Consider safer error handling in scoped_env_var_t.

Two concerns with this RAII helper:

  1. Line 59: name_ stores a raw pointer from the constructor argument. If the caller passes a temporary or short-lived string, this will become a dangling pointer. Consider storing a std::string copy.

  2. Lines 46, 51-52: Using assert() for setenv/unsetenv return values means errors are silently ignored in release builds. Consider logging failures or using a different assertion mechanism.

🛠️ Suggested fix
 class scoped_env_var_t {
  public:
   scoped_env_var_t(const char* name, const char* value) : name_(name)
   {
     cuopt_assert(name != nullptr, "Environment variable name must be non-null");
     cuopt_assert(value != nullptr, "Environment variable value must be non-null");
     const char* original_value = std::getenv(name_);
     was_set_                   = (original_value != nullptr);
     if (was_set_) { original_value_ = original_value; }
     const int status = setenv(name_, value, 1);
-    assert(status == 0);
+    cuopt_assert(status == 0, "setenv failed");
   }

   ~scoped_env_var_t()
   {
     const int status = was_set_ ? setenv(name_, original_value_.c_str(), 1) : unsetenv(name_);
-    assert(status == 0);
+    // Note: Cannot throw or assert in destructor; log on failure
+    if (status != 0) { CUOPT_LOG_WARN("Failed to restore env var %s", name_); }
   }

   scoped_env_var_t(const scoped_env_var_t&)            = delete;
   scoped_env_var_t& operator=(const scoped_env_var_t&) = delete;

  private:
-  const char* name_;
+  std::string name_;  // Own the string to avoid dangling pointer
   std::string original_value_;
   bool was_set_{false};
 };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/tests/mip/determinism_test.cu` around lines 36 - 62, scoped_env_var_t
currently stores a raw const char* (name_) and uses assert() on setenv/unsetenv
which is unsafe; update the constructor to copy the environment variable name
into a std::string member (e.g., change name_ to std::string name_) to avoid
dangling pointers, and replace the assert(status == 0) checks in the constructor
and destructor with proper error handling: check the return value of
setenv/unsetenv and either log the error (with errno) or throw a runtime_error
from the constructor/destructor context as appropriate so failures are not
ignored in release builds; ensure you update references to name_ in the
constructor/destructor to use the std::string.
cpp/tests/mip/miplib_test.cu-76-83 (1)

76-83: ⚠️ Potential issue | 🟡 Minor

Commented-out test has incorrect function call.

If the commented test is uncommented, line 81 would fail to compile. The call test_miplib_file(test_instance, true) passes true as the second argument, but the function signature expects mip_solver_settings_t<int, double> as the second parameter.

Should likely be:

mip_solver_settings_t<int, double> settings;
// configure determinism settings...
test_miplib_file(test_instance, settings, true);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/tests/mip/miplib_test.cu` around lines 76 - 83, The commented-out test
calls test_miplib_file(test_instance, true) but the function signature requires
a mip_solver_settings_t<int, double> settings object as the second parameter;
update the test to construct a mip_solver_settings_t<int, double> (e.g.,
mip_solver_settings_t<int,double> settings), set any determinism-related fields
on settings as needed, and then call test_miplib_file(test_instance, settings,
true) so the correct overload is invoked.
cpp/src/utilities/version_info.cpp-188-196 (1)

188-196: ⚠️ Potential issue | 🟡 Minor

Potential uncaught exception from std::stod.

If /proc/cpuinfo contains a malformed "cpu MHz" line, std::stod at line 192 can throw std::invalid_argument or std::out_of_range. Since this is called from a lambda initializing a thread_local static, an uncaught exception could terminate the program.

🛡️ Proposed fix to add exception handling
       if (line.find("cpu MHz") != std::string::npos) {
         std::size_t colon = line.find(':');
         if (colon != std::string::npos) {
-          double mhz = std::stod(line.substr(colon + 1));
-          if (mhz > max_mhz) { max_mhz = mhz; }
+          try {
+            double mhz = std::stod(line.substr(colon + 1));
+            if (mhz > max_mhz) { max_mhz = mhz; }
+          } catch (const std::exception&) {
+            // Skip malformed entries
+          }
         }
       }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/utilities/version_info.cpp` around lines 188 - 196, The loop that
parses "/proc/cpuinfo" calls std::stod on line substrings and can throw
std::invalid_argument or std::out_of_range (risking terminate since this runs in
a thread_local static initializer). Wrap the std::stod call inside a try-catch
that catches std::invalid_argument and std::out_of_range (or std::exception) and
simply skip the malformed "cpu MHz" line when an exception is thrown, leaving
max_mhz unchanged; update references in this block around std::stod, max_mhz,
cpuinfo and line to implement the handling and optionally log a debug message
instead of letting the exception propagate.
cpp/src/utilities/determinism_log.hpp-25-35 (1)

25-35: ⚠️ Potential issue | 🟡 Minor

Redundant macro definitions are dead code.

The #ifndef CUOPT_DETERMINISM_LOG guard on line 19 ensures that subsequent definitions on lines 25-35 are never evaluated. Additionally, line 26 has a different signature (missing the logger parameter) compared to line 20, suggesting either incomplete refactoring or a copy-paste oversight.

Consider removing the redundant blocks or clarifying the intended macro signature.

🧹 Proposed fix to remove dead code
 `#ifndef` CUOPT_DETERMINISM_LOG
 `#define` CUOPT_DETERMINISM_LOG(logger, ...) \
   do {                                     \
   } while (0)
 `#endif`
-
-#ifndef CUOPT_DETERMINISM_LOG
-#define CUOPT_DETERMINISM_LOG(...) \
-  do {                             \
-  } while (0)
-#endif
-
-#ifndef CUOPT_DETERMINISM_LOG
-#define CUOPT_DETERMINISM_LOG(...) \
-  do {                             \
-  } while (0)
-#endif
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/utilities/determinism_log.hpp` around lines 25 - 35, The file defines
duplicate dead-code macro blocks for CUOPT_DETERMINISM_LOG (one without the
expected logger parameter) — remove the redundant second block and ensure only a
single macro definition exists with the intended signature (the one that takes
logger and variadic args), or if the no-logger variant is intended, reconcile
both definitions to a consistent, documented signature; update/remove the extra
`#ifndef/`#define/#endif pair so CUOPT_DETERMINISM_LOG is defined exactly once
with the correct parameter list.
cpp/tests/mip/CMakeLists.txt-42-45 (1)

42-45: ⚠️ Potential issue | 🟡 Minor

Stale comment conflicts with active test configuration.

Line 42 says “Disable for now,” but Line 43-Line 45 actively registers FEASIBILITY_JUMP_TEST. Please update/remove the comment to avoid confusion.

🧹 Proposed cleanup
-# Disable for now
 ConfigureTest(FEASIBILITY_JUMP_TEST
    ${CMAKE_CURRENT_SOURCE_DIR}/feasibility_jump_tests.cu
 )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/tests/mip/CMakeLists.txt` around lines 42 - 45, The comment "Disable for
now" is stale and conflicts with the active ConfigureTest registration for
FEASIBILITY_JUMP_TEST; remove or update that comment so it accurately reflects
the current state (either delete the line or change it to indicate the test is
enabled), and keep the ConfigureTest(FEASIBILITY_JUMP_TEST ...) block intact so
the test registration remains correct.
cpp/src/mip_heuristics/diversity/recombiners/fp_recombiner.cuh-91-97 (1)

91-97: ⚠️ Potential issue | 🟡 Minor

Potential null pointer issue if work_context is not set.

When CUOPT_DETERMINISM_GPU_HEURISTICS is enabled, the code asserts that work_context != nullptr. However, if the determinism flag is set but gpu_heur_loop wasn't properly initialized, this will crash at runtime. Consider adding a more graceful fallback or earlier validation when the settings are configured.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/diversity/recombiners/fp_recombiner.cuh` around lines
91 - 97, The code asserts that lp_settings.work_context is non-null when
context.settings.determinism_mode includes CUOPT_DETERMINISM_GPU_HEURISTICS,
which can crash if context.gpu_heur_loop wasn't initialized; update the block
around context.settings.determinism_mode and lp_settings.work_context to first
check if this->context.gpu_heur_loop is non-null and, if null, either set a safe
fallback (e.g., a default work context or disable the deterministic GPU
heuristics by clearing the CUOPT_DETERMINISM_GPU_HEURISTICS flag) and log a
clear warning, or return/propagate an error earlier during settings validation
so gpu_heur_loop is guaranteed initialized; ensure modifications touch the same
scope where lp_settings.time_limit, lp_settings.work_limit,
lp_settings.work_context, this->context.gpu_heur_loop and cuopt_assert are used.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 9a2b84b7-6e55-477e-bac8-6ef8b641f650

📥 Commits

Reviewing files that changed from the base of the PR and between c5116e1 and e80e545.

📒 Files selected for processing (107)
  • benchmarks/linear_programming/cuopt/run_mip.cpp
  • ci/compute-sanitizer-suppressions.xml
  • cpp/CMakeLists.txt
  • cpp/include/cuopt/linear_programming/constants.h
  • cpp/include/cuopt/linear_programming/cuopt_c.h
  • cpp/include/cuopt/linear_programming/mip/solver_settings.hpp
  • cpp/include/cuopt/linear_programming/utilities/internals.hpp
  • cpp/src/CMakeLists.txt
  • cpp/src/branch_and_bound/branch_and_bound.cpp
  • cpp/src/branch_and_bound/branch_and_bound.hpp
  • cpp/src/branch_and_bound/deterministic_workers.hpp
  • cpp/src/branch_and_bound/pseudo_costs.cpp
  • cpp/src/branch_and_bound/pseudo_costs.hpp
  • cpp/src/dual_simplex/basis_updates.cpp
  • cpp/src/dual_simplex/bound_flipping_ratio_test.cpp
  • cpp/src/dual_simplex/bound_flipping_ratio_test.hpp
  • cpp/src/dual_simplex/phase2.cpp
  • cpp/src/dual_simplex/simplex_solver_settings.hpp
  • cpp/src/math_optimization/solver_settings.cu
  • cpp/src/mip_heuristics/diversity/diversity_config.hpp
  • cpp/src/mip_heuristics/diversity/diversity_manager.cu
  • cpp/src/mip_heuristics/diversity/diversity_manager.cuh
  • cpp/src/mip_heuristics/diversity/lns/rins.cu
  • cpp/src/mip_heuristics/diversity/multi_armed_bandit.cuh
  • cpp/src/mip_heuristics/diversity/population.cu
  • cpp/src/mip_heuristics/diversity/population.cuh
  • cpp/src/mip_heuristics/diversity/recombiners/bound_prop_recombiner.cuh
  • cpp/src/mip_heuristics/diversity/recombiners/fp_recombiner.cuh
  • cpp/src/mip_heuristics/diversity/recombiners/line_segment_recombiner.cuh
  • cpp/src/mip_heuristics/diversity/recombiners/recombiner.cuh
  • cpp/src/mip_heuristics/diversity/recombiners/recombiner_stats.hpp
  • cpp/src/mip_heuristics/diversity/recombiners/sub_mip.cuh
  • cpp/src/mip_heuristics/diversity/weights.cuh
  • cpp/src/mip_heuristics/feasibility_jump/feasibility_jump.cu
  • cpp/src/mip_heuristics/feasibility_jump/feasibility_jump.cuh
  • cpp/src/mip_heuristics/feasibility_jump/feasibility_jump_impl_common.cuh
  • cpp/src/mip_heuristics/feasibility_jump/feasibility_jump_kernels.cu
  • cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu
  • cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cuh
  • cpp/src/mip_heuristics/feasibility_jump/load_balancing.cuh
  • cpp/src/mip_heuristics/feasibility_jump/utils.cuh
  • cpp/src/mip_heuristics/local_search/feasibility_pump/feasibility_pump.cu
  • cpp/src/mip_heuristics/local_search/feasibility_pump/feasibility_pump.cuh
  • cpp/src/mip_heuristics/local_search/line_segment_search/line_segment_search.cu
  • cpp/src/mip_heuristics/local_search/line_segment_search/line_segment_search.cuh
  • cpp/src/mip_heuristics/local_search/local_search.cu
  • cpp/src/mip_heuristics/local_search/local_search.cuh
  • cpp/src/mip_heuristics/local_search/rounding/bounds_repair.cu
  • cpp/src/mip_heuristics/local_search/rounding/bounds_repair.cuh
  • cpp/src/mip_heuristics/local_search/rounding/constraint_prop.cu
  • cpp/src/mip_heuristics/local_search/rounding/constraint_prop.cuh
  • cpp/src/mip_heuristics/local_search/rounding/lb_bounds_repair.cu
  • cpp/src/mip_heuristics/local_search/rounding/lb_bounds_repair.cuh
  • cpp/src/mip_heuristics/local_search/rounding/lb_constraint_prop.cu
  • cpp/src/mip_heuristics/local_search/rounding/lb_constraint_prop.cuh
  • cpp/src/mip_heuristics/local_search/rounding/simple_rounding_kernels.cuh
  • cpp/src/mip_heuristics/presolve/bounds_presolve.cu
  • cpp/src/mip_heuristics/presolve/bounds_update_data.cu
  • cpp/src/mip_heuristics/presolve/lb_probing_cache.cu
  • cpp/src/mip_heuristics/presolve/probing_cache.cu
  • cpp/src/mip_heuristics/presolve/probing_cache.cuh
  • cpp/src/mip_heuristics/presolve/third_party_presolve.cpp
  • cpp/src/mip_heuristics/presolve/third_party_presolve.hpp
  • cpp/src/mip_heuristics/presolve/utils.cuh
  • cpp/src/mip_heuristics/problem/problem.cu
  • cpp/src/mip_heuristics/problem/problem_fixing.cuh
  • cpp/src/mip_heuristics/relaxed_lp/relaxed_lp.cu
  • cpp/src/mip_heuristics/relaxed_lp/relaxed_lp.cuh
  • cpp/src/mip_heuristics/solution/solution.cu
  • cpp/src/mip_heuristics/solution/solution.cuh
  • cpp/src/mip_heuristics/solve.cu
  • cpp/src/mip_heuristics/solver.cu
  • cpp/src/mip_heuristics/solver.cuh
  • cpp/src/mip_heuristics/solver_context.cuh
  • cpp/src/pdlp/cuopt_c.cpp
  • cpp/src/pdlp/pdlp.cu
  • cpp/src/utilities/copy_helpers.hpp
  • cpp/src/utilities/cuda_helpers.cuh
  • cpp/src/utilities/determinism_log.hpp
  • cpp/src/utilities/models/fj_predictor/header.h
  • cpp/src/utilities/models/fj_predictor/main.cpp
  • cpp/src/utilities/models/fj_predictor/quantize.cpp
  • cpp/src/utilities/seed_generator.cu
  • cpp/src/utilities/seed_generator.cuh
  • cpp/src/utilities/termination_checker.hpp
  • cpp/src/utilities/timer.hpp
  • cpp/src/utilities/version_info.cpp
  • cpp/src/utilities/version_info.hpp
  • cpp/src/utilities/work_limit_context.hpp
  • cpp/src/utilities/work_limit_timer.hpp
  • cpp/src/utilities/work_unit_predictor.cpp
  • cpp/src/utilities/work_unit_predictor.hpp
  • cpp/src/utilities/work_unit_scheduler.cpp
  • cpp/tests/CMakeLists.txt
  • cpp/tests/mip/CMakeLists.txt
  • cpp/tests/mip/determinism_test.cu
  • cpp/tests/mip/determinism_utils.cuh
  • cpp/tests/mip/diversity_test.cu
  • cpp/tests/mip/feasibility_jump_tests.cu
  • cpp/tests/mip/load_balancing_test.cu
  • cpp/tests/mip/local_search_test.cu
  • cpp/tests/mip/mip_utils.cuh
  • cpp/tests/mip/miplib_test.cu
  • cpp/tests/mip/multi_probe_test.cu
  • cpp/tests/mip/presolve_test.cu
  • cpp/tests/mip/unit_test.cu
  • datasets/mip/download_miplib_test_dataset.sh

*/
/* clang-format on */

#include <utilities/determinism_log.hpp>
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Fix the missing determinism_log.hpp include.

Clang already fails to resolve this header, so the new deterministic logging path does not compile. The same include was added in multiple changed files, so this blocks the whole PR until the file is added or the include path is corrected.

🧰 Tools
🪛 Clang (14.0.6)

[error] 8-8: 'utilities/determinism_log.hpp' file not found

(clang-diagnostic-error)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/branch_and_bound/branch_and_bound.cpp` at line 8, The build fails
because the new include <utilities/determinism_log.hpp> referenced in
branch_and_bound.cpp (and several other changed files) cannot be found; either
add the missing header file into the project at the expected include path (so
<utilities/determinism_log.hpp> resolves) or change the include to the correct
relative path used by the codebase and update the build include directories
(CMakeLists or equivalent) so the compiler can locate determinism_log.hpp;
locate occurrences of <utilities/determinism_log.hpp> in branch_and_bound.cpp
and the other modified files and apply the chosen fix consistently.

Comment on lines 589 to 590
repair_queue_.push_back({solution, cuopt::internals::mip_solution_origin_t::UNKNOWN});
mutex_repair_.unlock();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Populate a non-negative repair timestamp here.

queued_repair_solution_t::work_timestamp defaults to -1.0, and repaired incumbents later flow into emit_solution_callback_*(), which now asserts that timestamps are non-negative. With new_incumbent_callback enabled, a repaired injected solution can abort the solve on this path.

🛠️ Suggested fix
   if (attempt_repair) {
     mutex_repair_.lock();
-    repair_queue_.push_back({solution, cuopt::internals::mip_solution_origin_t::UNKNOWN});
+    const double repair_work_timestamp =
+      settings_.deterministic ? work_unit_context_.current_work() : 0.0;
+    repair_queue_.push_back({solution,
+                             cuopt::internals::mip_solution_origin_t::UNKNOWN,
+                             repair_work_timestamp});
     mutex_repair_.unlock();
   }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
repair_queue_.push_back({solution, cuopt::internals::mip_solution_origin_t::UNKNOWN});
mutex_repair_.unlock();
const double repair_work_timestamp =
settings_.deterministic ? work_unit_context_.current_work() : 0.0;
repair_queue_.push_back({solution,
cuopt::internals::mip_solution_origin_t::UNKNOWN,
repair_work_timestamp});
mutex_repair_.unlock();
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/branch_and_bound/branch_and_bound.cpp` around lines 589 - 590, The
queued repair entry is pushed with a default negative work_timestamp
(queued_repair_solution_t::work_timestamp == -1.0), which later triggers
assertions in emit_solution_callback_*() when new_incumbent_callback is enabled;
fix by populating a non-negative timestamp before pushing into repair_queue_
(replace the push_back call so the queued_repair_solution_t instance sets
work_timestamp to a valid value, e.g., current wall-clock time or 0.0), ensuring
the code path around repair_queue_.push_back and mutex_repair_.unlock uses the
updated queued_repair_solution_t with a non-negative work_timestamp.

Comment on lines +259 to +297
template <typename i_t, typename f_t>
f_t trial_branching_generic(const lp_problem_t<i_t, f_t>& original_lp,
const simplex_solver_settings_t<i_t, f_t>& settings,
const std::vector<variable_type_t>& var_types,
const std::vector<variable_status_t>& vstatus,
const std::vector<f_t>& edge_norms,
const basis_update_mpf_t<i_t, f_t>& basis_factors,
const std::vector<i_t>& basic_list,
const std::vector<i_t>& nonbasic_list,
i_t branch_var,
f_t branch_var_lower,
f_t branch_var_upper,
f_t upper_bound,
i_t bnb_lp_iter_per_node,
f_t start_time,
i_t upper_max_lp_iter,
i_t lower_max_lp_iter,
int64_t& total_lp_iter)
{
omp_atomic_t<int64_t> atomic_iter{0};
f_t result = trial_branching(original_lp,
settings,
var_types,
vstatus,
edge_norms,
basis_factors,
basic_list,
nonbasic_list,
branch_var,
branch_var_lower,
branch_var_upper,
upper_bound,
bnb_lp_iter_per_node,
start_time,
upper_max_lp_iter,
lower_max_lp_iter,
atomic_iter);
total_lp_iter += atomic_iter.load();
return result;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

The deterministic strong-branching counter still has a data race.

This overload writes to a shared int64_t& total_lp_iter, but reliable_variable_selection_core() passes that same reference into an OpenMP taskloop. total_lp_iter += atomic_iter.load() is therefore unsynchronized and can drop iterations nondeterministically, which feeds back into the reliability threshold.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/branch_and_bound/pseudo_costs.cpp` around lines 259 - 297, The
function trial_branching_generic writes to a shared int64_t& total_lp_iter
unsafely (called from reliable_variable_selection_core inside an OpenMP
taskloop), causing a data race; change the update to perform an atomic addition
instead of a plain += by either (a) changing the parameter to accept an
omp_atomic_t<int64_t>& (or std::atomic<int64_t>&) and calling its
fetch_add(atomic_iter.load())/operator+=, or (b) keep int64_t& but replace the
plain total_lp_iter += atomic_iter.load() with an OpenMP atomic update (e.g. use
`#pragma` omp atomic or an atomic_fetch_add wrapper) so the increment is
synchronized; ensure references to atomic_iter, total_lp_iter,
trial_branching_generic, and reliable_variable_selection_core are updated
consistently.

Comment on lines +356 to +359
if (solution.get_feasible()) {
population.add_solution(std::move(solution),
internals::mip_solution_origin_t::FEASIBILITY_JUMP);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Don't move the caller-owned solution out of run_fj_alone().

run_solver() logs and returns the same sol after this call. Moving it into population here leaves the returned object unspecified on the fj_only_run path.

Suggested fix
   ls.fj.solve(solution);
   if (solution.get_feasible()) {
-    population.add_solution(std::move(solution),
+    population.add_solution(solution_t<i_t, f_t>(solution),
                             internals::mip_solution_origin_t::FEASIBILITY_JUMP);
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/diversity/diversity_manager.cu` around lines 356 -
359, The code moves the caller-owned `solution` into `population` inside
`run_fj_alone()` which leaves the same `sol` returned by `run_solver()`
unspecified on the `fj_only_run` path; instead, avoid transferring ownership:
change the call to `population.add_solution` so it does not take `solution` via
move—either pass `solution` by const reference (if `add_solution` supports it)
or create an explicit copy/clone before moving (e.g. make a local copy and move
that) so the original `solution` remains valid for `run_solver()` return; update
the call site around `population.add_solution(std::move(solution),
internals::mip_solution_origin_t::FEASIBILITY_JUMP)` accordingly.

Comment on lines 125 to +128
if (!full_refresh && fj.pb.related_variables.size() > 0 &&
fj.pb.n_variables / fj.work_ids_for_related_vars[*fj.selected_var] >=
fj.settings->parameters.old_codepath_total_var_to_relvar_ratio_threshold) {
fj.settings->parameters.old_codepath_total_var_to_relvar_ratio_threshold &&
fj.settings->load_balancing_mode != fj_load_balancing_mode_t::ALWAYS_ON) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Guard the fallback ratio check against zero work IDs.

If fj.work_ids_for_related_vars[*fj.selected_var] is 0, this divides by zero inside the kernel. That's enough to blow up the fallback path for isolated/empty-related variables.

Possible fix
-  if (!full_refresh && fj.pb.related_variables.size() > 0 &&
-      fj.pb.n_variables / fj.work_ids_for_related_vars[*fj.selected_var] >=
+  auto related_work_ids = fj.work_ids_for_related_vars[*fj.selected_var];
+  if (!full_refresh && fj.pb.related_variables.size() > 0 && related_work_ids > 0 &&
+      fj.pb.n_variables / related_work_ids >=
         fj.settings->parameters.old_codepath_total_var_to_relvar_ratio_threshold &&
       fj.settings->load_balancing_mode != fj_load_balancing_mode_t::ALWAYS_ON) {
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if (!full_refresh && fj.pb.related_variables.size() > 0 &&
fj.pb.n_variables / fj.work_ids_for_related_vars[*fj.selected_var] >=
fj.settings->parameters.old_codepath_total_var_to_relvar_ratio_threshold) {
fj.settings->parameters.old_codepath_total_var_to_relvar_ratio_threshold &&
fj.settings->load_balancing_mode != fj_load_balancing_mode_t::ALWAYS_ON) {
auto related_work_ids = fj.work_ids_for_related_vars[*fj.selected_var];
if (!full_refresh && fj.pb.related_variables.size() > 0 && related_work_ids > 0 &&
fj.pb.n_variables / related_work_ids >=
fj.settings->parameters.old_codepath_total_var_to_relvar_ratio_threshold &&
fj.settings->load_balancing_mode != fj_load_balancing_mode_t::ALWAYS_ON) {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/feasibility_jump/load_balancing.cuh` around lines 125
- 128, The division uses fj.work_ids_for_related_vars[*fj.selected_var] and can
divide by zero; update the condition to first fetch that value (e.g., auto
work_ids = fj.work_ids_for_related_vars[*fj.selected_var]) and guard the ratio
check with work_ids > 0 (or replace the division with a safe multiply
comparison) so the clause becomes: full_refresh is false && related_variables
non-empty && work_ids > 0 && fj.pb.n_variables / work_ids >=
fj.settings->parameters.old_codepath_total_var_to_relvar_ratio_threshold &&
fj.settings->load_balancing_mode != fj_load_balancing_mode_t::ALWAYS_ON. Ensure
you reference fj.selected_var and fj.work_ids_for_related_vars in the change.

const raft::handle_t* handle_ptr;
std::mt19937 gen;
timer_t timer{0.};
work_limit_timer_t timer;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Search for work_limit_timer_t definition to verify default constructor behavior
ast-grep --pattern 'class work_limit_timer_t {
  $$$
}'

# Also check for struct definition
ast-grep --pattern 'struct work_limit_timer_t {
  $$$
}'

# Search for using/typedef
rg -n 'using work_limit_timer_t' --type cpp
rg -n 'typedef.*work_limit_timer_t' --type cpp

Repository: aliceb-nv/cuopt-fork

Length of output: 159


🏁 Script executed:

#!/bin/bash
# Find the termination_checker_t definition in the file where the type alias is
cat cpp/src/utilities/termination_checker.hpp | head -250 | tail -100

# Also search for termination_checker_t class/struct definition more broadly
rg -n 'class termination_checker_t|struct termination_checker_t' --type cpp -A 20

Repository: aliceb-nv/cuopt-fork

Length of output: 4597


🏁 Script executed:

#!/bin/bash
# Search for timer_t definition
rg -n 'using timer_t|typedef.*timer_t|class timer_t|struct timer_t' --type cpp -A 5

Repository: aliceb-nv/cuopt-fork

Length of output: 398


🏁 Script executed:

#!/bin/bash
# Get the full timer_t class definition
sed -n '16,100p' cpp/src/utilities/timer.hpp

Repository: aliceb-nv/cuopt-fork

Length of output: 2852


🏁 Script executed:

#!/bin/bash
# Search for all constructors of termination_checker_t
rg -n 'termination_checker_t(' cpp/src/utilities/termination_checker.hpp | head -20

Repository: aliceb-nv/cuopt-fork

Length of output: 158


🏁 Script executed:

#!/bin/bash
# Search for all constructors of termination_checker_t
rg -n 'termination_checker_t\(' cpp/src/utilities/termination_checker.hpp

Repository: aliceb-nv/cuopt-fork

Length of output: 272


This change will not compile: work_limit_timer_t cannot be default-constructed.

The type alias work_limit_timer_t refers to termination_checker_t, which has no explicit default constructor and contains a member timer_t timer without an initializer. Since timer_t's default constructor is explicitly deleted, the implicit default constructor of termination_checker_t cannot be generated. The statement work_limit_timer_t timer; at line 147 will fail to compile.

The original code timer_t timer{0.} explicitly constructed the timer with a value. The replacement requires either:

  • An explicit default constructor for termination_checker_t that properly initializes timer_t
  • Or initializing timer_t with a valid argument at the declaration point
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/local_search/rounding/bounds_repair.cuh` at line 147,
work_limit_timer_t cannot be default-constructed because termination_checker_t
contains a timer_t whose default ctor is deleted; replace the problematic
declaration by either (A) initialize the local instance with a valid timer value
(e.g. construct the alias with the same argument used previously, e.g. use
work_limit_timer_t timer{0.} or the appropriate float/duration), or (B) add an
explicit default constructor to termination_checker_t that initializes its timer
member (timer_t timer) with the required argument (e.g. timer{0.}) so that
termination_checker_t can be default-constructed; update the declaration of
work_limit_timer_t timer accordingly to match the chosen approach.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 12

🧹 Nitpick comments (8)
cpp/src/mip_heuristics/solver.cu (1)

20-23: Consider documenting the macro override intent.

The #undef followed by #define overrides the determinism logging macro to redirect to CUOPT_LOG_INFO. While this is intentional for verbose logging during development/debugging, it may inadvertently flood production logs. Consider adding a comment explaining the intent or guarding with a build flag.

💡 Suggested documentation
 `#include` <utilities/determinism_log.hpp>

+// Override determinism logging to emit INFO-level logs for debugging work-unit accounting.
+// TODO: Consider making this conditional on a build flag for production builds.
 `#undef` CUOPT_DETERMINISM_LOG
 `#define` CUOPT_DETERMINISM_LOG(...) CUOPT_LOG_INFO(__VA_ARGS__)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/solver.cu` around lines 20 - 23, Add documentation and
a compilation guard around the macro override: explain why CUOPT_DETERMINISM_LOG
is being undefined and redefined to CUOPT_LOG_INFO (e.g., temporary verbose
logging for development/troubleshooting) and either wrap the `#undef/`#define in a
conditional build flag (such as `#ifdef` DEBUG or a project-specific
CUOPT_DETERMINISM_OVERRIDE) or add a clear comment referencing
utilities/determinism_log.hpp and CUOPT_LOG_INFO so maintainers know this change
is intentional and can disable it for production.
cpp/include/cuopt/linear_programming/utilities/internals.hpp (3)

107-118: Add Doxygen documentation for extended callback class.

This class extends the callback interface with origin metadata. Per coding guidelines, public classes should document their purpose and thread-safety characteristics.

📝 Suggested documentation
+/**
+ * `@brief` Extended solution callback that receives origin metadata alongside solution data.
+ *
+ * Subclass and implement get_solution() to receive solutions with provenance information.
+ * Register via solver settings to receive callbacks during the solve process.
+ *
+ * `@note` Callbacks may be invoked from worker threads; implementations should be thread-safe.
+ */
 class get_solution_callback_ext_t : public base_solution_callback_t {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/include/cuopt/linear_programming/utilities/internals.hpp` around lines
107 - 118, Add Doxygen documentation to the public class
get_solution_callback_ext_t describing its purpose as an extended solution
callback that supplies origin/metadata alongside solution data, document the
virtual method get_solution parameters and expected behavior, and state
thread-safety guarantees (e.g., whether implementations must be thread-safe or
called from a single thread). Place the comment above the class definition and
mention the overridden get_type() behavior returning
base_solution_callback_type::GET_SOLUTION_EXT so readers can locate this
specialization.

27-42: Add Doxygen documentation for new public enum.

As per coding guidelines for public headers under cpp/include/cuopt/**/*, new public entities should have documentation comments. This enum introduces the solution origin taxonomy that external callbacks will receive.

📝 Suggested documentation
+/**
+ * `@brief` Origin type for MIP solutions, indicating which solver component found the solution.
+ *
+ * Used in extended callbacks (get_solution_callback_ext_t) to provide provenance metadata.
+ */
 enum class mip_solution_origin_t : uint32_t {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/include/cuopt/linear_programming/utilities/internals.hpp` around lines 27
- 42, Add Doxygen documentation to the new public enum mip_solution_origin_t and
each enumerator so external users and generated docs understand the
solution-origin taxonomy; update the enum definition in internals.hpp
(mip_solution_origin_t) to include a brief summary comment above the enum and
short `@enum/`@brief or per-enumerator `@brief` comments for values like UNKNOWN,
BRANCH_AND_BOUND_NODE, BRANCH_AND_BOUND_DIVING, FEASIBILITY_JUMP,
CPU_FEASIBILITY_JUMP, LOCAL_SEARCH, QUICK_FEASIBLE, LP_ROUNDING, RECOMBINATION,
SUB_MIP, USER_INITIAL, USER_INJECTED, RINS, and PRESOLVE describing what each
origin means and any callback implications.

65-69: Add Doxygen documentation for callback info struct.

This struct is part of the public extended callback API and should document its fields for downstream consumers.

📝 Suggested documentation
+/**
+ * `@brief` Extended callback metadata for MIP solution events.
+ *
+ * `@param` struct_size Size of this struct for ABI versioning.
+ * `@param` origin The solver component that produced the solution.
+ * `@param` work_timestamp Deterministic work-unit timestamp (-1.0 if unavailable).
+ */
 struct mip_solution_callback_info_t {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/include/cuopt/linear_programming/utilities/internals.hpp` around lines 65
- 69, The public callback struct mip_solution_callback_info_t lacks Doxygen
comments; add a brief Doxygen comment above the struct describing its role in
the extended callback API and document each member: document struct_size as the
size/version stamp (uint64_t) used for ABI compatibility, origin
(mip_solution_origin_t) as the provenance of the solution with possible enum
values, and work_timestamp (double) as the timestamp or monotonic time
associated with the callback; keep descriptions concise and use `@brief` and
`@param/`@var or `@field` tags so downstream consumers and generated docs see each
field purpose.
cpp/tests/mip/diversity_test.cu (1)

71-73: Remove unused seed parameter from test helper functions.

The seed parameter is declared in test_full_run_determinism, test_initial_solution_determinism, and test_recombiners_determinism but never used within these functions. The seed is instead set via cuopt::seed_generator::set_seed(seed) in the test cases before calling these helpers.

🧹 Proposed signature cleanup
-static uint32_t test_full_run_determinism(std::string path,
-                                          unsigned long seed = std::random_device{}(),
-                                          float work_limit   = 10.0f)
+static uint32_t test_full_run_determinism(std::string path,
+                                          float work_limit = 10.0f)

Apply similar changes to test_initial_solution_determinism and test_recombiners_determinism.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/tests/mip/diversity_test.cu` around lines 71 - 73, Remove the unused seed
parameter from the helper functions by deleting the seed parameter (and its
default initializer) from the declarations and definitions of
test_full_run_determinism, test_initial_solution_determinism, and
test_recombiners_determinism; then update all call sites so they no longer pass
a seed argument (the tests already set the seed via
cuopt::seed_generator::set_seed(seed)), ensuring signatures only include the
necessary parameters (e.g., path and work_limit or whatever else the helper
truly needs).
cpp/src/branch_and_bound/branch_and_bound.cpp (3)

48-54: Debug macro redefinition pattern is unusual but functional.

The #undef followed by #define pattern works, but having the macro unconditionally enabled (despite the "uncomment to enable" comment) will produce verbose logs in production builds. Consider using a compile-time flag (e.g., #ifdef CUOPT_ENABLE_DETERMINISM_LOGGING) to conditionally enable this.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/branch_and_bound/branch_and_bound.cpp` around lines 48 - 54, The
current code unconditionally undefines and defines the CUOPT_DETERMINISM_LOG
macro (symbols: CUOPT_DETERMINISM_LOG, CUOPT_DETERMINISM_LOG(logger,...)) which
forces verbose logging; change it to be enabled only when a compile-time flag is
set by replacing the unconditional pattern with a conditional compile guard
(e.g., wrap the `#define` of CUOPT_DETERMINISM_LOG inside an `#ifdef`
CUOPT_ENABLE_DETERMINISM_LOGGING / `#endif` and remove the unconditional
undef/#define), and provide a no-op fallback macro when the flag is not defined
so production builds remain quiet.

591-593: Repair queue path is non-deterministic only—timestamp default is acceptable here.

I've traced the flow: set_new_solution asserts non-deterministic mode at line 527, and repair_heuristic_solutions is only called from non-deterministic schedulers. The emit_solution_callback assertion (line 496-497) only fails when deterministic=true, so the -1.0 default timestamp won't trigger failures.

However, for clarity and future-proofing, consider explicitly setting work_timestamp = 0.0 to document the intent:

♻️ Suggested improvement
   if (attempt_repair) {
     mutex_repair_.lock();
-    repair_queue_.push_back({solution, origin});
+    repair_queue_.push_back({solution, origin, 0.0});  // Non-deterministic path uses wall-clock time
     mutex_repair_.unlock();
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/branch_and_bound/branch_and_bound.cpp` around lines 591 - 593, The
repair queue push currently uses a default timestamp (-1.0) which is acceptable
but unclear; change the pushed tuple so the timestamp is explicitly 0.0 to
document non-deterministic-only intent: in the block guarded by attempt_repair
(around mutex_repair_.lock() and repair_queue_.push_back({solution, origin})),
replace the implicit/default timestamp with an explicit work_timestamp = 0.0
when constructing the pushed entry; this clarifies behavior relative to
set_new_solution, repair_heuristic_solutions, and emit_solution_callback without
changing semantics.

4192-4238: Load balancing uses backlog counts, not total work estimates.

The balancing logic at deterministic_balance_worker_loads distributes based on backlog.size() (node count) rather than estimated work. For heterogeneous nodes (varying LP difficulty), this could lead to uneven work distribution. Consider adding an objective-estimate-based metric if load imbalance becomes problematic in practice.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/branch_and_bound/branch_and_bound.cpp` around lines 4192 - 4238, The
balancing currently in deterministic_balance_worker_loads uses backlog.size()
(backlog_counts) which ignores per-node cost; change it to use an estimated-work
metric by summing each node's work estimate instead of counts: for each
mip_node_t<i_t,f_t>* in worker.backlog use an existing field or method (e.g.,
node->estimated_work or implement estimate_node_work(node)) to compute
per-worker total_work_estimates vector and total_work, max_work, min_work, and
use those to decide needs_balance. When redistributing, collect all nodes into
all_backlog_nodes then perform a greedy largest-first assignment (sort nodes by
estimated work descending and push each node to the worker with current smallest
total_assigned_work) rather than simple round-robin so heavy nodes are balanced;
update any references to backlog_counts/min/max to use work estimates. Ensure
worker.backlog.clear()/push logic still uses mip_node_t* but maintains the new
per-worker work totals during distribution.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@benchmarks/linear_programming/cuopt/run_mip.cpp`:
- Around line 270-277: The log message passed to CUOPT_LOG_INFO contains a debug
artifact "1run_mip"; update the format string in the CUOPT_LOG_INFO call (the
call at/around function run_mip) to remove the leading "1" so it reads "run_mip
settings: heuristics_only=%d deterministic=%d determinism_mode=%d
time_limit=%.6f work_limit=%.6f" while leaving the argument list
((int)heuristics_only, (int)deterministic, settings.determinism_mode,
settings.time_limit, settings.work_limit) unchanged.
- Line 299: The commented-out debug line contains a stray debug prefix
"1solution"; update the comment to remove the prefix so the call reads as a
normal commented call (referencing write_to_sol_file, base_filename and
handle_.get_stream()) — e.g., change "//
1solution.write_to_sol_file(base_filename + ".sol", handle_.get_stream());" to
"// solution.write_to_sol_file(base_filename + ".sol", handle_.get_stream());"
(or fully uncomment if intended to be active).

In `@cpp/include/cuopt/linear_programming/constants.h`:
- Around line 80-92: Typo in the comment for CUOPT_DETERMINISM_BB: change
"rleease" to "release" in the comment immediately following the
CUOPT_DETERMINISM_BB macro definition so the comment reads "...in the previous
release"; locate the comment next to the CUOPT_DETERMINISM_BB symbol in
constants.h and correct the spelling.

In `@cpp/src/branch_and_bound/branch_and_bound.cpp`:
- Around line 3773-3781: The code unconditionally sets lp_settings.work_limit =
std::numeric_limits<f_t>::infinity(), which ignores any user-configured work
budget; change lp_settings.work_limit to respect settings_.work_limit by
computing a per-node budget (e.g., min(settings_.work_limit - used_work, a
reasonable per-node cap, or remaining_time-derived budget) before assigning to
lp_settings.work_limit), ensuring lp_settings.work_limit is finite when a user
limit exists and falls back to infinity only if settings_.work_limit is
infinite; update the assignment site (lp_settings.work_limit) and any logic that
computes remaining/horizon budgets so the node LP solve cannot exceed the
configured work budget.
- Around line 4446-4454: The diving branch sets lp_settings.work_limit to
infinity so individual LP solves ignore global work limits; change
lp_settings.work_limit to a finite remaining work value (mirror the BFS fix) by
assigning it to the available work budget (e.g., compute remaining_work =
worker.work_limit - worker.work_done or use the existing remaining_work
variable) and set lp_settings.work_limit = remaining_work instead of infinity;
update the code around lp_settings.work_limit in the same block that sets
cut_off and time_limit so diving LP respects the global work budget.

In `@cpp/src/mip_heuristics/problem/presolve_data.cu`:
- Around line 114-128: fixed_var_assignment is being used as shared device
scratch while callbacks may run on different override streams, causing races;
allocate a per-call device buffer (or obtain a per-stream scratch) instead of
referencing the shared member when launching the thrust kernel and when calling
expand_device_copy/cuopt::host_copy. Concretely, inside the presolve routine
where h (from handle_override or problem.handle_ptr), current_assignment,
variable_mapping, assgn, var_map and fixed_assgn are used, create a local device
buffer for the fixed assignment (or query a per-stream scratch tied to
h->get_stream()), use that local buffer in the thrust::for_each lambda and pass
it to expand_device_copy and cuopt::host_copy, and free/let it go out of scope;
apply the same change to the other occurrence around the 151-154 block.

In `@cpp/src/mip_heuristics/problem/problem.cu`:
- Around line 951-955: The deterministic early-return leaves
related_variables_offsets empty which breaks view_t::range_for_related_vars()
(it reads related_variables_offsets[v + 1]); instead of clearing to size 0,
resize related_variables_offsets to a zero-filled offsets array representing no
related vars (one zero per vertex plus a sentinel, e.g., set
related_variables_offsets.size = num_vertices + 1 and fill with 0) and ensure
related_variables is resized to 0 as before; update the deterministic branch
that currently calls related_variables.resize(0, ...) and
related_variables_offsets.resize(0, ...) to produce the zero-filled offsets so
callers get empty ranges safely.

In `@cpp/src/mip_heuristics/solution_callbacks.cuh`:
- Around line 195-197: The callback currently records runtime SET_SOLUTION
injections with the wrong provenance by calling
on_injected(outside_sol.get_host_assignment(), outside_sol.get_objective(),
internals::mip_solution_origin_t::USER_INITIAL); — change the origin tag to
internals::mip_solution_origin_t::USER_INJECTED so runtime callback injections
are correctly classified; update the call site of on_injected(...) accordingly
(keep outside_sol.get_host_assignment() and outside_sol.get_objective() intact).
- Around line 193-194: The assertion uses a fixed absolute tolerance (1e-6)
which is brittle for large-magnitude objectives or float instantiations; modify
the cuopt_assert in the callback that checks outside_sol.get_user_objective() vs
outside_sol_objective to use a scaled/relative tolerance instead (e.g., compute
tol = rel_tol * std::max<f_t>(1,
std::max(std::abs(outside_sol.get_user_objective()),
std::abs(outside_sol_objective))) or include epsilon from
std::numeric_limits<f_t>::epsilon() when appropriate) and then assert that
std::abs(... - ...) <= tol; update the call site around cuopt_assert to use this
computed tol rather than the hard-coded 1e-6.
- Around line 157-177: The callback currently copies the host solution to
device, then applies scaling (scaling.scale_solutions) before calling
problem_ptr->pre_process_assignment, which can mis-scale when presolve changes
indices/sizes; change the order so that after set_sol_callback->set_solution
fills h_incumbent_assignment and raft::copy moves it into incumbent_assignment,
you call problem_ptr->pre_process_assignment(incumbent_assignment) first, and
only then, if settings.mip_scaling, call
scaling.scale_solutions(incumbent_assignment); keep the calls to
set_sol_callback, raft::copy, pre_process_assignment, and
scaling.scale_solutions in that sequence and preserve stream/synchronization
semantics.
- Around line 16-18: The header solution_callbacks.cuh uses std::isfinite and
std::abs but does not include <cmath>, relying on transitive includes; add
`#include` <cmath> to this header so it is self-contained and ensures
std::isfinite and std::abs are declared for the uses in solution_callbacks.cuh
(references: std::isfinite at the code using it and std::abs where used).

In `@cpp/tests/mip/diversity_test.cu`:
- Around line 300-312: Remove the unreachable dead code after the early return:
keep the single return statement that calls detail::compute_hash(hashes) and
delete the subsequent block that references
diversity_manager.get_population_pointer(), the loop over
pop->population_to_vector() calling sol.get_hash(), the secondary call to
detail::compute_hash, the printf using path.c_str(), and the later return of
final_hash so no code remains after the initial return.

---

Nitpick comments:
In `@cpp/include/cuopt/linear_programming/utilities/internals.hpp`:
- Around line 107-118: Add Doxygen documentation to the public class
get_solution_callback_ext_t describing its purpose as an extended solution
callback that supplies origin/metadata alongside solution data, document the
virtual method get_solution parameters and expected behavior, and state
thread-safety guarantees (e.g., whether implementations must be thread-safe or
called from a single thread). Place the comment above the class definition and
mention the overridden get_type() behavior returning
base_solution_callback_type::GET_SOLUTION_EXT so readers can locate this
specialization.
- Around line 27-42: Add Doxygen documentation to the new public enum
mip_solution_origin_t and each enumerator so external users and generated docs
understand the solution-origin taxonomy; update the enum definition in
internals.hpp (mip_solution_origin_t) to include a brief summary comment above
the enum and short `@enum/`@brief or per-enumerator `@brief` comments for values
like UNKNOWN, BRANCH_AND_BOUND_NODE, BRANCH_AND_BOUND_DIVING, FEASIBILITY_JUMP,
CPU_FEASIBILITY_JUMP, LOCAL_SEARCH, QUICK_FEASIBLE, LP_ROUNDING, RECOMBINATION,
SUB_MIP, USER_INITIAL, USER_INJECTED, RINS, and PRESOLVE describing what each
origin means and any callback implications.
- Around line 65-69: The public callback struct mip_solution_callback_info_t
lacks Doxygen comments; add a brief Doxygen comment above the struct describing
its role in the extended callback API and document each member: document
struct_size as the size/version stamp (uint64_t) used for ABI compatibility,
origin (mip_solution_origin_t) as the provenance of the solution with possible
enum values, and work_timestamp (double) as the timestamp or monotonic time
associated with the callback; keep descriptions concise and use `@brief` and
`@param/`@var or `@field` tags so downstream consumers and generated docs see each
field purpose.

In `@cpp/src/branch_and_bound/branch_and_bound.cpp`:
- Around line 48-54: The current code unconditionally undefines and defines the
CUOPT_DETERMINISM_LOG macro (symbols: CUOPT_DETERMINISM_LOG,
CUOPT_DETERMINISM_LOG(logger,...)) which forces verbose logging; change it to be
enabled only when a compile-time flag is set by replacing the unconditional
pattern with a conditional compile guard (e.g., wrap the `#define` of
CUOPT_DETERMINISM_LOG inside an `#ifdef` CUOPT_ENABLE_DETERMINISM_LOGGING / `#endif`
and remove the unconditional undef/#define), and provide a no-op fallback macro
when the flag is not defined so production builds remain quiet.
- Around line 591-593: The repair queue push currently uses a default timestamp
(-1.0) which is acceptable but unclear; change the pushed tuple so the timestamp
is explicitly 0.0 to document non-deterministic-only intent: in the block
guarded by attempt_repair (around mutex_repair_.lock() and
repair_queue_.push_back({solution, origin})), replace the implicit/default
timestamp with an explicit work_timestamp = 0.0 when constructing the pushed
entry; this clarifies behavior relative to set_new_solution,
repair_heuristic_solutions, and emit_solution_callback without changing
semantics.
- Around line 4192-4238: The balancing currently in
deterministic_balance_worker_loads uses backlog.size() (backlog_counts) which
ignores per-node cost; change it to use an estimated-work metric by summing each
node's work estimate instead of counts: for each mip_node_t<i_t,f_t>* in
worker.backlog use an existing field or method (e.g., node->estimated_work or
implement estimate_node_work(node)) to compute per-worker total_work_estimates
vector and total_work, max_work, min_work, and use those to decide
needs_balance. When redistributing, collect all nodes into all_backlog_nodes
then perform a greedy largest-first assignment (sort nodes by estimated work
descending and push each node to the worker with current smallest
total_assigned_work) rather than simple round-robin so heavy nodes are balanced;
update any references to backlog_counts/min/max to use work estimates. Ensure
worker.backlog.clear()/push logic still uses mip_node_t* but maintains the new
per-worker work totals during distribution.

In `@cpp/src/mip_heuristics/solver.cu`:
- Around line 20-23: Add documentation and a compilation guard around the macro
override: explain why CUOPT_DETERMINISM_LOG is being undefined and redefined to
CUOPT_LOG_INFO (e.g., temporary verbose logging for development/troubleshooting)
and either wrap the `#undef/`#define in a conditional build flag (such as `#ifdef`
DEBUG or a project-specific CUOPT_DETERMINISM_OVERRIDE) or add a clear comment
referencing utilities/determinism_log.hpp and CUOPT_LOG_INFO so maintainers know
this change is intentional and can disable it for production.

In `@cpp/tests/mip/diversity_test.cu`:
- Around line 71-73: Remove the unused seed parameter from the helper functions
by deleting the seed parameter (and its default initializer) from the
declarations and definitions of test_full_run_determinism,
test_initial_solution_determinism, and test_recombiners_determinism; then update
all call sites so they no longer pass a seed argument (the tests already set the
seed via cuopt::seed_generator::set_seed(seed)), ensuring signatures only
include the necessary parameters (e.g., path and work_limit or whatever else the
helper truly needs).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: c6ac0f60-5442-4c71-a41f-54397e72d31c

📥 Commits

Reviewing files that changed from the base of the PR and between e80e545 and b41dc3f.

📒 Files selected for processing (18)
  • benchmarks/linear_programming/cuopt/run_mip.cpp
  • cpp/include/cuopt/linear_programming/constants.h
  • cpp/include/cuopt/linear_programming/utilities/internals.hpp
  • cpp/src/branch_and_bound/branch_and_bound.cpp
  • cpp/src/branch_and_bound/branch_and_bound.hpp
  • cpp/src/mip_heuristics/diversity/population.cu
  • cpp/src/mip_heuristics/diversity/population.cuh
  • cpp/src/mip_heuristics/local_search/rounding/simple_rounding.cu
  • cpp/src/mip_heuristics/problem/presolve_data.cu
  • cpp/src/mip_heuristics/problem/presolve_data.cuh
  • cpp/src/mip_heuristics/problem/problem.cu
  • cpp/src/mip_heuristics/problem/problem.cuh
  • cpp/src/mip_heuristics/solution/solution.cu
  • cpp/src/mip_heuristics/solution/solution.cuh
  • cpp/src/mip_heuristics/solution_callbacks.cuh
  • cpp/src/mip_heuristics/solver.cu
  • cpp/src/mip_heuristics/solver_context.cuh
  • cpp/tests/mip/diversity_test.cu

Comment on lines +270 to +277
CUOPT_LOG_INFO(
"1run_mip settings: heuristics_only=%d deterministic=%d determinism_mode=%d "
"time_limit=%.6f work_limit=%.6f",
(int)heuristics_only,
(int)deterministic,
settings.determinism_mode,
settings.time_limit,
settings.work_limit);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Remove debug prefix from log message.

The log message contains "1run_mip" which appears to be a debug artifact. This should be "run_mip".

🔧 Proposed fix
   CUOPT_LOG_INFO(
-    "1run_mip settings: heuristics_only=%d deterministic=%d determinism_mode=%d "
+    "run_mip settings: heuristics_only=%d deterministic=%d determinism_mode=%d "
     "time_limit=%.6f work_limit=%.6f",
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
CUOPT_LOG_INFO(
"1run_mip settings: heuristics_only=%d deterministic=%d determinism_mode=%d "
"time_limit=%.6f work_limit=%.6f",
(int)heuristics_only,
(int)deterministic,
settings.determinism_mode,
settings.time_limit,
settings.work_limit);
CUOPT_LOG_INFO(
"run_mip settings: heuristics_only=%d deterministic=%d determinism_mode=%d "
"time_limit=%.6f work_limit=%.6f",
(int)heuristics_only,
(int)deterministic,
settings.determinism_mode,
settings.time_limit,
settings.work_limit);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@benchmarks/linear_programming/cuopt/run_mip.cpp` around lines 270 - 277, The
log message passed to CUOPT_LOG_INFO contains a debug artifact "1run_mip";
update the format string in the CUOPT_LOG_INFO call (the call at/around function
run_mip) to remove the leading "1" so it reads "run_mip settings:
heuristics_only=%d deterministic=%d determinism_mode=%d time_limit=%.6f
work_limit=%.6f" while leaving the argument list ((int)heuristics_only,
(int)deterministic, settings.determinism_mode, settings.time_limit,
settings.work_limit) unchanged.

benchmark_info.last_improvement_of_best_feasible,
benchmark_info.last_improvement_after_recombination);
// solution.write_to_sol_file(base_filename + ".sol", handle_.get_stream());
// 1solution.write_to_sol_file(base_filename + ".sol", handle_.get_stream());
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Remove debug prefix from comment.

The commented-out code has "1solution" which appears to be a debug marker. If this code should remain commented out, clean up the debug prefix.

🔧 Proposed fix
-  // 1solution.write_to_sol_file(base_filename + ".sol", handle_.get_stream());
+  // solution.write_to_sol_file(base_filename + ".sol", handle_.get_stream());
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// 1solution.write_to_sol_file(base_filename + ".sol", handle_.get_stream());
// solution.write_to_sol_file(base_filename + ".sol", handle_.get_stream());
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@benchmarks/linear_programming/cuopt/run_mip.cpp` at line 299, The
commented-out debug line contains a stray debug prefix "1solution"; update the
comment to remove the prefix so the call reads as a normal commented call
(referencing write_to_sol_file, base_filename and handle_.get_stream()) — e.g.,
change "// 1solution.write_to_sol_file(base_filename + ".sol",
handle_.get_stream());" to "// solution.write_to_sol_file(base_filename +
".sol", handle_.get_stream());" (or fully uncomment if intended to be active).

Comment on lines +80 to +92
/* @brief MIP determinism mode flags (bitset) */
#define CUOPT_DETERMINISM_NONE 0x0
#define CUOPT_DETERMINISM_BB \
0x1 // matches the previous value of '1' which was for B&B-only determinism in the previous
// rleease
#define CUOPT_DETERMINISM_GPU_HEURISTICS 0x2
#define CUOPT_DETERMINISM_FULL (CUOPT_DETERMINISM_BB | CUOPT_DETERMINISM_GPU_HEURISTICS)

/* Backward compatibility aliases */
#define CUOPT_MODE_OPPORTUNISTIC CUOPT_DETERMINISM_NONE
#define CUOPT_MODE_DETERMINISTIC CUOPT_DETERMINISM_FULL
#define CUOPT_MODE_DETERMINISTIC_BB CUOPT_DETERMINISM_BB
#define CUOPT_MODE_DETERMINISTIC_GPU_HEURISTICS CUOPT_DETERMINISM_GPU_HEURISTICS
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix typo in comment.

Line 84 has "rleease" which should be "release".

🔧 Proposed fix
 `#define` CUOPT_DETERMINISM_BB \
-  0x1  // matches the previous value of '1' which was for B&B-only determinism in the previous
-       // rleease
+  0x1  // matches the previous value of '1' which was for B&B-only determinism in the previous
+       // release
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
/* @brief MIP determinism mode flags (bitset) */
#define CUOPT_DETERMINISM_NONE 0x0
#define CUOPT_DETERMINISM_BB \
0x1 // matches the previous value of '1' which was for B&B-only determinism in the previous
// rleease
#define CUOPT_DETERMINISM_GPU_HEURISTICS 0x2
#define CUOPT_DETERMINISM_FULL (CUOPT_DETERMINISM_BB | CUOPT_DETERMINISM_GPU_HEURISTICS)
/* Backward compatibility aliases */
#define CUOPT_MODE_OPPORTUNISTIC CUOPT_DETERMINISM_NONE
#define CUOPT_MODE_DETERMINISTIC CUOPT_DETERMINISM_FULL
#define CUOPT_MODE_DETERMINISTIC_BB CUOPT_DETERMINISM_BB
#define CUOPT_MODE_DETERMINISTIC_GPU_HEURISTICS CUOPT_DETERMINISM_GPU_HEURISTICS
/* `@brief` MIP determinism mode flags (bitset) */
`#define` CUOPT_DETERMINISM_NONE 0x0
`#define` CUOPT_DETERMINISM_BB \
0x1 // matches the previous value of '1' which was for B&B-only determinism in the previous
// release
`#define` CUOPT_DETERMINISM_GPU_HEURISTICS 0x2
`#define` CUOPT_DETERMINISM_FULL (CUOPT_DETERMINISM_BB | CUOPT_DETERMINISM_GPU_HEURISTICS)
/* Backward compatibility aliases */
`#define` CUOPT_MODE_OPPORTUNISTIC CUOPT_DETERMINISM_NONE
`#define` CUOPT_MODE_DETERMINISTIC CUOPT_DETERMINISM_FULL
`#define` CUOPT_MODE_DETERMINISTIC_BB CUOPT_DETERMINISM_BB
`#define` CUOPT_MODE_DETERMINISTIC_GPU_HEURISTICS CUOPT_DETERMINISM_GPU_HEURISTICS
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/include/cuopt/linear_programming/constants.h` around lines 80 - 92, Typo
in the comment for CUOPT_DETERMINISM_BB: change "rleease" to "release" in the
comment immediately following the CUOPT_DETERMINISM_BB macro definition so the
comment reads "...in the previous release"; locate the comment next to the
CUOPT_DETERMINISM_BB symbol in constants.h and correct the spelling.

Comment on lines +3773 to +3781
if (original_lp_.objective_is_integral) {
lp_settings.cut_off =
std::ceil(worker.local_upper_bound - settings_.integer_tol) + settings_.dual_tol;
} else {
lp_settings.cut_off = worker.local_upper_bound + settings_.dual_tol;
}
lp_settings.inside_mip = 2;
lp_settings.time_limit = remaining_time;
lp_settings.work_limit = std::numeric_limits<f_t>::infinity();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Node LP work_limit unconditionally set to infinity ignores user-configured work limits.

Setting lp_settings.work_limit = std::numeric_limits<f_t>::infinity() means individual node LP solves will never respect the user's work limit. The work limit check only happens at horizon boundaries (line 3703-3704), so a single long-running node LP could exceed the budget before the next sync.

Consider propagating a per-node work budget derived from settings_.work_limit or at minimum documenting this design decision.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/branch_and_bound/branch_and_bound.cpp` around lines 3773 - 3781, The
code unconditionally sets lp_settings.work_limit =
std::numeric_limits<f_t>::infinity(), which ignores any user-configured work
budget; change lp_settings.work_limit to respect settings_.work_limit by
computing a per-node budget (e.g., min(settings_.work_limit - used_work, a
reasonable per-node cap, or remaining_time-derived budget) before assigning to
lp_settings.work_limit), ensuring lp_settings.work_limit is finite when a user
limit exists and falls back to infinity only if settings_.work_limit is
infinite; update the assignment site (lp_settings.work_limit) and any logic that
computes remaining/horizon budgets so the node LP solve cannot exceed the
configured work budget.

Comment on lines +4446 to +4454
if (original_lp_.objective_is_integral) {
lp_settings.cut_off =
std::ceil(worker.local_upper_bound - settings_.integer_tol) + settings_.dual_tol;
} else {
lp_settings.cut_off = worker.local_upper_bound + settings_.dual_tol;
}
lp_settings.inside_mip = 2;
lp_settings.time_limit = remaining_time;
lp_settings.work_limit = std::numeric_limits<f_t>::infinity();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Diving worker also sets work_limit to infinity—same concern as BFS worker.

Same issue as the deterministic BFS path at line 3781. Individual diving LP solves won't respect work limits.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/branch_and_bound/branch_and_bound.cpp` around lines 4446 - 4454, The
diving branch sets lp_settings.work_limit to infinity so individual LP solves
ignore global work limits; change lp_settings.work_limit to a finite remaining
work value (mirror the BFS fix) by assigning it to the available work budget
(e.g., compute remaining_work = worker.work_limit - worker.work_done or use the
existing remaining_work variable) and set lp_settings.work_limit =
remaining_work instead of infinity; update the code around
lp_settings.work_limit in the same block that sets cut_off and time_limit so
diving LP respects the global work budget.

Comment on lines +16 to +18
#include <limits>
#include <mutex>
#include <vector>
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail
sed -n '10,25p' cpp/src/mip_heuristics/solution_callbacks.cuh
rg -n '\bstd::(isfinite|abs)\b' cpp/src/mip_heuristics/solution_callbacks.cuh

Repository: aliceb-nv/cuopt-fork

Length of output: 670


Include <cmath> in this header.

The header uses std::isfinite (line 75) and std::abs (line 193), both declared in <cmath>, but this header is not included. The code currently works only through transitive includes, making the header non-self-contained.

Proposed fix
+#include <cmath>
 `#include` <limits>
 `#include` <mutex>
 `#include` <vector>
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
#include <limits>
#include <mutex>
#include <vector>
`#include` <cmath>
`#include` <limits>
`#include` <mutex>
`#include` <vector>
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/solution_callbacks.cuh` around lines 16 - 18, The
header solution_callbacks.cuh uses std::isfinite and std::abs but does not
include <cmath>, relying on transitive includes; add `#include` <cmath> to this
header so it is self-contained and ensures std::isfinite and std::abs are
declared for the uses in solution_callbacks.cuh (references: std::isfinite at
the code using it and std::abs where used).

Comment on lines +157 to +177
auto callback_num_variables = problem_ptr->original_problem_ptr->get_n_variables();
rmm::device_uvector<f_t> incumbent_assignment(callback_num_variables,
current_incumbent.handle_ptr->get_stream());
auto inf = std::numeric_limits<f_t>::infinity();
current_incumbent.handle_ptr->sync_stream();
std::vector<f_t> h_incumbent_assignment(incumbent_assignment.size());
std::vector<f_t> h_outside_sol_objective(1, inf);
std::vector<f_t> h_user_bound(1, user_bound);
set_sol_callback->set_solution(h_incumbent_assignment.data(),
h_outside_sol_objective.data(),
h_user_bound.data(),
set_sol_callback->get_user_data());
f_t outside_sol_objective = h_outside_sol_objective[0];
if (outside_sol_objective == inf) { continue; }

raft::copy(incumbent_assignment.data(),
h_incumbent_assignment.data(),
incumbent_assignment.size(),
current_incumbent.handle_ptr->get_stream());
if (settings.mip_scaling) { scaling.scale_solutions(incumbent_assignment); }
bool is_valid = problem_ptr->pre_process_assignment(incumbent_assignment);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Preprocess the injected assignment before scaling it.

Line 157 builds the callback buffer in original problem space, but Line 176 applies solver-space scaling before Line 177 crushes/preprocesses it. With presolve enabled, that can mis-scale the wrong indices or sizes. The inbound path should preprocess first, then scale.

🔁 Proposed fix
       raft::copy(incumbent_assignment.data(),
                  h_incumbent_assignment.data(),
                  incumbent_assignment.size(),
                  current_incumbent.handle_ptr->get_stream());
-      if (settings.mip_scaling) { scaling.scale_solutions(incumbent_assignment); }
       bool is_valid = problem_ptr->pre_process_assignment(incumbent_assignment);
       if (!is_valid) { continue; }
+      if (settings.mip_scaling) { scaling.scale_solutions(incumbent_assignment); }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/solution_callbacks.cuh` around lines 157 - 177, The
callback currently copies the host solution to device, then applies scaling
(scaling.scale_solutions) before calling problem_ptr->pre_process_assignment,
which can mis-scale when presolve changes indices/sizes; change the order so
that after set_sol_callback->set_solution fills h_incumbent_assignment and
raft::copy moves it into incumbent_assignment, you call
problem_ptr->pre_process_assignment(incumbent_assignment) first, and only then,
if settings.mip_scaling, call scaling.scale_solutions(incumbent_assignment);
keep the calls to set_sol_callback, raft::copy, pre_process_assignment, and
scaling.scale_solutions in that sequence and preserve stream/synchronization
semantics.

Comment on lines +193 to +194
cuopt_assert(std::abs(outside_sol.get_user_objective() - outside_sol_objective) <= 1e-6,
"External solution objective mismatch");
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Use a scaled tolerance for the callback objective check.

The recomputed objective and the callback-reported objective come from different numeric paths, so a fixed 1e-6 absolute threshold here is brittle. Large-magnitude objectives, or float instantiations of f_t, can trip this assert from normal roundoff. Please compare with a relative/scaled tolerance instead of a hard-coded absolute one.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/solution_callbacks.cuh` around lines 193 - 194, The
assertion uses a fixed absolute tolerance (1e-6) which is brittle for
large-magnitude objectives or float instantiations; modify the cuopt_assert in
the callback that checks outside_sol.get_user_objective() vs
outside_sol_objective to use a scaled/relative tolerance instead (e.g., compute
tol = rel_tol * std::max<f_t>(1,
std::max(std::abs(outside_sol.get_user_objective()),
std::abs(outside_sol_objective))) or include epsilon from
std::numeric_limits<f_t>::epsilon() when appropriate) and then assert that
std::abs(... - ...) <= tol; update the call site around cuopt_assert to use this
computed tol rather than the hard-coded 1e-6.

Comment on lines +195 to +197
on_injected(outside_sol.get_host_assignment(),
outside_sol.get_objective(),
internals::mip_solution_origin_t::USER_INITIAL);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Mark runtime callback injections as USER_INJECTED.

This helper is handling a SET_SOLUTION callback, so tagging the accepted candidate as USER_INITIAL rewrites its provenance. Any origin-based logging, stats, or callback metadata will misclassify a runtime injection as startup seeding.

🏷️ Proposed fix
       on_injected(outside_sol.get_host_assignment(),
                   outside_sol.get_objective(),
-                  internals::mip_solution_origin_t::USER_INITIAL);
+                  internals::mip_solution_origin_t::USER_INJECTED);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
on_injected(outside_sol.get_host_assignment(),
outside_sol.get_objective(),
internals::mip_solution_origin_t::USER_INITIAL);
on_injected(outside_sol.get_host_assignment(),
outside_sol.get_objective(),
internals::mip_solution_origin_t::USER_INJECTED);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/solution_callbacks.cuh` around lines 195 - 197, The
callback currently records runtime SET_SOLUTION injections with the wrong
provenance by calling on_injected(outside_sol.get_host_assignment(),
outside_sol.get_objective(), internals::mip_solution_origin_t::USER_INITIAL); —
change the origin tag to internals::mip_solution_origin_t::USER_INJECTED so
runtime callback injections are correctly classified; update the call site of
on_injected(...) accordingly (keep outside_sol.get_host_assignment() and
outside_sol.get_objective() intact).

Comment on lines +300 to +312
return detail::compute_hash(hashes);

auto pop = diversity_manager.get_population_pointer();
for (const auto& sol : pop->population_to_vector()) {
hashes.push_back(sol.get_hash());
}

uint32_t final_hash = detail::compute_hash(hashes);
printf("%s: final hash: 0x%x, pop size %d\n",
path.c_str(),
final_hash,
(int)pop->population_to_vector().size());
return final_hash;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Remove unreachable dead code after early return.

Line 300 returns before the code at lines 302-312 can execute. This appears to be leftover code from development that should be removed.

🧹 Proposed fix to remove dead code
   }
   return detail::compute_hash(hashes);
-
-  auto pop = diversity_manager.get_population_pointer();
-  for (const auto& sol : pop->population_to_vector()) {
-    hashes.push_back(sol.get_hash());
-  }
-
-  uint32_t final_hash = detail::compute_hash(hashes);
-  printf("%s: final hash: 0x%x, pop size %d\n",
-         path.c_str(),
-         final_hash,
-         (int)pop->population_to_vector().size());
-  return final_hash;
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
return detail::compute_hash(hashes);
auto pop = diversity_manager.get_population_pointer();
for (const auto& sol : pop->population_to_vector()) {
hashes.push_back(sol.get_hash());
}
uint32_t final_hash = detail::compute_hash(hashes);
printf("%s: final hash: 0x%x, pop size %d\n",
path.c_str(),
final_hash,
(int)pop->population_to_vector().size());
return final_hash;
return detail::compute_hash(hashes);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/tests/mip/diversity_test.cu` around lines 300 - 312, Remove the
unreachable dead code after the early return: keep the single return statement
that calls detail::compute_hash(hashes) and delete the subsequent block that
references diversity_manager.get_population_pointer(), the loop over
pop->population_to_vector() calling sol.get_hash(), the secondary call to
detail::compute_hash, the printf using path.c_str(), and the later return of
final_hash so no code remains after the initial return.

@aliceb-nv aliceb-nv closed this Mar 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant