Skip to content

Incorporate determinism into GPU heuristics + parallel determinsitic B&B+GPU exploration#5

Closed
aliceb-nv wants to merge 247 commits into
release/26.04from
gpudet
Closed

Incorporate determinism into GPU heuristics + parallel determinsitic B&B+GPU exploration#5
aliceb-nv wants to merge 247 commits into
release/26.04from
gpudet

Conversation

@aliceb-nv
Copy link
Copy Markdown
Owner

This is a test PR to trigger coderabbit review.

Description

Issue

Checklist

  • I am familiar with the Contributing Guidelines.
  • Testing
    • New or existing tests cover these changes
    • Added tests
    • Created an issue to follow-up
    • NA
  • Documentation
    • The documentation is up to date with these changes
    • Added new documentation
    • NA

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 19, 2026

📝 Walkthrough

Walkthrough

This pull request introduces solution callback enhancements with origin metadata and work-unit tracking, refactors timer infrastructure for deterministic execution control, expands determinism logging and modes across MIP solving, redesigns solution publication/injection architecture, updates work-limit integration throughout GPU heuristics, and adds comprehensive deterministic reproducibility tests.

Changes

Cohort / File(s) Summary
Solution Callback API Extensions
cpp/include/cuopt/linear_programming/cuopt_c.h, cpp/include/cuopt/linear_programming/utilities/internals.hpp
Added extended MIP solution callback infrastructure: new cuOptMIPSolutionCallbackInfo struct, cuOptMIPGetSolutionCallbackExt callback type, and cuOptSetMIPGetSolutionCallbackExt C API function; introduced mip_solution_origin_t enum and get_solution_callback_ext_t class for tracking solution provenance (branch-and-bound, heuristics, presolve, user-injected, etc.).
Determinism Modes and Constants
cpp/include/cuopt/linear_programming/constants.h
Replaced scalar determinism constants with bitset-based flags: CUOPT_DETERMINISM_NONE, CUOPT_DETERMINISM_BB, CUOPT_DETERMINISM_GPU_HEURISTICS, CUOPT_DETERMINISM_FULL; added CUOPT_MIP_SOLUTION_ORIGIN_* enumerators for 14 solution sources; updated mode macros to use new bitwise-OR semantics.
Settings and Configuration
cpp/include/cuopt/linear_programming/mip/solver_settings.hpp, cpp/src/dual_simplex/simplex_solver_settings.hpp
Extended MIP settings with work-unit scales (cpufj_work_unit_scale, gpu_heur_work_unit_scale, bb_work_unit_scale) and GPU-heuristic timing control (gpu_heur_wait_for_exploration); renamed incumbent callback from solution_callback to new_incumbent_callback with extended signature including origin and work metadata; changed default determinism_mode to CUOPT_DETERMINISM_NONE.
Timer/Work-Limit Refactoring
cpp/src/utilities/termination_checker.hpp, cpp/src/utilities/work_limit_timer.hpp, cpp/src/utilities/timer.hpp
Introduced unified termination_checker_t combining timer and work-limit logic with parent chain support, mode selection (time vs work-based termination), and deterministic progress tracking; added work_limit_timer_t backward-compatibility alias; extended timer_t::check_time_limit() with debug location parameters.
Work-Limit Context
cpp/src/utilities/work_limit_context.hpp, cpp/src/utilities/seed_generator.cu, cpp/src/utilities/seed_generator.cuh
Extended work-limit context with producer synchronization, scaling factors, and copy/move semantics; refactored seed generator from global counter to thread-local with epoch-based reinitialization; added record_work_sync_on_horizon scaling and producer notifications.
Branch-and-Bound Enhancements
cpp/src/branch_and_bound/branch_and_bound.cpp, cpp/src/branch_and_bound/branch_and_bound.hpp, cpp/src/branch_and_bound/deterministic_workers.hpp
Redesigned solution callback emission with emit_solution_callback(...) methods including origin/timestamp; changed set_new_solution and queue_external_solution_deterministic signatures to carry mip_solution_origin_t; introduced exploration-start synchronization via wait_for_exploration_start(); extended deterministic replay with deterministic_replay_solution_t structure; added PENDING node status handling; updated pseudo-cost tracking with iteration deltas.
Solution Publication/Injection
cpp/src/mip_heuristics/solution_callbacks.cuh, cpp/src/pdlp/cuopt_c.cpp
Introduced solution_publication_t and solution_injection_t classes to centralize callback payload building, objective comparison, and user-callback invocation; new C wrapper c_get_solution_callback_ext_t for extended callback interface; improved callback handling with scaling/presolve post-processing.
Population/Diversity Management
cpp/src/mip_heuristics/diversity/population.cu, cpp/src/mip_heuristics/diversity/population.cuh, cpp/src/mip_heuristics/diversity/diversity_manager.cu, cpp/src/mip_heuristics/diversity/diversity_manager.cuh
Changed population's add_solution(...) and related methods to accept mip_solution_origin_t; introduced drained_external_solution_t bundling solution with origin; replaced local solution_origin_t enum with centralized internals::mip_solution_origin_t; updated diversity manager to use work_limit_timer_t and pass origin through population insertion; refactored external solution handling with deterministic BB early-return logic.
Recombiners
cpp/src/mip_heuristics/diversity/recombiners/recombiner.cuh, cpp/src/mip_heuristics/diversity/recombiners/*.cuh, cpp/src/mip_heuristics/diversity/multi_armed_bandit.cuh
Updated all recombiner recombine(...) return types from std::pair<solution_t, bool> to std::tuple<solution_t, bool, double> adding work metrics; changed init_enabled_recombiners signature to accept context for determinism-aware filtering; added recombiner_name(...) for logging; introduced determinism-capable sorting in assign_same_integer_values; updated MAB reward scaling for work-based vs time-based scoring.
Feasibility Jump
cpp/src/mip_heuristics/feasibility_jump/feasibility_jump.cu, cpp/src/mip_heuristics/feasibility_jump/feasibility_jump.cuh, cpp/src/mip_heuristics/feasibility_jump/feasibility_jump_kernels.cu
Added deterministic work-estimation infrastructure, feature-vector generation, and per-variable frontier-work tracking; changed move scoring from float to int32_t with explicit constructors; introduced work-unit scaling via environment variable; added load-balancing code-path selection and deterministic constraint/move ordering; extended constraint scoring to accept optional left/right weights.
Local Search and Rounding
cpp/src/mip_heuristics/local_search/local_search.cu, cpp/src/mip_heuristics/local_search/local_search.cuh, cpp/src/mip_heuristics/local_search/feasibility_pump/feasibility_pump.cu, cpp/src/mip_heuristics/local_search/rounding/*
Globally replaced timer_t with work_limit_timer_t across method signatures; removed callback best-objective tracking; added determinism logging for projection/rounding/LP operations; integrated work-limit context for deterministic iterations; refactored repair/rounding loops with iteration caps and deterministic candidate ordering; added run_fp(...) parameter for FP iteration count.
Presolve and Relaxation
cpp/src/mip_heuristics/presolve/bounds_presolve.cu, cpp/src/mip_heuristics/presolve/bounds_update_data.cu, cpp/src/mip_heuristics/presolve/probing_cache.cu, cpp/src/mip_heuristics/presolve/third_party_presolve.hpp, cpp/src/mip_heuristics/relaxed_lp/relaxed_lp.cu
Updated compute_probing_cache and compute_cache_for_var to use work_limit_timer_t; changed timer type in bound_flipping_ratio_test_t with value initialization; added work-limit clamping and iteration-limit reductions in deterministic presolve; introduced set_deterministic(...) for third-party presolver; extended relaxed_lp_settings_t with iteration/work limits and work context; added LP call-ID tagging and determinism-mode iteration estimation.
Problem and Solution Management
cpp/src/mip_heuristics/problem/problem.cu, cpp/src/mip_heuristics/problem/problem.cuh, cpp/src/mip_heuristics/solution/solution.cu, cpp/src/mip_heuristics/solution/solution.cuh
Added post_process_assignment(...) and papilo_uncrush_assignment(...) handle overrides; changed branch_and_bound_callback signature to include mip_solution_origin_t; introduced solution constructor with explicit handle override; added get_hash() method; removed lower_slack/upper_slack members; implemented deterministic variable substitution via atomic-free accumulation; filled device buffers with sentinel values on resize.
Solver Infrastructure
cpp/src/mip_heuristics/solver.cu, cpp/src/mip_heuristics/solver.cuh, cpp/src/mip_heuristics/solver_context.cuh, cpp/src/mip_heuristics/solve.cu
Refactored mip_solver_t to use termination_checker_t& instead of timer_t; introduced bb_observer_adapter_t for deterministic incumbent handling; added solution publication/injection via context members; changed determinism control flow to bitmask checks (CUOPT_DETERMINISM_BB vs CUOPT_DETERMINISM_GPU_HEURISTICS); added exploration-preemption signaling; updated B&B device setup with explicit CUDA device setting; added work-unit summary logging.
Utilities
cpp/src/utilities/cuda_helpers.cuh, cpp/src/utilities/determinism_log.hpp, cpp/src/utilities/copy_helpers.hpp, cpp/src/utilities/work_unit_scheduler.cpp, cpp/src/utilities/version_info.cpp, cpp/src/utilities/version_info.hpp, cpp/src/utilities/mip_constants.hpp
Introduced NVTX memory-initialization marking for CUDA >= 12.0.80; added determinism logging macro (no-op stub); added make_span overloads for device_scalar<T>; refactored work-unit scheduler to use set_current_work(...); added get_cpu_max_clock_mhz() function; defined base work-scale constants (BB_BASE_WORK_SCALE, GPU_HEUR_BASE_WORK_SCALE, CPUFJ_BASE_WORK_SCALE).
Pseudo-Cost Branching
cpp/src/branch_and_bound/pseudo_costs.cpp, cpp/src/branch_and_bound/pseudo_costs.hpp
Introduced generalized reliable_variable_selection_core(...) template unifying deterministic/opportunistic branching; added per-task work-limit contexts in strong-branching OpenMP region; extended pseudo_cost_snapshot_t with strong_branching_lp_iter_ tracking; updated function signatures to accept optional work_unit_context; added work aggregation and recording on horizon.
Dual Simplex and Bounds
cpp/src/dual_simplex/basis_updates.cpp, cpp/src/dual_simplex/bound_flipping_ratio_test.cpp, cpp/src/dual_simplex/bound_flipping_ratio_test.hpp, cpp/src/dual_simplex/phase2.cpp
Made work-estimate logarithm accumulation conditional to avoid log2(0/1); added explicit float casts; value-initialized work_estimate_ in bound_flipping_ratio_test_t; added null-safety check for work-unit context in feature logging.
Benchmark and Configuration
benchmarks/linear_programming/cuopt/run_mip.cpp, cpp/CMakeLists.txt, ci/compute-sanitizer-suppressions.xml
Enhanced benchmark to capture incumbent updates via extended callback, record work timestamps and solution origins, write incumbents CSV trace; adjusted determinism mode selection with explicit conditions; set additional MIP tuning fields; added compute-sanitizer suppression rules for uninitialized memory reports; added commented-out PAPI dependency detection.
Tests
cpp/tests/mip/determinism_test.cu, cpp/tests/mip/determinism_utils.cuh, cpp/tests/mip/diversity_test.cu, cpp/tests/mip/feasibility_jump_tests.cu, cpp/tests/mip/local_search_test.cu, cpp/tests/mip/mip_utils.cuh, cpp/tests/mip/multi_probe_test.cu, cpp/tests/mip/presolve_test.cu, cpp/tests/mip/unit_test.cu, cpp/tests/mip/load_balancing_test.cu, cpp/tests/mip/miplib_test.cu, cpp/tests/CMakeLists.txt, cpp/tests/mip/CMakeLists.txt
Added extensive deterministic reproducibility test infrastructure: spin_stream_raii_t for concurrent GPU activity, callback-origin tracking helpers, hash-based test assertions; new test suites for diversity/local-search/presolve/feasibility-jump determinism; updated existing tests to use termination_checker_t and seed-controlled randomness; enabled FEASIBILITY_JUMP_TEST; added CUPTI dependency detection in CMake.
Dataset Configuration
datasets/mip/download_miplib_test_dataset.sh
Added supportcase42 to instance list for test dataset downloads.

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120+ minutes

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch gpudet

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Note

Due to the large number of review comments, Critical severity comments were prioritized as inline comments.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (4)
cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu (1)

1487-1502: ⚠️ Potential issue | 🟠 Major

Flush CPUFJ work before using it as a hard stop.

work_units_elapsed only advances every 100 iterations, and there is no flush on the way out of cpu_solve. The new budget check can therefore overshoot by almost a full batch, and any incumbent published inside that window carries a stale work timestamp.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu` around lines 1487 - 1502,
The loop only updates fj_cpu.work_units_elapsed every 100 iterations so the new
budget check can overshoot; before breaking out of cpu_solve (and before any
final incumbent is published) explicitly flush the pending CPU work by calling
fj_cpu.memory_aggregator.collect(), computing biased_work = (loads+stores) *
fj_cpu.work_unit_bias / 1e10, adding it to fj_cpu.work_units_elapsed, and
notifying fj_cpu.producer_sync (if non-null); then perform the work_budget
comparison using fj_cpu.work_units_elapsed.load(...). Ensure this same flush is
performed on the early-exit path where the budget is checked so incumbents carry
an up-to-date timestamp.
cpp/src/mip_heuristics/diversity/diversity_manager.cu (1)

124-141: ⚠️ Potential issue | 🔴 Critical

The deterministic work context is never enabled unless CUOPT_CONFIG_ID is set.

The early return at Line 125 exits before Lines 140-141 set context.gpu_heur_loop.deterministic. On the common path with no env override, every termination_checker_t(context.gpu_heur_loop, …) built later falls back to non-deterministic behavior.

Minimal fix
+  context.gpu_heur_loop.deterministic =
+    (context.settings.determinism_mode & CUOPT_DETERMINISM_GPU_HEURISTICS);
+
   const char* env_config_id_raw = std::getenv("CUOPT_CONFIG_ID");
   if (env_config_id_raw == nullptr) { return; }
@@
-  context.gpu_heur_loop.deterministic =
-    (context.settings.determinism_mode & CUOPT_DETERMINISM_GPU_HEURISTICS);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/diversity/diversity_manager.cu` around lines 124 -
141, The code currently returns early when CUOPT_CONFIG_ID is unset, so
context.gpu_heur_loop.deterministic is never initialized; move or duplicate the
deterministic assignment so it always runs regardless of the CUOPT_CONFIG_ID
path: ensure context.gpu_heur_loop.deterministic =
(context.settings.determinism_mode & CUOPT_DETERMINISM_GPU_HEURISTICS) is
executed before any early returns (or placed before the getenv check) so that
the termination_checker_t built later observes the correct deterministic flag;
keep the existing env parsing and bounds checks for env_config_id but do not let
them skip setting the deterministic field.
cpp/src/branch_and_bound/branch_and_bound.cpp (1)

560-594: ⚠️ Potential issue | 🟠 Major

Publish callbacks for directly accepted injected incumbents.

When check_guess(...) succeeds, this path updates upper_bound_/incumbent_ and logs via report_heuristic(), but it never calls emit_solution_callback*(). Repaired injections do publish, so feasible direct injections currently skip new_incumbent_callback and lose their origin.

💡 Suggested fix
-  if (is_feasible) { report_heuristic(obj); }
+  if (is_feasible) {
+    report_heuristic(obj, -1.0);
+    emit_solution_callback_from_crushed(crushed_solution, obj, origin, -1.0);
+  }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/branch_and_bound/branch_and_bound.cpp` around lines 560 - 594, When a
direct injected incumbent is accepted (inside the if (is_feasible && obj <
upper_bound_) branch after incumbent_.set_incumbent_solution(...)), call the
existing publishing callbacks so the new incumbent and its origin are emitted;
specifically invoke the emit_solution_callback variants (e.g.,
emit_solution_callback(...) and the origin-aware overload such as
emit_solution_callback_with_origin(...) or new_incumbent_callback(...)) with the
updated incumbent (crushed_solution / obj) and the origin, and do this while
still holding mutex_upper_ (i.e., before mutex_upper_.unlock()) so the origin is
preserved and the direct-injection path publishes just like the repaired path.
cpp/src/mip_heuristics/local_search/local_search.cu (1)

544-548: ⚠️ Potential issue | 🟠 Major

Pass the iteration budget into this run_fp() call.

The run_fp() declaration has a fourth parameter n_fp_iterations with a default value of std::numeric_limits<i_t>::max(). Line 545 calls it with only three arguments, so the binary_only || integer_only branch silently uses the default unlimited iterations instead of the 1000000 limit defined on line 547. This inconsistency should be resolved by moving n_fp_iterations outside the conditional and passing it to both branches.

Suggested fix
+  const i_t n_fp_iterations = 1000000;
   if (binary_only || integer_only) {
-    return run_fp(solution, timer, population_ptr);
+    return run_fp(solution, timer, population_ptr, n_fp_iterations);
   } else {
-    const i_t n_fp_iterations = 1000000;
     fp.cycle_queue.reset(solution);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/local_search/local_search.cu` around lines 544 - 548,
The code currently defines n_fp_iterations only in the else branch and calls
run_fp(solution, timer, population_ptr) in the binary_only || integer_only
branch so it unintentionally uses the default unlimited iterations; move the
declaration const i_t n_fp_iterations = 1000000; to before the if statement and
pass it as the fourth argument to run_fp in both branches (i.e. call
run_fp(solution, timer, population_ptr, n_fp_iterations)), ensuring
fp.cycle_queue.reset(solution) remains where needed.
🟠 Major comments (26)
cpp/src/mip_heuristics/diversity/weights.cuh-30-33 (1)

30-33: ⚠️ Potential issue | 🟠 Major

population.weights.get_hash() call at line 1044 lacks explicit stream, creating potential race condition.

weight_t members are initialized and modified on handle_ptr->get_stream(), but get_hash() defaults to rmm::cuda_stream_default. The call at diversity_manager.cu:1044 does not pass an explicit stream, causing objective_weight.value(stream) to read from the wrong stream without synchronization. This can produce stale or uninitialized values.

Pass the solver's stream explicitly:

CUOPT_LOG_TRACE("Recombining sol %x and %x with recombiner %d, weights %x",
                a.get_hash(),
                b.get_hash(),
                recombiner,
-               population.weights.get_hash());
+               population.weights.get_hash(population.weights_stream));

Or implement the suggested fix to store and default to the construction stream.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/diversity/weights.cuh` around lines 30 - 33, The
get_hash method uses rmm::cuda_stream_default causing race with members
initialized on another stream; modify get_hash to accept a rmm::cuda_stream_view
parameter (or read a stored construction stream) and use that stream when
calling compute_hash and objective_weight.value(stream), and then update all
callers (e.g., population.weights.get_hash() usage) to pass the solver's handle
stream so compute_hash(cstr_weights, stream) ^
compute_hash(objective_weight.value(stream)) reads from the correct stream.
ci/compute-sanitizer-suppressions.xml-210-221 (1)

210-221: ⚠️ Potential issue | 🟠 Major

Avoid blanket suppression of all cuMemcpyDtoDAsync uninitialized-access reports.

This record removes a high-signal diagnostic class across the entire codebase. It contains only a single frame constraint (the API function and its module) with no additional host stack scoping, allowing it to suppress errors from any callsite.

Instead, constrain the suppression to specific known-benign call paths by adding additional <frame> elements that match your host stack:

Suggested structure
  <record>
    <kind>InitcheckApiError</kind>
    <level>Error</level>
    <what>
      <text>Host API uninitialized memory access</text>
    </what>
    <hostStack>
      <frame>
        <func>cuMemcpyDtoDAsync.*</func>
        <module>.*libcuda.so.*</module>
      </frame>
      <frame>
        <module>.*libcuopt.so.*</module>
      </frame>
      <frame>
        <func>.*known_benign_callsite.*</func>
      </frame>
    </hostStack>
  </record>

Each additional <frame> element narrows the suppression to matching host stack frames at that depth. If a stable callsite cannot be identified, prefer removing this suppression and fixing the upstream initialization.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ci/compute-sanitizer-suppressions.xml` around lines 210 - 221, The
suppression is too broad: the <record> for InitcheckApiError currently matches
any cuMemcpyDtoDAsync call in libcuda and will silence real bugs; narrow it by
adding additional <frame> elements under hostStack that match the full
known-benign host call path (e.g., the upstream module like libcuopt.so and the
specific callsite function name) so the suppression only applies when the host
stack matches the exact benign sequence, or if you cannot identify a stable
callsite remove the suppression entirely; update the record containing the
cuMemcpyDtoDAsync frame (the InitcheckApiError entry) to include those
additional frames or delete the record.
ci/compute-sanitizer-suppressions.xml-157-176 (1)

157-176: ⚠️ Potential issue | 🟠 Major

Narrow this transform_kernel suppression to avoid hiding unrelated regressions.

At line 159 and line 162, the rule is over-broad: no size constraint and only generic CUDA runtime frames. This will suppress any uninitialized global memory read in transform_kernel regardless of actual size, masking genuine future bugs.

Other suppressions in this file use <size> to constrain matching; this one should too. Add a specific size and project-specific frames (if available in the actual error report) to narrowly scope the suppression.

Proposed narrowing (example)
   <record>
     <kind>Initcheck</kind>
     <what>
-      <text>Uninitialized __global__ memory read</text>
+      <text>Uninitialized __global__ memory read of size 16 bytes</text>
+      <size>16</size>
     </what>
     <where>
       <func>transform_kernel</func>
     </where>
     <hostStack>
       <frame>
         <func>cuLaunchKernel_ptsz</func>
         <module>.*libcuda.so.*</module>
       </frame>
       <frame>
-        <module>.*libcudart.so.*</module>
+        <func>your_project_function</func>
       </frame>
       <frame>
         <func>cudaLaunchKernel_ptsz</func>
       </frame>
     </hostStack>
   </record>
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ci/compute-sanitizer-suppressions.xml` around lines 157 - 176, The
suppression for transform_kernel is too broad; narrow it by adding a <size>
element with the specific byte size seen in the original error and tighten the
hostStack to include project-specific frames/modules (replace the generic
libcudart/libcuda entries with the actual module and function names from the
failure report). Update the <record> for transform_kernel to include
<size>...</size> and refine <hostStack> entries using precise <func> and
<module> values so only the intended uninitialized __global__ memory read is
suppressed.
cpp/include/cuopt/linear_programming/cuopt_c.h-74-78 (1)

74-78: ⚠️ Potential issue | 🟠 Major

Publish the callback metadata contract before exposing it in the C API.

cuOptMIPSolutionCallbackInfo::origin is a raw uint32_t, but this header does not export the enum/constants that give that value meaning, and work_timestamp has no documented units or semantics. As written, C clients cannot consume the new metadata without depending on internal headers or reverse-engineering numeric values. Please add public symbolic constants/docs for origin, document work_timestamp, and spell out the callback threading expectations alongside the new setter.

As per coding guidelines, "For public header files (C++ API): Check if new public functions/classes have documentation comments (Doxygen format) - Flag API changes that may need corresponding docs/ updates - Verify parameter descriptions match actual types/behavior - Suggest documenting thread-safety, GPU requirements, and numerical behavior".

Also applies to: 722-787

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/include/cuopt/linear_programming/cuopt_c.h` around lines 74 - 78, The new
C struct cuOptMIPSolutionCallbackInfo exposes an opaque uint32_t field origin
and an undocumented double work_timestamp and lacks threading/usage docs;
publish a public enum or `#define` constants for the origin values (e.g.,
CUOPT_MIP_ORIGIN_*) in this header so C clients can interpret origin, add a
Doxygen comment on cuOptMIPSolutionCallbackInfo describing each field including
precise units and semantics for work_timestamp (e.g., seconds since epoch or
seconds since solve start, monotonic vs. wall-clock), and add a Doxygen note
near the callback setter function describing callback threading expectations,
whether the callback may be invoked from GPU/host threads, required
synchronization, and any lifetime/GPU-context constraints so C consumers know
how to safely use the callback.
cpp/src/pdlp/pdlp.cu-2260-2263 (1)

2260-2263: ⚠️ Potential issue | 🟠 Major

This turns the average warm-start vector back into scaled coordinates.

At Line 2260 you're copying pdhg_solver_.get_primal_solution() after the problem has been scaled, but the warm-start path later sets no_rescale_average = true and skips the unscale_solutions(unscaled_primal_avg_solution_, ...) call. With project_initial_primal + warm start, average_termination_strategy_ can therefore evaluate a scaled primal average against the unscaled problem. Either keep this copy in unscaled space or refresh the warm-start average/accumulator consistently before the first termination check.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/pdlp/pdlp.cu` around lines 2260 - 2263, The copy into
unscaled_primal_avg_solution_ is taking pdhg_solver_.get_primal_solution() while
the solver is still in scaled coordinates, which creates a mismatch when
no_rescale_average = true (used with project_initial_primal and warm start) so
average_termination_strategy_ may evaluate a scaled average against the unscaled
problem; fix by ensuring the buffer receives an unscaled vector or by
refreshing/unscaling the warm-start average/accumulator before the first
termination check — either call unscale_solutions(...) (or the equivalent
unscale path) before the raft::copy into unscaled_primal_avg_solution_, or
update the warm-start accumulator/average to the solver’s current (scaled) state
consistently so no_rescale_average logic remains correct (refer to
unscaled_primal_avg_solution_, pdhg_solver_.get_primal_solution(),
no_rescale_average, unscale_solutions, project_initial_primal,
average_termination_strategy_ and the accumulator/average refresh).
cpp/src/mip_heuristics/presolve/bounds_presolve.cu-174-178 (1)

174-178: ⚠️ Potential issue | 🟠 Major

Keep the determinism cap local to this solve.

Lines 175-177 permanently clamp the member settings.iteration_limit. After one deterministic call, every later solve() on this same bound_presolve_t instance will stay capped at 50 iterations, even when deterministic mode is off. Use a local loop bound instead.

Suggested fix
-  // CHANGE once we have a work predictor
-  if ((context.settings.determinism_mode & CUOPT_DETERMINISM_GPU_HEURISTICS)) {
-    timer                    = timer_t(std::numeric_limits<f_t>::infinity());
-    settings.iteration_limit = std::min(settings.iteration_limit, 50);
-  }
+  i_t iteration_limit = settings.iteration_limit;
+  // CHANGE once we have a work predictor
+  if ((context.settings.determinism_mode & CUOPT_DETERMINISM_GPU_HEURISTICS)) {
+    timer = timer_t(std::numeric_limits<f_t>::infinity());
+    iteration_limit = std::min(iteration_limit, static_cast<i_t>(50));
+  }
@@
-  for (iter = 0; iter < settings.iteration_limit; ++iter) {
+  for (iter = 0; iter < iteration_limit; ++iter) {

Also applies to: 182-182

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/presolve/bounds_presolve.cu` around lines 174 - 178,
The member settings.iteration_limit is being permanently clamped when
context.settings.determinism_mode has CUOPT_DETERMINISM_GPU_HEURISTICS; instead,
don't mutate the member on bound_presolve_t — compute a local loop bound (e.g.,
local_iteration_limit = std::min(settings.iteration_limit, 50u)) and use that
local variable in the solve() loop and any loop checks (replace uses at the two
places flagged), leaving settings.iteration_limit unchanged; ensure any timer_t
handling remains the same but the determinism cap is only applied to the local
loop bound.
cpp/src/mip_heuristics/feasibility_jump/feasibility_jump_impl_common.cuh-106-108 (1)

106-108: ⚠️ Potential issue | 🟠 Major

Don’t make the new weights optional with NaN defaults.

If a caller omits either new argument, cstr_weight becomes NaN, which immediately violates the checks on Lines 136-137. In non-assert builds that also risks poisoning base_feas / bonus_robust with NaNs. Either require both weights or preserve the old fallback lookup from fj.

Safer signature
-  f_t current_lhs,
-  f_t cstr_left_weight  = std::numeric_limits<f_t>::quiet_NaN(),
-  f_t cstr_right_weight = std::numeric_limits<f_t>::quiet_NaN())
+  f_t current_lhs,
+  f_t cstr_left_weight,
+  f_t cstr_right_weight)

Also applies to: 128-137

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/feasibility_jump/feasibility_jump_impl_common.cuh`
around lines 106 - 108, The new optional weight params cstr_left_weight and
cstr_right_weight should not default to NaN because callers that omit them will
cause the checks around base_feas/bonus_robust to fail and may poison values;
update the signature and call sites so either (A) both weights are required
(remove the NaN defaults) or (B) preserve the original fallback behavior by
detecting omitted values and looking them up from fj (the prior
constraint-weight lookup) before performing the checks in the function (the
routine that uses current_lhs, cstr_left_weight, cstr_right_weight and computes
base_feas/bonus_robust). Ensure you reference and adjust all uses of
cstr_left_weight/cstr_right_weight in this translation unit and related callers
so no NaNs reach the validation checks.
cpp/src/mip_heuristics/diversity/recombiners/line_segment_recombiner.cuh-87-89 (1)

87-89: ⚠️ Potential issue | 🟠 Major

Don't ship placeholder recombiner work accounting.

work is currently just n_different_vars, and the inline TODO suggests that is temporary. This now feeds the deterministic work-budget path, so it will misrepresent actual line-segment effort whenever search depth, repair work, or early termination differs.

Do you want me to draft a concrete work metric based on the actual search/timer consumption?

cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu-1333-1341 (1)

1333-1341: ⚠️ Potential issue | 🟠 Major

Don't let the env var override an explicit 1.0 setting.

With the current ternary, a caller that explicitly sets cpufj_work_unit_scale = 1.0 still loses to CUOPT_CPUFJ_WORK_UNIT_SCALE. That makes the public setting unable to opt out of process-global state. This needs an explicit “was user-set” sentinel/optional instead of comparing against the default value.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu` around lines 1333 - 1341,
The current ternary lets the CUOPT_CPUFJ_WORK_UNIT_SCALE env var override an
explicit setting of context.settings.cpufj_work_unit_scale==1.0; change the
logic to distinguish "user set" vs "unset" instead of comparing to 1.0. Update
the code that computes cpu_work_unit_scale to use an explicit sentinel/optional
(e.g. context.settings.cpufj_work_unit_scale_is_set or make
cpufj_work_unit_scale an std::optional<double>) and only call
read_positive_work_unit_scale("CUOPT_CPUFJ_WORK_UNIT_SCALE") when the setting is
not user-provided; then apply the resulting cpu_work_unit_scale to
fj_cpu->work_unit_bias and keep the CUOPT_DETERMINISM_LOG unchanged when scale
!= 1.0.
cpp/src/mip_heuristics/diversity/recombiners/fp_recombiner.cuh-91-97 (1)

91-97: ⚠️ Potential issue | 🟠 Major

Clarify and address time limit handling in deterministic LP settings.

When determinism_mode & CUOPT_DETERMINISM_GPU_HEURISTICS, the LP solver's time limit is set to infinity while relying on work_limit (0.05 work units) as the timeout. This is intentional—in deterministic mode, termination_checker_t checks work units via work_context instead of wall-clock time. However, the TODO on line 93 highlights a real concern: the global elapsed time budget is not enforced at this call site. If the work-unit counting mechanism is inaccurate or the work limit is exceeded unexpectedly, the LP could run longer than intended. Either confirm that work_limit provides a sufficient safety valve and remove the TODO, or implement global elapsed time tracking to ensure the call respects the solver's overall time budget.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/diversity/recombiners/fp_recombiner.cuh` around lines
91 - 97, When determinism is enabled (this->context.settings.determinism_mode &
CUOPT_DETERMINISM_GPU_HEURISTICS) don't leave lp_settings.time_limit as infinite
without enforcing the solver's overall time budget: either remove the TODO after
confirming that lp_settings.work_limit
(fp_recombiner_config_t::infeasibility_detection_time_limit) plus
termination_checker_t via lp_settings.work_context
(&this->context.gpu_heur_loop) is a sufficient safety valve, or implement global
elapsed-time enforcement by computing the remaining global budget from the
solver context and setting lp_settings.time_limit to that remaining time
(clamped to a small positive epsilon) before launching the LP so the call
respects the overall time budget while keeping work_limit/work_context intact.
cpp/src/utilities/seed_generator.cuh-22-35 (1)

22-35: ⚠️ Potential issue | 🟠 Major

This resets every worker to the same seed stream.

After set_seed(42), thread A and thread B both reinitialize counter from base_seed_ and both return 42 on their first get_seed(). That removes cross-thread seed uniqueness, so parallel heuristics can end up exploring identical pseudo-random sequences.

Also applies to: 64-67

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/utilities/seed_generator.cuh` around lines 22 - 35, local_state()
reinitializes every thread's thread_state_t.counter to the same base_seed_ when
epoch_ changes, causing all threads to produce identical seed streams; fix by
assigning each thread a unique starting seed when it first initializes or when
epoch_ advances—e.g., use an atomic allocation of per-thread offsets (via an
atomic counter or next_seed_) or derive a thread-unique offset from
std::this_thread::get_id() hashed, and set state.counter = base_seed_ +
unique_offset (update state.last_epoch accordingly). Update local_state() and
any related reinit logic so thread_state_t.counter is initialized with
base_seed_ plus the per-thread unique value instead of base_seed_ alone
(referencing thread_state_t, local_state(), epoch_, base_seed_, and counter).
cpp/src/mip_heuristics/solve.cu-273-276 (1)

273-276: ⚠️ Potential issue | 🟠 Major

Deterministic presolve currently bypasses the user time limit.

Setting presolve_time_limit to infinity means a finite settings.time_limit can be fully consumed inside Papilo before timer.check_time_limit() runs again. That changes the solver contract from “respect total solve time” to “respect it only after presolve”.

Suggested change
       double presolve_time_limit = std::min(0.1 * time_limit, 60.0);
-      if (deterministic_run) { presolve_time_limit = std::numeric_limits<double>::infinity(); }
+      if (deterministic_run) { presolve_time_limit = timer.remaining_time(); }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/solve.cu` around lines 273 - 276, The deterministic
branch currently sets presolve_time_limit to infinity which lets Papilo consume
the entire user time_limit; instead keep presolve_time_limit finite by capping
it to the user time limit (and/or the existing 60s cap). Change the
deterministic_run branch so presolve_time_limit is set to a finite value (e.g.,
std::min(time_limit, 60.0) or reuse std::min(0.1 * time_limit, 60.0) but not
infinity) before creating/configuring presolver and calling
presolver->set_deterministic(deterministic_run).
cpp/src/mip_heuristics/local_search/rounding/bounds_repair.cu-517-520 (1)

517-520: ⚠️ Potential issue | 🟠 Major

Avoid hard-coding iter_limit = 20 in deterministic mode.

Line 519 ignores the actual remaining work budget and can force repair_problem(...) to exit with violations still present on larger models. Deterministic mode should stop because the timer says so, not because it hit a magic constant.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/local_search/rounding/bounds_repair.cu` around lines
517 - 520, Replace the hard-coded magic constant (iter_limit = 20) used when
timer.deterministic is true with a check that consults the timer's remaining
budget instead: stop the while loop based on the timer (e.g., a
timer.expired()/is_timeout()/remaining_time() check or by deriving iter_limit
from timer.remaining_budget()) rather than a fixed number so repair_problem(...)
can continue until the timer indicates it must stop; update the loop that uses
iter_limit and the timer.deterministic condition accordingly.
cpp/include/cuopt/linear_programming/constants.h-80-106 (1)

80-106: 🛠️ Refactor suggestion | 🟠 Major

Document these new public determinism/origin macros.

This header changes determinism from a single mode value to bitmask flags and adds public solution-origin constants, but the API here does not explain the valid combinations or migration path for callers that compare raw values. Please add Doxygen/migration notes before release.

As per coding guidelines, "For public header files (C++ API): Flag API changes that may need corresponding docs/ updates" and "For breaking changes, recommend updating docs and migration guides."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/include/cuopt/linear_programming/constants.h` around lines 80 - 106, Add
a Doxygen comment block above the new determinism and solution-origin macros
explaining the API change from a single-mode value to bitmask flags
(CUOPT_DETERMINISM_NONE, CUOPT_DETERMINISM_BB, CUOPT_DETERMINISM_GPU_HEURISTICS,
CUOPT_DETERMINISM_FULL and the CUOPT_MODE_* aliases) and documenting valid bit
combinations, recommended runtime checks (test bits rather than compare raw
values), and the migration path from the old single-mode value to the new flags
(e.g., how legacy values map to CUOPT_MODE_OPPORTUNISTIC /
CUOPT_MODE_DETERMINISTIC / CUOPT_MODE_DETERMINISTIC_BB), plus a short note for
callers about the new public solution-origin constants
(CUOPT_MIP_SOLUTION_ORIGIN_*) and whether those values are stable/extendable;
ensure the comment is in the public header near these symbols so consumers and
docs generation pick it up.
cpp/src/utilities/work_limit_context.hpp-56-62 (1)

56-62: ⚠️ Potential issue | 🟠 Major

Copying this context breaks registered producer progress pointers.

The copy/move operations allocate a new producer_work_units_elapsed atomic but keep the same producer_sync. Any producer holding the original producer_progress_ptr() will stop seeing progress after the copy, even though the copied context still emits notifications to the same sync object. That can stall deterministic horizon coordination. Share the atomic across copies, or make copies detach and require an explicit reattach.

Also applies to: 66-123, 130-133

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/utilities/work_limit_context.hpp` around lines 56 - 62, The copy/move
semantics currently create a new producer_work_units_elapsed (unique_ptr) while
retaining the same producer_sync_t*, breaking any external pointers returned by
producer_progress_ptr(); change producer_work_units_elapsed from
std::unique_ptr<std::atomic<double>> to std::shared_ptr<std::atomic<double>> so
copies share the same atomic, update the constructors, copy/move constructors
and assignment operators to copy the shared_ptr, and ensure
producer_progress_ptr() still returns a stable pointer/reference into that
shared atomic; alternatively, if you prefer detach semantics, make copies reset
producer_sync (or the atomic) and require explicit reattach, but the simpler fix
is to convert producer_work_units_elapsed to a shared_ptr and propagate that
through the class methods.
cpp/src/mip_heuristics/local_search/rounding/bounds_repair.cu-290-296 (1)

290-296: ⚠️ Potential issue | 🟠 Major

Sort on (variable_index, bound_shift) here.

Line 293 only sorts by variable_index. If curr_cstr contains the same variable more than once, the secondary bound_shift order is still unconstrained, so the new deterministic ordering is incomplete and later tie-breaking can still pick a different move.

Suggested fix
-  thrust::sort_by_key(handle_ptr->get_thrust_policy(),
-                      candidates.variable_index.begin(),
-                      candidates.variable_index.begin() + n_candidates,
-                      candidates.bound_shift.begin());
+  auto begin = thrust::make_zip_iterator(
+    thrust::make_tuple(candidates.variable_index.begin(), candidates.bound_shift.begin()));
+  auto end = thrust::make_zip_iterator(
+    thrust::make_tuple(candidates.variable_index.begin() + n_candidates,
+                       candidates.bound_shift.begin() + n_candidates));
+  thrust::sort(handle_ptr->get_thrust_policy(), begin, end, [] __device__(auto lhs, auto rhs) {
+    if (thrust::get<0>(lhs) != thrust::get<0>(rhs)) {
+      return thrust::get<0>(lhs) < thrust::get<0>(rhs);
+    }
+    return thrust::get<1>(lhs) < thrust::get<1>(rhs);
+  });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/local_search/rounding/bounds_repair.cu` around lines
290 - 296, The current thrust::sort_by_key call only sorts by
candidates.variable_index, leaving ties in variable index unordered; change the
sort so the key is the pair (variable_index, bound_shift) to enforce
deterministic secondary ordering: use a zip/tie of candidates.variable_index and
candidates.bound_shift as the key (e.g. via thrust::make_zip_iterator /
thrust::make_tuple) with the same range length n_candidates and keep the values
range (currently candidates.bound_shift.begin()) as the associated values;
update the thrust::sort_by_key invocation (and include
handle_ptr->get_thrust_policy()) to sort by the composite key instead of just
variable_index.
cpp/src/mip_heuristics/diversity/recombiners/recombiner.cuh-229-273 (1)

229-273: ⚠️ Potential issue | 🟠 Major

Preserve declaration order when building enabled_recombiners.

Converting from std::unordered_set to std::vector at lines 252–253 introduces non-determinism in the deterministic path. The vector is directly iterated at line 814 of diversity_manager.cu, and direct indexing at line 1014 (enabled_recombiners[selected_index]) means the MAB arm selection maps to different actual recombiners depending on hash table iteration order. Additionally, lines 1017–1021 use std::find and std::distance on this vector to resolve indices—all of which are order-dependent.

Instead, preserve the order from recombiner_types by filtering into a vector directly, rather than using an unordered_set as an intermediate.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/diversity/recombiners/recombiner.cuh` around lines 229
- 273, The current code builds enabled_recombiners via an std::unordered_set
which loses the original order from recombiner_types and makes downstream
indexing non-deterministic; instead iterate recombiner_types in order and build
a std::vector<recombiner_enum_t> directly (e.g., start with an empty local
vector, for each recombiner in recombiner_types push_back only if not disabled
by disable_fp_and_submip_for_expensive_fix, disable_submip_for_continuous_limit,
or disable_submip_for_determinism), then assign that vector to
recombiner_t::enabled_recombiners and keep the rest of the logging/usage
(recombiner_name, CUOPT_DETERMINISM_LOG) unchanged so the index->recombiner
mapping remains deterministic.
cpp/src/utilities/termination_checker.hpp-153-164 (1)

153-164: ⚠️ Potential issue | 🟠 Major

remaining_time() ignores the parent checker.

check() honors parent_, but remaining_units()/remaining_time() do not. Several new call sites size nested LP/FJ/repair budgets from remaining_time(), so a child under a nearly expired parent can still hand out a much larger limit to callees. Please make the remaining-budget query parent-aware, or split wall-time vs work-budget queries so callers can't size sub-budgets from a stale local limit.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/utilities/termination_checker.hpp` around lines 153 - 164,
remaining_time()/remaining_units() ignore parent_ and can return a larger budget
than parent allows; update remaining_units() (and remaining_time()) to be
parent-aware by computing the local remaining (using work_context, work_limit,
work_units_at_start, and timer as now) and then, if parent_ exists, take the
minimum of the local remaining and parent_->remaining_units() (or
parent_->remaining_time() for wall-time queries); ensure remaining_time() still
forwards to remaining_units() or split into explicit wall-time vs work-budget
accessors if callers need different semantics, and keep check() behavior
unchanged.
cpp/src/mip_heuristics/solver.cu-83-88 (1)

83-88: ⚠️ Potential issue | 🟠 Major

Don't drop incumbents before the diversity manager is published.

new_incumbent_callback() gates the population/RINS path on context->diversity_manager_ptr, but the async B&B thread starts at Line 347 and that pointer is only assigned at Line 355. An incumbent found in that window is silently skipped here and never reaches the heuristic side.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/solver.cu` around lines 83 - 88, An incumbent can be
dropped because the code only calls population.add_external_solution and
rins.new_best_incumbent_callback when context->diversity_manager_ptr is
non-null, but that pointer is published after the async B&B thread starts; fix
by making incumbent delivery tolerant to a null diversity manager: when
context->diversity_manager_ptr is null, push the (solution, objective,
info.origin) into a small thread-safe pending buffer (e.g.,
context->pending_incumbents) or route through a new_incumbent_callback() that
enqueues incumbents; then, when diversity_manager_ptr is assigned, flush
pending_incumbents into
diversity_manager_ptr->population.add_external_solution(...) and
diversity_manager_ptr->rins.new_best_incumbent_callback(...). Ensure the buffer
is thread-safe and cleared after flush so no incumbents are lost or
double-processed.
cpp/src/mip_heuristics/solution_callbacks.cuh-172-197 (1)

172-197: ⚠️ Potential issue | 🟠 Major

Reject all non-finite injected objectives, not just +inf.

Only +inf is treated as “no injected solution” here. NaN and -inf still flow into preprocessing and then trip the objective assertion below.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/solution_callbacks.cuh` around lines 172 - 197, The
code currently only skips injected objectives equal to +inf; change the check on
f_t outside_sol_objective (read from h_outside_sol_objective) to skip any
non-finite value (NaN, +inf, -inf) by using std::isfinite(outside_sol_objective)
and continue when it returns false (and add <cmath> if needed); this prevents
invalid values reaching pre_process_assignment, solution_t<i_t,f_t> outside_sol,
and the cuopt_assert comparing objectives.
cpp/src/mip_heuristics/diversity/diversity_manager.cu-1028-1036 (1)

1028-1036: ⚠️ Potential issue | 🟠 Major

The selection log advances the RNG.

This calls seed_generator::get_seed() only to print it. Because get_seed() is the consuming API, the log itself changes the next random draw and defeats the determinism you're trying to observe here. Use peek_seed() or cache the seed before logging.

Minimal fix
-    (unsigned int)cuopt::seed_generator::get_seed());
+    (unsigned int)cuopt::seed_generator::peek_seed());
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/diversity/diversity_manager.cu` around lines 1028 -
1036, The logging call in CUOPT_DETERMINISM_LOG currently calls
cuopt::seed_generator::get_seed(), which consumes the RNG and breaks
determinism; change the logging to use the non-consuming seed access
(cuopt::seed_generator::peek_seed()) or read and cache the seed once before the
log and use that cached value. Locate the CUOPT_DETERMINISM_LOG invocation
around recombiner_t<i_t,f_t>::recombiner_name(...) and
mab_recombiner.last_chosen_option and replace the get_seed() call with
peek_seed() or a previously stored seed variable so the log does not advance the
RNG.
cpp/src/mip_heuristics/local_search/feasibility_pump/feasibility_pump.cu-8-13 (1)

8-13: ⚠️ Potential issue | 🟠 Major

This turns determinism logging into a hot-path INFO log.

The macro is enabled unconditionally here, and the new FP call sites below compute hashes / sync streams inside CUOPT_DETERMINISM_LOG. That means every solve pays the logging cost, not just opt-in determinism debugging.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/local_search/feasibility_pump/feasibility_pump.cu`
around lines 8 - 13, The CUOPT_DETERMINISM_LOG macro is being enabled
unconditionally which forces hash computations and stream syncs at every call
site; change the macro to be enabled only when a dedicated compile-time flag is
set (e.g. CUOPT_DETERMINISM_LOG_ENABLED): wrap the verbose definition that calls
CUOPT_LOG_INFO in `#ifdef` CUOPT_DETERMINISM_LOG_ENABLED / `#endif` and provide an
empty variadic no-op fallback macro (e.g. `#define` CUOPT_DETERMINISM_LOG(...) do
{ } while (0)) when the flag is not set so arguments at call sites (hash/sync
computations) are completely elided; keep the macro name CUOPT_DETERMINISM_LOG
and the CUOPT_LOG_INFO call intact for the enabled branch.
cpp/src/mip_heuristics/local_search/rounding/constraint_prop.cu-1090-1093 (1)

1090-1093: ⚠️ Potential issue | 🟠 Major

apply_round() ignores the caller's sub-budget.

This rebuilds max_timer from *context.termination instead of the timer argument. In the FP path, that lets bounds propagation spend up to max_time_for_bounds_prop even when the caller intentionally passed a smaller bounds_prop_time_limit.

Minimal fix
-  max_timer =
-    work_limit_timer_t{context.gpu_heur_loop, max_time_for_bounds_prop, *context.termination};
+  max_timer = work_limit_timer_t{
+    context.gpu_heur_loop,
+    std::min<f_t>(max_time_for_bounds_prop, timer.remaining_time()),
+    timer};
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/local_search/rounding/constraint_prop.cu` around lines
1090 - 1093, apply_round() currently rebuilds max_timer from
context.gpu_heur_loop and max_time_for_bounds_prop (using *context.termination)
which ignores the caller's sub-budget passed as the timer argument and allows
bounds propagation to overrun bounds_prop_time_limit; change the code so
max_timer is derived from the provided timer argument (not *context.termination)
when setting up work_limit_timer_t for bounds propagation in apply_round(),
preserving the existing use of context.gpu_heur_loop and
max_time_for_bounds_prop but feeding timer (the caller's sub-budget) into
work_limit_timer_t so check_brute_force_rounding/FP path respects the caller
limit after sol.compute_feasibility().
cpp/src/mip_heuristics/diversity/diversity_manager.cu-359-362 (1)

359-362: ⚠️ Potential issue | 🟠 Major

Don't move from the object that fj_only_run returns.

run_solver() still returns the same sol after calling run_fj_alone(sol). When this branch finds a feasible incumbent, moving solution into the population leaves the returned object in a moved-from state.

Minimal fix
-    population.add_solution(std::move(solution),
-                            internals::mip_solution_origin_t::FEASIBILITY_JUMP);
+    population.add_solution(solution_t<i_t, f_t>(solution),
+                            internals::mip_solution_origin_t::FEASIBILITY_JUMP);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/diversity/diversity_manager.cu` around lines 359 -
362, The code moves `solution` into `population` but `run_solver()` later
returns the same `solution` after calling `run_fj_alone(sol)`, leaving the
return value in a moved-from state; instead avoid moving the returned object:
change the `population.add_solution(std::move(solution),
internals::mip_solution_origin_t::FEASIBILITY_JUMP)` call to use a non-moving
overload or a copy (e.g., `population.add_solution(solution,
internals::mip_solution_origin_t::FEASIBILITY_JUMP)`), or create a local copy
`auto s = solution;` and move that into `add_solution`; adjust
`population.add_solution` signature if needed so the original `solution` remains
valid for `run_solver()` to return.
cpp/src/branch_and_bound/branch_and_bound.cpp-926-949 (1)

926-949: ⚠️ Potential issue | 🟠 Major

Replay late deterministic heuristic solutions in deterministic order.

pending is iterated in raw queue order. That order depends on producer arrival, so if multiple pending solutions are at or before current_work, the incumbent/callback sequence can vary run-to-run even with identical timestamps.

💡 Suggested fix
   if (settings_.deterministic) {
     const double current_work = work_unit_context_.current_work();
     mutex_heuristic_queue_.lock();
     std::vector<queued_external_solution_t> pending;
     pending.swap(heuristic_solution_queue_);
     mutex_heuristic_queue_.unlock();
+
+    std::sort(pending.begin(), pending.end(), [](const auto& a, const auto& b) {
+      if (a.work_timestamp != b.work_timestamp) return a.work_timestamp < b.work_timestamp;
+      if (a.user_objective != b.user_objective) return a.user_objective < b.user_objective;
+      if (a.origin != b.origin) return a.origin < b.origin;
+      return a.solution < b.solution;
+    });
 
     for (const auto& queued_solution : pending) {
       if (queued_solution.work_timestamp > current_work) { continue; }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/branch_and_bound/branch_and_bound.cpp` around lines 926 - 949, The
pending vector is processed in producer arrival order causing nondeterminism;
before the for-loop over pending, sort pending deterministically (e.g., stable
sort by queued_external_solution_t.work_timestamp ascending and then a stable
tie-breaker such as queued_solution.origin or another immutable field) so
retire_queued_solution, incumbent_.set_incumbent_solution and
emit_solution_callback_from_crushed are invoked in a reproducible order when
work_timestamp <= current_work.
cpp/src/mip_heuristics/diversity/population.cu-216-222 (1)

216-222: ⚠️ Potential issue | 🟠 Major

Tighten this threshold instead of widening it.

new_best_feasible_objective is only updated when a non-CPUFJ entry is worse (>), so later CPUFJ solutions can survive even when they are not better than the current best feasible.

💡 Suggested fix
-      } else if (h_entry.origin != internals::mip_solution_origin_t::CPU_FEASIBILITY_JUMP &&
-                 h_entry.objective > new_best_feasible_objective) {
+      } else if (h_entry.origin != internals::mip_solution_origin_t::CPU_FEASIBILITY_JUMP &&
+                 h_entry.objective < new_best_feasible_objective) {
         new_best_feasible_objective = h_entry.objective;
       }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/diversity/population.cu` around lines 216 - 222, The
comparisons allow CPU_FEASIBILITY_JUMP entries with equal objective to slip
through; tighten the threshold by changing the strict '>' checks to '>=' so
equal objectives are treated as not better. Update the conditional using
h_entry.origin, h_entry.objective and new_best_feasible_objective (and the enum
internals::mip_solution_origin_t::CPU_FEASIBILITY_JUMP) so that
CPU_FEASIBILITY_JUMP entries are skipped when h_entry.objective >=
new_best_feasible_objective and non-CPU_FEASIBILITY_JUMP entries update
new_best_feasible_objective when h_entry.objective >=
new_best_feasible_objective.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: ff4c389a-f194-4d50-99e4-33c63a2b9f85

📥 Commits

Reviewing files that changed from the base of the PR and between 4d5f5e5 and 1cad039.

📒 Files selected for processing (107)
  • benchmarks/linear_programming/cuopt/run_mip.cpp
  • ci/compute-sanitizer-suppressions.xml
  • cpp/CMakeLists.txt
  • cpp/include/cuopt/linear_programming/constants.h
  • cpp/include/cuopt/linear_programming/cuopt_c.h
  • cpp/include/cuopt/linear_programming/mip/solver_settings.hpp
  • cpp/include/cuopt/linear_programming/utilities/internals.hpp
  • cpp/src/branch_and_bound/branch_and_bound.cpp
  • cpp/src/branch_and_bound/branch_and_bound.hpp
  • cpp/src/branch_and_bound/deterministic_workers.hpp
  • cpp/src/branch_and_bound/pseudo_costs.cpp
  • cpp/src/branch_and_bound/pseudo_costs.hpp
  • cpp/src/dual_simplex/basis_updates.cpp
  • cpp/src/dual_simplex/bound_flipping_ratio_test.cpp
  • cpp/src/dual_simplex/bound_flipping_ratio_test.hpp
  • cpp/src/dual_simplex/phase2.cpp
  • cpp/src/dual_simplex/simplex_solver_settings.hpp
  • cpp/src/math_optimization/solver_settings.cu
  • cpp/src/mip_heuristics/diversity/diversity_config.hpp
  • cpp/src/mip_heuristics/diversity/diversity_manager.cu
  • cpp/src/mip_heuristics/diversity/diversity_manager.cuh
  • cpp/src/mip_heuristics/diversity/lns/rins.cu
  • cpp/src/mip_heuristics/diversity/multi_armed_bandit.cuh
  • cpp/src/mip_heuristics/diversity/population.cu
  • cpp/src/mip_heuristics/diversity/population.cuh
  • cpp/src/mip_heuristics/diversity/recombiners/bound_prop_recombiner.cuh
  • cpp/src/mip_heuristics/diversity/recombiners/fp_recombiner.cuh
  • cpp/src/mip_heuristics/diversity/recombiners/line_segment_recombiner.cuh
  • cpp/src/mip_heuristics/diversity/recombiners/recombiner.cuh
  • cpp/src/mip_heuristics/diversity/recombiners/recombiner_stats.hpp
  • cpp/src/mip_heuristics/diversity/recombiners/sub_mip.cuh
  • cpp/src/mip_heuristics/diversity/weights.cuh
  • cpp/src/mip_heuristics/feasibility_jump/feasibility_jump.cu
  • cpp/src/mip_heuristics/feasibility_jump/feasibility_jump.cuh
  • cpp/src/mip_heuristics/feasibility_jump/feasibility_jump_impl_common.cuh
  • cpp/src/mip_heuristics/feasibility_jump/feasibility_jump_kernels.cu
  • cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu
  • cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cuh
  • cpp/src/mip_heuristics/feasibility_jump/load_balancing.cuh
  • cpp/src/mip_heuristics/feasibility_jump/utils.cuh
  • cpp/src/mip_heuristics/local_search/feasibility_pump/feasibility_pump.cu
  • cpp/src/mip_heuristics/local_search/feasibility_pump/feasibility_pump.cuh
  • cpp/src/mip_heuristics/local_search/line_segment_search/line_segment_search.cu
  • cpp/src/mip_heuristics/local_search/line_segment_search/line_segment_search.cuh
  • cpp/src/mip_heuristics/local_search/local_search.cu
  • cpp/src/mip_heuristics/local_search/local_search.cuh
  • cpp/src/mip_heuristics/local_search/rounding/bounds_repair.cu
  • cpp/src/mip_heuristics/local_search/rounding/bounds_repair.cuh
  • cpp/src/mip_heuristics/local_search/rounding/constraint_prop.cu
  • cpp/src/mip_heuristics/local_search/rounding/constraint_prop.cuh
  • cpp/src/mip_heuristics/local_search/rounding/lb_bounds_repair.cu
  • cpp/src/mip_heuristics/local_search/rounding/lb_bounds_repair.cuh
  • cpp/src/mip_heuristics/local_search/rounding/lb_constraint_prop.cu
  • cpp/src/mip_heuristics/local_search/rounding/lb_constraint_prop.cuh
  • cpp/src/mip_heuristics/local_search/rounding/simple_rounding.cu
  • cpp/src/mip_heuristics/local_search/rounding/simple_rounding_kernels.cuh
  • cpp/src/mip_heuristics/mip_constants.hpp
  • cpp/src/mip_heuristics/presolve/bounds_presolve.cu
  • cpp/src/mip_heuristics/presolve/bounds_update_data.cu
  • cpp/src/mip_heuristics/presolve/lb_probing_cache.cu
  • cpp/src/mip_heuristics/presolve/probing_cache.cu
  • cpp/src/mip_heuristics/presolve/probing_cache.cuh
  • cpp/src/mip_heuristics/presolve/third_party_presolve.cpp
  • cpp/src/mip_heuristics/presolve/third_party_presolve.hpp
  • cpp/src/mip_heuristics/presolve/utils.cuh
  • cpp/src/mip_heuristics/problem/presolve_data.cu
  • cpp/src/mip_heuristics/problem/presolve_data.cuh
  • cpp/src/mip_heuristics/problem/problem.cu
  • cpp/src/mip_heuristics/problem/problem.cuh
  • cpp/src/mip_heuristics/problem/problem_fixing.cuh
  • cpp/src/mip_heuristics/relaxed_lp/relaxed_lp.cu
  • cpp/src/mip_heuristics/relaxed_lp/relaxed_lp.cuh
  • cpp/src/mip_heuristics/solution/solution.cu
  • cpp/src/mip_heuristics/solution/solution.cuh
  • cpp/src/mip_heuristics/solution_callbacks.cuh
  • cpp/src/mip_heuristics/solve.cu
  • cpp/src/mip_heuristics/solver.cu
  • cpp/src/mip_heuristics/solver.cuh
  • cpp/src/mip_heuristics/solver_context.cuh
  • cpp/src/pdlp/cuopt_c.cpp
  • cpp/src/pdlp/pdlp.cu
  • cpp/src/utilities/copy_helpers.hpp
  • cpp/src/utilities/cuda_helpers.cuh
  • cpp/src/utilities/determinism_log.hpp
  • cpp/src/utilities/seed_generator.cu
  • cpp/src/utilities/seed_generator.cuh
  • cpp/src/utilities/termination_checker.hpp
  • cpp/src/utilities/timer.hpp
  • cpp/src/utilities/version_info.cpp
  • cpp/src/utilities/version_info.hpp
  • cpp/src/utilities/work_limit_context.hpp
  • cpp/src/utilities/work_limit_timer.hpp
  • cpp/src/utilities/work_unit_scheduler.cpp
  • cpp/tests/CMakeLists.txt
  • cpp/tests/mip/CMakeLists.txt
  • cpp/tests/mip/determinism_test.cu
  • cpp/tests/mip/determinism_utils.cuh
  • cpp/tests/mip/diversity_test.cu
  • cpp/tests/mip/feasibility_jump_tests.cu
  • cpp/tests/mip/load_balancing_test.cu
  • cpp/tests/mip/local_search_test.cu
  • cpp/tests/mip/mip_utils.cuh
  • cpp/tests/mip/miplib_test.cu
  • cpp/tests/mip/multi_probe_test.cu
  • cpp/tests/mip/presolve_test.cu
  • cpp/tests/mip/unit_test.cu
  • datasets/mip/download_miplib_test_dataset.sh

*/
/* clang-format on */

#include <utilities/determinism_log.hpp>
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Fix the missing determinism-log header include.

Clang is already reporting utilities/determinism_log.hpp as missing here, so this translation unit will not build until the header is added to the include path or the include target is corrected.

🧰 Tools
🪛 Clang (14.0.6)

[error] 8-8: 'utilities/determinism_log.hpp' file not found

(clang-diagnostic-error)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/branch_and_bound/branch_and_bound.cpp` at line 8, The translation
unit is failing because the header utilities/determinism_log.hpp is missing from
the include path; update the build configuration or the include directive so the
header is resolvable: either add the directory containing
utilities/determinism_log.hpp to the compiler include paths (e.g.,
target_include_directories for the target that builds branch_and_bound.cpp) or
correct the include target/name so that utilities/determinism_log.hpp can be
found and included by branch_and_bound.cpp.

Comment on lines 58 to 62
bool repair_problem(load_balanced_problem_t<i_t, f_t>* problem,
load_balanced_bounds_presolve_t<i_t, f_t>& lb_bound_presolve,
problem_t<i_t, f_t>& original_problem,
timer_t timer_,
work_limit_timer_t timer_,
const raft::handle_t* handle_ptr_);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
rg -n -C2 'repair_problem\(' \
  cpp/src/mip_heuristics/local_search/rounding/lb_bounds_repair.cuh \
  cpp/src/mip_heuristics/local_search/rounding/lb_bounds_repair.cu

Repository: aliceb-nv/cuopt-fork

Length of output: 1271


🏁 Script executed:

sed -n '58,62p' cpp/src/mip_heuristics/local_search/rounding/lb_bounds_repair.cuh

Repository: aliceb-nv/cuopt-fork

Length of output: 381


🏁 Script executed:

sed -n '406,411p' cpp/src/mip_heuristics/local_search/rounding/lb_bounds_repair.cu

Repository: aliceb-nv/cuopt-fork

Length of output: 320


Update the .cu definition to work_limit_timer_t too.

The declaration uses work_limit_timer_t timer_ but cpp/src/mip_heuristics/local_search/rounding/lb_bounds_repair.cu defines lb_bounds_repair_t::repair_problem(..., timer_t timer_, ...). This template definition/declaration mismatch will cause a compilation failure.

Minimal fix
// cpp/src/mip_heuristics/local_search/rounding/lb_bounds_repair.cu
-  timer_t timer_,
+  work_limit_timer_t timer_,
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/local_search/rounding/lb_bounds_repair.cuh` around
lines 58 - 62, The function declaration for repair_problem expects a parameter
of type work_limit_timer_t named timer_, but the template definition in
lb_bounds_repair_t::repair_problem currently uses timer_t; update the definition
signature in cpp/src/mip_heuristics/local_search/rounding/lb_bounds_repair.cu to
use work_limit_timer_t timer_ (and match the template parameters i_t, f_t, and
other parameters: load_balanced_problem_t, load_balanced_bounds_presolve_t,
problem_t, const raft::handle_t* handle_ptr_) so the declaration and definition
types match and compile.

Comment on lines 701 to 705
bool lb_constraint_prop_t<i_t, f_t>::apply_round(
solution_t<i_t, f_t>& sol,
f_t lp_run_time_after_feasible,
timer_t& timer,
work_limit_timer_t& timer,
std::optional<std::vector<thrust::pair<f_t, f_t>>> probing_candidates)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "Timer type definitions:"
rg -n -C2 "using\s+work_limit_timer_t|class\s+work_limit_timer_t|struct\s+work_limit_timer_t|using\s+timer_t|class\s+timer_t|struct\s+timer_t" cpp/src cpp/include

echo
echo "lb_constraint_prop timer plumbing:"
rg -n -C2 "apply_round\(|find_integer\(|run_repair_procedure\(" cpp/src/mip_heuristics/local_search/rounding/lb_constraint_prop.*

Repository: aliceb-nv/cuopt-fork

Length of output: 5559


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== Full signatures in header file ==="
sed -n '21,90p' cpp/src/mip_heuristics/local_search/rounding/lb_constraint_prop.cuh

echo
echo "=== apply_round implementation (first 50 lines) ==="
sed -n '701,750p' cpp/src/mip_heuristics/local_search/rounding/lb_constraint_prop.cu

echo
echo "=== find_integer signature area ==="
sed -n '753,800p' cpp/src/mip_heuristics/local_search/rounding/lb_constraint_prop.cu

echo
echo "=== run_repair_procedure signature area ==="
sed -n '453,500p' cpp/src/mip_heuristics/local_search/rounding/lb_constraint_prop.cu

Repository: aliceb-nv/cuopt-fork

Length of output: 10572


Fix timer type mismatch between declarations and implementations.

The header file declares find_integer() and run_repair_procedure() with work_limit_timer_t& timer, but their implementations in the .cu file use timer_t& timer instead. These are incompatible types (work_limit_timer_t = termination_checker_t; timer_t is a separate class). Update the implementations to match the header declarations.

Affected function signatures
  • Line 753: find_integer() implementation uses timer_t& timer
  • Line 453: run_repair_procedure() implementation uses timer_t& timer

Both should be work_limit_timer_t& timer to match the header declarations and the caller at line 724-726.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/local_search/rounding/lb_constraint_prop.cu` around
lines 701 - 705, The implementations of lb_constraint_prop_t's helper methods
use the wrong timer type; change the function signatures for find_integer(...)
and run_repair_procedure(...) in the .cu file from timer_t& timer to
work_limit_timer_t& timer so they match the header declarations and the caller
in apply_round; update any local references/parameter names accordingly to the
work_limit_timer_t type to ensure compilation and correct linkage for
lb_constraint_prop_t<i_t, f_t>::find_integer and lb_constraint_prop_t<i_t,
f_t>::run_repair_procedure.

@aliceb-nv aliceb-nv closed this Mar 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant