Incorporate determinism into GPU heuristics + parallel determinsitic B&B+GPU exploration#7
Incorporate determinism into GPU heuristics + parallel determinsitic B&B+GPU exploration#7aliceb-nv wants to merge 292 commits into
Conversation
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (31)
📝 WalkthroughWalkthroughExtended MIP solver with deterministic execution support: added determinism bitflags and solution-origin metadata, unified termination/work-limit infrastructure, rewired callback/publication/injection paths, propagated work-aware timers across heuristics and presolve, and introduced deterministic logging, hashing, and extensive test coverage. Changes
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
|
There was a problem hiding this comment.
Actionable comments posted: 1
Note
Due to the large number of review comments, Critical severity comments were prioritized as inline comments.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (5)
cpp/tests/mip/multi_probe_test.cu (1)
152-165:⚠️ Potential issue | 🟠 MajorThis regression helper never enables the deterministic code path.
test_multi_probe()still buildssolverfrom default settings, somulti_probe_t::solve()is not running under the determinism mode this PR is adding. That makesmulti_probe_deterministica regression test for the old path, not the new one. Pass a determinism flag into this helper or construct the solver with deterministic settings in the deterministic test case.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/tests/mip/multi_probe_test.cu` around lines 152 - 165, The regression helper constructs detail::mip_solver_t solver using default mip_solver_settings_t (so multi_probe_t::solve() runs non-deterministic code); update the helper so deterministic tests actually enable determinism—either add a parameter to test_multi_probe() to pass a determinism flag and use it when creating mip_solver_settings_t default_settings (set the deterministic mode) or, in the multi_probe_deterministic test case, explicitly modify default_settings before constructing detail::mip_solver_t solver so the solver is created with deterministic settings and the deterministic code path is exercised.cpp/src/mip_heuristics/feasibility_jump/load_balancing.cuh (1)
541-567:⚠️ Potential issue | 🔴 CriticalType change from
f_ttoint32_tfor score accumulators is a critical bug.The accumulators
base_feasandbonus_robustchanged fromf_ttoint32_t(lines 541-542), butfeas_score_constraintreturnsstd::pair<f_t, f_t>. This causes implicit truncation of floating-point scores to integers at lines 562-563 before they are reduced and passed towarp_update_move_score, which expectsf_tparameters (lines 271-272). The binary variable kernel (lines 328-359) correctly usesf_taccumulators throughout, making this inconsistency a regression that corrupts score calculations in the MTM kernel.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/mip_heuristics/feasibility_jump/load_balancing.cuh` around lines 541 - 567, The accumulators base_feas and bonus_robust in the MTM kernel must be f_t, not int32_t, because feas_score_constraint<i_t,f_t> returns std::pair<f_t,f_t> and warp_update_move_score expects f_t; change the declarations of base_feas and bonus_robust back to f_t, ensure the values added from feas_score_constraint are stored as f_t, and keep the raft::warpReduce calls operating on f_t so the reduced values passed to warp_update_move_score retain their floating precision.cpp/src/mip_heuristics/local_search/local_search.cu (1)
209-245:⚠️ Potential issue | 🟠 MajorDeterministic
do_fj_solve()overcommits the local work budget.The GPU climb gets the full
time_limitthroughtimer, and everyls_cpu_fjclimber also getswork_budget = time_limit. Unlikestart_cpufj_deterministic()above, these workers are never registered withproducer_sync, so this helper can spend more than one budgeted unit of work per local-search call. Either split the budget across workers or keep CPUFJ disabled in deterministic mode here until it participates in the same producer/work accounting.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/mip_heuristics/local_search/local_search.cu` around lines 209 - 245, The deterministic path in do_fj_solve assigns each cpu_fj worker the full time_limit (cpu_fj.fj_cpu->work_budget = time_limit) without registering them with producer_sync, allowing total CPU work to exceed the intended budget; change this by disabling per-worker CPUFJ in deterministic mode here (skip creating/assigning work_budget and work_units_elapsed for ls_cpu_fj) or instead divide the time_limit across ls_cpu_fj (e.g., time_limit / ls_cpu_fj.size()) and ensure any created cpu_fj workers are registered with the same producer/work accounting used by start_cpufj_deterministic so their budgets are enforced consistently.cpp/src/mip_heuristics/local_search/rounding/constraint_prop.cu (1)
1166-1180:⚠️ Potential issue | 🟠 MajorClamp the nested LP to the parent timer.
In deterministic mode this LP gets its own
work_limit, but it is taken fromlp_run_time_after_feasibleinstead oftimer.remaining_time(). Near exhaustion, this path can spend more work than the caller budgeted.Suggested fix
relaxed_lp_settings_t lp_settings; lp_settings.time_limit = lp_run_time_after_feasible; if (timer.deterministic) { - lp_settings.work_limit = lp_settings.time_limit; + lp_settings.work_limit = + std::min(lp_settings.time_limit, std::max((f_t)0.0, timer.remaining_time())); + lp_settings.time_limit = lp_settings.work_limit; lp_settings.work_context = timer.work_context; cuopt_assert(lp_settings.work_context != nullptr, "Missing deterministic work context"); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/mip_heuristics/local_search/rounding/constraint_prop.cu` around lines 1166 - 1180, The deterministic LP uses lp_settings.work_limit derived from lp_run_time_after_feasible which can exceed the caller's remaining budget; update the deterministic branch in constraint_prop.cu to clamp lp_settings.work_limit to the parent timer by setting lp_settings.work_limit = std::min(lp_run_time_after_feasible, timer.remaining_time()) (and keep lp_settings.work_context = timer.work_context as before) before calling run_lp_with_vars_fixed; ensure you reference lp_settings, timer, lp_run_time_after_feasible and the run_lp_with_vars_fixed call when making the change.cpp/src/mip_heuristics/diversity/diversity_manager.cu (1)
126-144:⚠️ Potential issue | 🔴 CriticalMove the deterministic-flag initialization above the env-config early returns.
Line 127 returns in the common
CUOPT_CONFIG_ID == nullptrpath, so Lines 142-143 never run. That leavescontext.gpu_heur_loop.deterministicfalse and turns laterwork_limit_timer_t(context.gpu_heur_loop, ...)instances back into wall-clock timers even when GPU determinism was requested.🛠️ Proposed fix
const char* env_config_id_raw = std::getenv("CUOPT_CONFIG_ID"); + context.gpu_heur_loop.deterministic = + (context.settings.determinism_mode & CUOPT_DETERMINISM_GPU_HEURISTICS); if (env_config_id_raw == nullptr) { return; } @@ - context.gpu_heur_loop.deterministic = - (context.settings.determinism_mode & CUOPT_DETERMINISM_GPU_HEURISTICS); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/mip_heuristics/diversity/diversity_manager.cu` around lines 126 - 144, Initialize context.gpu_heur_loop.deterministic before any early returns so GPU determinism is always set regardless of CUOPT_CONFIG_ID parsing/validation; specifically, move the assignment context.gpu_heur_loop.deterministic = (context.settings.determinism_mode & CUOPT_DETERMINISM_GPU_HEURISTICS) to occur immediately after reading settings (and before checking env_config_id_raw, std::stoi, or the max_config check), leaving the existing CUOPT_CONFIG_ID parsing, env_config_id, and max_config validation logic intact so work_limit_timer_t and other consumers see the correct deterministic flag even when CUOPT_CONFIG_ID is missing or invalid.
🟠 Major comments (20)
cpp/src/dual_simplex/basis_updates.cpp-2205-2205 (1)
2205-2205:⚠️ Potential issue | 🟠 MajorSort work estimate is computed with the wrong size term.
std::sorthere operates onnzelements, but the estimate usesm + nz. This overcounts work (including whennz == 0) and can skew work-based scheduling/determinism decisions.Suggested fix
- if ((m + nz) > 1) { work_estimate_ += (m + nz) * std::log2((f_t)(m + nz)); } + if (nz > 1) { + const f_t nz_f = static_cast<f_t>(nz); + work_estimate_ += nz_f * std::log2(nz_f); + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/dual_simplex/basis_updates.cpp` at line 2205, The work estimate uses the wrong size: change the condition and log term that update work_estimate_ in the block that precedes the std::sort over nz elements so it uses nz (not m + nz); specifically, replace the (m + nz) check and std::log2((f_t)(m + nz)) with an nz > 1 guard and std::log2((f_t)nz) so work_estimate_ reflects sorting nz elements (reference symbols: work_estimate_, m, nz, std::sort).cpp/include/cuopt/linear_programming/constants.h-80-90 (1)
80-90:⚠️ Potential issue | 🟠 MajorPotential breaking semantic change in
CUOPT_MODE_DETERMINISTIC.
CUOPT_MODE_DETERMINISTICnow aliasesCUOPT_DETERMINISM_FULL(0x3), while the comment indicates prior behavior/value aligned with B&B-only determinism (0x1). Recompiled clients using the legacy macro will silently get stricter behavior/perf characteristics unless this is explicitly documented and migration-guided.As per coding guidelines, "For breaking changes, recommend updating docs and migration guides."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/include/cuopt/linear_programming/constants.h` around lines 80 - 90, The macro CUOPT_MODE_DETERMINISTIC was changed to alias CUOPT_DETERMINISM_FULL (0x3) which breaks prior semantics that mapped deterministic mode to B&B-only (CUOPT_DETERMINISM_BB, 0x1); restore the original behavior by remapping CUOPT_MODE_DETERMINISTIC back to CUOPT_DETERMINISM_BB (or introduce a new CUOPT_MODE_DETERMINISTIC_FULL and keep CUOPT_MODE_DETERMINISTIC as before) and add a note in the changelog/docs/migration guide explaining the new FULL mode and the intentional choice if you keep the new mapping; locate and change the definition for CUOPT_MODE_DETERMINISTIC in the constants.h file (symbols: CUOPT_MODE_DETERMINISTIC, CUOPT_DETERMINISM_FULL, CUOPT_DETERMINISM_BB).cpp/src/mip_heuristics/local_search/local_search.cuh-55-83 (1)
55-83:⚠️ Potential issue | 🟠 MajorPass
work_limit_timer_tby reference for consistency and state propagation.Methods in this header accept
work_limit_timer_tby value, but incpp/src/mip_heuristics/presolve/multi_probe.cuh(line 58) andcpp/src/mip_heuristics/presolve/bounds_presolve.cuh(line 64), the same type is passed by reference. Sincetermination_checker_t(the underlying type) tracks mutable state and provides non-const methods likerecord_work(), passing by value creates a copy and prevents state changes from propagating back to the caller. For consistency and correctness, consider changing all parameters towork_limit_timer_t&.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/mip_heuristics/local_search/local_search.cuh` around lines 55 - 83, All declarations that currently take work_limit_timer_t by value should take it by reference so mutation of the underlying termination_checker_t propagates to the caller; update the signatures of generate_fast_solution, generate_solution, run_fj_until_timer, run_local_search, run_fj_annealing, run_fj_line_segment, run_fj_on_zero, check_fj_on_lp_optimal, run_staged_fp and run_fp to accept work_limit_timer_t& (and adjust any callers/definitions accordingly) to match how work_limit_timer_t is used elsewhere (e.g., in multi_probe.cuh and bounds_presolve.cuh).cpp/src/utilities/seed_generator.cuh-16-16 (1)
16-16:⚠️ Potential issue | 🟠 MajorData race on
base_seed_read inlocal_state().
base_seed_is a non-atomicint64_tthat is written inset_seed()(lines 43, 45) and read inlocal_state()(line 32). Whileepoch_uses proper acquire/release semantics, the read ofbase_seed_inlocal_state()races with the write inset_seed()if called concurrently from different threads.Consider making
base_seed_atomic or protecting both fields under a common synchronization mechanism.Proposed fix
class seed_generator { - static int64_t base_seed_; + static std::atomic<int64_t> base_seed_; // Monotonically increasing epoch; incremented on every set_seed() call. // ... static std::atomic<int64_t> epoch_; // In local_state(): if (state.last_epoch != current_epoch) { - state.counter = base_seed_; + state.counter = base_seed_.load(std::memory_order_acquire); state.last_epoch = current_epoch; } // In set_seed(): `#ifdef` BENCHMARK - base_seed_ = std::random_device{}(); + base_seed_.store(std::random_device{}(), std::memory_order_release); `#else` - base_seed_ = static_cast<int64_t>(seed); + base_seed_.store(static_cast<int64_t>(seed), std::memory_order_release); `#endif`Also applies to: 27-36, 40-48
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/utilities/seed_generator.cuh` at line 16, The read of base_seed_ in local_state() races with writes in set_seed(); change base_seed_ to std::atomic<int64_t> (or otherwise protect both base_seed_ and epoch_ with a common mutex) and update all accesses: replace raw reads in local_state() with base_seed_.load(std::memory_order_acquire) and writes in set_seed() with base_seed_.store(..., std::memory_order_release) so the ordering matches epoch_’s acquire/release semantics (alternatively lock the same mutex used for epoch_ around both set_seed() and local_state()).cpp/src/mip_heuristics/solution/solution.cu-58-72 (1)
58-72:⚠️ Potential issue | 🟠 MajorThe handle-override constructor is unsafe as written.
This ctor dereferences
handle_overrideimmediately, and the type still sometimes syncs viaproblem_ptr->handle_ptrinstead ofsolution_t::handle_ptr(for example, Line 263). A real override can therefore crash on null or enqueue work on one stream and synchronize another.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/mip_heuristics/solution/solution.cu` around lines 58 - 72, The constructor dereferences handle_override and mixes problem_ptr->handle_ptr vs solution_t::handle_ptr leading to potential null deref and stream mismatches; fix by selecting a single chosen_handle = handle_override ? handle_override : problem_.handle_ptr before using it, initialize solution_t::handle_ptr with that chosen_handle, and use chosen_handle (not problem_.handle_ptr) when calling get_lower_bounds, constructing member tensors (their get_stream()), initializing lp_state, and calling clamp_within_var_bounds so all operations use the same non-null handle consistently.cpp/src/mip_heuristics/solution/solution.cu-689-693 (1)
689-693:⚠️ Potential issue | 🟠 MajorHash only the active variable prefix.
assignment.size()is not guaranteed to matchproblem_ptr->n_variableshere—get_host_assignment()already documents that. Hashing the full buffer pulls stale tail values into the new determinism fingerprint after resize/fix/unfix flows.Suggested fix
template <typename i_t, typename f_t> uint32_t solution_t<i_t, f_t>::get_hash() const { - auto h_assignment = host_copy(assignment, handle_ptr->get_stream()); + auto h_assignment = + cuopt::host_copy(assignment.data(), problem_ptr->n_variables, handle_ptr->get_stream()); return compute_hash(h_assignment); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/mip_heuristics/solution/solution.cu` around lines 689 - 693, get_hash currently copies and hashes the entire device buffer (assignment) which can include stale tail values; change it to compute the hash only over the active variable prefix. Specifically, obtain the host assignment and its active length (use get_host_assignment() or problem_ptr->n_variables / the documented active size from the host vector) and call compute_hash only on that prefix (not the full buffer returned by host_copy). Update solution_t<i_t,f_t>::get_hash to copy and hash only the active-length portion (keep references to assignment, handle_ptr->get_stream(), compute_hash, and problem_ptr->n_variables/get_host_assignment to locate the code).cpp/tests/mip/determinism_test.cu-70-95 (1)
70-95:⚠️ Potential issue | 🟠 MajorDon't use the capped prefix buffer to prove GPU callbacks happened.
Once Line 87 starts returning early, later callbacks are dropped from
solutions. The downstreamcount_gpu_callbacks(callbacks_run*)checks then only inspect the first 32 callbacks, so this test can fail even when GPU-origin incumbents appear later in the run. Keep the prefix cap for stored assignments, but track callback-origin counts across all invocations.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/tests/mip/determinism_test.cu` around lines 70 - 95, In get_solution, we should not gate counting of callback origins on the capped solutions buffer; move or add logic that records/updates callback-origin counts (using callback_info->origin or a dedicated counters map/array like callbacks_run) before the early return that checks solutions.size() >= max_solutions so every invocation is accounted for, while still preserving the existing prefix-cap behavior for storing assignment data in solutions; update get_solution to increment the origin counter (or call the existing count/update helper) immediately after ASSERT/EXPECT checks and before the if (solutions.size() >= max_solutions) return.cpp/tests/mip/local_search_test.cu-157-170 (1)
157-170:⚠️ Potential issue | 🟠 MajorStop the FP loop when the work budget is exhausted.
run_single_fp_descent()/restart_fp()consumelocal_search.fp.timer, but the outerwhile (true)never checks it. Once the timer expires, this can spin forever returningfalseand hang the test runner.Suggested fix
if (mode == local_search_mode_t::FP) { bool is_feasible = false; int iterations = 0; - while (true) { + while (!local_search.fp.timer.check_time_limit()) { is_feasible = local_search.fp.run_single_fp_descent(solution); printf("fp_loop it %d, is_feasible %d\n", iterations, is_feasible); if (is_feasible) { break; } else { is_feasible = local_search.fp.restart_fp(solution); if (is_feasible) { break; } } iterations++; } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/tests/mip/local_search_test.cu` around lines 157 - 170, The FP loop can spin after the work budget expires because run_single_fp_descent() and restart_fp() consume local_search.fp.timer but the while(true) never checks it; update the loop in local_search_test.cu (around local_search.fp.run_single_fp_descent and local_search.fp.restart_fp) to check the FP work budget/timer after each call and break when expired (e.g., inspect local_search.fp.timer or call its expired/remaining API) so the loop exits when the timer indicates work is exhausted rather than looping forever.cpp/src/mip_heuristics/diversity/recombiners/recombiner.cuh-249-256 (1)
249-256:⚠️ Potential issue | 🟠 MajorNon-deterministic recombiner order due to
unordered_setiteration.The
enabled_recombinersvector is constructed fromstd::unordered_setiteration, which has implementation-defined order. This meansorder_strin the log and the actual recombiner execution order may vary between runs, potentially affecting determinism.🐛 Fix: Sort the enabled_recombiners vector after construction
recombiner_t::enabled_recombiners = std::vector<recombiner_enum_t>(enabled_recombiners.begin(), enabled_recombiners.end()); + // Sort to ensure deterministic iteration order + std::sort(recombiner_t::enabled_recombiners.begin(), + recombiner_t::enabled_recombiners.end()); cuopt_assert(!recombiner_t::enabled_recombiners.empty(), "No recombiners enabled after init");🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/mip_heuristics/diversity/recombiners/recombiner.cuh` around lines 249 - 256, The enabled_recombiners vector is built from an unordered_set and therefore can have non-deterministic iteration order; after constructing recombiner_t::enabled_recombiners, sort it to enforce a stable order (e.g., call std::sort on recombiner_t::enabled_recombiners using a comparator that orders by the enum numeric value or by recombiner_name(recombiner) for a deterministic lexical order) so order_str and actual execution order are deterministic.cpp/src/mip_heuristics/diversity/recombiners/bound_prop_recombiner.cuh-193-195 (1)
193-195:⚠️ Potential issue | 🟠 MajorDon't ship a placeholder work estimate into deterministic accounting.
n_vars_from_other / 1e8ignores the actual branch taken, reduced-problem size, probing load, and the bounds-prop/LP effort. Two recombinations with the samen_vars_from_othercan therefore consume very different work while being charged identically.Do you want me to sketch an estimator based on reduced-problem size and the actual bounds-prop rounds?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/mip_heuristics/diversity/recombiners/bound_prop_recombiner.cuh` around lines 193 - 195, The current deterministic accounting uses a placeholder estimate double work = static_cast<double>(n_vars_from_other) / 1e8; which mischarges recombinations; replace this with a real estimator that accounts for the actual branch taken, reduced-problem size, bounds-prop rounds, probing effort and any LP solves: compute work from inputs like n_vars_from_other, n_constraints_from_other (or reduced problem variable/constraint counts), the number of bounds-prop iterations/rounds (e.g., bounds_prop_rounds), and an LP/solver cost factor (e.g., lp_solves or lp_iterations), and combine them into a weighted sum (or calibrated formula) instead of the fixed division; update the usage of work where used in BoundPropRecombiner/recombine (or the function containing work) so charges reflect these concrete components rather than the 1e8 placeholder.cpp/src/mip_heuristics/solve.cu-176-179 (1)
176-179:⚠️ Potential issue | 🟠 MajorDon't force-enable solution logging.
Line 178 turns
CUOPT_MIP_HIDE_SOLUTIONinto a no-op, so callers who explicitly asked to suppress full solution output will still get it in logs.🛠️ Suggested fix
int hidesol = std::getenv("CUOPT_MIP_HIDE_SOLUTION") ? atoi(std::getenv("CUOPT_MIP_HIDE_SOLUTION")) : 0; - hidesol = 0; if (!hidesol) { detail::print_solution(scaled_problem.handle_ptr, sol.get_solution()); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/mip_heuristics/solve.cu` around lines 176 - 179, The code forcibly enables solution logging by overwriting the hidesol flag (int hidesol ...) with hidesol = 0; remove that overwrite so the hidesol value derived from the environment variable CUOPT_MIP_HIDE_SOLUTION is respected; then ensure the conditional if (!hidesol) { detail::print_solution(scaled_problem.handle_ptr, sol.get_solution()); } remains unchanged so callers can suppress printing by setting the env var, referencing the hidesol variable, detail::print_solution, scaled_problem.handle_ptr, and sol.get_solution() to locate the change.cpp/src/mip_heuristics/solve.cu-244-246 (1)
244-246:⚠️ Potential issue | 🟠 MajorThe root checker still ignores
settings.work_limit.
deterministic_runis computed after the checker is created, and the checker is always initialized fromtime_limit. That means the new deterministicremaining_time()/work_limit_timer_tplumbing below never sees the configured top-level work budget.🛠️ Suggested fix
- auto timer = - cuopt::termination_checker_t(time_limit, cuopt::termination_checker_t::root_tag_t{}); - const bool deterministic_run = (settings.determinism_mode != CUOPT_DETERMINISM_NONE); + const bool deterministic_run = (settings.determinism_mode != CUOPT_DETERMINISM_NONE); + const f_t root_work_limit = + (settings.work_limit == 0 || settings.work_limit == std::numeric_limits<f_t>::infinity() || + settings.work_limit == std::numeric_limits<f_t>::max()) + ? max_time_limit + : settings.work_limit; + auto timer = cuopt::termination_checker_t( + deterministic_run ? root_work_limit : time_limit, + cuopt::termination_checker_t::root_tag_t{});🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/mip_heuristics/solve.cu` around lines 244 - 246, Compute deterministic_run before constructing the termination checker and initialize cuopt::termination_checker_t with the correct top-level budget (use settings.work_limit via the work_limit timer path when determinism is enabled) instead of always passing time_limit; in practice move the line computing deterministic_run above the creation of timer and, if deterministic_run (or settings.work_limit is set), construct the checker using the work-limit-aware mechanism (work_limit_timer_t / remaining_time plumbing) so the root termination_checker_t (root_tag_t) sees settings.work_limit rather than always using time_limit.cpp/src/mip_heuristics/local_search/rounding/bounds_repair.cu-304-307 (1)
304-307:⚠️ Potential issue | 🟠 MajorThe pre-sort does not survive the later
thrust::sort_by_keyincompute_damages().The
compute_damages()function resorts candidates usingthrust::sort_by_keyon(cstr_delta, damage)with a custom comparator. Sincesort_by_keyis unstable, candidates with equal(cstr_delta, damage)pairs can be reordered, destroying the deterministic(variable_index, bound_shift)ordering established by the pre-sort at lines 304-307.To fix: either replace
sort_by_keywithstable_sort_by_key, or add(variable_index, bound_shift)as tie-breakers in the custom comparator.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/mip_heuristics/local_search/rounding/bounds_repair.cu` around lines 304 - 307, The current pre-sort of candidates by (variable_index, bound_shift) is being undone inside compute_damages() because it uses thrust::sort_by_key with a custom comparator (on cstr_delta and damage) which is unstable; either switch that call to thrust::stable_sort_by_key so equal (cstr_delta, damage) preserve the pre-sort, or update the custom comparator used in compute_damages() to include candidates.variable_index and candidates.bound_shift as deterministic tie-breakers (compare cstr_delta, then damage, then variable_index, then bound_shift) to maintain deterministic ordering.cpp/src/utilities/work_limit_context.hpp-60-62 (1)
60-62:⚠️ Potential issue | 🟠 Major
CUOPT_GPU_HEUR_WORK_UNIT_SCALEdoes not change the actual budget yet.The env value is stored in
producer_progress_scale, but recorded work is multiplied bywork_unit_scale, which stays1.0. That makes producer notifications and real scheduling drift apart.Suggested fix
- double producer_progress_scale{ - read_work_unit_scale_env_or_default("CUOPT_GPU_HEUR_WORK_UNIT_SCALE", 1.0)}; - double work_unit_scale{1.0}; + double work_unit_scale{ + read_work_unit_scale_env_or_default("CUOPT_GPU_HEUR_WORK_UNIT_SCALE", 1.0)}; + double producer_progress_scale{work_unit_scale};Also applies to: 123-124
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/utilities/work_limit_context.hpp` around lines 60 - 62, producer_progress_scale is being initialized from CUOPT_GPU_HEUR_WORK_UNIT_SCALE but work_unit_scale remains 1.0 so recorded work never uses the env override; update the initialization so work_unit_scale is set from read_work_unit_scale_env_or_default (or assigned from producer_progress_scale) so both producer notifications and recorded work use the same scale (apply the same change where work_unit_scale is declared/initialized at the other occurrence referenced by symbols producer_progress_scale and work_unit_scale).cpp/src/mip_heuristics/local_search/rounding/constraint_prop.cu-29-58 (1)
29-58:⚠️ Potential issue | 🟠 MajorGate
log_unset_var_ordering()before the host syncs.This helper copies to host, synchronizes the stream, and reads device values before it reaches
CUOPT_DETERMINISM_LOG. Because the call sites are unconditional, normal runs pay that stall too.Suggested fix
template <typename i_t, typename f_t> void log_unset_var_ordering(const char* label, const rmm::device_uvector<i_t>& vars, const rmm::device_uvector<f_t>& assignment, - const raft::handle_t* handle_ptr) + const raft::handle_t* handle_ptr, + bool enabled) { + if (!enabled) return; const size_t n = vars.size(); const size_t cap = std::min(n, (size_t)16); std::vector<i_t> h_vars(cap); raft::copy(h_vars.data(), vars.data(), cap, handle_ptr->get_stream()); handle_ptr->sync_stream(); @@ - log_unset_var_ordering("post-sort_by_frac", unset_integer_vars, sol.assignment, sol.handle_ptr); + log_unset_var_ordering( + "post-sort_by_frac", unset_integer_vars, sol.assignment, sol.handle_ptr, timer.deterministic);Apply the same guard to the other two call sites in
find_integer().Also applies to: 945-946, 963-964, 998-999
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/mip_heuristics/local_search/rounding/constraint_prop.cu` around lines 29 - 58, The function log_unset_var_ordering performs host copies and stream syncs unconditionally; wrap its expensive host-side work behind the determinism logging guard so it only runs when CUOPT_DETERMINISM_LOG is enabled (e.g., check the same macro/flag used by CUOPT_DETERMINISM_LOG and early-return if logging is disabled), and also wrap the three unconditional calls from find_integer() with the same guard so those call sites no longer trigger the host copy/sync when logging is off; locate log_unset_var_ordering and the call sites inside find_integer() and implement the conditional/early-return around the host copy, raft::copy/handle_ptr->sync_stream, and assignment reads (or wrap the entire function body) and add the same macro check at the three call sites.cpp/src/mip_heuristics/solution_callbacks.cuh-87-100 (1)
87-100:⚠️ Potential issue | 🟠 MajorMove callback dispatch out of
solution_callback_mutex_.
invoke_get_solution_callbacks()runs user code. Holding the mutex across that call can deadlock on re-entrant publication paths and blocks solver progress behind slow callbacks.Suggested fix
bool publish_new_best_feasible(const solution_callback_payload_t<i_t, f_t>& payload, double elapsed_time = -1.0) { - std::lock_guard<std::mutex> lock(solution_callback_mutex_); - cuopt_assert(std::isfinite(payload.solver_objective), - "Feasible incumbent objective must be finite"); - if (!(payload.solver_objective < best_callback_feasible_objective_)) { return false; } - - best_callback_feasible_objective_ = payload.solver_objective; - if (settings.benchmark_info_ptr != nullptr && elapsed_time >= 0.0) { - settings.benchmark_info_ptr->last_improvement_of_best_feasible = elapsed_time; - } - invoke_get_solution_callbacks(payload); - return true; + { + std::lock_guard<std::mutex> lock(solution_callback_mutex_); + cuopt_assert(std::isfinite(payload.solver_objective), + "Feasible incumbent objective must be finite"); + if (!(payload.solver_objective < best_callback_feasible_objective_)) { return false; } + + best_callback_feasible_objective_ = payload.solver_objective; + if (settings.benchmark_info_ptr != nullptr && elapsed_time >= 0.0) { + settings.benchmark_info_ptr->last_improvement_of_best_feasible = elapsed_time; + } + } + invoke_get_solution_callbacks(payload); + return true; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/mip_heuristics/solution_callbacks.cuh` around lines 87 - 100, The function publish_new_best_feasible holds solution_callback_mutex_ across invoke_get_solution_callbacks, which can run user code and deadlock; limit the mutex scope to only the critical section that checks/updates best_callback_feasible_objective_ and benchmark info (use a scoped block or a std::unique_lock and unlock before calling invoke_get_solution_callbacks), then call invoke_get_solution_callbacks(payload) after releasing solution_callback_mutex_; keep checks using solution_callback_mutex_, update best_callback_feasible_objective_ and settings.benchmark_info_ptr while locked, then release the lock and finally invoke_get_solution_callbacks.cpp/src/mip_heuristics/solution_callbacks.cuh-205-206 (1)
205-206:⚠️ Potential issue | 🟠 MajorUse model-scaled tolerances for injected-objective validation.
A fixed
1e-6absolute check is too tight for templatedf_tand large objectives. Valid injected solutions can trip this assert after preprocessing/scaling due to normal roundoff.Suggested fix
- cuopt_assert(std::abs(outside_sol.get_user_objective() - outside_sol_objective) <= 1e-6, - "External solution objective mismatch"); + const f_t computed_obj = outside_sol.get_user_objective(); + const f_t abs_tol = current_incumbent.problem_ptr->tolerances.absolute_tolerance; + const f_t rel_tol = f_t(1e-6); + cuopt_assert(std::abs(computed_obj - outside_sol_objective) <= + abs_tol + rel_tol * + std::max(std::abs(computed_obj), + std::abs(outside_sol_objective)), + "External solution objective mismatch");🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/mip_heuristics/solution_callbacks.cuh` around lines 205 - 206, The absolute tolerance 1e-6 in the cuopt_assert comparing outside_sol.get_user_objective() and outside_sol_objective is too strict for templated f_t and scaled models; replace it with a model-scaled tolerance (e.g., compute tol = max(min_tol, rel_tol * max(1.0, abs(outside_sol.get_user_objective()), abs(outside_sol_objective))) or use the model/objective scaling factor if available) and use cuopt_assert(std::abs(...) <= tol, ...) instead; update the assertion in solution_callbacks.cuh where outside_sol.get_user_objective() and outside_sol_objective are compared to use this computed tolerance.cpp/src/utilities/termination_checker.hpp-10-13 (1)
10-13:⚠️ Potential issue | 🟠 MajorMake
finishedatomic.
check()can run concurrently from multiple worker threads. The mutablebool finishedis read and written without synchronization in the deterministic path, which is a data race.🔒 Proposed fix
`#include` <algorithm> +#include <atomic> `#include` <cmath> `#include` <string> @@ - if (finished_now && !finished) { - finished = true; - double actual_elapsed_time = timer.elapsed_time(); - - if (work_limit > 0 && std::abs(actual_elapsed_time - work_limit) / work_limit > 0.10) { - CUOPT_LOG_TRACE( - "%s:%d: %s(): Work limit timer finished with a large discrepancy: %fs for %fwu " - "(global: %g, start: %g)", - file, - line, - caller, - actual_elapsed_time, - work_limit, - work_context->current_work(), - work_units_at_start); + if (finished_now) { + bool expected = false; + if (finished.compare_exchange_strong(expected, true, std::memory_order_relaxed)) { + double actual_elapsed_time = timer.elapsed_time(); + + if (work_limit > 0 && std::abs(actual_elapsed_time - work_limit) / work_limit > 0.10) { + CUOPT_LOG_TRACE( + "%s:%d: %s(): Work limit timer finished with a large discrepancy: %fs for %fwu " + "(global: %g, start: %g)", + file, + line, + caller, + actual_elapsed_time, + work_limit, + work_context->current_work(), + work_units_at_start); + } } } - return finished; + return finished.load(std::memory_order_relaxed); @@ - mutable bool finished{false}; + mutable std::atomic<bool> finished{false};Also applies to: 88-109, 223-225
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/utilities/termination_checker.hpp` around lines 10 - 13, The mutable bool finished in the termination checker is a data race because check() is called concurrently; change it to a thread-safe atomic by adding `#include` <atomic> and replacing the declaration with mutable std::atomic<bool> finished{false}; then update all reads/writes to use finished.load(...) and finished.store(...), using memory_order_acquire for reads in check() and memory_order_release for writes when setting finished (and/or memory_order_relaxed for simple clear/set if appropriate), and apply the same changes to any other methods that read/write finished (e.g., the reset/marking code referenced around the other occurrences).cpp/src/mip_heuristics/diversity/diversity_manager.cu-499-503 (1)
499-503:⚠️ Potential issue | 🟠 MajorDerive deterministic phase budgets from remaining work.
timer.get_time_limit()ignores work already consumed beforerun_solver()(presolve, probing, etc.). In deterministic mode the initial LP budget can therefore exceed the remaining global budget.🛠️ Proposed fix
population.timer = timer; - const f_t time_limit = timer.deterministic ? timer.get_time_limit() : timer.remaining_time(); + const f_t time_limit = timer.remaining_time(); const f_t lp_time_limit = std::min(diversity_config.max_time_on_lp, time_limit * diversity_config.time_ratio_on_init_lp);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/mip_heuristics/diversity/diversity_manager.cu` around lines 499 - 503, The deterministic LP-phase budget is being computed from timer.get_time_limit(), which can ignore work already consumed before run_solver(); instead derive phase budgets from remaining time. Change the computation so time_limit is obtained from timer.remaining_time() for both deterministic and non-deterministic modes (i.e., replace the conditional using timer.get_time_limit() with a call to timer.remaining_time()), then compute lp_time_limit as before using diversity_config.max_time_on_lp and diversity_config.time_ratio_on_init_lp; update any comments referencing deterministic behavior to reflect the unified remaining-time approach (references: population.timer, timer.get_time_limit(), timer.remaining_time(), time_limit, lp_time_limit, diversity_config.max_time_on_lp, diversity_config.time_ratio_on_init_lp).cpp/tests/mip/diversity_test.cu-285-295 (1)
285-295:⚠️ Potential issue | 🟠 MajorDon't call
exit(1)from a test helper.This aborts the whole test binary and skips GTest failure reporting and teardown for the rest of the suite. Fail the current test instead.
🧪 Proposed fix
if (hash_map.find(std::make_tuple(path, i, j, recombiner)) == hash_map.end()) { hash_map[std::make_tuple(path, i, j, recombiner)] = offspring_hash; } else { if (hash_map[std::make_tuple(path, i, j, recombiner)] != offspring_hash) { - printf("%s: hash mismatch for %d,%d: %d != %d\n", - path.c_str(), - i, - j, - hash_map[std::make_tuple(path, i, j, recombiner)], - offspring_hash); - exit(1); + ADD_FAILURE() << path << ": hash mismatch for " << i << "," << j << ": expected " + << hash_map[std::make_tuple(path, i, j, recombiner)] + << " got " << offspring_hash; + return 0; } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/tests/mip/diversity_test.cu` around lines 285 - 295, The test helper currently prints an error and calls exit(1) on a hash mismatch (the branch that compares hash_map[std::make_tuple(path, i, j, recombiner)] to offspring_hash); replace the printf + exit(1) with a GTest failure so the current test fails cleanly (e.g., use FAIL() or ADD_FAILURE() with a formatted message including path, i, j, the stored hash and offspring_hash) and remove the exit call so teardown and other tests continue to run.
🟡 Minor comments (11)
ci/compute-sanitizer-suppressions.xml-80-106 (1)
80-106:⚠️ Potential issue | 🟡 MinorRecord 5 (lines 80-108) has a brittle host stack—consolidate into explicit function anchors
This record has 4 module-only frames paired with only 1 function anchor (
ThreadLoad), making it vulnerable to silent suppression failures across CUDA/runtime updates. Tighten the stack by using function-anchored callsites (cudaLaunchKernel*,cudaMemcpyAsync, specificrmm::*functions) instead of repeated module patterns.Note: Records at lines 124-155, 177-209, and 223-249 already have multiple function anchors alongside their module-only frames (2–3 anchors each), so they are not at risk.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ci/compute-sanitizer-suppressions.xml` around lines 80 - 106, The suppression record for Initcheck/ThreadLoad is brittle because it uses multiple module-only frames; replace those module-only anchors with concrete function callsites to stabilize matching — e.g., add/replace frames referencing cudaLaunchKernel* and/or cudaMemcpyAsync and specific rmm::* functions (or other explicit host functions that call into libcuda/libcudart/libcuopt) alongside the existing ThreadLoad and Device(Reduce|Scan)Kernel anchors so the hostStack contains multiple function-anchored frames instead of only module patterns.cpp/src/utilities/determinism_log.hpp-19-35 (1)
19-35:⚠️ Potential issue | 🟡 MinorRemove duplicate macro blocks; only the first definition is active.
After Line 20 defines
CUOPT_DETERMINISM_LOG, the#ifndefblocks at Line 25-35 never execute. Keep one no-op definition to avoid confusion.Proposed cleanup
`#ifndef` CUOPT_DETERMINISM_LOG `#define` CUOPT_DETERMINISM_LOG(logger, ...) \ do { \ } while (0) `#endif` - -#ifndef CUOPT_DETERMINISM_LOG -#define CUOPT_DETERMINISM_LOG(...) \ - do { \ - } while (0) -#endif - -#ifndef CUOPT_DETERMINISM_LOG -#define CUOPT_DETERMINISM_LOG(...) \ - do { \ - } while (0) -#endif🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/utilities/determinism_log.hpp` around lines 19 - 35, The file contains three redundant no-op macro definitions for CUOPT_DETERMINISM_LOG; keep a single no-op macro definition and remove the duplicate blocks so the macro is only defined once. Locate the repeated CUOPT_DETERMINISM_LOG macro definitions (the `#ifndef` / `#define` / do { } while (0) blocks) and delete the extra two, leaving one clear no-op definition to avoid confusion.cpp/src/mip_heuristics/diversity/recombiners/recombiner_stats.hpp-80-84 (1)
80-84:⚠️ Potential issue | 🟡 MinorInitialize
last_recombiner_workand consider updatingreset().The new field
last_recombiner_workis uninitialized, which could lead to undefined behavior ifget_last_recombiner_work()is called beforeset_recombiner_work(). Similarly, thereset()method (lines 99-105) does not clearlast_recombiner_work(orlast_recombiner_time), so these values persist across resets.Consider adding default initialization:
Proposed fix
double last_recombiner_time; std::chrono::high_resolution_clock::time_point last_recombiner_start_time; - double last_recombiner_work; + double last_recombiner_work{0.0};🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/mip_heuristics/diversity/recombiners/recombiner_stats.hpp` around lines 80 - 84, Initialize last_recombiner_work (and last_recombiner_time) when declared and clear them in the reset() method: add a default value for last_recombiner_work (e.g., 0.0) in its declaration, and update reset() to explicitly set last_recombiner_work and last_recombiner_time back to the same default/sentinel so get_last_recombiner_work() cannot return uninitialized data; modify the set_recombiner_work(), get_last_recombiner_work(), and reset() logic accordingly to rely on these initialized fields.cpp/src/mip_heuristics/local_search/rounding/bounds_repair.cuh-121-124 (1)
121-124:⚠️ Potential issue | 🟡 MinorPass the work-limit timer by reference for consistency.
Line 123 passes
work_limit_timer_t timer_by value, which is inconsistent with other timer-accepting functions in the codebase (bound_update_loopin bounds_presolve.cuh and multi_probe.cuh acceptwork_limit_timer_t&). While the current implementation functions correctly due to the sharedwork_contextpointer, accepting by reference aligns with the established pattern for mutable state throughout the codebase.Suggested header fix
- bool repair_problem(problem_t<i_t, f_t>& problem, - problem_t<i_t, f_t>& original_problem, - work_limit_timer_t timer_, - const raft::handle_t* handle_ptr_); + bool repair_problem(problem_t<i_t, f_t>& problem, + problem_t<i_t, f_t>& original_problem, + work_limit_timer_t& timer_, + const raft::handle_t* handle_ptr_);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/mip_heuristics/local_search/rounding/bounds_repair.cuh` around lines 121 - 124, The declaration of repair_problem currently takes work_limit_timer_t timer_ by value; change it to take work_limit_timer_t& timer_ for consistency with other timer-accepting functions (e.g., bound_update_loop) and to avoid copying mutable timer state. Update the function prototype in bounds_repair.cuh (repair_problem<...> signature), the corresponding implementation/definition, and all call sites so they pass the timer by reference (and adjust any constness if needed); ensure the parameter name and the raft::handle_t* handle_ptr_ symbol remain unchanged.cpp/include/cuopt/linear_programming/cuopt_c.h-740-740 (1)
740-740:⚠️ Potential issue | 🟡 MinorDocumentation references non-existent
struct_sizefield.The documentation states "
struct_sizeis always set and can be used to detect future extensions safely," butcuOptMIPSolutionCallbackInfodoes not contain astruct_sizefield. This appears to be a copy-paste error.Suggested fix
- * `@param`[in] callback_info - Pointer to callback metadata. `struct_size` is always set and can be - * used to detect future extensions safely. + * `@param`[in] callback_info - Pointer to callback metadata containing solution origin and work + * timestamp.As per coding guidelines: For public header files, verify parameter descriptions match actual types/behavior.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/include/cuopt/linear_programming/cuopt_c.h` at line 740, The comment incorrectly references a non-existent struct_size field in the documentation for cuOptMIPSolutionCallbackInfo; update the parameter description for callback_info in cuopt_c.h to remove the mention of struct_size (or replace it with the correct field if one exists) so the doc matches the actual cuOptMIPSolutionCallbackInfo definition and follows public-header guidelines.cpp/src/mip_heuristics/diversity/recombiners/line_segment_recombiner.cuh-79-80 (1)
79-80:⚠️ Potential issue | 🟡 MinorAdd a null check before dereferencing
context.termination.The
work_limit_timer_tconstructor (which is an alias fortermination_checker_t) dereferences the third parameter without null validation. Thecontext.terminationmember is initialized tonullptrand documented as being "set by mip_solver_t after construction," but there is no runtime check to ensure it is non-null whenrecombine()is called. This pattern appears elsewhere in the codebase (e.g., in diversity_manager.cu) and represents a systematic unsafe pointer dereference. Add an assertion or null check to enforce the precondition, or refactor to avoid the dereference if termination may not be set.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/mip_heuristics/diversity/recombiners/line_segment_recombiner.cuh` around lines 79 - 80, Before constructing work_limit_timer_t line_segment_timer in recombine(), verify context.termination is non-null; add an assertion or explicit null-check (e.g., assert(context.termination) or if (!context.termination) handle error/return) prior to the line that creates line_segment_timer so you never dereference a nullptr. Reference the symbols: work_limit_timer_t, line_segment_timer, context.termination and the recombine() method when applying the check.cpp/src/mip_heuristics/diversity/recombiners/fp_recombiner.cuh-72-75 (1)
72-75:⚠️ Potential issue | 🟡 MinorThis fingerprint hashes
vars_to_fixbefore it is populated.
vars_to_fixis a reusable scratch buffer, so the hash at this point reflects stale or uninitialized contents from a previous recombination. That makes the new determinism trace misleading.Suggested fix
CUOPT_DETERMINISM_LOG( - "FP rec: offspring hash 0x%x, vars to fix 0x%x", - offspring.get_hash(), - detail::compute_hash(make_span(vars_to_fix), offspring.handle_ptr->get_stream())); + "FP rec: offspring hash 0x%x", offspring.get_hash());🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/mip_heuristics/diversity/recombiners/fp_recombiner.cuh` around lines 72 - 75, The determinism log is hashing vars_to_fix before it is populated, causing stale/uninitialized data to be logged; update the FP rec logging so that detail::compute_hash(make_span(vars_to_fix), offspring.handle_ptr->get_stream()) is called only after vars_to_fix has been filled with the current recombination data (or compute the hash of the correct populated span), i.e., move the compute_hash call to after the code that populates vars_to_fix (or ensure vars_to_fix is explicitly zeroed/populated beforehand) and then pass that populated span into CUOPT_DETERMINISM_LOG alongside offspring.get_hash().benchmarks/linear_programming/cuopt/run_mip.cpp-299-299 (1)
299-299:⚠️ Potential issue | 🟡 MinorTypo in commented line: "1solution" should be "solution".
This appears to be a debugging artifact where a character was accidentally prepended to comment out the line.
🔧 Fix the comment
- // 1solution.write_to_sol_file(base_filename + ".sol", handle_.get_stream()); + // solution.write_to_sol_file(base_filename + ".sol", handle_.get_stream());🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@benchmarks/linear_programming/cuopt/run_mip.cpp` at line 299, The commented-out line contains a typo: "1solution.write_to_sol_file(base_filename + \".sol\", handle_.get_stream());" — remove the leading '1' so the comment reads "solution.write_to_sol_file(base_filename + \".sol\", handle_.get_stream());" (or uncomment it if intended to be active). Update the comment around the symbol solution.write_to_sol_file and keep the references to base_filename and handle_.get_stream() unchanged.cpp/tests/mip/feasibility_jump_tests.cu-192-197 (1)
192-197:⚠️ Potential issue | 🟡 MinorStatic map for determinism tracking has thread-safety concern.
The
static std::unordered_mapis not thread-safe. If Google Test runs multiple test cases in parallel (via--gtest_parallel), concurrent access could cause data races.Consider using
thread_localor ensuring tests run sequentially.Also,
EXPECT_NEAR(..., 1.0)is a relatively loose tolerance for a determinism check — if the solver is truly deterministic, the values should match exactly (within floating-point representation).🔧 Suggested fix for thread safety
- static std::unordered_map<std::string, double> first_val_map; + // Thread-local to avoid races if tests run in parallel + static thread_local std::unordered_map<std::string, double> first_val_map;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/tests/mip/feasibility_jump_tests.cu` around lines 192 - 197, The static unordered_map first_val_map used to record determinism is not thread-safe; replace it with a thread-safe alternative and tighten the comparison: either make first_val_map thread_local or protect access with a mutex around reads/writes to first_val_map (referencing first_val_map and test_instance), and change the assertion from EXPECT_NEAR(solution.get_user_objective(), first_val_map[test_instance], 1.0) to a stricter equality check such as EXPECT_DOUBLE_EQ(solution.get_user_objective(), first_val_map[test_instance]) or EXPECT_NEAR(..., epsilon) with a very small epsilon (e.g., 1e-12) to reliably detect non-determinism while avoiding data races when reading solution.get_user_objective().cpp/tests/mip/presolve_test.cu-151-157 (1)
151-157:⚠️ Potential issue | 🟡 MinorHash both cached-bound branches.
The composite hash only includes
a.second[0].var_to_cached_bound_map. Ifa.second[1]diverges, this test still reports deterministic behavior.Suggested fix
- auto sorted_map = std::map<int, detail::cached_bound_t<double>>( - a.second[0].var_to_cached_bound_map.begin(), a.second[0].var_to_cached_bound_map.end()); - for (const auto& [var_id, cached_bound] : sorted_map) { - var_to_cached_bound_keys.push_back(var_id); - var_to_cached_bound_lb.push_back(cached_bound.lb); - var_to_cached_bound_ub.push_back(cached_bound.ub); - } + for (const auto& cache_entry : a.second) { + auto sorted_map = std::map<int, detail::cached_bound_t<double>>( + cache_entry.var_to_cached_bound_map.begin(), cache_entry.var_to_cached_bound_map.end()); + for (const auto& [var_id, cached_bound] : sorted_map) { + var_to_cached_bound_keys.push_back(var_id); + var_to_cached_bound_lb.push_back(cached_bound.lb); + var_to_cached_bound_ub.push_back(cached_bound.ub); + } + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/tests/mip/presolve_test.cu` around lines 151 - 157, The test only hashes a.second[0].var_to_cached_bound_map, so divergence in a.second[1] is ignored; update the logic that builds sorted_map and the subsequent pushes (var_to_cached_bound_keys, var_to_cached_bound_lb, var_to_cached_bound_ub) to include both branches: iterate/merge a.second[0].var_to_cached_bound_map and a.second[1].var_to_cached_bound_map (or otherwise combine their entries) in a deterministic order (e.g., insert both into a std::map keyed by var_id) and then push the keys and cached_bound_t (lb/ub) from that combined sorted_map so both branches affect the composite hash.cpp/src/branch_and_bound/pseudo_costs.cpp-223-260 (1)
223-260:⚠️ Potential issue | 🟡 MinorForward
work_ctxthrough this overload.This wrapper accepts
work_ctxbut drops it when it callstrial_branching(), so callers using the atomic-counter overload silently bypass the new work-accounting path.🩹 Proposed fix
return trial_branching(original_lp, settings, var_types, vstatus, edge_norms, basis_factors, basic_list, nonbasic_list, branch_var, branch_var_lower, branch_var_upper, upper_bound, bnb_lp_iter_per_node, start_time, upper_max_lp_iter, lower_max_lp_iter, - total_lp_iter); + total_lp_iter, + work_ctx);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/branch_and_bound/pseudo_costs.cpp` around lines 223 - 260, The wrapper trial_branching_generic currently accepts a cuopt::work_limit_context_t* work_ctx parameter but calls trial_branching without forwarding it, which causes callers using the work-accounting path to be bypassed; update the call to trial_branching inside trial_branching_generic to pass the work_ctx argument through (i.e., include the work_ctx when invoking trial_branching) so the work-limit context is honored, keeping all other parameters unchanged; reference trial_branching_generic and trial_branching to locate the change.
| solution_t<i_t, f_t>& sol, | ||
| f_t lp_run_time_after_feasible, | ||
| timer_t& timer, | ||
| work_limit_timer_t& timer, |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check the relationship between work_limit_timer_t and timer_t
rg -n "work_limit_timer_t|class.*timer" cpp/src/utilities/timer.hpp cpp/src/utilities/termination_checker.hpp -A 5
# Check if work_limit_timer_t is a typedef or inherits from timer_t
ast-grep --pattern $'using work_limit_timer_t = $_'Repository: aliceb-nv/cuopt-fork
Length of output: 2319
🏁 Script executed:
#!/bin/bash
# Extract function signatures from lb_constraint_prop.cu
echo "=== Checking apply_round signature around line 704 ==="
sed -n '700,710p' cpp/src/mip_heuristics/local_search/rounding/lb_constraint_prop.cu
echo -e "\n=== Checking find_integer signature around line 759 ==="
sed -n '755,765p' cpp/src/mip_heuristics/local_search/rounding/lb_constraint_prop.cu
echo -e "\n=== Checking run_repair_procedure signature around line 457 ==="
sed -n '453,463p' cpp/src/mip_heuristics/local_search/rounding/lb_constraint_prop.cu
echo -e "\n=== Checking find_integer call in apply_round around line 729 ==="
sed -n '725,735p' cpp/src/mip_heuristics/local_search/rounding/lb_constraint_prop.cuRepository: aliceb-nv/cuopt-fork
Length of output: 2200
🏁 Script executed:
#!/bin/bash
# Check timer_t and termination_checker_t class definitions and any inheritance
echo "=== timer_t class definition ==="
sed -n '16,60p' cpp/src/utilities/timer.hpp
echo -e "\n=== termination_checker_t class definition (first 50 lines) ==="
sed -n '30,80p' cpp/src/utilities/termination_checker.hpp
echo -e "\n=== Check for inheritance and compatibility methods ==="
rg -A 2 "class termination_checker_t|struct termination_checker_t" cpp/src/utilities/termination_checker.hpp
echo -e "\n=== Check if termination_checker_t inherits from timer_t ==="
rg "termination_checker_t\s*:" cpp/src/utilities/termination_checker.hppRepository: aliceb-nv/cuopt-fork
Length of output: 3214
🏁 Script executed:
#!/bin/bash
# Check if termination_checker_t has methods compatible with timer_t interface
echo "=== Check termination_checker_t methods ==="
rg -A 3 "remaining_time|check_time_limit|elapsed_time" cpp/src/utilities/termination_checker.hpp | head -40
echo -e "\n=== Full termination_checker_t class (up to line 237) ==="
wc -l cpp/src/utilities/termination_checker.hpp
echo -e "\n=== Check for conversion operator or implicit conversion ==="
rg "operator timer_t|explicit|implicit" cpp/src/utilities/termination_checker.hppRepository: aliceb-nv/cuopt-fork
Length of output: 1890
🏁 Script executed:
#!/bin/bash
# Check if find_integer is a template function
echo "=== Search for find_integer definition ==="
rg -B 5 "find_integer\s*\(" cpp/src/mip_heuristics/local_search/rounding/lb_constraint_prop.cu | head -30
echo -e "\n=== Check function signature and template status ==="
sed -n '745,765p' cpp/src/mip_heuristics/local_search/rounding/lb_constraint_prop.cuRepository: aliceb-nv/cuopt-fork
Length of output: 1290
Fix timer type incompatibility in apply_round and find_integer calls.
apply_round (line 704) accepts work_limit_timer_t& timer, but calls find_integer (line 729) which expects timer_t& timer (line 759). These are incompatible types—work_limit_timer_t is a typedef for termination_checker_t, which is unrelated to timer_t. Similarly, run_repair_procedure (line 457) also expects timer_t&. Update function signatures to use a consistent timer type across the call chain.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@cpp/src/mip_heuristics/local_search/rounding/lb_constraint_prop.cu` at line
704, The timer parameter types are inconsistent: apply_round currently takes
work_limit_timer_t& but calls find_integer and run_repair_procedure which expect
timer_t&; change the signatures of apply_round and any intermediate callers
(e.g., run_repair_procedure) to accept timer_t& (or change find_integer to
accept work_limit_timer_t& if you prefer that canonical type) so the same timer
type is used across the call chain; update all call sites to pass the matching
timer_t instance and adjust any typedef/using declarations if needed to make
timer_t the shared, compatible timer type.
Test PR to trigger coderabbit.
Description
Issue
Checklist