Conversation
|
@CodeRabbit full review |
✅ Actions performedFull review triggered. |
|
Warning Rate limit exceeded
Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 13 minutes and 42 seconds. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (116)
📝 WalkthroughWalkthroughThis PR introduces comprehensive determinism support for GPU heuristics with solution origin tracking. Key changes include a new Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes ✨ Finishing Touches🧪 Generate unit tests (beta)
|
There was a problem hiding this comment.
Actionable comments posted: 19
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
cpp/src/mip_heuristics/solver.cu (1)
445-460:⚠️ Potential issue | 🟠 MajorDeterministic B&B currently drops heuristic incumbents.
In the deterministic branch,
context.problem_ptr->branch_and_bound_callbackis left unset, so GPU-heuristic solutions never reach B&B. Later, Lines 518-528 explicitly discardsolunless B&B itself retired a finite-objective incumbent. That means a feasible heuristic solution found during a deterministic run can disappear from the final result.Please wire the deterministic path to
queue_external_solution_deterministic(...)(or the equivalent origin/timestamp-aware ingress) before keeping the “discard GPU heuristic incumbent” logic enabled.Also applies to: 518-528
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/mip_heuristics/solver.cu` around lines 445 - 460, The deterministic branch leaves context.problem_ptr->branch_and_bound_callback unset so GPU-heuristic incumbents are dropped; set the callback in the else branch to call branch_and_bound->queue_external_solution_deterministic(...) (or an equivalent ingress that includes origin/timestamp) similar to the commented lambda so heuristic solutions reach dual_simplex::branch_and_bound_t::queue_external_solution_deterministic; ensure the callback signature matches the non-deterministic std::bind (or adapts the solution vector to the deterministic queue) and that subsequent logic that discards non-BB incumbents (the code around set_new_solution and the discard at lines ~518-528) still sees these queued deterministic incumbents as valid.cpp/src/mip_heuristics/diversity/diversity_manager.cu (1)
127-145:⚠️ Potential issue | 🔴 CriticalInitialize
gpu_heur_loop.deterministicbefore theCUOPT_CONFIG_IDearly returns.When
CUOPT_CONFIG_IDis unset, Line 128 returns before Lines 143-144 run, so the normal constructor path never propagatessettings.determinism_modeintocontext.gpu_heur_loop. Every latertermination_checker_t(context.gpu_heur_loop, ...)child then behaves as non-deterministic unless that unrelated env var happens to be present.🐛 Proposed fix
+ context.gpu_heur_loop.deterministic = + (context.settings.determinism_mode & CUOPT_DETERMINISM_GPU_HEURISTICS); + const char* env_config_id_raw = std::getenv("CUOPT_CONFIG_ID"); if (env_config_id_raw == nullptr) { return; } @@ - context.gpu_heur_loop.deterministic = - (context.settings.determinism_mode & CUOPT_DETERMINISM_GPU_HEURISTICS);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/mip_heuristics/diversity/diversity_manager.cu` around lines 127 - 145, The code currently returns early when CUOPT_CONFIG_ID is unset before initializing context.gpu_heur_loop.deterministic, so ensure you set context.gpu_heur_loop.deterministic = (context.settings.determinism_mode & CUOPT_DETERMINISM_GPU_HEURISTICS) before any early returns related to CUOPT_CONFIG_ID; move that assignment above the getenv/parse logic (or add a pre-check initialization) so deterministic is always propagated regardless of the env var handling in this function.
🧹 Nitpick comments (13)
cpp/include/cuopt/linear_programming/constants.h (1)
114-139: Consider adding Doxygen documentation for new public constants.These new determinism flags and solution origin constants are part of the public C API. While the naming is reasonably self-documenting, adding Doxygen-style documentation would help users understand the semantics (especially for the bitmask usage pattern with
|for combining flags).As per coding guidelines: "For public header files (C++ API): Check if new public functions/classes have documentation comments (Doxygen format)".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/include/cuopt/linear_programming/constants.h` around lines 114 - 139, Add Doxygen-style documentation above the new public constants to explain their purpose and usage: document the determinism bitmask block (CUOPT_DETERMINISM_NONE, CUOPT_DETERMINISM_BB, CUOPT_DETERMINISM_GPU_HEURISTICS, CUOPT_DETERMINISM_FULL and the alias modes CUOPT_MODE_OPPORTUNISTIC, CUOPT_MODE_DETERMINISTIC, CUOPT_MODE_DETERMINISTIC_BB, CUOPT_MODE_DETERMINISTIC_GPU_HEURISTICS) with a brief description and note that flags can be combined with '|' and what each flag enables; also document the MIP solution origin constants (CUOPT_MIP_SOLUTION_ORIGIN_UNKNOWN through CUOPT_MIP_SOLUTION_ORIGIN_PRESOLVE) with short descriptions of each origin value and intended consumers, using Doxygen comment style (/** ... */) so these appear in generated API docs.cpp/tests/mip/mip_utils.cuh (1)
16-16: Remove unused include.The
<pdlp/initial_scaling_strategy/initial_scaling.cuh>include is not used in this file. Test files that require this header include it directly themselves, so removing this include will not affect their compilation.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/tests/mip/mip_utils.cuh` at line 16, Remove the unused include directive for <pdlp/initial_scaling_strategy/initial_scaling.cuh> from the top of mip_utils.cuh; this header is not referenced in this file and tests that need it include it themselves, so simply delete the `#include` line to eliminate the unused dependency.cpp/src/mip_heuristics/local_search/line_segment_search/line_segment_search.cu (1)
206-212: Clamp the nested FJ time budget before callingfj.solve().This now forwards
timer.remaining_time()into the child solve before checkingtimer.check_time_limit(). Ifremaining_time()can go negative once the parent budget is exhausted, the nested FJ gets an invalid limit instead of bailing out immediately.Proposed change
- fj.settings.time_limit = std::min(1., timer.remaining_time()); - is_feasible = fj.solve(solution); + if (timer.check_time_limit()) { break; } + fj.settings.time_limit = + std::max<f_t>(0, std::min<f_t>(1., timer.remaining_time())); + is_feasible = fj.solve(solution);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/mip_heuristics/local_search/line_segment_search/line_segment_search.cu` around lines 206 - 212, Clamp or bail out on the parent timer before invoking the nested FJ: first call timer.check_time_limit() and if it indicates the budget is exhausted, skip or return instead of calling fj.solve(); otherwise compute fj.settings.time_limit by clamping timer.remaining_time() to a non-negative value (e.g., max(0., min(1., timer.remaining_time()))) and then call fj.solve(solution). Update the logic around fj.settings.time_limit, timer.remaining_time(), timer.check_time_limit(), and fj.solve() accordingly.cpp/src/mip_heuristics/diversity/recombiners/line_segment_recombiner.cuh (1)
87-88: Address the TODO: work calculation uses unexplained magic constant.The work value is calculated as
n_different_vars / 1e8with aTODO: CHANGEcomment. This suggests the formula is provisional. Consider documenting the rationale or implementing the intended calculation before merging.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/mip_heuristics/diversity/recombiners/line_segment_recombiner.cuh` around lines 87 - 88, The work calculation currently uses a magic constant ("double work = static_cast<double>(n_different_vars) / 1e8;") and has a TODO; replace this provisional formula with a deterministic, documented computation or a configurable scale factor: compute work relative to a meaningful metric (e.g., normalize by the total number of variables or problem size, or multiply n_different_vars by a clearly named constant like WORK_SCALE read from configuration), update the variable name if needed, and add a short comment explaining the rationale; search for the symbols n_different_vars and work in line_segment_recombiner.cuh (around the existing TODO) to locate the change.cpp/src/mip_heuristics/relaxed_lp/relaxed_lp.cu (1)
74-97: Technical debt: Iteration estimation uses hard-coded magic constants.The predictor model at lines 82-85 uses unexplained coefficients (
313,200,400,600,7100). The multiple TODO comments acknowledge this is a temporary hack. Consider tracking this as technical debt to replace with an actual predictor model.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/mip_heuristics/relaxed_lp/relaxed_lp.cu` around lines 74 - 97, The iteration estimator in the determinism_mode block uses hard-coded magic numbers (313, 200, 400, 600, 7100) inside the loop that computes estim_ms; refactor by extracting these coefficients into clearly named constants or a small config struct (e.g., kBaseMs, kPerVariableMs, kPerConstraintMs, kPerCoefficientMs, kPerIterMs) and replace the literals in the computation, add a TODO with a task/issue reference to replace the heuristic with a real predictor, and ensure the constants are documented so future work on pdlp_settings.iteration_limit (and the use of op_problem, estim_iters, and settings.work_limit) can find and update the model easily.cpp/src/mip_heuristics/solver_context.cuh (1)
72-74: Document the lifecycle requirement for the termination pointer.The
terminationpointer is initialized tonullptrand must be set bymip_solver_tafter construction. Consider adding a comment about when it's valid to access, or add a getter that asserts non-null.📝 Suggested documentation
- // Root termination checker — set by mip_solver_t after construction. - // All sub-timers should use this as parent for wall-clock safety. - cuopt::termination_checker_t* termination{nullptr}; + // Root termination checker — set by mip_solver_t after construction. + // All sub-timers should use this as parent for wall-clock safety. + // IMPORTANT: Must not be dereferenced until set by the solver. + cuopt::termination_checker_t* termination{nullptr};🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/mip_heuristics/solver_context.cuh` around lines 72 - 74, Add a short lifecycle note for the cuopt::termination_checker_t* termination member in solver_context.cuh stating it is initialized to nullptr and must be set by mip_solver_t immediately after construction before any access; alternatively implement and use a getter (e.g., termination_checked() or get_termination()) that asserts or throws if termination is null to enforce validity at call sites (reference the termination member and mip_solver_t initialization responsibility).cpp/tests/mip/feasibility_jump_tests.cu (1)
193-198: Static map may retain stale values across parameterized test invocations.The
static std::unordered_map<std::string, double> first_val_mappersists for the lifetime of the process. If the test suite runs multiple times or if instances are reused with different configurations, stale values could cause false positives/negatives. Consider clearing the map at the start ofTEST(mip_solve, feasibility_jump_determinism)or using a non-static approach.♻️ Suggested approach
TEST(mip_solve, feasibility_jump_determinism) { cuopt::init_logger_t log("", true); + // Clear any stale values from previous test runs + // (Move first_val_map to be local to the test or pass expected values explicitly) int seed = std::getenv("CUOPT_SEED") ? std::stoi(std::getenv("CUOPT_SEED")) : std::random_device{}();Alternatively, pass the expected "gold" objective as a parameter to
run_fj_check_determinisminstead of using a static map.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/tests/mip/feasibility_jump_tests.cu` around lines 193 - 198, The static first_val_map in TEST(mip_solve, feasibility_jump_determinism) can retain values between test runs and cause stale results; change it to not be static (make it a local variable) or explicitly clear it at the start of the TEST, or better yet refactor run_fj_check_determinism to accept the expected "gold" objective as a parameter and remove reliance on first_val_map so determinism checks use per-invocation state rather than a process-global static map.cpp/tests/mip/local_search_test.cu (2)
59-66: Consider extractinginit_handlerto a shared test utility.This function is duplicated between
local_search_test.cuandfeasibility_jump_tests.cu. Consider moving it to a shared header likemip_utils.cuhto reduce duplication.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/tests/mip/local_search_test.cu` around lines 59 - 66, The function init_handler is duplicated across tests; extract it into a shared test utility header to remove duplication by moving the implementation of init_handler (the RAFT_CUBLAS_TRY/RAFT_CUSPARSE_TRY calls that set pointer modes) into a new header (e.g., mip_utils.cuh) and include that header from both local_search_test.cu and feasibility_jump_tests.cu, updating those tests to use the shared init_handler declaration; ensure the header declares the same signature: void init_handler(const raft::handle_t* handle_ptr) and that any needed RAFT includes are present so the RAFT_CUBLAS_TRY/RAFT_CUSPARSE_TRY macros and raft::handle_t are available.
212-214: Seed conversion may truncate on platforms whereunsigned longdiffers fromint.
std::stoireturnsint, which is then cast tounsigned long. On platforms whereintis 32-bit andunsigned longis 64-bit, this is fine, but the explicit cast makes the conversion clearer. Consider usingstd::stouldirectly for consistency.♻️ Suggested fix
- unsigned long seed = std::getenv("CUOPT_SEED") - ? (unsigned long)std::stoi(std::getenv("CUOPT_SEED")) - : (unsigned long)std::random_device{}(); + unsigned long seed = std::getenv("CUOPT_SEED") + ? std::stoul(std::getenv("CUOPT_SEED")) + : static_cast<unsigned long>(std::random_device{}());🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/tests/mip/local_search_test.cu` around lines 212 - 214, The seed assignment uses std::stoi (returns int) then casts to unsigned long, risking truncation on platforms where unsigned long is wider; change the getenv conversion to use std::stoul (or std::stoull if you prefer unsigned long long) when initializing variable seed (the expression assigning unsigned long seed) so the environment string is parsed directly as an unsigned type and avoid narrowing casts.cpp/src/mip_heuristics/diversity/recombiners/sub_mip.cuh (1)
71-72: Same TODO comment as fp_recombiner - consider addressing consistently.This is the same placeholder work calculation (
n_vars_from_other / 1e8) with the same TODO comment. If this formula is finalized, remove the TODO comments across all recombiners for consistency.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/mip_heuristics/diversity/recombiners/sub_mip.cuh` around lines 71 - 72, The placeholder work calculation "double work = static_cast<double>(n_vars_from_other) / 1e8" with the TODO comment must be finalized and made consistent with the other recombiners: replace the magic constant with a named scaling constant (e.g., WORK_SCALE) or use the finalized formula used by fp_recombiner, update the calculation in sub_mip (symbol: work / variable n_vars_from_other) accordingly, and remove the TODO comment so all recombiners share the same clear implementation and constant.cpp/src/mip_heuristics/diversity/recombiners/fp_recombiner.cuh (1)
68-69: Address or remove the TODO comment.The
// TODO: CHANGEcomment suggests this work calculation (n_vars_from_other / 1e8) is a placeholder. If this is the intended formula, remove the TODO; otherwise, clarify what needs to change.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/mip_heuristics/diversity/recombiners/fp_recombiner.cuh` around lines 68 - 69, The TODO comment above the work calculation is a placeholder and must be resolved: either remove the "// TODO: CHANGE" if the intended formula is work = static_cast<double>(n_vars_from_other) / 1e8; or replace the expression with the correct scaling/normalization (e.g., use an appropriately named constant or function) and update the comment to explain the rationale; locate the code that defines the variable work and the input n_vars_from_other in fp_recombiner.cuh (the work variable assignment) and modify it accordingly so the code has no TODO and the formula is explicit and documented.cpp/src/mip_heuristics/diversity/population.cuh (1)
106-117: Add direct include oftermination_checker.hppto this header.
population_tnow exposescuopt::termination_checker_tin its public API (line 127) and stores it as a member (line 216), but this header does not directly include<utilities/termination_checker.hpp>. Instead, it relies on a transitive include from<mip_heuristics/solver.cuh>, which is fragile and can break on include-order changes.Proposed fix
`#include` "assignment_hash_map.cuh" `#include` "population.cuh" `#include` <mip_heuristics/solution/solution.cuh> `#include` <mip_heuristics/solver.cuh> +#include <utilities/termination_checker.hpp> `#include` <utilities/timer.hpp>🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/mip_heuristics/diversity/population.cuh` around lines 106 - 117, population_t exposes and stores cuopt::termination_checker_t but doesn't directly include termination_checker.hpp; add a direct include of "utilities/termination_checker.hpp" to the top of this header so the type is always available regardless of transitive includes. Update the include list in cpp/src/mip_heuristics/diversity/population.cuh to `#include` "utilities/termination_checker.hpp" (ensuring quotes/angle style consistent with project conventions) so symbols like population_t and cuopt::termination_checker_t resolve without relying on mip_heuristics/solver.cuh.cpp/src/branch_and_bound/branch_and_bound.cpp (1)
2404-2406: Consider adding tracking issue for clique table determinism.The clique table is explicitly disabled in deterministic mode with a TODO comment. This is a safe approach, but consider creating a tracking issue to ensure this gets addressed. Clique cuts can be valuable for MIP performance.
Would you like me to open an issue to track enabling clique tables in deterministic mode?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/branch_and_bound/branch_and_bound.cpp` around lines 2404 - 2406, Add a tracking issue for enabling clique tables in deterministic mode and reference it from the TODO: create an issue in the repo that describes enabling clique tables (affecting settings_.clique_cuts, clique_table_, settings_.deterministic, and behavior around signal_extend_cliques_.store), then update the TODO comment next to the block that disables clique tables to include the issue number/URL and brief acceptance criteria (e.g., deterministic behavior preserved and tests added) so future work is discoverable.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@benchmarks/linear_programming/cuopt/run_mip.cpp`:
- Around line 150-166: The code uses std::chrono::high_resolution_clock for
elapsed-time measurements which can be non-monotonic; update all timing code to
use std::chrono::steady_clock instead: change the constructor parameter type and
the stored member start_time_ in incumbent_tracker_t to
steady_clock::time_point, update the get_solution method where now is obtained
to use steady_clock::now() and the wall_s calculation to use steady_clock
durations, and also replace any uses of start_run_solver assignment and final
end calculation to use steady_clock (e.g., start_run_solver =
std::chrono::steady_clock::now() and compute end with steady_clock::now()) so
all time capture is monotonic and consistent.
In `@ci/compute-sanitizer-suppressions.xml`:
- Around line 70-79: The Initcheck suppression for DeviceSegmentedReduceKernel
is too broad—add a hostStack constraint to the same <record> to limit the
suppression to the specific host call path that launches this kernel. Edit the
<record> for DeviceSegmentedReduceKernel and add a <hostStack> element
containing the precise <frame> entries (function names/frames) that correspond
to the host-side launcher (e.g., the function that invokes the kernel or its
caller chain) so the suppression only applies when that stack trace is present.
- Around line 210-222: The suppression is too broad because it matches any
cuMemcpyDtoDAsync calls in libcuda.so; narrow it by constraining the hostStack
to the known benign call path — add one or more additional <frame> entries after
the cuMemcpyDtoDAsync frame that match your project's wrapper/function and
module (e.g., the host function that invokes the device-to-device copy and your
binary's module name) so the record only suppresses cuMemcpyDtoDAsync when
called through that documented, stable path instead of via a blanket regex.
In `@cpp/include/cuopt/linear_programming/utilities/internals.hpp`:
- Around line 26-68: The new public API additions (mip_solution_origin_t,
mip_solution_callback_info_t, base_solution_callback_type including
GET_SOLUTION_EXT, and the new get_solution_callback_ext_t) lack Doxygen comments
describing semantics, lifetime/ownership, threading/GPU expectations, and
migration from GET_SOLUTION; add Doxygen blocks above each public symbol (enum
mip_solution_origin_t, typedef/using mip_solution_callback_info_t, enum
base_solution_callback_type, and the get_solution_callback_ext_t callback type)
that clearly state what each value/field means, who owns referenced data,
whether callbacks may be invoked on GPU/worker threads or require
host-synchronization, reentrancy expectations, required numerical ranges/units,
and a short migration note explaining differences from GET_SOLUTION and how to
adopt GET_SOLUTION_EXT safely.
In `@cpp/src/mip_heuristics/diversity/recombiners/recombiner.cuh`:
- Around line 113-116: The determinism log is hashing the whole device buffer
(this->remaining_indices) including stale tail entries; change the compute_hash
call to hash only the live prefix (the first remaining_variables entries) by
passing a span/slice limited to remaining_variables instead of the full buffer;
apply the same fix for the other log that should hash only n_vars_from_other
entries (lines ~198-203). Locate the CUOPT_DETERMINISM_LOG calls that use
detail::compute_hash(make_span(this->remaining_indices),
a.handle_ptr->get_stream()) and replace the span/input to compute_hash so it
covers only the valid element count (remaining_variables or n_vars_from_other)
while preserving the stream argument.
- Around line 257-269: The log references disable_submip_for_determinism which
is not defined; add a local boolean named disable_submip_for_determinism
(similar to disable_submip_for_continuous_limit and
disable_fp_and_submip_for_expensive_fix) and initialize it from the appropriate
config/condition used for deterministic sub-MIP disabling (for example from
sub_mip_recombiner_config_t or the same predicate used elsewhere in
recombiner.cuh), then use that variable in the CUOPT_DETERMINISM_LOG call so the
identifier is defined and accurately reflects the configured/derived state.
In `@cpp/src/mip_heuristics/feasibility_jump/feasibility_jump.cu`:
- Around line 1319-1325: The stage-two budget scaling incorrectly multiplies
already-reduced settings (time_limit, work_limit, iteration_limit) by
rounding_second_stage_split a second time; change the logic so stage-two budgets
are computed from the original totals (e.g., capture
orig_time_limit/orig_work_limit/orig_iteration_limit before the first ROUNDING
rewrite) and set stage-two values to orig_* * rounding_second_stage_split (and
for deterministic work_limit use orig_work_limit), rather than multiplying the
already-reduced settings; apply the same fix to the other occurrence around
lines 1364-1369.
In `@cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu`:
- Around line 47-58: The code currently only uses cuopt_assert in
read_positive_work_unit_scale (and the similar block at 1333-1341) which is
compiled out in release builds, allowing zero/negative/NaN scales to propagate;
change read_positive_work_unit_scale to perform a runtime validation (check
errno, end_ptr, isfinite(parsed_value) and parsed_value > 0.0) and when invalid
log an error/warning and return a safe default (e.g., 1.0) instead of asserting;
also add the same runtime validation where
context.settings.cpufj_work_unit_scale is set (so it does not bypass the
helper), ensuring any externally-sourced scale is checked and replaced with the
default or rejected at runtime in release builds.
In `@cpp/src/mip_heuristics/presolve/third_party_presolve.cpp`:
- Around line 544-565: Update the placeholder commit reference in the comment
above the SequentialPresolveWrapper class: replace the `<>` in the line "// This
is necessary due to a bug in commit <>" with the actual commit hash (or remove
the commit reference entirely) so the rationale for forcing TBB sequential mode
is traceable; the comment to edit is immediately above the template<class
Presolver, typename REAL> class SequentialPresolveWrapper and the execute(...)
override that wraps Presolver<REAL>::execute.
In `@cpp/src/mip_heuristics/problem/presolve_data.cu`:
- Around line 236-245: The code may read stale data when handle_override !=
problem.handle_ptr because you only call problem.handle_ptr->sync_stream() but
then perform cuopt::host_copy(...) into h->get_stream(); update the sequence so
the stream that last wrote assignment and the target stream h->get_stream() are
synchronized before host_copy (e.g., ensure both
problem.handle_ptr->sync_stream() and h->sync_stream() or insert proper
cross-stream synchronization/ CUDA event-based sync) so h_assignment sees
up-to-date data before calling papilo_presolve_ptr->uncrush_primal_solution;
alternatively document that callers must guarantee assignment is ready on
h->get_stream().
In `@cpp/src/mip_heuristics/problem/problem.cu`:
- Around line 1433-1438: The code performs a duplicate check on var_indices but
doesn't handle the case when var_indices is empty before subsequent code indexes
var_indices[0]; add an early no-op return or an assertion for non-emptiness
immediately before this block (e.g., if (var_indices.empty()) return;) so that
later code that accesses var_indices[0] is safe; reference the existing symbols
var_indices and sorted_vi (and the duplicate check using std::adjacent_find)
when applying the change.
In `@cpp/src/mip_heuristics/solution_callbacks.cuh`:
- Around line 15-17: This header is missing <cmath> causing use of std::isfinite
and std::abs in solution_callbacks.cuh to rely on transitive includes; make the
header self-contained by adding `#include` <cmath> to the top-of-file includes so
functions/methods that call std::isfinite and std::abs (and any other math
helpers used later in this file) are declared without depending on upstream
headers.
In `@cpp/src/mip_heuristics/solve.cu`:
- Around line 167-190: The early_incumbent_callback_t lambda currently calls
detail::dispatch_get_solution_callbacks directly from the early CPUFJ thread
(bypassing solver.run_solver()'s publication path), so change the lambda to
route new incumbents through the existing solution publication mechanism:
instead of calling dispatch_get_solution_callbacks from inside the lambda,
invoke the solution publication API (e.g.,
solution_publication_t::publish_new_best_feasible or the solver's publish
wrapper) passing the uncrushed user_assignment, user_obj and origin so that
publication acquires solution_callback_mutex_ and performs the shared incumbent
handoff under that same lock; also move the updates to
ctx_ptr->initial_incumbent_assignment and ctx_ptr->initial_upper_bound to occur
under the publication/mutex path so the shared state is synchronized
consistently with published callbacks.
In `@cpp/src/pdlp/initial_scaling_strategy/initial_scaling.cu`:
- Around line 789-795: The new overloads accept a stream_override but the
single-arg unscale_solutions(solution_t&) still calls the old delegate path
without passing through the stream override, causing cross-stream ordering
issues; update the unscale_solutions overload that takes solution_t (and any
other overload that constructs a temporary dummy buffer) to forward the
effective stream (use the provided stream_override when non-null or the
solution.stream()/stream_view_) into the lower-level
unscale_solutions(rmm::device_uvector<f_t>& solution, rmm::device_uvector<f_t>&
s, rmm::device_uvector<f_t>& dummy, cudaStream_t stream_override) so all paths
use the same stream.
- Around line 693-695: The ternary using stream_override conflates nullptr and
the CUDA default stream value 0; update the API and call site to make the
override explicit (e.g., change the parameter type from raw cudaStream_t to
std::optional<cudaStream_t> or use a dedicated sentinel) and then construct
rmm::cuda_stream_view accordingly in the initialization of stream (replace the
expression using stream_override with a branch that checks optional.has_value()
or sentinel presence). Modify the function signature and usages that pass
stream_override and ensure stream_view_ remains the fallback when no explicit
override is provided.
In `@cpp/src/utilities/termination_checker.hpp`:
- Around line 185-193: Revert the change to elapsed_time(): keep it returning
wall-clock seconds by using timer.elapsed_time() regardless of deterministic
mode, and add a new accessor (e.g., elapsed_work_units() or
work_units_elapsed()) that returns work_context->current_work() -
work_units_at_start when deterministic (and 0.0 if no work_context); update only
this header by restoring elapsed_time() to call timer.elapsed_time() and adding
the new work-based method that uses deterministic, work_context, and
work_units_at_start so callers that need work units can opt-in without changing
existing time semantics.
In `@cpp/src/utilities/work_limit_context.hpp`:
- Around line 60-62: The environment scale is being stored in
producer_progress_scale but record_work_sync_on_horizon() (and the
scheduler/termination accounting) uses work_unit_scale which remains 1.0 and
attach_producer_sync() logs the wrong field; fix by making the env knob affect
the actual work-budget scale: either assign work_unit_scale =
read_work_unit_scale_env_or_default("CUOPT_GPU_HEUR_WORK_UNIT_SCALE", 1.0) (and
keep producer_progress_scale only if you intentionally need a separate
producer-only factor), or update record_work_sync_on_horizon() to multiply by
producer_progress_scale if the env is intended to be producer-only; also update
attach_producer_sync() logging to print the correct variable
(producer_progress_scale vs work_unit_scale) and apply the same change where
similar code appears (the other occurrences around the other blocks noted).
In `@cpp/tests/mip/presolve_test.cu`:
- Around line 139-145: The test currently only serializes
a.second[0].var_to_cached_bound_map into sorted_map and hashes that, missing any
nondeterminism in other probing-cache branches; update the logic to iterate over
all entries in a.second (not just index 0) and include each entry's
var_to_cached_bound_map contents in the deterministic serialization — e.g., for
each element in a.second, extract its var_to_cached_bound_map, insert its
key/value pairs into the combined sorted_map (or process each map in turn while
appending to var_to_cached_bound_keys, var_to_cached_bound_lb,
var_to_cached_bound_ub) so both probing-cache branches are included in the hash
(references: a.second, var_to_cached_bound_map, sorted_map,
var_to_cached_bound_keys, var_to_cached_bound_lb, var_to_cached_bound_ub).
- Around line 98-118: The new determinism helpers require calling init_handler()
to initialize cuBLAS/cuSparse pointer-mode; update
test_probing_cache_determinism (and similarly test_scaling_determinism) to
invoke init_handler() immediately after constructing the raft::handle_t handle_
so the tests do not rely on prior global library state — locate the handle_
construction in test_probing_cache_determinism and call init_handler() before
any work with the handle (e.g., before handle_.sync_stream() and before creating
solver/context) to ensure deterministic, order-independent behavior.
---
Outside diff comments:
In `@cpp/src/mip_heuristics/diversity/diversity_manager.cu`:
- Around line 127-145: The code currently returns early when CUOPT_CONFIG_ID is
unset before initializing context.gpu_heur_loop.deterministic, so ensure you set
context.gpu_heur_loop.deterministic = (context.settings.determinism_mode &
CUOPT_DETERMINISM_GPU_HEURISTICS) before any early returns related to
CUOPT_CONFIG_ID; move that assignment above the getenv/parse logic (or add a
pre-check initialization) so deterministic is always propagated regardless of
the env var handling in this function.
In `@cpp/src/mip_heuristics/solver.cu`:
- Around line 445-460: The deterministic branch leaves
context.problem_ptr->branch_and_bound_callback unset so GPU-heuristic incumbents
are dropped; set the callback in the else branch to call
branch_and_bound->queue_external_solution_deterministic(...) (or an equivalent
ingress that includes origin/timestamp) similar to the commented lambda so
heuristic solutions reach
dual_simplex::branch_and_bound_t::queue_external_solution_deterministic; ensure
the callback signature matches the non-deterministic std::bind (or adapts the
solution vector to the deterministic queue) and that subsequent logic that
discards non-BB incumbents (the code around set_new_solution and the discard at
lines ~518-528) still sees these queued deterministic incumbents as valid.
---
Nitpick comments:
In `@cpp/include/cuopt/linear_programming/constants.h`:
- Around line 114-139: Add Doxygen-style documentation above the new public
constants to explain their purpose and usage: document the determinism bitmask
block (CUOPT_DETERMINISM_NONE, CUOPT_DETERMINISM_BB,
CUOPT_DETERMINISM_GPU_HEURISTICS, CUOPT_DETERMINISM_FULL and the alias modes
CUOPT_MODE_OPPORTUNISTIC, CUOPT_MODE_DETERMINISTIC, CUOPT_MODE_DETERMINISTIC_BB,
CUOPT_MODE_DETERMINISTIC_GPU_HEURISTICS) with a brief description and note that
flags can be combined with '|' and what each flag enables; also document the MIP
solution origin constants (CUOPT_MIP_SOLUTION_ORIGIN_UNKNOWN through
CUOPT_MIP_SOLUTION_ORIGIN_PRESOLVE) with short descriptions of each origin value
and intended consumers, using Doxygen comment style (/** ... */) so these appear
in generated API docs.
In `@cpp/src/branch_and_bound/branch_and_bound.cpp`:
- Around line 2404-2406: Add a tracking issue for enabling clique tables in
deterministic mode and reference it from the TODO: create an issue in the repo
that describes enabling clique tables (affecting settings_.clique_cuts,
clique_table_, settings_.deterministic, and behavior around
signal_extend_cliques_.store), then update the TODO comment next to the block
that disables clique tables to include the issue number/URL and brief acceptance
criteria (e.g., deterministic behavior preserved and tests added) so future work
is discoverable.
In `@cpp/src/mip_heuristics/diversity/population.cuh`:
- Around line 106-117: population_t exposes and stores
cuopt::termination_checker_t but doesn't directly include
termination_checker.hpp; add a direct include of
"utilities/termination_checker.hpp" to the top of this header so the type is
always available regardless of transitive includes. Update the include list in
cpp/src/mip_heuristics/diversity/population.cuh to `#include`
"utilities/termination_checker.hpp" (ensuring quotes/angle style consistent with
project conventions) so symbols like population_t and
cuopt::termination_checker_t resolve without relying on
mip_heuristics/solver.cuh.
In `@cpp/src/mip_heuristics/diversity/recombiners/fp_recombiner.cuh`:
- Around line 68-69: The TODO comment above the work calculation is a
placeholder and must be resolved: either remove the "// TODO: CHANGE" if the
intended formula is work = static_cast<double>(n_vars_from_other) / 1e8; or
replace the expression with the correct scaling/normalization (e.g., use an
appropriately named constant or function) and update the comment to explain the
rationale; locate the code that defines the variable work and the input
n_vars_from_other in fp_recombiner.cuh (the work variable assignment) and modify
it accordingly so the code has no TODO and the formula is explicit and
documented.
In `@cpp/src/mip_heuristics/diversity/recombiners/line_segment_recombiner.cuh`:
- Around line 87-88: The work calculation currently uses a magic constant
("double work = static_cast<double>(n_different_vars) / 1e8;") and has a TODO;
replace this provisional formula with a deterministic, documented computation or
a configurable scale factor: compute work relative to a meaningful metric (e.g.,
normalize by the total number of variables or problem size, or multiply
n_different_vars by a clearly named constant like WORK_SCALE read from
configuration), update the variable name if needed, and add a short comment
explaining the rationale; search for the symbols n_different_vars and work in
line_segment_recombiner.cuh (around the existing TODO) to locate the change.
In `@cpp/src/mip_heuristics/diversity/recombiners/sub_mip.cuh`:
- Around line 71-72: The placeholder work calculation "double work =
static_cast<double>(n_vars_from_other) / 1e8" with the TODO comment must be
finalized and made consistent with the other recombiners: replace the magic
constant with a named scaling constant (e.g., WORK_SCALE) or use the finalized
formula used by fp_recombiner, update the calculation in sub_mip (symbol: work /
variable n_vars_from_other) accordingly, and remove the TODO comment so all
recombiners share the same clear implementation and constant.
In
`@cpp/src/mip_heuristics/local_search/line_segment_search/line_segment_search.cu`:
- Around line 206-212: Clamp or bail out on the parent timer before invoking the
nested FJ: first call timer.check_time_limit() and if it indicates the budget is
exhausted, skip or return instead of calling fj.solve(); otherwise compute
fj.settings.time_limit by clamping timer.remaining_time() to a non-negative
value (e.g., max(0., min(1., timer.remaining_time()))) and then call
fj.solve(solution). Update the logic around fj.settings.time_limit,
timer.remaining_time(), timer.check_time_limit(), and fj.solve() accordingly.
In `@cpp/src/mip_heuristics/relaxed_lp/relaxed_lp.cu`:
- Around line 74-97: The iteration estimator in the determinism_mode block uses
hard-coded magic numbers (313, 200, 400, 600, 7100) inside the loop that
computes estim_ms; refactor by extracting these coefficients into clearly named
constants or a small config struct (e.g., kBaseMs, kPerVariableMs,
kPerConstraintMs, kPerCoefficientMs, kPerIterMs) and replace the literals in the
computation, add a TODO with a task/issue reference to replace the heuristic
with a real predictor, and ensure the constants are documented so future work on
pdlp_settings.iteration_limit (and the use of op_problem, estim_iters, and
settings.work_limit) can find and update the model easily.
In `@cpp/src/mip_heuristics/solver_context.cuh`:
- Around line 72-74: Add a short lifecycle note for the
cuopt::termination_checker_t* termination member in solver_context.cuh stating
it is initialized to nullptr and must be set by mip_solver_t immediately after
construction before any access; alternatively implement and use a getter (e.g.,
termination_checked() or get_termination()) that asserts or throws if
termination is null to enforce validity at call sites (reference the termination
member and mip_solver_t initialization responsibility).
In `@cpp/tests/mip/feasibility_jump_tests.cu`:
- Around line 193-198: The static first_val_map in TEST(mip_solve,
feasibility_jump_determinism) can retain values between test runs and cause
stale results; change it to not be static (make it a local variable) or
explicitly clear it at the start of the TEST, or better yet refactor
run_fj_check_determinism to accept the expected "gold" objective as a parameter
and remove reliance on first_val_map so determinism checks use per-invocation
state rather than a process-global static map.
In `@cpp/tests/mip/local_search_test.cu`:
- Around line 59-66: The function init_handler is duplicated across tests;
extract it into a shared test utility header to remove duplication by moving the
implementation of init_handler (the RAFT_CUBLAS_TRY/RAFT_CUSPARSE_TRY calls that
set pointer modes) into a new header (e.g., mip_utils.cuh) and include that
header from both local_search_test.cu and feasibility_jump_tests.cu, updating
those tests to use the shared init_handler declaration; ensure the header
declares the same signature: void init_handler(const raft::handle_t* handle_ptr)
and that any needed RAFT includes are present so the
RAFT_CUBLAS_TRY/RAFT_CUSPARSE_TRY macros and raft::handle_t are available.
- Around line 212-214: The seed assignment uses std::stoi (returns int) then
casts to unsigned long, risking truncation on platforms where unsigned long is
wider; change the getenv conversion to use std::stoul (or std::stoull if you
prefer unsigned long long) when initializing variable seed (the expression
assigning unsigned long seed) so the environment string is parsed directly as an
unsigned type and avoid narrowing casts.
In `@cpp/tests/mip/mip_utils.cuh`:
- Line 16: Remove the unused include directive for
<pdlp/initial_scaling_strategy/initial_scaling.cuh> from the top of
mip_utils.cuh; this header is not referenced in this file and tests that need it
include it themselves, so simply delete the `#include` line to eliminate the
unused dependency.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: a5f87b60-b5ab-4df7-8bfc-b0664aaff097
📒 Files selected for processing (116)
benchmarks/linear_programming/cuopt/run_mip.cppci/compute-sanitizer-suppressions.xmlcpp/include/cuopt/linear_programming/constants.hcpp/include/cuopt/linear_programming/cuopt_c.hcpp/include/cuopt/linear_programming/mip/solver_settings.hppcpp/include/cuopt/linear_programming/utilities/internals.hppcpp/src/branch_and_bound/branch_and_bound.cppcpp/src/branch_and_bound/branch_and_bound.hppcpp/src/branch_and_bound/deterministic_workers.hppcpp/src/branch_and_bound/pseudo_costs.cppcpp/src/branch_and_bound/pseudo_costs.hppcpp/src/dual_simplex/basis_updates.cppcpp/src/dual_simplex/bound_flipping_ratio_test.cppcpp/src/dual_simplex/bound_flipping_ratio_test.hppcpp/src/dual_simplex/phase2.cppcpp/src/dual_simplex/simplex_solver_settings.hppcpp/src/math_optimization/solver_settings.cucpp/src/mip_heuristics/diversity/diversity_config.hppcpp/src/mip_heuristics/diversity/diversity_manager.cucpp/src/mip_heuristics/diversity/diversity_manager.cuhcpp/src/mip_heuristics/diversity/lns/rins.cucpp/src/mip_heuristics/diversity/multi_armed_bandit.cuhcpp/src/mip_heuristics/diversity/population.cucpp/src/mip_heuristics/diversity/population.cuhcpp/src/mip_heuristics/diversity/recombiners/bound_prop_recombiner.cuhcpp/src/mip_heuristics/diversity/recombiners/fp_recombiner.cuhcpp/src/mip_heuristics/diversity/recombiners/line_segment_recombiner.cuhcpp/src/mip_heuristics/diversity/recombiners/recombiner.cuhcpp/src/mip_heuristics/diversity/recombiners/recombiner_stats.hppcpp/src/mip_heuristics/diversity/recombiners/sub_mip.cuhcpp/src/mip_heuristics/diversity/weights.cuhcpp/src/mip_heuristics/early_heuristic.cuhcpp/src/mip_heuristics/feasibility_jump/early_cpufj.cuhcpp/src/mip_heuristics/feasibility_jump/early_gpufj.cucpp/src/mip_heuristics/feasibility_jump/early_gpufj.cuhcpp/src/mip_heuristics/feasibility_jump/feasibility_jump.cucpp/src/mip_heuristics/feasibility_jump/feasibility_jump.cuhcpp/src/mip_heuristics/feasibility_jump/feasibility_jump_impl_common.cuhcpp/src/mip_heuristics/feasibility_jump/feasibility_jump_kernels.cucpp/src/mip_heuristics/feasibility_jump/feasibility_jump_kernels.cuhcpp/src/mip_heuristics/feasibility_jump/fj_cpu.cucpp/src/mip_heuristics/feasibility_jump/fj_cpu.cuhcpp/src/mip_heuristics/feasibility_jump/load_balancing.cuhcpp/src/mip_heuristics/feasibility_jump/utils.cuhcpp/src/mip_heuristics/local_search/feasibility_pump/feasibility_pump.cucpp/src/mip_heuristics/local_search/feasibility_pump/feasibility_pump.cuhcpp/src/mip_heuristics/local_search/line_segment_search/line_segment_search.cucpp/src/mip_heuristics/local_search/line_segment_search/line_segment_search.cuhcpp/src/mip_heuristics/local_search/local_search.cucpp/src/mip_heuristics/local_search/local_search.cuhcpp/src/mip_heuristics/local_search/rounding/bounds_repair.cucpp/src/mip_heuristics/local_search/rounding/bounds_repair.cuhcpp/src/mip_heuristics/local_search/rounding/constraint_prop.cucpp/src/mip_heuristics/local_search/rounding/constraint_prop.cuhcpp/src/mip_heuristics/local_search/rounding/lb_bounds_repair.cucpp/src/mip_heuristics/local_search/rounding/lb_bounds_repair.cuhcpp/src/mip_heuristics/local_search/rounding/lb_constraint_prop.cucpp/src/mip_heuristics/local_search/rounding/lb_constraint_prop.cuhcpp/src/mip_heuristics/local_search/rounding/simple_rounding.cucpp/src/mip_heuristics/local_search/rounding/simple_rounding_kernels.cuhcpp/src/mip_heuristics/mip_constants.hppcpp/src/mip_heuristics/presolve/bounds_presolve.cucpp/src/mip_heuristics/presolve/bounds_presolve.cuhcpp/src/mip_heuristics/presolve/bounds_update_data.cucpp/src/mip_heuristics/presolve/lb_probing_cache.cucpp/src/mip_heuristics/presolve/load_balanced_bounds_presolve.cucpp/src/mip_heuristics/presolve/load_balanced_bounds_presolve.cuhcpp/src/mip_heuristics/presolve/multi_probe.cucpp/src/mip_heuristics/presolve/multi_probe.cuhcpp/src/mip_heuristics/presolve/probing_cache.cucpp/src/mip_heuristics/presolve/probing_cache.cuhcpp/src/mip_heuristics/presolve/third_party_presolve.cppcpp/src/mip_heuristics/presolve/third_party_presolve.hppcpp/src/mip_heuristics/presolve/utils.cuhcpp/src/mip_heuristics/problem/presolve_data.cucpp/src/mip_heuristics/problem/presolve_data.cuhcpp/src/mip_heuristics/problem/problem.cucpp/src/mip_heuristics/problem/problem.cuhcpp/src/mip_heuristics/problem/problem_fixing.cuhcpp/src/mip_heuristics/relaxed_lp/relaxed_lp.cucpp/src/mip_heuristics/relaxed_lp/relaxed_lp.cuhcpp/src/mip_heuristics/solution/solution.cucpp/src/mip_heuristics/solution/solution.cuhcpp/src/mip_heuristics/solution_callbacks.cuhcpp/src/mip_heuristics/solve.cucpp/src/mip_heuristics/solver.cucpp/src/mip_heuristics/solver.cuhcpp/src/mip_heuristics/solver_context.cuhcpp/src/mip_heuristics/solver_solution.cucpp/src/pdlp/cuopt_c.cppcpp/src/pdlp/initial_scaling_strategy/initial_scaling.cucpp/src/pdlp/initial_scaling_strategy/initial_scaling.cuhcpp/src/utilities/copy_helpers.hppcpp/src/utilities/cuda_helpers.cuhcpp/src/utilities/determinism_log.hppcpp/src/utilities/seed_generator.cucpp/src/utilities/seed_generator.cuhcpp/src/utilities/termination_checker.hppcpp/src/utilities/timer.hppcpp/src/utilities/work_limit_context.hppcpp/src/utilities/work_unit_scheduler.cppcpp/src/utilities/work_unit_scheduler.hppcpp/tests/CMakeLists.txtcpp/tests/mip/CMakeLists.txtcpp/tests/mip/determinism_test.cucpp/tests/mip/determinism_utils.cuhcpp/tests/mip/diversity_test.cucpp/tests/mip/feasibility_jump_tests.cucpp/tests/mip/incumbent_callback_test.cucpp/tests/mip/load_balancing_test.cucpp/tests/mip/local_search_test.cucpp/tests/mip/mip_utils.cuhcpp/tests/mip/multi_probe_test.cucpp/tests/mip/presolve_test.cudatasets/mip/download_miplib_test_dataset.shskills/cuopt-developer/SKILL.md
| incumbent_tracker_t(std::chrono::high_resolution_clock::time_point start_time) | ||
| : start_time_(start_time) | ||
| { | ||
| } | ||
|
|
||
| void get_solution(void* data, | ||
| void* cost, | ||
| void* solution_bound, | ||
| const cuopt::internals::mip_solution_callback_info_t* info, | ||
| void* user_data) override | ||
| { | ||
| double obj = *static_cast<double*>(cost); | ||
| double wt = (info != nullptr) ? info->work_timestamp : -1.0; | ||
| auto origin = (info != nullptr) ? (cuopt::internals::mip_solution_origin_t)info->origin | ||
| : cuopt::internals::mip_solution_origin_t::UNKNOWN; | ||
| auto now = std::chrono::high_resolution_clock::now(); | ||
| double wall_s = std::chrono::duration<double>(now - start_time_).count(); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# Find the file mentioned in the review
find . -name "run_mip.cpp" -path "*/cuopt/*" 2>/dev/nullRepository: aliceb-nv/cuopt-fork
Length of output: 113
🏁 Script executed:
# Search for all occurrences of high_resolution_clock in the repository
rg "high_resolution_clock" --type cpp 2>/dev/null || trueRepository: aliceb-nv/cuopt-fork
Length of output: 1762
🏁 Script executed:
# Search for start_run_solver
rg "start_run_solver" --type cpp 2>/dev/null || trueRepository: aliceb-nv/cuopt-fork
Length of output: 433
🏁 Script executed:
# Get line count to understand file size
wc -l benchmarks/linear_programming/cuopt/run_mip.cppRepository: aliceb-nv/cuopt-fork
Length of output: 115
🏁 Script executed:
# Read the relevant sections mentioned in the review (150-190)
sed -n '145,195p' benchmarks/linear_programming/cuopt/run_mip.cppRepository: aliceb-nv/cuopt-fork
Length of output: 1964
🏁 Script executed:
# Find and show context around start_run_solver
rg -B 3 -A 5 "start_run_solver" benchmarks/linear_programming/cuopt/run_mip.cppRepository: aliceb-nv/cuopt-fork
Length of output: 1228
Use steady_clock for elapsed-time capture.
high_resolution_clock is not guaranteed to be monotonic and may alias system_clock on some implementations, causing the incumbent trace timestamps to jump backwards if system time is adjusted. steady_clock is guaranteed monotonic and is the correct choice for benchmark elapsed-time measurement. Apply this change to the constructor parameter, member variable, callback implementation, and the start_run_solver assignment and final end calculation.
Proposed changes
- incumbent_tracker_t(std::chrono::high_resolution_clock::time_point start_time)
+ incumbent_tracker_t(std::chrono::steady_clock::time_point start_time)
: start_time_(start_time)
{
}
@@
- auto now = std::chrono::high_resolution_clock::now();
+ auto now = std::chrono::steady_clock::now();
double wall_s = std::chrono::duration<double>(now - start_time_).count();
@@
- std::chrono::high_resolution_clock::time_point start_time_;
+ std::chrono::steady_clock::time_point start_time_;
@@
- auto start_run_solver = std::chrono::high_resolution_clock::now();
+ auto start_run_solver = std::chrono::steady_clock::now();
@@
- auto end = std::chrono::high_resolution_clock::now();
+ auto end = std::chrono::steady_clock::now();🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@benchmarks/linear_programming/cuopt/run_mip.cpp` around lines 150 - 166, The
code uses std::chrono::high_resolution_clock for elapsed-time measurements which
can be non-monotonic; update all timing code to use std::chrono::steady_clock
instead: change the constructor parameter type and the stored member start_time_
in incumbent_tracker_t to steady_clock::time_point, update the get_solution
method where now is obtained to use steady_clock::now() and the wall_s
calculation to use steady_clock durations, and also replace any uses of
start_run_solver assignment and final end calculation to use steady_clock (e.g.,
start_run_solver = std::chrono::steady_clock::now() and compute end with
steady_clock::now()) so all time capture is monotonic and consistent.
| <record> | ||
| <kind>Initcheck</kind> | ||
| <what> | ||
| <text>Uninitialized __global__ memory read of size 8 bytes</text> | ||
| <size>8</size> | ||
| </what> | ||
| <where> | ||
| <func>DeviceSegmentedReduceKernel</func> | ||
| </where> | ||
| </record> |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
python - <<'PY'
import xml.etree.ElementTree as ET
p = "ci/compute-sanitizer-suppressions.xml"
root = ET.parse(p).getroot()
for i, rec in enumerate(root.findall("record"), 1):
kind = (rec.findtext("kind") or "").strip()
where_func = (rec.findtext("./where/func") or "").strip()
has_host_stack = rec.find("hostStack") is not None
if kind == "Initcheck" and (not has_host_stack):
print(f"record {i}: Initcheck without hostStack, where/func={where_func or '<none>'}")
PYRepository: aliceb-nv/cuopt-fork
Length of output: 141
Add a hostStack constraint to scope this Initcheck suppression more narrowly.
The suppression for DeviceSegmentedReduceKernel in record 4 lacks a hostStack constraint, making it broad enough to mask unrelated regressions. Narrow the scope by adding the appropriate constraint.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@ci/compute-sanitizer-suppressions.xml` around lines 70 - 79, The Initcheck
suppression for DeviceSegmentedReduceKernel is too broad—add a hostStack
constraint to the same <record> to limit the suppression to the specific host
call path that launches this kernel. Edit the <record> for
DeviceSegmentedReduceKernel and add a <hostStack> element containing the precise
<frame> entries (function names/frames) that correspond to the host-side
launcher (e.g., the function that invokes the kernel or its caller chain) so the
suppression only applies when that stack trace is present.
| <record> | ||
| <kind>InitcheckApiError</kind> | ||
| <level>Error</level> | ||
| <what> | ||
| <text>Host API uninitialized memory access</text> | ||
| </what> | ||
| <hostStack> | ||
| <frame> | ||
| <func>cuMemcpyDtoDAsync.*</func> | ||
| <module>.*libcuda.so.*</module> | ||
| </frame> | ||
| </hostStack> | ||
| </record> |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
python - <<'PY'
import xml.etree.ElementTree as ET, re
p = "ci/compute-sanitizer-suppressions.xml"
root = ET.parse(p).getroot()
for i, rec in enumerate(root.findall("record"), 1):
if (rec.findtext("kind") or "").strip() != "InitcheckApiError":
continue
frames = rec.findall("./hostStack/frame")
funcs = [(f.findtext("func") or "").strip() for f in frames]
if any(re.search(r"cuMemcpyDtoDAsync", fn) for fn in funcs):
print(f"record {i}: cuMemcpyDtoDAsync suppression has {len(frames)} host frame(s)")
for n, fn in enumerate(funcs, 1):
print(f" frame {n}: {fn or '<module-only frame>'}")
PYRepository: aliceb-nv/cuopt-fork
Length of output: 155
🏁 Script executed:
# First, let's look at the full context of the suppression file
head -250 ci/compute-sanitizer-suppressions.xml | tail -50Repository: aliceb-nv/cuopt-fork
Length of output: 1363
🏁 Script executed:
# Check if there are any comments in the XML that explain this suppression
grep -B5 -A5 "cuMemcpyDtoDAsync" ci/compute-sanitizer-suppressions.xmlRepository: aliceb-nv/cuopt-fork
Length of output: 308
🏁 Script executed:
# Look at the git history for when this suppression was added and why
git log --oneline -n 20 -- ci/compute-sanitizer-suppressions.xmlRepository: aliceb-nv/cuopt-fork
Length of output: 84
🏁 Script executed:
# Check if this suppression is mentioned in any documentation or comments in the codebase
rg "cuMemcpyDtoDAsync" --type txt --type md -A3 -B3Repository: aliceb-nv/cuopt-fork
Length of output: 46
Constrain cuMemcpyDtoDAsync suppression to known benign call path.
This rule broadly suppresses all cuMemcpyDtoDAsync.* calls with a regex pattern, and while a comment notes that device-to-device copies are "usually harmless," the blanket regex can still hide real dataflow bugs. Narrow the suppression to a documented, stable call path in your codebase (e.g., constraint the hostStack to additional frames that uniquely identify the safe context) rather than matching any call to cuMemcpyDtoDAsync.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@ci/compute-sanitizer-suppressions.xml` around lines 210 - 222, The
suppression is too broad because it matches any cuMemcpyDtoDAsync calls in
libcuda.so; narrow it by constraining the hostStack to the known benign call
path — add one or more additional <frame> entries after the cuMemcpyDtoDAsync
frame that match your project's wrapper/function and module (e.g., the host
function that invokes the device-to-device copy and your binary's module name)
so the record only suppresses cuMemcpyDtoDAsync when called through that
documented, stable path instead of via a blanket regex.
| enum class mip_solution_origin_t : uint32_t { | ||
| UNKNOWN = CUOPT_MIP_SOLUTION_ORIGIN_UNKNOWN, | ||
| BRANCH_AND_BOUND_NODE = CUOPT_MIP_SOLUTION_ORIGIN_BRANCH_AND_BOUND, | ||
| BRANCH_AND_BOUND_DIVING = CUOPT_MIP_SOLUTION_ORIGIN_BRANCH_AND_BOUND_DIVING, | ||
| FEASIBILITY_JUMP = CUOPT_MIP_SOLUTION_ORIGIN_FEASIBILITY_JUMP, | ||
| CPU_FEASIBILITY_JUMP = CUOPT_MIP_SOLUTION_ORIGIN_CPU_FEASIBILITY_JUMP, | ||
| LOCAL_SEARCH = CUOPT_MIP_SOLUTION_ORIGIN_LOCAL_SEARCH, | ||
| LP_ROUNDING = CUOPT_MIP_SOLUTION_ORIGIN_LP_ROUNDING, | ||
| RECOMBINATION = CUOPT_MIP_SOLUTION_ORIGIN_RECOMBINATION, | ||
| SUB_MIP = CUOPT_MIP_SOLUTION_ORIGIN_SUB_MIP, | ||
| USER_INITIAL = CUOPT_MIP_SOLUTION_ORIGIN_USER_INITIAL, | ||
| USER_INJECTED = CUOPT_MIP_SOLUTION_ORIGIN_USER_INJECTED, | ||
| RINS = CUOPT_MIP_SOLUTION_ORIGIN_RINS, | ||
| PRESOLVE = CUOPT_MIP_SOLUTION_ORIGIN_PRESOLVE, | ||
| }; | ||
|
|
||
| constexpr const char* mip_solution_origin_to_string(mip_solution_origin_t origin) | ||
| { | ||
| switch (origin) { | ||
| case mip_solution_origin_t::UNKNOWN: return "unknown"; | ||
| case mip_solution_origin_t::BRANCH_AND_BOUND_NODE: return "branch_and_bound_node"; | ||
| case mip_solution_origin_t::BRANCH_AND_BOUND_DIVING: return "branch_and_bound_diving"; | ||
| case mip_solution_origin_t::FEASIBILITY_JUMP: return "feasibility_jump"; | ||
| case mip_solution_origin_t::CPU_FEASIBILITY_JUMP: return "cpu_feasibility_jump"; | ||
| case mip_solution_origin_t::LOCAL_SEARCH: return "local_search"; | ||
| case mip_solution_origin_t::LP_ROUNDING: return "lp_rounding"; | ||
| case mip_solution_origin_t::RECOMBINATION: return "recombination"; | ||
| case mip_solution_origin_t::SUB_MIP: return "sub_mip"; | ||
| case mip_solution_origin_t::USER_INITIAL: return "user_initial"; | ||
| case mip_solution_origin_t::USER_INJECTED: return "user_injected"; | ||
| case mip_solution_origin_t::RINS: return "rins"; | ||
| case mip_solution_origin_t::PRESOLVE: | ||
| return "presolve"; | ||
| // no default to trigger compiler -Werror | ||
| } | ||
| return "unknown"; | ||
| } | ||
|
|
||
| using mip_solution_callback_info_t = cuOptMIPSolutionCallbackInfo; | ||
|
|
||
| // get_solution_ext was added to support passing additional information to the get_solution callback | ||
| // without inducing a breaking ABI change | ||
| enum class base_solution_callback_type { GET_SOLUTION, GET_SOLUTION_EXT, SET_SOLUTION }; |
There was a problem hiding this comment.
Document the new public callback/origin API before release.
mip_solution_origin_t, mip_solution_callback_info_t, GET_SOLUTION_EXT, and get_solution_callback_ext_t extend the public C++ API, but they currently ship without Doxygen describing callback-info semantics, lifetime/ownership, threading expectations, or migration from GET_SOLUTION. That makes the new surface hard to consume safely from downstream code.
As per coding guidelines, cpp/include/cuopt/**/*: "For public header files (C++ API): Check if new public functions/classes have documentation comments (Doxygen format); Flag API changes that may need corresponding docs/ updates; Verify parameter descriptions match actual types/behavior; Suggest documenting thread-safety, GPU requirements, and numerical behavior."
Also applies to: 102-113
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@cpp/include/cuopt/linear_programming/utilities/internals.hpp` around lines 26
- 68, The new public API additions (mip_solution_origin_t,
mip_solution_callback_info_t, base_solution_callback_type including
GET_SOLUTION_EXT, and the new get_solution_callback_ext_t) lack Doxygen comments
describing semantics, lifetime/ownership, threading/GPU expectations, and
migration from GET_SOLUTION; add Doxygen blocks above each public symbol (enum
mip_solution_origin_t, typedef/using mip_solution_callback_info_t, enum
base_solution_callback_type, and the get_solution_callback_ext_t callback type)
that clearly state what each value/field means, who owns referenced data,
whether callbacks may be invoked on GPU/worker threads or require
host-synchronization, reentrancy expectations, required numerical ranges/units,
and a short migration note explaining differences from GET_SOLUTION and how to
adopt GET_SOLUTION_EXT safely.
| CUOPT_DETERMINISM_LOG( | ||
| "remaining indices hash 0x%x, size %d", | ||
| detail::compute_hash(make_span(this->remaining_indices), a.handle_ptr->get_stream()), | ||
| remaining_variables); |
There was a problem hiding this comment.
Hash only the live prefix of remaining_indices.
Both determinism logs hash the full device buffer, but only the first remaining_variables / n_vars_from_other entries are valid for the current call. The tail can still hold stale data from earlier runs, so these logs can report false divergences.
Also applies to: 198-203
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@cpp/src/mip_heuristics/diversity/recombiners/recombiner.cuh` around lines 113
- 116, The determinism log is hashing the whole device buffer
(this->remaining_indices) including stale tail entries; change the compute_hash
call to hash only the live prefix (the first remaining_variables entries) by
passing a span/slice limited to remaining_variables instead of the full buffer;
apply the same fix for the other log that should hash only n_vars_from_other
entries (lines ~198-203). Locate the CUOPT_DETERMINISM_LOG calls that use
detail::compute_hash(make_span(this->remaining_indices),
a.handle_ptr->get_stream()) and replace the span/input to compute_hash so it
covers only the valid element count (remaining_variables or n_vars_from_other)
while preserving the stream argument.
There was a problem hiding this comment.
But that earlier data would still be deterministic right.
There was a problem hiding this comment.
@aliceb-nv — mostly, but with one important exception.
On subsequent calls to assign_same_integer_values within the same run, the tail beyond remaining_variables does contain the leftover sorted data from the previous call, which was itself produced deterministically. So in that steady-state scenario, you're right: the stale tail is deterministic.
However, on the very first call, remaining_indices is a freshly allocated rmm::device_uvector. GPU memory allocators do not guarantee zero-initialization, so the tail contains uninitialised device memory that can differ between process runs — making the hash non-deterministic exactly when you'd first use it.
There's also a subtler issue even in the deterministic-tail case: if remaining_variables changes from one call to the next (e.g., a different pair of parents is recombined), the stale footprint from the previous call will be a different size, so the hash will change even if the live data is identical. That can create spurious-divergence noise in the logs, making it harder to pinpoint real issues.
So the fix is still worth doing — limit the hash to make_span(this->remaining_indices.data(), remaining_variables) and similarly make_span(this->remaining_indices.data(), n_vars_from_other) in the compute_vars_to_fix log.
✏️ Learnings added
Learnt from: aliceb-nv
Repo: aliceb-nv/cuopt-fork PR: 10
File: cpp/src/mip_heuristics/diversity/recombiners/recombiner.cuh:113-116
Timestamp: 2026-04-10T12:20:37.856Z
Learning: In `cpp/src/mip_heuristics/diversity/recombiners/recombiner.cuh`, the `CUOPT_DETERMINISM_LOG` calls inside `assign_same_integer_values` and `compute_vars_to_fix` that hash `this->remaining_indices` should only hash the live prefix (first `remaining_variables` or `n_vars_from_other` entries respectively). The full buffer tail is non-deterministic on the first call (uninitialized GPU memory from `rmm::device_uvector`) and can cause spurious hash changes on subsequent calls when the live prefix size varies between calls.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
🧠 Learnings used
Learnt from: aliceb-nv
Repo: aliceb-nv/cuopt-fork PR: 9
File: cpp/tests/mip/diversity_test.cu:240-242
Timestamp: 2026-04-09T14:29:21.052Z
Learning: In `cpp/tests/mip/diversity_test.cu` (aliceb-nv/cuopt-fork), the `recombine()` call in `test_recombiners_determinism` can legitimately return `success=false` — this means the recombiner did not find a valid/better offspring, which is a valid outcome. The offspring hash is still computed and checked for determinism even when success=false. Review comments suggesting to treat success=false as a test failure are not applicable.
Learnt from: aliceb-nv
Repo: aliceb-nv/cuopt-fork PR: 9
File: cpp/src/mip_heuristics/diversity/diversity_manager.cu:90-100
Timestamp: 2026-04-09T13:47:57.295Z
Learning: In aliceb-nv/cuopt-fork, two MIP solves never run concurrently in the same process. Therefore, resetting process-wide static state (e.g., `fp_recombiner_config_t::max_n_of_vars_from_other`, `ls_recombiner_config_t::max_n_of_vars_from_other`, `bp_recombiner_config_t::max_n_of_vars_from_other`, `sub_mip_recombiner_config_t::max_n_of_vars_from_other`, `mab_ls_config_t::last_lm_config`, `mab_ls_config_t::last_ls_mab_option`) from the `diversity_manager_t` constructor (in `cpp/src/mip_heuristics/diversity/diversity_manager.cu`) is safe and intentional — it ensures a clean slate for sequential test runs. Review comments about concurrent-construction hazards on these statics are not applicable.
Learnt from: aliceb-nv
Repo: aliceb-nv/cuopt-fork PR: 8
File: cpp/src/branch_and_bound/pseudo_costs.cpp:1557-1563
Timestamp: 2026-04-08T14:23:31.627Z
Learning: In `cpp/src/branch_and_bound/pseudo_costs.cpp`, the `reliable_variable_selection_core` function requires `rng` to always be non-null as a design invariant. The unconditional `cuopt_assert(rng != nullptr, ...)` before the shuffle is intentional — callers are expected to always provide a valid RNG, keeping the call contract simple and avoiding a split call state.
Learnt from: aliceb-nv
Repo: aliceb-nv/cuopt-fork PR: 8
File: cpp/src/mip_heuristics/local_search/rounding/bounds_repair.cu:304-309
Timestamp: 2026-04-08T14:52:04.371Z
Learning: In the aliceb-nv/cuopt-fork repository, "determinism" is defined strictly as "run-to-run, same env, same GPU, same stdlib, same binary". This means thrust::sort / sort_by_key and other GPU sorting primitives are considered deterministic because, given the same input and the same hardware/binary/stdlib, they always produce the same output. Concerns about non-stable sort tie-breaking are only relevant for cross-environment reproducibility, NOT for this project's definition of determinism.
Learnt from: aliceb-nv
Repo: aliceb-nv/cuopt-fork PR: 8
File: cpp/src/mip_heuristics/problem/problem.cu:957-961
Timestamp: 2026-04-08T14:55:10.501Z
Learning: In cpp/src/mip_heuristics/problem/problem.cu, compute_related_variables() returns empty related_variables and related_variables_offsets in deterministic mode, and downstream consumers check pb.related_variables.size() before accessing the CSR offsets/indices; thus zero-sized offsets are acceptable by design.
Learnt from: aliceb-nv
Repo: aliceb-nv/cuopt-fork PR: 9
File: cpp/src/mip_heuristics/presolve/bounds_presolve.cu:175-178
Timestamp: 2026-04-09T14:25:25.098Z
Learning: In aliceb-nv/cuopt-fork, the `determinism_mode` in `mip_solver_settings_t` (and reflected in `mip_solver_context_t`) is fixed and does not change during a single MIP solve. Therefore, mutating instance-level settings (e.g., `settings.iteration_limit`) inside methods like `bound_presolve_t::bound_update_loop` based on a determinism-mode check (e.g., `context.settings.determinism_mode & CUOPT_DETERMINISM_GPU_HEURISTICS`) is safe — subsequent calls within the same solve will see the same determinism state. Review comments about permanent mutation of such settings being dangerous across repeated calls within one solve are not applicable.
Learnt from: aliceb-nv
Repo: aliceb-nv/cuopt-fork PR: 8
File: cpp/src/mip_heuristics/local_search/rounding/constraint_prop.cu:1129-1135
Timestamp: 2026-04-09T09:43:21.906Z
Learning: In aliceb-nv/cuopt-fork (cpp/src/mip_heuristics/relaxed_lp/relaxed_lp.cu), `get_relaxed_lp_solution` detects deterministic mode via `std::isfinite(settings.work_limit)` and internally replaces the PDLP time limit with `infinity()`. Therefore, callers setting `lp_settings.work_limit` for deterministic mode do NOT also need to set `lp_settings.time_limit = infinity()` at the call site — the internal implementation already handles this. Review comments suggesting to clear the wall-clock time limit at the call site in deterministic mode are not applicable.
Learnt from: aliceb-nv
Repo: aliceb-nv/cuopt-fork PR: 9
File: cpp/src/mip_heuristics/diversity/weights.cuh:30-33
Timestamp: 2026-04-09T13:49:12.833Z
Learning: In `cpp/src/mip_heuristics/diversity/weights.cuh`, `weight_t::get_hash(rmm::cuda_stream_view stream = rmm::cuda_stream_default)` uses an optional default stream intentionally. Callers that need explicit stream control pass their own stream; callers that don't mind default-stream synchronization use the default. Review comments asking to remove the default or require an explicit stream are not applicable.
Learnt from: aliceb-nv
Repo: aliceb-nv/cuopt-fork PR: 8
File: cpp/src/branch_and_bound/pseudo_costs.cpp:571-609
Timestamp: 2026-04-09T13:03:56.533Z
Learning: In `cpp/src/branch_and_bound/pseudo_costs.cpp`, the two overloads of `trial_branching_generic` (one taking `omp_atomic_t<int64_t>&` and one taking `int64_t&`) cleanly partition concurrent vs. sequential call sites. The `int64_t&` overload is only invoked from `reliable_variable_selection_core` instantiated with `SBIterT = int64_t`, which is used exclusively in deterministic/single-task contexts where `num_tasks <= 1` (guarded by `if (num_tasks > 1)` on the `#pragma omp taskloop`). There is no data race on the `int64_t&` accumulator. Review comments about a race on `total_lp_iter` in this overload are not applicable.
Learnt from: aliceb-nv
Repo: aliceb-nv/cuopt-fork PR: 8
File: cpp/src/mip_heuristics/feasibility_jump/load_balancing.cuh:541-542
Timestamp: 2026-04-09T09:30:39.975Z
Learning: In aliceb-nv/cuopt-fork, the `base_feas` and `bonus_robust` accumulators in `load_balancing_mtm_compute_scores` (in `cpp/src/mip_heuristics/feasibility_jump/load_balancing.cuh`) are intentionally `int32_t` even though `feas_score_constraint<i_t,f_t>` returns `std::pair<f_t,f_t>`. The precision loss from implicit narrowing is acceptable because these values are used for heuristic move scoring only, and integer overflow is considered unlikely in practice. Review comments about changing these accumulators to floating-point types are not applicable.
Learnt from: aliceb-nv
Repo: aliceb-nv/cuopt-fork PR: 9
File: cpp/tests/mip/load_balancing_test.cu:132-133
Timestamp: 2026-04-09T14:38:52.402Z
Learning: In `cpp/tests/mip/load_balancing_test.cu` (aliceb-nv/cuopt-fork), `detail::mip_solver_t` is constructed with a `cuopt::termination_checker_t(0.0, cuopt::termination_checker_t::root_tag_t{})` (zero time limit) intentionally in `test_multi_probe`. The solver is only used to obtain `solver.context` for constructing `detail::load_balanced_bounds_presolve_t` and `detail::bound_presolve_t`; the actual presolve calls (`bnd_prb.solve(...)`, `lb_prs.solve(...)`) are made directly and are unaffected by the zero root time limit. Review comments about this causing premature termination are not applicable.
Learnt from: aliceb-nv
Repo: aliceb-nv/cuopt-fork PR: 8
File: cpp/src/mip_heuristics/local_search/rounding/lb_bounds_repair.cu:412-416
Timestamp: 2026-04-08T14:52:32.556Z
Learning: In the aliceb-nv/cuopt-fork repository, files prefixed with `lb_` under `cpp/src/mip_heuristics/local_search/rounding/` (e.g., `lb_bounds_repair.cu`, `lb_constraint_prop.cu`, `lb_bounds_repair.cuh`, `lb_constraint_prop.cuh`) are not built/compiled in the current state of the code. Review comments about compilation errors, missing includes, or other build issues in these files should be ignored.
Learnt from: aliceb-nv
Repo: aliceb-nv/cuopt-fork PR: 8
File: cpp/src/mip_heuristics/solution_callbacks.cuh:204-214
Timestamp: 2026-04-08T14:56:14.641Z
Learning: In aliceb-nv/cuopt-fork, the `cuopt_assert` in `solution_callbacks.cuh` for injected-solution objective validation uses a hard-coded `1e-6` absolute tolerance intentionally. Model tolerances (in `problem_ptr->tolerances`) are for feasibility and integrality only, not for objective differences. `cuopt_assert` is debug-only and does not affect user-facing builds, so review comments about tightening or relaxing this threshold are not applicable.
| double elapsed_time() const noexcept | ||
| { | ||
| if (deterministic) { | ||
| if (!work_context) { return 0.0; } | ||
| return work_context->current_work() - work_units_at_start; | ||
| } else { | ||
| return timer.elapsed_time(); | ||
| } | ||
| } |
There was a problem hiding this comment.
Keep elapsed_time() as wall-clock time.
This changes elapsed_time() from seconds to work units in deterministic mode, but callers in this PR still use it for wall-clock reporting (stats.total_solve_time, callback timestamps, and %f seconds logs). That will leak work units into user-visible time fields. Add a separate work-based accessor instead, or audit every existing elapsed_time() caller before changing the semantics.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@cpp/src/utilities/termination_checker.hpp` around lines 185 - 193, Revert the
change to elapsed_time(): keep it returning wall-clock seconds by using
timer.elapsed_time() regardless of deterministic mode, and add a new accessor
(e.g., elapsed_work_units() or work_units_elapsed()) that returns
work_context->current_work() - work_units_at_start when deterministic (and 0.0
if no work_context); update only this header by restoring elapsed_time() to call
timer.elapsed_time() and adding the new work-based method that uses
deterministic, work_context, and work_units_at_start so callers that need work
units can opt-in without changing existing time semantics.
| double producer_progress_scale{ | ||
| read_work_unit_scale_env_or_default("CUOPT_GPU_HEUR_WORK_UNIT_SCALE", 1.0)}; | ||
| double work_unit_scale{1.0}; |
There was a problem hiding this comment.
The env scale never reaches actual work accounting.
CUOPT_GPU_HEUR_WORK_UNIT_SCALE is loaded into producer_progress_scale, but record_work_sync_on_horizon() multiplies by work_unit_scale, which stays 1.0. So the env knob changes the producer-side progress number, not the scheduler/termination work budget. The log in attach_producer_sync() also prints the wrong field.
🔧 Minimal fix
- double producer_progress_scale{
- read_work_unit_scale_env_or_default("CUOPT_GPU_HEUR_WORK_UNIT_SCALE", 1.0)};
- double work_unit_scale{1.0};
+ double producer_progress_scale{1.0};
+ double work_unit_scale{
+ read_work_unit_scale_env_or_default("CUOPT_GPU_HEUR_WORK_UNIT_SCALE", 1.0)};- if (work_unit_scale != 1.0) {
- CUOPT_DETERMINISM_LOG("[%s] Using work-unit scale %f", name.c_str(), work_unit_scale);
+ if (work_unit_scale != 1.0) {
+ CUOPT_DETERMINISM_LOG("[%s] Using work-unit scale %f", name.c_str(), work_unit_scale);
}If producer progress needs a separate scaling factor, it should use a distinct env var/name instead of sharing the work-budget knob.
Also applies to: 95-97, 123-126
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@cpp/src/utilities/work_limit_context.hpp` around lines 60 - 62, The
environment scale is being stored in producer_progress_scale but
record_work_sync_on_horizon() (and the scheduler/termination accounting) uses
work_unit_scale which remains 1.0 and attach_producer_sync() logs the wrong
field; fix by making the env knob affect the actual work-budget scale: either
assign work_unit_scale =
read_work_unit_scale_env_or_default("CUOPT_GPU_HEUR_WORK_UNIT_SCALE", 1.0) (and
keep producer_progress_scale only if you intentionally need a separate
producer-only factor), or update record_work_sync_on_horizon() to multiply by
producer_progress_scale if the env is intended to be producer-only; also update
attach_producer_sync() logging to print the correct variable
(producer_progress_scale vs work_unit_scale) and apply the same change where
similar code appears (the other occurrences around the other blocks noted).
| uint32_t test_probing_cache_determinism(std::string path, | ||
| unsigned long seed = std::random_device{}()) | ||
| { | ||
| const raft::handle_t handle_{}; | ||
| cuopt::mps_parser::mps_data_model_t<int, double> mps_problem = | ||
| cuopt::mps_parser::parse_mps<int, double>(path, false); | ||
| handle_.sync_stream(); | ||
| auto op_problem = mps_data_model_to_optimization_problem(&handle_, mps_problem); | ||
| problem_checking_t<int, double>::check_problem_representation(op_problem); | ||
| detail::problem_t<int, double> problem(op_problem); | ||
| mip_solver_settings_t<int, double> default_settings{}; | ||
| default_settings.mip_scaling = false; // we're not checking scaling determinism here | ||
| auto timer = cuopt::termination_checker_t(0.0, cuopt::termination_checker_t::root_tag_t{}); | ||
| detail::mip_solver_t<int, double> solver(problem, default_settings, timer); | ||
| detail::bound_presolve_t<int, double> bnd_prb(solver.context); | ||
|
|
||
| work_limit_context_t work_limit_context("ProbingCache"); | ||
| // rely on the iteration limit | ||
| termination_checker_t probing_timer( | ||
| work_limit_context, std::numeric_limits<double>::max(), timer); | ||
| compute_probing_cache(bnd_prb, problem, probing_timer); |
There was a problem hiding this comment.
Call init_handler() in the new determinism helpers.
You added the cuBLAS/cuSparse pointer-mode initializer, but neither test_probing_cache_determinism() nor test_scaling_determinism() invokes it after constructing handle_. That makes these tests depend on prior global library state and can turn determinism checks into order-dependent flakes.
🔧 Minimal fix
uint32_t test_probing_cache_determinism(std::string path,
unsigned long seed = std::random_device{}())
{
const raft::handle_t handle_{};
+ init_handler(&handle_);
cuopt::mps_parser::mps_data_model_t<int, double> mps_problem =
cuopt::mps_parser::parse_mps<int, double>(path, false); uint32_t test_scaling_determinism(std::string path, unsigned long seed = std::random_device{}())
{
const raft::handle_t handle_{};
+ init_handler(&handle_);
cuopt::mps_parser::mps_data_model_t<int, double> mps_problem =
cuopt::mps_parser::parse_mps<int, double>(path, false);Also applies to: 160-166
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@cpp/tests/mip/presolve_test.cu` around lines 98 - 118, The new determinism
helpers require calling init_handler() to initialize cuBLAS/cuSparse
pointer-mode; update test_probing_cache_determinism (and similarly
test_scaling_determinism) to invoke init_handler() immediately after
constructing the raft::handle_t handle_ so the tests do not rely on prior global
library state — locate the handle_ construction in
test_probing_cache_determinism and call init_handler() before any work with the
handle (e.g., before handle_.sync_stream() and before creating solver/context)
to ensure deterministic, order-independent behavior.
| auto sorted_map = std::map<int, detail::cached_bound_t<double>>( | ||
| a.second[0].var_to_cached_bound_map.begin(), a.second[0].var_to_cached_bound_map.end()); | ||
| for (const auto& [var_id, cached_bound] : sorted_map) { | ||
| var_to_cached_bound_keys.push_back(var_id); | ||
| var_to_cached_bound_lb.push_back(cached_bound.lb); | ||
| var_to_cached_bound_ub.push_back(cached_bound.ub); | ||
| } |
There was a problem hiding this comment.
Hash both probing-cache branches, not just entry [0].
This composite hash only serializes a.second[0].var_to_cached_bound_map. If entry [1] becomes nondeterministic, the new test still passes, so it misses half of the cache state it is supposed to validate.
🔧 Minimal fix
- auto sorted_map = std::map<int, detail::cached_bound_t<double>>(
- a.second[0].var_to_cached_bound_map.begin(), a.second[0].var_to_cached_bound_map.end());
- for (const auto& [var_id, cached_bound] : sorted_map) {
- var_to_cached_bound_keys.push_back(var_id);
- var_to_cached_bound_lb.push_back(cached_bound.lb);
- var_to_cached_bound_ub.push_back(cached_bound.ub);
+ for (int branch = 0; branch < 2; ++branch) {
+ auto sorted_map = std::map<int, detail::cached_bound_t<double>>(
+ a.second[branch].var_to_cached_bound_map.begin(),
+ a.second[branch].var_to_cached_bound_map.end());
+ for (const auto& [var_id, cached_bound] : sorted_map) {
+ var_to_cached_bound_keys.push_back(var_id);
+ var_to_cached_bound_lb.push_back(cached_bound.lb);
+ var_to_cached_bound_ub.push_back(cached_bound.ub);
+ }
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| auto sorted_map = std::map<int, detail::cached_bound_t<double>>( | |
| a.second[0].var_to_cached_bound_map.begin(), a.second[0].var_to_cached_bound_map.end()); | |
| for (const auto& [var_id, cached_bound] : sorted_map) { | |
| var_to_cached_bound_keys.push_back(var_id); | |
| var_to_cached_bound_lb.push_back(cached_bound.lb); | |
| var_to_cached_bound_ub.push_back(cached_bound.ub); | |
| } | |
| for (int branch = 0; branch < 2; ++branch) { | |
| auto sorted_map = std::map<int, detail::cached_bound_t<double>>( | |
| a.second[branch].var_to_cached_bound_map.begin(), | |
| a.second[branch].var_to_cached_bound_map.end()); | |
| for (const auto& [var_id, cached_bound] : sorted_map) { | |
| var_to_cached_bound_keys.push_back(var_id); | |
| var_to_cached_bound_lb.push_back(cached_bound.lb); | |
| var_to_cached_bound_ub.push_back(cached_bound.ub); | |
| } | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@cpp/tests/mip/presolve_test.cu` around lines 139 - 145, The test currently
only serializes a.second[0].var_to_cached_bound_map into sorted_map and hashes
that, missing any nondeterminism in other probing-cache branches; update the
logic to iterate over all entries in a.second (not just index 0) and include
each entry's var_to_cached_bound_map contents in the deterministic serialization
— e.g., for each element in a.second, extract its var_to_cached_bound_map,
insert its key/value pairs into the combined sorted_map (or process each map in
turn while appending to var_to_cached_bound_keys, var_to_cached_bound_lb,
var_to_cached_bound_ub) so both probing-cache branches are included in the hash
(references: a.second, var_to_cached_bound_map, sorted_map,
var_to_cached_bound_keys, var_to_cached_bound_lb, var_to_cached_bound_ub).
Dummy PR to trigger coderabbit review
Description
Issue
Checklist