Incorporate determinism into GPU heuristics + parallel determinsitic B&B+GPU exploration#4
Conversation
📝 WalkthroughWalkthroughAdds deterministic/work-unit-based execution and incumbent tracking across the MIP stack: new termination/work-limit abstractions, solution-origin metadata and extended callbacks, deterministic B&B & heuristics changes, recombiner/work accounting updates, many timer→work_limit_timer API adjustments, work-unit prediction, and extensive determinism-focused tests and sanitizer suppressions. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
|
There was a problem hiding this comment.
Actionable comments posted: 7
Note
Due to the large number of review comments, Critical severity comments were prioritized as inline comments.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (4)
cpp/src/mip_heuristics/solution/solution.cu (1)
92-108:⚠️ Potential issue | 🔴 CriticalAvoid copying potentially uninitialized derived buffers in
copy_from.Lines 92-99 mark source buffers/scalars as initialized, and Lines 101-108 immediately copy them. If source state is actually uninitialized, this propagates indeterminate values and hides a real initialization bug.
🛠️ Safer direction (don’t copy unknown-derived state; force recompute)
- // slack, excess, and constraint value may be uninitialized (and computed later). Mark them as - // such - cuopt::mark_span_as_initialized(make_span(other_sol.lower_excess), handle_ptr->get_stream()); - cuopt::mark_span_as_initialized(make_span(other_sol.upper_excess), handle_ptr->get_stream()); - cuopt::mark_span_as_initialized(make_span(other_sol.constraint_value), handle_ptr->get_stream()); - cuopt::mark_span_as_initialized(make_span(other_sol.obj_val), handle_ptr->get_stream()); - cuopt::mark_span_as_initialized(make_span(other_sol.n_feasible_constraints), - handle_ptr->get_stream()); - - expand_device_copy(lower_excess, other_sol.lower_excess, handle_ptr->get_stream()); - expand_device_copy(upper_excess, other_sol.upper_excess, handle_ptr->get_stream()); - expand_device_copy(constraint_value, other_sol.constraint_value, handle_ptr->get_stream()); - raft::copy(obj_val.data(), other_sol.obj_val.data(), 1, handle_ptr->get_stream()); - raft::copy(n_feasible_constraints.data(), - other_sol.n_feasible_constraints.data(), - 1, - handle_ptr->get_stream()); + // Derived feasibility/objective buffers can be stale or unset in the source. + // Keep deterministic state by resetting and recomputing on demand. + lower_excess.resize(problem_ptr->n_constraints, handle_ptr->get_stream()); + upper_excess.resize(problem_ptr->n_constraints, handle_ptr->get_stream()); + constraint_value.resize(problem_ptr->n_constraints, handle_ptr->get_stream()); + thrust::fill(handle_ptr->get_thrust_policy(), lower_excess.begin(), lower_excess.end(), f_t{0}); + thrust::fill(handle_ptr->get_thrust_policy(), upper_excess.begin(), upper_excess.end(), f_t{0}); + thrust::fill( + handle_ptr->get_thrust_policy(), constraint_value.begin(), constraint_value.end(), f_t{0}); + obj_val.set_value_to_zero_async(handle_ptr->get_stream()); + n_feasible_constraints.set_value_to_zero_async(handle_ptr->get_stream()); + is_feasible = false;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/mip_heuristics/solution/solution.cu` around lines 92 - 108, The current copy_from implementation marks and then blindly copies derived fields (other_sol.lower_excess, other_sol.upper_excess, other_sol.constraint_value, other_sol.obj_val, other_sol.n_feasible_constraints) which can propagate uninitialized data; remove the cuopt::mark_span_as_initialized calls for those derived buffers and stop calling expand_device_copy/raft::copy on them in copy_from — instead leave the destination fields (lower_excess, upper_excess, constraint_value, obj_val, n_feasible_constraints) as uninitialized or explicitly set them to safe defaults (e.g., zero) and set/clear the appropriate "needs recompute" flag so they will be recomputed later; adjust copy_from logic accordingly (refer to functions/objects: copy_from, mark_span_as_initialized, expand_device_copy, raft::copy, other_sol.* and destination members) so no unknown-derived state is silently copied.cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu (1)
1491-1502:⚠️ Potential issue | 🟠 MajorThis work-budget check is too coarse for deterministic scheduling.
work_units_elapsedis only updated and compared every 100 iterations. Any budget smaller than one batch is effectively rounded up to the next checkpoint, so CPUFJ can overshoot the requested budget by a large amount before Line 1500 ever fires.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu` around lines 1491 - 1502, The work-budget check currently only runs at the batch checkpoint, so fj_cpu.work_units_elapsed (updated by biased_work) can overshoot for budgets smaller than a batch; modify the loop that accumulates biased_work so you update and compare fj_cpu.work_units_elapsed against fj_cpu.work_budget on each iteration (or immediately after adding biased_work) and break out as soon as the budget is exceeded; ensure you use the same atomic load/store semantics as other sites (work_units_elapsed.load/store with std::memory_order_relaxed or the existing pattern) and preserve the producer_sync/notify_progress and CUOPT_LOG_TRACE calls so they still run when exiting early.cpp/include/cuopt/linear_programming/utilities/internals.hpp (1)
27-116: 🛠️ Refactor suggestion | 🟠 MajorDocument the new callback ABI in this public header.
The new origin enum,
mip_solution_callback_info_t, andGET_SOLUTION_EXTinterface need Doxygen here: pointer types,struct_size/lifetime semantics, and callback threading expectations are not obvious from the signatures alone.As per coding guidelines,
cpp/include/cuopt/**/*: "Check if new public functions/classes have documentation comments (Doxygen format)", "Flag API changes that may need corresponding docs/ updates", and "Suggest documenting thread-safety, GPU requirements, and numerical behavior".🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/include/cuopt/linear_programming/utilities/internals.hpp` around lines 27 - 116, Add Doxygen comments to the public ABI symbols introduced: document mip_solution_origin_t (enumerators and their semantic meanings), mip_solution_callback_info_t (explain each field including struct_size, valid lifetime, ownership, alignment and versioning expectations), base_solution_callback_type (describe GET_SOLUTION_EXT purpose), and get_solution_callback_ext_t::get_solution (document pointer parameter types, which may be null, whether objective_value/solution_bound are floats or doubles depending on isFloat, threading/callback concurrency guarantees, GPU vs CPU context restrictions, and numerical behavior/precision expectations). Also annotate base_solution_callback_t::set_user_data/get_user_data to clarify ownership of user_data. Keep comments in Doxygen format in this header so API docs and maintainers can find pointer semantics, thread-safety, and ABI/versioning rules.cpp/src/mip_heuristics/solver.cu (1)
304-310:⚠️ Potential issue | 🟠 MajorDeterministic B&B still never ingests GPU-heuristic incumbents.
In this branch
branch_and_bound_callbackremains unset, so heuristic incumbents cannot tighten the B&B bound or affect pruning. That leaves the new deterministic “parallel B&B + GPU exploration” mode only half-wired.If you want, I can turn that TODO into a concrete follow-up issue with the queueing and retirement steps spelled out.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/mip_heuristics/solver.cu` around lines 304 - 310, In the deterministic B&B branch ensure GPU-heuristic incumbents are forwarded by setting context.problem_ptr->branch_and_bound_callback: capture the BranchAndBound instance (branch_and_bound.get()) and assign a lambda that calls bb->queue_external_solution_deterministic(solution, 0.0) so GPU heuristics can supply deterministic incumbents; keep the existing set_concurrent_lp_root_solve(false) and restore the commented TODO into an active callback assignment using the exact symbol branch_and_bound_callback and the method queue_external_solution_deterministic to enable pruning and bound tightening.
🟡 Minor comments (13)
cpp/tests/mip/unit_test.cu-286-305 (1)
286-305:⚠️ Potential issue | 🟡 MinorCommented-out test has issues that must be fixed before uncommenting.
Three problems need attention before this test can be used:
Line 299 member name:
settings.presolve = true;should besettings.presolver = cuopt::linear_programming::presolver_t::None;to match all other tests in the file (lines 198, 213, 245, 266). The current name is incompatible with the settings struct.Unused extracted parameters:
scalingandheuristics_onlyare extracted fromGetParam()on lines 289-290 but immediately overwritten with hardcodedtruevalues on lines 297-298. Either use the parameters or remove the extractions.Missing time limit: Unlike all other tests in the file, this test does not set
settings.time_limit = 5;. Add it to prevent indefinite test runs.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/tests/mip/unit_test.cu` around lines 286 - 305, The commented-out test has three fixes: replace the invalid member assignment settings.presolve = true with settings.presolver = cuopt::linear_programming::presolver_t::None to match the settings struct and other tests; stop overwriting the parameters extracted from GetParam() by either removing the unused local variables scaling and heuristics_only or by using them when setting settings.mip_scaling and settings.heuristics_only (i.e., set settings.mip_scaling = scaling and settings.heuristics_only = heuristics_only); and add a time limit like settings.time_limit = 5 to avoid indefinite runs.ci/compute-sanitizer-suppressions.xml-209-222 (1)
209-222:⚠️ Potential issue | 🟡 MinorTypo in comment and broad suppression scope.
Line 209 has a typo: "actualy" → "actually".
This suppression is notably broad—it suppresses all
cuMemcpyDtoDAsyncuninitialized memory access errors regardless of the calling context. While the comment acknowledges the risk, relying on "errors may be caught later on" is optimistic and could mask genuine issues in device-to-device copies.Consider whether a tighter hostStack constraint is feasible, or document specific known call sites that trigger this.
🔧 Proposed fix for the typo
- <!-- Uninitialized device-to-device copies are usually harmless - if actualy bogus, errors may be caught later on --> + <!-- Uninitialized device-to-device copies are usually harmless - if actually bogus, errors may be caught later on -->🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ci/compute-sanitizer-suppressions.xml` around lines 209 - 222, The suppression comment in the <record> for InitcheckApiError has a typo ("actualy" → "actually") and the hostStack pattern matching cuMemcpyDtoDAsync is too broad; fix the typo in the comment text and tighten the suppression by narrowing the <hostStack>/<frame>/<func> or adding additional <frame> entries to match known safe call sites (e.g., include specific wrapper function names or modules instead of ".*libcuda.so.*"), or else document the exact safe call sites in a new comment above the <record>; locate the record that references cuMemcpyDtoDAsync and update the comment text and hostStack constraints (and optionally add a note listing the verified call sites) so the suppression only covers intended, reviewed contexts.cpp/src/dual_simplex/basis_updates.cpp-2204-2206 (1)
2204-2206:⚠️ Potential issue | 🟡 MinorWork estimate formula uses incorrect variable for sort operation.
The
std::sortat line 2204 operates onnzelements (range frommtom + nz), but the work estimate uses(m + nz) * log2(m + nz)instead ofnz * log2(nz). This inflates the work estimate, especially whenmis large. Additionally, the guard(m + nz) > 1can pass whennz == 0(ifm >= 2), adding spurious work for an empty sort operation.Contrast with line 2217 in the same file, which correctly uses
etilde.i.size()for both the guard and the work formula when estimating sort work.Suggested fix
std::sort(xi_workspace_.begin() + m, xi_workspace_.begin() + m + nz, std::less<i_t>()); - if ((m + nz) > 1) { work_estimate_ += (m + nz) * std::log2((f_t)(m + nz)); } + if (nz > 1) { work_estimate_ += nz * std::log2((f_t)nz); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/dual_simplex/basis_updates.cpp` around lines 2204 - 2206, The work-estimate after sorting xi_workspace_ is using the wrong size and guard: change the guard from (m + nz) > 1 to nz > 1 and compute the cost as nz * log2(nz) (with the same cast style used elsewhere, e.g. (f_t)nz) and add that to work_estimate_ instead of using (m + nz) * log2((f_t)(m + nz)); update the block around the std::sort call that references xi_workspace_, m, nz and work_estimate_ accordingly.cpp/src/utilities/timer.hpp-37-50 (1)
37-50:⚠️ Potential issue | 🟡 MinorAdd
[[maybe_unused]]to suppress compiler warnings on unused parameters.Parameters
caller,file, andlineare not used in the active code path (lines 42–49 are commented out). With-Werrorenabled in the build (cpp/CMakeLists.txt:78, 144), this will cause compilation failures. Annotate these parameters with[[maybe_unused]]to silence the warning, matching the existing convention used elsewhere in the codebase.Suggested fix
- bool check_time_limit(const char* caller = __builtin_FUNCTION(), - const char* file = __builtin_FILE(), - int line = __builtin_LINE()) const noexcept + bool check_time_limit([[maybe_unused]] const char* caller = __builtin_FUNCTION(), + [[maybe_unused]] const char* file = __builtin_FILE(), + [[maybe_unused]] int line = __builtin_LINE()) const noexcept🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/utilities/timer.hpp` around lines 37 - 50, The parameters caller, file, and line in the method check_time_limit(...) are currently unused (the logging block is commented out) and trigger -Werror warnings; annotate those parameters with [[maybe_unused]] (e.g., change parameter declarations for caller, file, and line in check_time_limit) to suppress unused-parameter warnings while preserving the current defaults and behavior.cpp/tests/mip/diversity_test.cu-71-73 (1)
71-73:⚠️ Potential issue | 🟡 MinorRemove or use the
seedparameter.The
seedparameter with default valuestd::random_device{}()is declared intest_full_run_determinism,test_initial_solution_determinism, andtest_recombiners_determinism, but it is never used within the function bodies. Either remove the parameter or use it to seed the random number generation.Also applies to: 132-134, 192-194
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/tests/mip/diversity_test.cu` around lines 71 - 73, The three functions test_full_run_determinism, test_initial_solution_determinism, and test_recombiners_determinism declare an unused seed parameter; either remove the seed parameter from their signatures or actually apply it to the RNG used in each test (e.g., seed whatever std::mt19937 / std::default_random_engine or CUDA host RNG the tests use) so the tests become deterministic; update all overloads/call sites accordingly and ensure the default behavior remains unchanged if you keep the default argument.cpp/tests/mip/diversity_test.cu-285-293 (1)
285-293:⚠️ Potential issue | 🟡 MinorAvoid
exit(1)in test code; use GTest assertions instead.Calling
exit(1)bypasses the test framework's cleanup and reporting. UseFAIL()orASSERT_EQto properly fail the test.🧪 Suggested fix
if (hash_map[std::make_tuple(path, i, j, recombiner)] != offspring_hash) { printf("%s: hash mismatch for %d,%d: %d != %d\n", path.c_str(), i, j, hash_map[std::make_tuple(path, i, j, recombiner)], offspring_hash); - exit(1); + FAIL() << "Hash mismatch for " << path << " at (" << i << "," << j << ")"; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/tests/mip/diversity_test.cu` around lines 285 - 293, Replace the direct process exit with a GTest assertion: where the code compares hash_map[std::make_tuple(path, i, j, recombiner)] to offspring_hash, remove exit(1) and use ASSERT_EQ or EXPECT_EQ to report the mismatch (for example ASSERT_EQ(hash_map[std::make_tuple(path, i, j, recombiner)], offspring_hash) << "hash mismatch for " << path << ":" << i << "," << j;), or use FAIL() << formatted message if you need to include both values; this ensures the test framework records the failure and performs proper cleanup instead of calling exit.cpp/src/utilities/seed_generator.cuh-55-65 (1)
55-65:⚠️ Potential issue | 🟡 MinorPotential issue with
SEED_GENERATOR_DEBUGmacro check.Using
#if SEED_GENERATOR_DEBUGwithout first checking if the macro is defined will cause a compilation warning or treat an undefined macro as 0. Consider using#ifdef SEED_GENERATOR_DEBUGor#if defined(SEED_GENERATOR_DEBUG) && SEED_GENERATOR_DEBUG.🛠️ Suggested fix
-#if SEED_GENERATOR_DEBUG +#ifdef SEED_GENERATOR_DEBUG static int64_t get_seed(const char* caller = __builtin_FUNCTION(), const char* file = __builtin_FILE(), int line = __builtin_LINE())🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/utilities/seed_generator.cuh` around lines 55 - 65, The preprocessor check for SEED_GENERATOR_DEBUG should guard against the macro being undefined; update the existing conditional around the get_seed overloads to use either `#ifdef` SEED_GENERATOR_DEBUG or `#if` defined(SEED_GENERATOR_DEBUG) && SEED_GENERATOR_DEBUG so the compiler won't warn or treat an undefined macro as 0, keeping the debug variant of static int64_t get_seed(const char* caller = __builtin_FUNCTION(), const char* file = __builtin_FILE(), int line = __builtin_LINE()) and the non-debug static int64_t get_seed() { return local_state().counter++; } behavior unchanged; ensure you modify the opening conditional only (and keep the corresponding `#else/`#endif) so get_seed is selected correctly based on the macro definition.cpp/tests/mip/determinism_test.cu-36-62 (1)
36-62:⚠️ Potential issue | 🟡 MinorConsider safer error handling in
scoped_env_var_t.Two concerns with this RAII helper:
Line 59:
name_stores a raw pointer from the constructor argument. If the caller passes a temporary or short-lived string, this will become a dangling pointer. Consider storing astd::stringcopy.Lines 46, 51-52: Using
assert()forsetenv/unsetenvreturn values means errors are silently ignored in release builds. Consider logging failures or using a different assertion mechanism.🛠️ Suggested fix
class scoped_env_var_t { public: scoped_env_var_t(const char* name, const char* value) : name_(name) { cuopt_assert(name != nullptr, "Environment variable name must be non-null"); cuopt_assert(value != nullptr, "Environment variable value must be non-null"); const char* original_value = std::getenv(name_); was_set_ = (original_value != nullptr); if (was_set_) { original_value_ = original_value; } const int status = setenv(name_, value, 1); - assert(status == 0); + cuopt_assert(status == 0, "setenv failed"); } ~scoped_env_var_t() { const int status = was_set_ ? setenv(name_, original_value_.c_str(), 1) : unsetenv(name_); - assert(status == 0); + // Note: Cannot throw or assert in destructor; log on failure + if (status != 0) { CUOPT_LOG_WARN("Failed to restore env var %s", name_); } } scoped_env_var_t(const scoped_env_var_t&) = delete; scoped_env_var_t& operator=(const scoped_env_var_t&) = delete; private: - const char* name_; + std::string name_; // Own the string to avoid dangling pointer std::string original_value_; bool was_set_{false}; };🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/tests/mip/determinism_test.cu` around lines 36 - 62, scoped_env_var_t currently stores a raw const char* (name_) and uses assert() on setenv/unsetenv which is unsafe; update the constructor to copy the environment variable name into a std::string member (e.g., change name_ to std::string name_) to avoid dangling pointers, and replace the assert(status == 0) checks in the constructor and destructor with proper error handling: check the return value of setenv/unsetenv and either log the error (with errno) or throw a runtime_error from the constructor/destructor context as appropriate so failures are not ignored in release builds; ensure you update references to name_ in the constructor/destructor to use the std::string.cpp/tests/mip/miplib_test.cu-76-83 (1)
76-83:⚠️ Potential issue | 🟡 MinorCommented-out test has incorrect function call.
If the commented test is uncommented, line 81 would fail to compile. The call
test_miplib_file(test_instance, true)passestrueas the second argument, but the function signature expectsmip_solver_settings_t<int, double>as the second parameter.Should likely be:
mip_solver_settings_t<int, double> settings; // configure determinism settings... test_miplib_file(test_instance, settings, true);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/tests/mip/miplib_test.cu` around lines 76 - 83, The commented-out test calls test_miplib_file(test_instance, true) but the function signature requires a mip_solver_settings_t<int, double> settings object as the second parameter; update the test to construct a mip_solver_settings_t<int, double> (e.g., mip_solver_settings_t<int,double> settings), set any determinism-related fields on settings as needed, and then call test_miplib_file(test_instance, settings, true) so the correct overload is invoked.cpp/src/utilities/version_info.cpp-188-196 (1)
188-196:⚠️ Potential issue | 🟡 MinorPotential uncaught exception from
std::stod.If
/proc/cpuinfocontains a malformed "cpu MHz" line,std::stodat line 192 can throwstd::invalid_argumentorstd::out_of_range. Since this is called from a lambda initializing athread_local static, an uncaught exception could terminate the program.🛡️ Proposed fix to add exception handling
if (line.find("cpu MHz") != std::string::npos) { std::size_t colon = line.find(':'); if (colon != std::string::npos) { - double mhz = std::stod(line.substr(colon + 1)); - if (mhz > max_mhz) { max_mhz = mhz; } + try { + double mhz = std::stod(line.substr(colon + 1)); + if (mhz > max_mhz) { max_mhz = mhz; } + } catch (const std::exception&) { + // Skip malformed entries + } } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/utilities/version_info.cpp` around lines 188 - 196, The loop that parses "/proc/cpuinfo" calls std::stod on line substrings and can throw std::invalid_argument or std::out_of_range (risking terminate since this runs in a thread_local static initializer). Wrap the std::stod call inside a try-catch that catches std::invalid_argument and std::out_of_range (or std::exception) and simply skip the malformed "cpu MHz" line when an exception is thrown, leaving max_mhz unchanged; update references in this block around std::stod, max_mhz, cpuinfo and line to implement the handling and optionally log a debug message instead of letting the exception propagate.cpp/src/utilities/determinism_log.hpp-25-35 (1)
25-35:⚠️ Potential issue | 🟡 MinorRedundant macro definitions are dead code.
The
#ifndef CUOPT_DETERMINISM_LOGguard on line 19 ensures that subsequent definitions on lines 25-35 are never evaluated. Additionally, line 26 has a different signature (missing theloggerparameter) compared to line 20, suggesting either incomplete refactoring or a copy-paste oversight.Consider removing the redundant blocks or clarifying the intended macro signature.
🧹 Proposed fix to remove dead code
`#ifndef` CUOPT_DETERMINISM_LOG `#define` CUOPT_DETERMINISM_LOG(logger, ...) \ do { \ } while (0) `#endif` - -#ifndef CUOPT_DETERMINISM_LOG -#define CUOPT_DETERMINISM_LOG(...) \ - do { \ - } while (0) -#endif - -#ifndef CUOPT_DETERMINISM_LOG -#define CUOPT_DETERMINISM_LOG(...) \ - do { \ - } while (0) -#endif🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/utilities/determinism_log.hpp` around lines 25 - 35, The file defines duplicate dead-code macro blocks for CUOPT_DETERMINISM_LOG (one without the expected logger parameter) — remove the redundant second block and ensure only a single macro definition exists with the intended signature (the one that takes logger and variadic args), or if the no-logger variant is intended, reconcile both definitions to a consistent, documented signature; update/remove the extra `#ifndef/`#define/#endif pair so CUOPT_DETERMINISM_LOG is defined exactly once with the correct parameter list.cpp/tests/mip/CMakeLists.txt-42-45 (1)
42-45:⚠️ Potential issue | 🟡 MinorStale comment conflicts with active test configuration.
Line 42 says “Disable for now,” but Line 43-Line 45 actively registers
FEASIBILITY_JUMP_TEST. Please update/remove the comment to avoid confusion.🧹 Proposed cleanup
-# Disable for now ConfigureTest(FEASIBILITY_JUMP_TEST ${CMAKE_CURRENT_SOURCE_DIR}/feasibility_jump_tests.cu )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/tests/mip/CMakeLists.txt` around lines 42 - 45, The comment "Disable for now" is stale and conflicts with the active ConfigureTest registration for FEASIBILITY_JUMP_TEST; remove or update that comment so it accurately reflects the current state (either delete the line or change it to indicate the test is enabled), and keep the ConfigureTest(FEASIBILITY_JUMP_TEST ...) block intact so the test registration remains correct.cpp/src/mip_heuristics/diversity/recombiners/fp_recombiner.cuh-91-97 (1)
91-97:⚠️ Potential issue | 🟡 MinorPotential null pointer issue if work_context is not set.
When
CUOPT_DETERMINISM_GPU_HEURISTICSis enabled, the code asserts thatwork_context != nullptr. However, if the determinism flag is set butgpu_heur_loopwasn't properly initialized, this will crash at runtime. Consider adding a more graceful fallback or earlier validation when the settings are configured.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/mip_heuristics/diversity/recombiners/fp_recombiner.cuh` around lines 91 - 97, The code asserts that lp_settings.work_context is non-null when context.settings.determinism_mode includes CUOPT_DETERMINISM_GPU_HEURISTICS, which can crash if context.gpu_heur_loop wasn't initialized; update the block around context.settings.determinism_mode and lp_settings.work_context to first check if this->context.gpu_heur_loop is non-null and, if null, either set a safe fallback (e.g., a default work context or disable the deterministic GPU heuristics by clearing the CUOPT_DETERMINISM_GPU_HEURISTICS flag) and log a clear warning, or return/propagate an error earlier during settings validation so gpu_heur_loop is guaranteed initialized; ensure modifications touch the same scope where lp_settings.time_limit, lp_settings.work_limit, lp_settings.work_context, this->context.gpu_heur_loop and cuopt_assert are used.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 9a2b84b7-6e55-477e-bac8-6ef8b641f650
📒 Files selected for processing (107)
benchmarks/linear_programming/cuopt/run_mip.cppci/compute-sanitizer-suppressions.xmlcpp/CMakeLists.txtcpp/include/cuopt/linear_programming/constants.hcpp/include/cuopt/linear_programming/cuopt_c.hcpp/include/cuopt/linear_programming/mip/solver_settings.hppcpp/include/cuopt/linear_programming/utilities/internals.hppcpp/src/CMakeLists.txtcpp/src/branch_and_bound/branch_and_bound.cppcpp/src/branch_and_bound/branch_and_bound.hppcpp/src/branch_and_bound/deterministic_workers.hppcpp/src/branch_and_bound/pseudo_costs.cppcpp/src/branch_and_bound/pseudo_costs.hppcpp/src/dual_simplex/basis_updates.cppcpp/src/dual_simplex/bound_flipping_ratio_test.cppcpp/src/dual_simplex/bound_flipping_ratio_test.hppcpp/src/dual_simplex/phase2.cppcpp/src/dual_simplex/simplex_solver_settings.hppcpp/src/math_optimization/solver_settings.cucpp/src/mip_heuristics/diversity/diversity_config.hppcpp/src/mip_heuristics/diversity/diversity_manager.cucpp/src/mip_heuristics/diversity/diversity_manager.cuhcpp/src/mip_heuristics/diversity/lns/rins.cucpp/src/mip_heuristics/diversity/multi_armed_bandit.cuhcpp/src/mip_heuristics/diversity/population.cucpp/src/mip_heuristics/diversity/population.cuhcpp/src/mip_heuristics/diversity/recombiners/bound_prop_recombiner.cuhcpp/src/mip_heuristics/diversity/recombiners/fp_recombiner.cuhcpp/src/mip_heuristics/diversity/recombiners/line_segment_recombiner.cuhcpp/src/mip_heuristics/diversity/recombiners/recombiner.cuhcpp/src/mip_heuristics/diversity/recombiners/recombiner_stats.hppcpp/src/mip_heuristics/diversity/recombiners/sub_mip.cuhcpp/src/mip_heuristics/diversity/weights.cuhcpp/src/mip_heuristics/feasibility_jump/feasibility_jump.cucpp/src/mip_heuristics/feasibility_jump/feasibility_jump.cuhcpp/src/mip_heuristics/feasibility_jump/feasibility_jump_impl_common.cuhcpp/src/mip_heuristics/feasibility_jump/feasibility_jump_kernels.cucpp/src/mip_heuristics/feasibility_jump/fj_cpu.cucpp/src/mip_heuristics/feasibility_jump/fj_cpu.cuhcpp/src/mip_heuristics/feasibility_jump/load_balancing.cuhcpp/src/mip_heuristics/feasibility_jump/utils.cuhcpp/src/mip_heuristics/local_search/feasibility_pump/feasibility_pump.cucpp/src/mip_heuristics/local_search/feasibility_pump/feasibility_pump.cuhcpp/src/mip_heuristics/local_search/line_segment_search/line_segment_search.cucpp/src/mip_heuristics/local_search/line_segment_search/line_segment_search.cuhcpp/src/mip_heuristics/local_search/local_search.cucpp/src/mip_heuristics/local_search/local_search.cuhcpp/src/mip_heuristics/local_search/rounding/bounds_repair.cucpp/src/mip_heuristics/local_search/rounding/bounds_repair.cuhcpp/src/mip_heuristics/local_search/rounding/constraint_prop.cucpp/src/mip_heuristics/local_search/rounding/constraint_prop.cuhcpp/src/mip_heuristics/local_search/rounding/lb_bounds_repair.cucpp/src/mip_heuristics/local_search/rounding/lb_bounds_repair.cuhcpp/src/mip_heuristics/local_search/rounding/lb_constraint_prop.cucpp/src/mip_heuristics/local_search/rounding/lb_constraint_prop.cuhcpp/src/mip_heuristics/local_search/rounding/simple_rounding_kernels.cuhcpp/src/mip_heuristics/presolve/bounds_presolve.cucpp/src/mip_heuristics/presolve/bounds_update_data.cucpp/src/mip_heuristics/presolve/lb_probing_cache.cucpp/src/mip_heuristics/presolve/probing_cache.cucpp/src/mip_heuristics/presolve/probing_cache.cuhcpp/src/mip_heuristics/presolve/third_party_presolve.cppcpp/src/mip_heuristics/presolve/third_party_presolve.hppcpp/src/mip_heuristics/presolve/utils.cuhcpp/src/mip_heuristics/problem/problem.cucpp/src/mip_heuristics/problem/problem_fixing.cuhcpp/src/mip_heuristics/relaxed_lp/relaxed_lp.cucpp/src/mip_heuristics/relaxed_lp/relaxed_lp.cuhcpp/src/mip_heuristics/solution/solution.cucpp/src/mip_heuristics/solution/solution.cuhcpp/src/mip_heuristics/solve.cucpp/src/mip_heuristics/solver.cucpp/src/mip_heuristics/solver.cuhcpp/src/mip_heuristics/solver_context.cuhcpp/src/pdlp/cuopt_c.cppcpp/src/pdlp/pdlp.cucpp/src/utilities/copy_helpers.hppcpp/src/utilities/cuda_helpers.cuhcpp/src/utilities/determinism_log.hppcpp/src/utilities/models/fj_predictor/header.hcpp/src/utilities/models/fj_predictor/main.cppcpp/src/utilities/models/fj_predictor/quantize.cppcpp/src/utilities/seed_generator.cucpp/src/utilities/seed_generator.cuhcpp/src/utilities/termination_checker.hppcpp/src/utilities/timer.hppcpp/src/utilities/version_info.cppcpp/src/utilities/version_info.hppcpp/src/utilities/work_limit_context.hppcpp/src/utilities/work_limit_timer.hppcpp/src/utilities/work_unit_predictor.cppcpp/src/utilities/work_unit_predictor.hppcpp/src/utilities/work_unit_scheduler.cppcpp/tests/CMakeLists.txtcpp/tests/mip/CMakeLists.txtcpp/tests/mip/determinism_test.cucpp/tests/mip/determinism_utils.cuhcpp/tests/mip/diversity_test.cucpp/tests/mip/feasibility_jump_tests.cucpp/tests/mip/load_balancing_test.cucpp/tests/mip/local_search_test.cucpp/tests/mip/mip_utils.cuhcpp/tests/mip/miplib_test.cucpp/tests/mip/multi_probe_test.cucpp/tests/mip/presolve_test.cucpp/tests/mip/unit_test.cudatasets/mip/download_miplib_test_dataset.sh
| */ | ||
| /* clang-format on */ | ||
|
|
||
| #include <utilities/determinism_log.hpp> |
There was a problem hiding this comment.
Fix the missing determinism_log.hpp include.
Clang already fails to resolve this header, so the new deterministic logging path does not compile. The same include was added in multiple changed files, so this blocks the whole PR until the file is added or the include path is corrected.
🧰 Tools
🪛 Clang (14.0.6)
[error] 8-8: 'utilities/determinism_log.hpp' file not found
(clang-diagnostic-error)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@cpp/src/branch_and_bound/branch_and_bound.cpp` at line 8, The build fails
because the new include <utilities/determinism_log.hpp> referenced in
branch_and_bound.cpp (and several other changed files) cannot be found; either
add the missing header file into the project at the expected include path (so
<utilities/determinism_log.hpp> resolves) or change the include to the correct
relative path used by the codebase and update the build include directories
(CMakeLists or equivalent) so the compiler can locate determinism_log.hpp;
locate occurrences of <utilities/determinism_log.hpp> in branch_and_bound.cpp
and the other modified files and apply the chosen fix consistently.
| repair_queue_.push_back({solution, cuopt::internals::mip_solution_origin_t::UNKNOWN}); | ||
| mutex_repair_.unlock(); |
There was a problem hiding this comment.
Populate a non-negative repair timestamp here.
queued_repair_solution_t::work_timestamp defaults to -1.0, and repaired incumbents later flow into emit_solution_callback_*(), which now asserts that timestamps are non-negative. With new_incumbent_callback enabled, a repaired injected solution can abort the solve on this path.
🛠️ Suggested fix
if (attempt_repair) {
mutex_repair_.lock();
- repair_queue_.push_back({solution, cuopt::internals::mip_solution_origin_t::UNKNOWN});
+ const double repair_work_timestamp =
+ settings_.deterministic ? work_unit_context_.current_work() : 0.0;
+ repair_queue_.push_back({solution,
+ cuopt::internals::mip_solution_origin_t::UNKNOWN,
+ repair_work_timestamp});
mutex_repair_.unlock();
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| repair_queue_.push_back({solution, cuopt::internals::mip_solution_origin_t::UNKNOWN}); | |
| mutex_repair_.unlock(); | |
| const double repair_work_timestamp = | |
| settings_.deterministic ? work_unit_context_.current_work() : 0.0; | |
| repair_queue_.push_back({solution, | |
| cuopt::internals::mip_solution_origin_t::UNKNOWN, | |
| repair_work_timestamp}); | |
| mutex_repair_.unlock(); |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@cpp/src/branch_and_bound/branch_and_bound.cpp` around lines 589 - 590, The
queued repair entry is pushed with a default negative work_timestamp
(queued_repair_solution_t::work_timestamp == -1.0), which later triggers
assertions in emit_solution_callback_*() when new_incumbent_callback is enabled;
fix by populating a non-negative timestamp before pushing into repair_queue_
(replace the push_back call so the queued_repair_solution_t instance sets
work_timestamp to a valid value, e.g., current wall-clock time or 0.0), ensuring
the code path around repair_queue_.push_back and mutex_repair_.unlock uses the
updated queued_repair_solution_t with a non-negative work_timestamp.
| template <typename i_t, typename f_t> | ||
| f_t trial_branching_generic(const lp_problem_t<i_t, f_t>& original_lp, | ||
| const simplex_solver_settings_t<i_t, f_t>& settings, | ||
| const std::vector<variable_type_t>& var_types, | ||
| const std::vector<variable_status_t>& vstatus, | ||
| const std::vector<f_t>& edge_norms, | ||
| const basis_update_mpf_t<i_t, f_t>& basis_factors, | ||
| const std::vector<i_t>& basic_list, | ||
| const std::vector<i_t>& nonbasic_list, | ||
| i_t branch_var, | ||
| f_t branch_var_lower, | ||
| f_t branch_var_upper, | ||
| f_t upper_bound, | ||
| i_t bnb_lp_iter_per_node, | ||
| f_t start_time, | ||
| i_t upper_max_lp_iter, | ||
| i_t lower_max_lp_iter, | ||
| int64_t& total_lp_iter) | ||
| { | ||
| omp_atomic_t<int64_t> atomic_iter{0}; | ||
| f_t result = trial_branching(original_lp, | ||
| settings, | ||
| var_types, | ||
| vstatus, | ||
| edge_norms, | ||
| basis_factors, | ||
| basic_list, | ||
| nonbasic_list, | ||
| branch_var, | ||
| branch_var_lower, | ||
| branch_var_upper, | ||
| upper_bound, | ||
| bnb_lp_iter_per_node, | ||
| start_time, | ||
| upper_max_lp_iter, | ||
| lower_max_lp_iter, | ||
| atomic_iter); | ||
| total_lp_iter += atomic_iter.load(); | ||
| return result; |
There was a problem hiding this comment.
The deterministic strong-branching counter still has a data race.
This overload writes to a shared int64_t& total_lp_iter, but reliable_variable_selection_core() passes that same reference into an OpenMP taskloop. total_lp_iter += atomic_iter.load() is therefore unsynchronized and can drop iterations nondeterministically, which feeds back into the reliability threshold.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@cpp/src/branch_and_bound/pseudo_costs.cpp` around lines 259 - 297, The
function trial_branching_generic writes to a shared int64_t& total_lp_iter
unsafely (called from reliable_variable_selection_core inside an OpenMP
taskloop), causing a data race; change the update to perform an atomic addition
instead of a plain += by either (a) changing the parameter to accept an
omp_atomic_t<int64_t>& (or std::atomic<int64_t>&) and calling its
fetch_add(atomic_iter.load())/operator+=, or (b) keep int64_t& but replace the
plain total_lp_iter += atomic_iter.load() with an OpenMP atomic update (e.g. use
`#pragma` omp atomic or an atomic_fetch_add wrapper) so the increment is
synchronized; ensure references to atomic_iter, total_lp_iter,
trial_branching_generic, and reliable_variable_selection_core are updated
consistently.
| if (solution.get_feasible()) { | ||
| population.add_solution(std::move(solution), | ||
| internals::mip_solution_origin_t::FEASIBILITY_JUMP); | ||
| } |
There was a problem hiding this comment.
Don't move the caller-owned solution out of run_fj_alone().
run_solver() logs and returns the same sol after this call. Moving it into population here leaves the returned object unspecified on the fj_only_run path.
Suggested fix
ls.fj.solve(solution);
if (solution.get_feasible()) {
- population.add_solution(std::move(solution),
+ population.add_solution(solution_t<i_t, f_t>(solution),
internals::mip_solution_origin_t::FEASIBILITY_JUMP);
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@cpp/src/mip_heuristics/diversity/diversity_manager.cu` around lines 356 -
359, The code moves the caller-owned `solution` into `population` inside
`run_fj_alone()` which leaves the same `sol` returned by `run_solver()`
unspecified on the `fj_only_run` path; instead, avoid transferring ownership:
change the call to `population.add_solution` so it does not take `solution` via
move—either pass `solution` by const reference (if `add_solution` supports it)
or create an explicit copy/clone before moving (e.g. make a local copy and move
that) so the original `solution` remains valid for `run_solver()` return; update
the call site around `population.add_solution(std::move(solution),
internals::mip_solution_origin_t::FEASIBILITY_JUMP)` accordingly.
| if (!full_refresh && fj.pb.related_variables.size() > 0 && | ||
| fj.pb.n_variables / fj.work_ids_for_related_vars[*fj.selected_var] >= | ||
| fj.settings->parameters.old_codepath_total_var_to_relvar_ratio_threshold) { | ||
| fj.settings->parameters.old_codepath_total_var_to_relvar_ratio_threshold && | ||
| fj.settings->load_balancing_mode != fj_load_balancing_mode_t::ALWAYS_ON) { |
There was a problem hiding this comment.
Guard the fallback ratio check against zero work IDs.
If fj.work_ids_for_related_vars[*fj.selected_var] is 0, this divides by zero inside the kernel. That's enough to blow up the fallback path for isolated/empty-related variables.
Possible fix
- if (!full_refresh && fj.pb.related_variables.size() > 0 &&
- fj.pb.n_variables / fj.work_ids_for_related_vars[*fj.selected_var] >=
+ auto related_work_ids = fj.work_ids_for_related_vars[*fj.selected_var];
+ if (!full_refresh && fj.pb.related_variables.size() > 0 && related_work_ids > 0 &&
+ fj.pb.n_variables / related_work_ids >=
fj.settings->parameters.old_codepath_total_var_to_relvar_ratio_threshold &&
fj.settings->load_balancing_mode != fj_load_balancing_mode_t::ALWAYS_ON) {📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if (!full_refresh && fj.pb.related_variables.size() > 0 && | |
| fj.pb.n_variables / fj.work_ids_for_related_vars[*fj.selected_var] >= | |
| fj.settings->parameters.old_codepath_total_var_to_relvar_ratio_threshold) { | |
| fj.settings->parameters.old_codepath_total_var_to_relvar_ratio_threshold && | |
| fj.settings->load_balancing_mode != fj_load_balancing_mode_t::ALWAYS_ON) { | |
| auto related_work_ids = fj.work_ids_for_related_vars[*fj.selected_var]; | |
| if (!full_refresh && fj.pb.related_variables.size() > 0 && related_work_ids > 0 && | |
| fj.pb.n_variables / related_work_ids >= | |
| fj.settings->parameters.old_codepath_total_var_to_relvar_ratio_threshold && | |
| fj.settings->load_balancing_mode != fj_load_balancing_mode_t::ALWAYS_ON) { |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@cpp/src/mip_heuristics/feasibility_jump/load_balancing.cuh` around lines 125
- 128, The division uses fj.work_ids_for_related_vars[*fj.selected_var] and can
divide by zero; update the condition to first fetch that value (e.g., auto
work_ids = fj.work_ids_for_related_vars[*fj.selected_var]) and guard the ratio
check with work_ids > 0 (or replace the division with a safe multiply
comparison) so the clause becomes: full_refresh is false && related_variables
non-empty && work_ids > 0 && fj.pb.n_variables / work_ids >=
fj.settings->parameters.old_codepath_total_var_to_relvar_ratio_threshold &&
fj.settings->load_balancing_mode != fj_load_balancing_mode_t::ALWAYS_ON. Ensure
you reference fj.selected_var and fj.work_ids_for_related_vars in the change.
| const raft::handle_t* handle_ptr; | ||
| std::mt19937 gen; | ||
| timer_t timer{0.}; | ||
| work_limit_timer_t timer; |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Search for work_limit_timer_t definition to verify default constructor behavior
ast-grep --pattern 'class work_limit_timer_t {
$$$
}'
# Also check for struct definition
ast-grep --pattern 'struct work_limit_timer_t {
$$$
}'
# Search for using/typedef
rg -n 'using work_limit_timer_t' --type cpp
rg -n 'typedef.*work_limit_timer_t' --type cppRepository: aliceb-nv/cuopt-fork
Length of output: 159
🏁 Script executed:
#!/bin/bash
# Find the termination_checker_t definition in the file where the type alias is
cat cpp/src/utilities/termination_checker.hpp | head -250 | tail -100
# Also search for termination_checker_t class/struct definition more broadly
rg -n 'class termination_checker_t|struct termination_checker_t' --type cpp -A 20Repository: aliceb-nv/cuopt-fork
Length of output: 4597
🏁 Script executed:
#!/bin/bash
# Search for timer_t definition
rg -n 'using timer_t|typedef.*timer_t|class timer_t|struct timer_t' --type cpp -A 5Repository: aliceb-nv/cuopt-fork
Length of output: 398
🏁 Script executed:
#!/bin/bash
# Get the full timer_t class definition
sed -n '16,100p' cpp/src/utilities/timer.hppRepository: aliceb-nv/cuopt-fork
Length of output: 2852
🏁 Script executed:
#!/bin/bash
# Search for all constructors of termination_checker_t
rg -n 'termination_checker_t(' cpp/src/utilities/termination_checker.hpp | head -20Repository: aliceb-nv/cuopt-fork
Length of output: 158
🏁 Script executed:
#!/bin/bash
# Search for all constructors of termination_checker_t
rg -n 'termination_checker_t\(' cpp/src/utilities/termination_checker.hppRepository: aliceb-nv/cuopt-fork
Length of output: 272
This change will not compile: work_limit_timer_t cannot be default-constructed.
The type alias work_limit_timer_t refers to termination_checker_t, which has no explicit default constructor and contains a member timer_t timer without an initializer. Since timer_t's default constructor is explicitly deleted, the implicit default constructor of termination_checker_t cannot be generated. The statement work_limit_timer_t timer; at line 147 will fail to compile.
The original code timer_t timer{0.} explicitly constructed the timer with a value. The replacement requires either:
- An explicit default constructor for
termination_checker_tthat properly initializestimer_t - Or initializing
timer_twith a valid argument at the declaration point
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@cpp/src/mip_heuristics/local_search/rounding/bounds_repair.cuh` at line 147,
work_limit_timer_t cannot be default-constructed because termination_checker_t
contains a timer_t whose default ctor is deleted; replace the problematic
declaration by either (A) initialize the local instance with a valid timer value
(e.g. construct the alias with the same argument used previously, e.g. use
work_limit_timer_t timer{0.} or the appropriate float/duration), or (B) add an
explicit default constructor to termination_checker_t that initializes its timer
member (timer_t timer) with the required argument (e.g. timer{0.}) so that
termination_checker_t can be default-constructed; update the declaration of
work_limit_timer_t timer accordingly to match the chosen approach.
There was a problem hiding this comment.
Actionable comments posted: 12
🧹 Nitpick comments (8)
cpp/src/mip_heuristics/solver.cu (1)
20-23: Consider documenting the macro override intent.The
#undeffollowed by#defineoverrides the determinism logging macro to redirect toCUOPT_LOG_INFO. While this is intentional for verbose logging during development/debugging, it may inadvertently flood production logs. Consider adding a comment explaining the intent or guarding with a build flag.💡 Suggested documentation
`#include` <utilities/determinism_log.hpp> +// Override determinism logging to emit INFO-level logs for debugging work-unit accounting. +// TODO: Consider making this conditional on a build flag for production builds. `#undef` CUOPT_DETERMINISM_LOG `#define` CUOPT_DETERMINISM_LOG(...) CUOPT_LOG_INFO(__VA_ARGS__)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/mip_heuristics/solver.cu` around lines 20 - 23, Add documentation and a compilation guard around the macro override: explain why CUOPT_DETERMINISM_LOG is being undefined and redefined to CUOPT_LOG_INFO (e.g., temporary verbose logging for development/troubleshooting) and either wrap the `#undef/`#define in a conditional build flag (such as `#ifdef` DEBUG or a project-specific CUOPT_DETERMINISM_OVERRIDE) or add a clear comment referencing utilities/determinism_log.hpp and CUOPT_LOG_INFO so maintainers know this change is intentional and can disable it for production.cpp/include/cuopt/linear_programming/utilities/internals.hpp (3)
107-118: Add Doxygen documentation for extended callback class.This class extends the callback interface with origin metadata. Per coding guidelines, public classes should document their purpose and thread-safety characteristics.
📝 Suggested documentation
+/** + * `@brief` Extended solution callback that receives origin metadata alongside solution data. + * + * Subclass and implement get_solution() to receive solutions with provenance information. + * Register via solver settings to receive callbacks during the solve process. + * + * `@note` Callbacks may be invoked from worker threads; implementations should be thread-safe. + */ class get_solution_callback_ext_t : public base_solution_callback_t {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/include/cuopt/linear_programming/utilities/internals.hpp` around lines 107 - 118, Add Doxygen documentation to the public class get_solution_callback_ext_t describing its purpose as an extended solution callback that supplies origin/metadata alongside solution data, document the virtual method get_solution parameters and expected behavior, and state thread-safety guarantees (e.g., whether implementations must be thread-safe or called from a single thread). Place the comment above the class definition and mention the overridden get_type() behavior returning base_solution_callback_type::GET_SOLUTION_EXT so readers can locate this specialization.
27-42: Add Doxygen documentation for new public enum.As per coding guidelines for public headers under
cpp/include/cuopt/**/*, new public entities should have documentation comments. This enum introduces the solution origin taxonomy that external callbacks will receive.📝 Suggested documentation
+/** + * `@brief` Origin type for MIP solutions, indicating which solver component found the solution. + * + * Used in extended callbacks (get_solution_callback_ext_t) to provide provenance metadata. + */ enum class mip_solution_origin_t : uint32_t {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/include/cuopt/linear_programming/utilities/internals.hpp` around lines 27 - 42, Add Doxygen documentation to the new public enum mip_solution_origin_t and each enumerator so external users and generated docs understand the solution-origin taxonomy; update the enum definition in internals.hpp (mip_solution_origin_t) to include a brief summary comment above the enum and short `@enum/`@brief or per-enumerator `@brief` comments for values like UNKNOWN, BRANCH_AND_BOUND_NODE, BRANCH_AND_BOUND_DIVING, FEASIBILITY_JUMP, CPU_FEASIBILITY_JUMP, LOCAL_SEARCH, QUICK_FEASIBLE, LP_ROUNDING, RECOMBINATION, SUB_MIP, USER_INITIAL, USER_INJECTED, RINS, and PRESOLVE describing what each origin means and any callback implications.
65-69: Add Doxygen documentation for callback info struct.This struct is part of the public extended callback API and should document its fields for downstream consumers.
📝 Suggested documentation
+/** + * `@brief` Extended callback metadata for MIP solution events. + * + * `@param` struct_size Size of this struct for ABI versioning. + * `@param` origin The solver component that produced the solution. + * `@param` work_timestamp Deterministic work-unit timestamp (-1.0 if unavailable). + */ struct mip_solution_callback_info_t {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/include/cuopt/linear_programming/utilities/internals.hpp` around lines 65 - 69, The public callback struct mip_solution_callback_info_t lacks Doxygen comments; add a brief Doxygen comment above the struct describing its role in the extended callback API and document each member: document struct_size as the size/version stamp (uint64_t) used for ABI compatibility, origin (mip_solution_origin_t) as the provenance of the solution with possible enum values, and work_timestamp (double) as the timestamp or monotonic time associated with the callback; keep descriptions concise and use `@brief` and `@param/`@var or `@field` tags so downstream consumers and generated docs see each field purpose.cpp/tests/mip/diversity_test.cu (1)
71-73: Remove unusedseedparameter from test helper functions.The
seedparameter is declared intest_full_run_determinism,test_initial_solution_determinism, andtest_recombiners_determinismbut never used within these functions. The seed is instead set viacuopt::seed_generator::set_seed(seed)in the test cases before calling these helpers.🧹 Proposed signature cleanup
-static uint32_t test_full_run_determinism(std::string path, - unsigned long seed = std::random_device{}(), - float work_limit = 10.0f) +static uint32_t test_full_run_determinism(std::string path, + float work_limit = 10.0f)Apply similar changes to
test_initial_solution_determinismandtest_recombiners_determinism.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/tests/mip/diversity_test.cu` around lines 71 - 73, Remove the unused seed parameter from the helper functions by deleting the seed parameter (and its default initializer) from the declarations and definitions of test_full_run_determinism, test_initial_solution_determinism, and test_recombiners_determinism; then update all call sites so they no longer pass a seed argument (the tests already set the seed via cuopt::seed_generator::set_seed(seed)), ensuring signatures only include the necessary parameters (e.g., path and work_limit or whatever else the helper truly needs).cpp/src/branch_and_bound/branch_and_bound.cpp (3)
48-54: Debug macro redefinition pattern is unusual but functional.The
#undeffollowed by#definepattern works, but having the macro unconditionally enabled (despite the "uncomment to enable" comment) will produce verbose logs in production builds. Consider using a compile-time flag (e.g.,#ifdef CUOPT_ENABLE_DETERMINISM_LOGGING) to conditionally enable this.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/branch_and_bound/branch_and_bound.cpp` around lines 48 - 54, The current code unconditionally undefines and defines the CUOPT_DETERMINISM_LOG macro (symbols: CUOPT_DETERMINISM_LOG, CUOPT_DETERMINISM_LOG(logger,...)) which forces verbose logging; change it to be enabled only when a compile-time flag is set by replacing the unconditional pattern with a conditional compile guard (e.g., wrap the `#define` of CUOPT_DETERMINISM_LOG inside an `#ifdef` CUOPT_ENABLE_DETERMINISM_LOGGING / `#endif` and remove the unconditional undef/#define), and provide a no-op fallback macro when the flag is not defined so production builds remain quiet.
591-593: Repair queue path is non-deterministic only—timestamp default is acceptable here.I've traced the flow:
set_new_solutionasserts non-deterministic mode at line 527, andrepair_heuristic_solutionsis only called from non-deterministic schedulers. Theemit_solution_callbackassertion (line 496-497) only fails whendeterministic=true, so the-1.0default timestamp won't trigger failures.However, for clarity and future-proofing, consider explicitly setting
work_timestamp = 0.0to document the intent:♻️ Suggested improvement
if (attempt_repair) { mutex_repair_.lock(); - repair_queue_.push_back({solution, origin}); + repair_queue_.push_back({solution, origin, 0.0}); // Non-deterministic path uses wall-clock time mutex_repair_.unlock(); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/branch_and_bound/branch_and_bound.cpp` around lines 591 - 593, The repair queue push currently uses a default timestamp (-1.0) which is acceptable but unclear; change the pushed tuple so the timestamp is explicitly 0.0 to document non-deterministic-only intent: in the block guarded by attempt_repair (around mutex_repair_.lock() and repair_queue_.push_back({solution, origin})), replace the implicit/default timestamp with an explicit work_timestamp = 0.0 when constructing the pushed entry; this clarifies behavior relative to set_new_solution, repair_heuristic_solutions, and emit_solution_callback without changing semantics.
4192-4238: Load balancing uses backlog counts, not total work estimates.The balancing logic at
deterministic_balance_worker_loadsdistributes based onbacklog.size()(node count) rather than estimated work. For heterogeneous nodes (varying LP difficulty), this could lead to uneven work distribution. Consider adding an objective-estimate-based metric if load imbalance becomes problematic in practice.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/branch_and_bound/branch_and_bound.cpp` around lines 4192 - 4238, The balancing currently in deterministic_balance_worker_loads uses backlog.size() (backlog_counts) which ignores per-node cost; change it to use an estimated-work metric by summing each node's work estimate instead of counts: for each mip_node_t<i_t,f_t>* in worker.backlog use an existing field or method (e.g., node->estimated_work or implement estimate_node_work(node)) to compute per-worker total_work_estimates vector and total_work, max_work, min_work, and use those to decide needs_balance. When redistributing, collect all nodes into all_backlog_nodes then perform a greedy largest-first assignment (sort nodes by estimated work descending and push each node to the worker with current smallest total_assigned_work) rather than simple round-robin so heavy nodes are balanced; update any references to backlog_counts/min/max to use work estimates. Ensure worker.backlog.clear()/push logic still uses mip_node_t* but maintains the new per-worker work totals during distribution.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@benchmarks/linear_programming/cuopt/run_mip.cpp`:
- Around line 270-277: The log message passed to CUOPT_LOG_INFO contains a debug
artifact "1run_mip"; update the format string in the CUOPT_LOG_INFO call (the
call at/around function run_mip) to remove the leading "1" so it reads "run_mip
settings: heuristics_only=%d deterministic=%d determinism_mode=%d
time_limit=%.6f work_limit=%.6f" while leaving the argument list
((int)heuristics_only, (int)deterministic, settings.determinism_mode,
settings.time_limit, settings.work_limit) unchanged.
- Line 299: The commented-out debug line contains a stray debug prefix
"1solution"; update the comment to remove the prefix so the call reads as a
normal commented call (referencing write_to_sol_file, base_filename and
handle_.get_stream()) — e.g., change "//
1solution.write_to_sol_file(base_filename + ".sol", handle_.get_stream());" to
"// solution.write_to_sol_file(base_filename + ".sol", handle_.get_stream());"
(or fully uncomment if intended to be active).
In `@cpp/include/cuopt/linear_programming/constants.h`:
- Around line 80-92: Typo in the comment for CUOPT_DETERMINISM_BB: change
"rleease" to "release" in the comment immediately following the
CUOPT_DETERMINISM_BB macro definition so the comment reads "...in the previous
release"; locate the comment next to the CUOPT_DETERMINISM_BB symbol in
constants.h and correct the spelling.
In `@cpp/src/branch_and_bound/branch_and_bound.cpp`:
- Around line 3773-3781: The code unconditionally sets lp_settings.work_limit =
std::numeric_limits<f_t>::infinity(), which ignores any user-configured work
budget; change lp_settings.work_limit to respect settings_.work_limit by
computing a per-node budget (e.g., min(settings_.work_limit - used_work, a
reasonable per-node cap, or remaining_time-derived budget) before assigning to
lp_settings.work_limit), ensuring lp_settings.work_limit is finite when a user
limit exists and falls back to infinity only if settings_.work_limit is
infinite; update the assignment site (lp_settings.work_limit) and any logic that
computes remaining/horizon budgets so the node LP solve cannot exceed the
configured work budget.
- Around line 4446-4454: The diving branch sets lp_settings.work_limit to
infinity so individual LP solves ignore global work limits; change
lp_settings.work_limit to a finite remaining work value (mirror the BFS fix) by
assigning it to the available work budget (e.g., compute remaining_work =
worker.work_limit - worker.work_done or use the existing remaining_work
variable) and set lp_settings.work_limit = remaining_work instead of infinity;
update the code around lp_settings.work_limit in the same block that sets
cut_off and time_limit so diving LP respects the global work budget.
In `@cpp/src/mip_heuristics/problem/presolve_data.cu`:
- Around line 114-128: fixed_var_assignment is being used as shared device
scratch while callbacks may run on different override streams, causing races;
allocate a per-call device buffer (or obtain a per-stream scratch) instead of
referencing the shared member when launching the thrust kernel and when calling
expand_device_copy/cuopt::host_copy. Concretely, inside the presolve routine
where h (from handle_override or problem.handle_ptr), current_assignment,
variable_mapping, assgn, var_map and fixed_assgn are used, create a local device
buffer for the fixed assignment (or query a per-stream scratch tied to
h->get_stream()), use that local buffer in the thrust::for_each lambda and pass
it to expand_device_copy and cuopt::host_copy, and free/let it go out of scope;
apply the same change to the other occurrence around the 151-154 block.
In `@cpp/src/mip_heuristics/problem/problem.cu`:
- Around line 951-955: The deterministic early-return leaves
related_variables_offsets empty which breaks view_t::range_for_related_vars()
(it reads related_variables_offsets[v + 1]); instead of clearing to size 0,
resize related_variables_offsets to a zero-filled offsets array representing no
related vars (one zero per vertex plus a sentinel, e.g., set
related_variables_offsets.size = num_vertices + 1 and fill with 0) and ensure
related_variables is resized to 0 as before; update the deterministic branch
that currently calls related_variables.resize(0, ...) and
related_variables_offsets.resize(0, ...) to produce the zero-filled offsets so
callers get empty ranges safely.
In `@cpp/src/mip_heuristics/solution_callbacks.cuh`:
- Around line 195-197: The callback currently records runtime SET_SOLUTION
injections with the wrong provenance by calling
on_injected(outside_sol.get_host_assignment(), outside_sol.get_objective(),
internals::mip_solution_origin_t::USER_INITIAL); — change the origin tag to
internals::mip_solution_origin_t::USER_INJECTED so runtime callback injections
are correctly classified; update the call site of on_injected(...) accordingly
(keep outside_sol.get_host_assignment() and outside_sol.get_objective() intact).
- Around line 193-194: The assertion uses a fixed absolute tolerance (1e-6)
which is brittle for large-magnitude objectives or float instantiations; modify
the cuopt_assert in the callback that checks outside_sol.get_user_objective() vs
outside_sol_objective to use a scaled/relative tolerance instead (e.g., compute
tol = rel_tol * std::max<f_t>(1,
std::max(std::abs(outside_sol.get_user_objective()),
std::abs(outside_sol_objective))) or include epsilon from
std::numeric_limits<f_t>::epsilon() when appropriate) and then assert that
std::abs(... - ...) <= tol; update the call site around cuopt_assert to use this
computed tol rather than the hard-coded 1e-6.
- Around line 157-177: The callback currently copies the host solution to
device, then applies scaling (scaling.scale_solutions) before calling
problem_ptr->pre_process_assignment, which can mis-scale when presolve changes
indices/sizes; change the order so that after set_sol_callback->set_solution
fills h_incumbent_assignment and raft::copy moves it into incumbent_assignment,
you call problem_ptr->pre_process_assignment(incumbent_assignment) first, and
only then, if settings.mip_scaling, call
scaling.scale_solutions(incumbent_assignment); keep the calls to
set_sol_callback, raft::copy, pre_process_assignment, and
scaling.scale_solutions in that sequence and preserve stream/synchronization
semantics.
- Around line 16-18: The header solution_callbacks.cuh uses std::isfinite and
std::abs but does not include <cmath>, relying on transitive includes; add
`#include` <cmath> to this header so it is self-contained and ensures
std::isfinite and std::abs are declared for the uses in solution_callbacks.cuh
(references: std::isfinite at the code using it and std::abs where used).
In `@cpp/tests/mip/diversity_test.cu`:
- Around line 300-312: Remove the unreachable dead code after the early return:
keep the single return statement that calls detail::compute_hash(hashes) and
delete the subsequent block that references
diversity_manager.get_population_pointer(), the loop over
pop->population_to_vector() calling sol.get_hash(), the secondary call to
detail::compute_hash, the printf using path.c_str(), and the later return of
final_hash so no code remains after the initial return.
---
Nitpick comments:
In `@cpp/include/cuopt/linear_programming/utilities/internals.hpp`:
- Around line 107-118: Add Doxygen documentation to the public class
get_solution_callback_ext_t describing its purpose as an extended solution
callback that supplies origin/metadata alongside solution data, document the
virtual method get_solution parameters and expected behavior, and state
thread-safety guarantees (e.g., whether implementations must be thread-safe or
called from a single thread). Place the comment above the class definition and
mention the overridden get_type() behavior returning
base_solution_callback_type::GET_SOLUTION_EXT so readers can locate this
specialization.
- Around line 27-42: Add Doxygen documentation to the new public enum
mip_solution_origin_t and each enumerator so external users and generated docs
understand the solution-origin taxonomy; update the enum definition in
internals.hpp (mip_solution_origin_t) to include a brief summary comment above
the enum and short `@enum/`@brief or per-enumerator `@brief` comments for values
like UNKNOWN, BRANCH_AND_BOUND_NODE, BRANCH_AND_BOUND_DIVING, FEASIBILITY_JUMP,
CPU_FEASIBILITY_JUMP, LOCAL_SEARCH, QUICK_FEASIBLE, LP_ROUNDING, RECOMBINATION,
SUB_MIP, USER_INITIAL, USER_INJECTED, RINS, and PRESOLVE describing what each
origin means and any callback implications.
- Around line 65-69: The public callback struct mip_solution_callback_info_t
lacks Doxygen comments; add a brief Doxygen comment above the struct describing
its role in the extended callback API and document each member: document
struct_size as the size/version stamp (uint64_t) used for ABI compatibility,
origin (mip_solution_origin_t) as the provenance of the solution with possible
enum values, and work_timestamp (double) as the timestamp or monotonic time
associated with the callback; keep descriptions concise and use `@brief` and
`@param/`@var or `@field` tags so downstream consumers and generated docs see each
field purpose.
In `@cpp/src/branch_and_bound/branch_and_bound.cpp`:
- Around line 48-54: The current code unconditionally undefines and defines the
CUOPT_DETERMINISM_LOG macro (symbols: CUOPT_DETERMINISM_LOG,
CUOPT_DETERMINISM_LOG(logger,...)) which forces verbose logging; change it to be
enabled only when a compile-time flag is set by replacing the unconditional
pattern with a conditional compile guard (e.g., wrap the `#define` of
CUOPT_DETERMINISM_LOG inside an `#ifdef` CUOPT_ENABLE_DETERMINISM_LOGGING / `#endif`
and remove the unconditional undef/#define), and provide a no-op fallback macro
when the flag is not defined so production builds remain quiet.
- Around line 591-593: The repair queue push currently uses a default timestamp
(-1.0) which is acceptable but unclear; change the pushed tuple so the timestamp
is explicitly 0.0 to document non-deterministic-only intent: in the block
guarded by attempt_repair (around mutex_repair_.lock() and
repair_queue_.push_back({solution, origin})), replace the implicit/default
timestamp with an explicit work_timestamp = 0.0 when constructing the pushed
entry; this clarifies behavior relative to set_new_solution,
repair_heuristic_solutions, and emit_solution_callback without changing
semantics.
- Around line 4192-4238: The balancing currently in
deterministic_balance_worker_loads uses backlog.size() (backlog_counts) which
ignores per-node cost; change it to use an estimated-work metric by summing each
node's work estimate instead of counts: for each mip_node_t<i_t,f_t>* in
worker.backlog use an existing field or method (e.g., node->estimated_work or
implement estimate_node_work(node)) to compute per-worker total_work_estimates
vector and total_work, max_work, min_work, and use those to decide
needs_balance. When redistributing, collect all nodes into all_backlog_nodes
then perform a greedy largest-first assignment (sort nodes by estimated work
descending and push each node to the worker with current smallest
total_assigned_work) rather than simple round-robin so heavy nodes are balanced;
update any references to backlog_counts/min/max to use work estimates. Ensure
worker.backlog.clear()/push logic still uses mip_node_t* but maintains the new
per-worker work totals during distribution.
In `@cpp/src/mip_heuristics/solver.cu`:
- Around line 20-23: Add documentation and a compilation guard around the macro
override: explain why CUOPT_DETERMINISM_LOG is being undefined and redefined to
CUOPT_LOG_INFO (e.g., temporary verbose logging for development/troubleshooting)
and either wrap the `#undef/`#define in a conditional build flag (such as `#ifdef`
DEBUG or a project-specific CUOPT_DETERMINISM_OVERRIDE) or add a clear comment
referencing utilities/determinism_log.hpp and CUOPT_LOG_INFO so maintainers know
this change is intentional and can disable it for production.
In `@cpp/tests/mip/diversity_test.cu`:
- Around line 71-73: Remove the unused seed parameter from the helper functions
by deleting the seed parameter (and its default initializer) from the
declarations and definitions of test_full_run_determinism,
test_initial_solution_determinism, and test_recombiners_determinism; then update
all call sites so they no longer pass a seed argument (the tests already set the
seed via cuopt::seed_generator::set_seed(seed)), ensuring signatures only
include the necessary parameters (e.g., path and work_limit or whatever else the
helper truly needs).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: c6ac0f60-5442-4c71-a41f-54397e72d31c
📒 Files selected for processing (18)
benchmarks/linear_programming/cuopt/run_mip.cppcpp/include/cuopt/linear_programming/constants.hcpp/include/cuopt/linear_programming/utilities/internals.hppcpp/src/branch_and_bound/branch_and_bound.cppcpp/src/branch_and_bound/branch_and_bound.hppcpp/src/mip_heuristics/diversity/population.cucpp/src/mip_heuristics/diversity/population.cuhcpp/src/mip_heuristics/local_search/rounding/simple_rounding.cucpp/src/mip_heuristics/problem/presolve_data.cucpp/src/mip_heuristics/problem/presolve_data.cuhcpp/src/mip_heuristics/problem/problem.cucpp/src/mip_heuristics/problem/problem.cuhcpp/src/mip_heuristics/solution/solution.cucpp/src/mip_heuristics/solution/solution.cuhcpp/src/mip_heuristics/solution_callbacks.cuhcpp/src/mip_heuristics/solver.cucpp/src/mip_heuristics/solver_context.cuhcpp/tests/mip/diversity_test.cu
| CUOPT_LOG_INFO( | ||
| "1run_mip settings: heuristics_only=%d deterministic=%d determinism_mode=%d " | ||
| "time_limit=%.6f work_limit=%.6f", | ||
| (int)heuristics_only, | ||
| (int)deterministic, | ||
| settings.determinism_mode, | ||
| settings.time_limit, | ||
| settings.work_limit); |
There was a problem hiding this comment.
Remove debug prefix from log message.
The log message contains "1run_mip" which appears to be a debug artifact. This should be "run_mip".
🔧 Proposed fix
CUOPT_LOG_INFO(
- "1run_mip settings: heuristics_only=%d deterministic=%d determinism_mode=%d "
+ "run_mip settings: heuristics_only=%d deterministic=%d determinism_mode=%d "
"time_limit=%.6f work_limit=%.6f",📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| CUOPT_LOG_INFO( | |
| "1run_mip settings: heuristics_only=%d deterministic=%d determinism_mode=%d " | |
| "time_limit=%.6f work_limit=%.6f", | |
| (int)heuristics_only, | |
| (int)deterministic, | |
| settings.determinism_mode, | |
| settings.time_limit, | |
| settings.work_limit); | |
| CUOPT_LOG_INFO( | |
| "run_mip settings: heuristics_only=%d deterministic=%d determinism_mode=%d " | |
| "time_limit=%.6f work_limit=%.6f", | |
| (int)heuristics_only, | |
| (int)deterministic, | |
| settings.determinism_mode, | |
| settings.time_limit, | |
| settings.work_limit); |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@benchmarks/linear_programming/cuopt/run_mip.cpp` around lines 270 - 277, The
log message passed to CUOPT_LOG_INFO contains a debug artifact "1run_mip";
update the format string in the CUOPT_LOG_INFO call (the call at/around function
run_mip) to remove the leading "1" so it reads "run_mip settings:
heuristics_only=%d deterministic=%d determinism_mode=%d time_limit=%.6f
work_limit=%.6f" while leaving the argument list ((int)heuristics_only,
(int)deterministic, settings.determinism_mode, settings.time_limit,
settings.work_limit) unchanged.
| benchmark_info.last_improvement_of_best_feasible, | ||
| benchmark_info.last_improvement_after_recombination); | ||
| // solution.write_to_sol_file(base_filename + ".sol", handle_.get_stream()); | ||
| // 1solution.write_to_sol_file(base_filename + ".sol", handle_.get_stream()); |
There was a problem hiding this comment.
Remove debug prefix from comment.
The commented-out code has "1solution" which appears to be a debug marker. If this code should remain commented out, clean up the debug prefix.
🔧 Proposed fix
- // 1solution.write_to_sol_file(base_filename + ".sol", handle_.get_stream());
+ // solution.write_to_sol_file(base_filename + ".sol", handle_.get_stream());📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // 1solution.write_to_sol_file(base_filename + ".sol", handle_.get_stream()); | |
| // solution.write_to_sol_file(base_filename + ".sol", handle_.get_stream()); |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@benchmarks/linear_programming/cuopt/run_mip.cpp` at line 299, The
commented-out debug line contains a stray debug prefix "1solution"; update the
comment to remove the prefix so the call reads as a normal commented call
(referencing write_to_sol_file, base_filename and handle_.get_stream()) — e.g.,
change "// 1solution.write_to_sol_file(base_filename + ".sol",
handle_.get_stream());" to "// solution.write_to_sol_file(base_filename +
".sol", handle_.get_stream());" (or fully uncomment if intended to be active).
| /* @brief MIP determinism mode flags (bitset) */ | ||
| #define CUOPT_DETERMINISM_NONE 0x0 | ||
| #define CUOPT_DETERMINISM_BB \ | ||
| 0x1 // matches the previous value of '1' which was for B&B-only determinism in the previous | ||
| // rleease | ||
| #define CUOPT_DETERMINISM_GPU_HEURISTICS 0x2 | ||
| #define CUOPT_DETERMINISM_FULL (CUOPT_DETERMINISM_BB | CUOPT_DETERMINISM_GPU_HEURISTICS) | ||
|
|
||
| /* Backward compatibility aliases */ | ||
| #define CUOPT_MODE_OPPORTUNISTIC CUOPT_DETERMINISM_NONE | ||
| #define CUOPT_MODE_DETERMINISTIC CUOPT_DETERMINISM_FULL | ||
| #define CUOPT_MODE_DETERMINISTIC_BB CUOPT_DETERMINISM_BB | ||
| #define CUOPT_MODE_DETERMINISTIC_GPU_HEURISTICS CUOPT_DETERMINISM_GPU_HEURISTICS |
There was a problem hiding this comment.
Fix typo in comment.
Line 84 has "rleease" which should be "release".
🔧 Proposed fix
`#define` CUOPT_DETERMINISM_BB \
- 0x1 // matches the previous value of '1' which was for B&B-only determinism in the previous
- // rleease
+ 0x1 // matches the previous value of '1' which was for B&B-only determinism in the previous
+ // release📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| /* @brief MIP determinism mode flags (bitset) */ | |
| #define CUOPT_DETERMINISM_NONE 0x0 | |
| #define CUOPT_DETERMINISM_BB \ | |
| 0x1 // matches the previous value of '1' which was for B&B-only determinism in the previous | |
| // rleease | |
| #define CUOPT_DETERMINISM_GPU_HEURISTICS 0x2 | |
| #define CUOPT_DETERMINISM_FULL (CUOPT_DETERMINISM_BB | CUOPT_DETERMINISM_GPU_HEURISTICS) | |
| /* Backward compatibility aliases */ | |
| #define CUOPT_MODE_OPPORTUNISTIC CUOPT_DETERMINISM_NONE | |
| #define CUOPT_MODE_DETERMINISTIC CUOPT_DETERMINISM_FULL | |
| #define CUOPT_MODE_DETERMINISTIC_BB CUOPT_DETERMINISM_BB | |
| #define CUOPT_MODE_DETERMINISTIC_GPU_HEURISTICS CUOPT_DETERMINISM_GPU_HEURISTICS | |
| /* `@brief` MIP determinism mode flags (bitset) */ | |
| `#define` CUOPT_DETERMINISM_NONE 0x0 | |
| `#define` CUOPT_DETERMINISM_BB \ | |
| 0x1 // matches the previous value of '1' which was for B&B-only determinism in the previous | |
| // release | |
| `#define` CUOPT_DETERMINISM_GPU_HEURISTICS 0x2 | |
| `#define` CUOPT_DETERMINISM_FULL (CUOPT_DETERMINISM_BB | CUOPT_DETERMINISM_GPU_HEURISTICS) | |
| /* Backward compatibility aliases */ | |
| `#define` CUOPT_MODE_OPPORTUNISTIC CUOPT_DETERMINISM_NONE | |
| `#define` CUOPT_MODE_DETERMINISTIC CUOPT_DETERMINISM_FULL | |
| `#define` CUOPT_MODE_DETERMINISTIC_BB CUOPT_DETERMINISM_BB | |
| `#define` CUOPT_MODE_DETERMINISTIC_GPU_HEURISTICS CUOPT_DETERMINISM_GPU_HEURISTICS |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@cpp/include/cuopt/linear_programming/constants.h` around lines 80 - 92, Typo
in the comment for CUOPT_DETERMINISM_BB: change "rleease" to "release" in the
comment immediately following the CUOPT_DETERMINISM_BB macro definition so the
comment reads "...in the previous release"; locate the comment next to the
CUOPT_DETERMINISM_BB symbol in constants.h and correct the spelling.
| if (original_lp_.objective_is_integral) { | ||
| lp_settings.cut_off = | ||
| std::ceil(worker.local_upper_bound - settings_.integer_tol) + settings_.dual_tol; | ||
| } else { | ||
| lp_settings.cut_off = worker.local_upper_bound + settings_.dual_tol; | ||
| } | ||
| lp_settings.inside_mip = 2; | ||
| lp_settings.time_limit = remaining_time; | ||
| lp_settings.work_limit = std::numeric_limits<f_t>::infinity(); |
There was a problem hiding this comment.
Node LP work_limit unconditionally set to infinity ignores user-configured work limits.
Setting lp_settings.work_limit = std::numeric_limits<f_t>::infinity() means individual node LP solves will never respect the user's work limit. The work limit check only happens at horizon boundaries (line 3703-3704), so a single long-running node LP could exceed the budget before the next sync.
Consider propagating a per-node work budget derived from settings_.work_limit or at minimum documenting this design decision.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@cpp/src/branch_and_bound/branch_and_bound.cpp` around lines 3773 - 3781, The
code unconditionally sets lp_settings.work_limit =
std::numeric_limits<f_t>::infinity(), which ignores any user-configured work
budget; change lp_settings.work_limit to respect settings_.work_limit by
computing a per-node budget (e.g., min(settings_.work_limit - used_work, a
reasonable per-node cap, or remaining_time-derived budget) before assigning to
lp_settings.work_limit), ensuring lp_settings.work_limit is finite when a user
limit exists and falls back to infinity only if settings_.work_limit is
infinite; update the assignment site (lp_settings.work_limit) and any logic that
computes remaining/horizon budgets so the node LP solve cannot exceed the
configured work budget.
| if (original_lp_.objective_is_integral) { | ||
| lp_settings.cut_off = | ||
| std::ceil(worker.local_upper_bound - settings_.integer_tol) + settings_.dual_tol; | ||
| } else { | ||
| lp_settings.cut_off = worker.local_upper_bound + settings_.dual_tol; | ||
| } | ||
| lp_settings.inside_mip = 2; | ||
| lp_settings.time_limit = remaining_time; | ||
| lp_settings.work_limit = std::numeric_limits<f_t>::infinity(); |
There was a problem hiding this comment.
Diving worker also sets work_limit to infinity—same concern as BFS worker.
Same issue as the deterministic BFS path at line 3781. Individual diving LP solves won't respect work limits.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@cpp/src/branch_and_bound/branch_and_bound.cpp` around lines 4446 - 4454, The
diving branch sets lp_settings.work_limit to infinity so individual LP solves
ignore global work limits; change lp_settings.work_limit to a finite remaining
work value (mirror the BFS fix) by assigning it to the available work budget
(e.g., compute remaining_work = worker.work_limit - worker.work_done or use the
existing remaining_work variable) and set lp_settings.work_limit =
remaining_work instead of infinity; update the code around
lp_settings.work_limit in the same block that sets cut_off and time_limit so
diving LP respects the global work budget.
| #include <limits> | ||
| #include <mutex> | ||
| #include <vector> |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
sed -n '10,25p' cpp/src/mip_heuristics/solution_callbacks.cuh
rg -n '\bstd::(isfinite|abs)\b' cpp/src/mip_heuristics/solution_callbacks.cuhRepository: aliceb-nv/cuopt-fork
Length of output: 670
Include <cmath> in this header.
The header uses std::isfinite (line 75) and std::abs (line 193), both declared in <cmath>, but this header is not included. The code currently works only through transitive includes, making the header non-self-contained.
Proposed fix
+#include <cmath>
`#include` <limits>
`#include` <mutex>
`#include` <vector>📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| #include <limits> | |
| #include <mutex> | |
| #include <vector> | |
| `#include` <cmath> | |
| `#include` <limits> | |
| `#include` <mutex> | |
| `#include` <vector> |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@cpp/src/mip_heuristics/solution_callbacks.cuh` around lines 16 - 18, The
header solution_callbacks.cuh uses std::isfinite and std::abs but does not
include <cmath>, relying on transitive includes; add `#include` <cmath> to this
header so it is self-contained and ensures std::isfinite and std::abs are
declared for the uses in solution_callbacks.cuh (references: std::isfinite at
the code using it and std::abs where used).
| auto callback_num_variables = problem_ptr->original_problem_ptr->get_n_variables(); | ||
| rmm::device_uvector<f_t> incumbent_assignment(callback_num_variables, | ||
| current_incumbent.handle_ptr->get_stream()); | ||
| auto inf = std::numeric_limits<f_t>::infinity(); | ||
| current_incumbent.handle_ptr->sync_stream(); | ||
| std::vector<f_t> h_incumbent_assignment(incumbent_assignment.size()); | ||
| std::vector<f_t> h_outside_sol_objective(1, inf); | ||
| std::vector<f_t> h_user_bound(1, user_bound); | ||
| set_sol_callback->set_solution(h_incumbent_assignment.data(), | ||
| h_outside_sol_objective.data(), | ||
| h_user_bound.data(), | ||
| set_sol_callback->get_user_data()); | ||
| f_t outside_sol_objective = h_outside_sol_objective[0]; | ||
| if (outside_sol_objective == inf) { continue; } | ||
|
|
||
| raft::copy(incumbent_assignment.data(), | ||
| h_incumbent_assignment.data(), | ||
| incumbent_assignment.size(), | ||
| current_incumbent.handle_ptr->get_stream()); | ||
| if (settings.mip_scaling) { scaling.scale_solutions(incumbent_assignment); } | ||
| bool is_valid = problem_ptr->pre_process_assignment(incumbent_assignment); |
There was a problem hiding this comment.
Preprocess the injected assignment before scaling it.
Line 157 builds the callback buffer in original problem space, but Line 176 applies solver-space scaling before Line 177 crushes/preprocesses it. With presolve enabled, that can mis-scale the wrong indices or sizes. The inbound path should preprocess first, then scale.
🔁 Proposed fix
raft::copy(incumbent_assignment.data(),
h_incumbent_assignment.data(),
incumbent_assignment.size(),
current_incumbent.handle_ptr->get_stream());
- if (settings.mip_scaling) { scaling.scale_solutions(incumbent_assignment); }
bool is_valid = problem_ptr->pre_process_assignment(incumbent_assignment);
if (!is_valid) { continue; }
+ if (settings.mip_scaling) { scaling.scale_solutions(incumbent_assignment); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@cpp/src/mip_heuristics/solution_callbacks.cuh` around lines 157 - 177, The
callback currently copies the host solution to device, then applies scaling
(scaling.scale_solutions) before calling problem_ptr->pre_process_assignment,
which can mis-scale when presolve changes indices/sizes; change the order so
that after set_sol_callback->set_solution fills h_incumbent_assignment and
raft::copy moves it into incumbent_assignment, you call
problem_ptr->pre_process_assignment(incumbent_assignment) first, and only then,
if settings.mip_scaling, call scaling.scale_solutions(incumbent_assignment);
keep the calls to set_sol_callback, raft::copy, pre_process_assignment, and
scaling.scale_solutions in that sequence and preserve stream/synchronization
semantics.
| cuopt_assert(std::abs(outside_sol.get_user_objective() - outside_sol_objective) <= 1e-6, | ||
| "External solution objective mismatch"); |
There was a problem hiding this comment.
Use a scaled tolerance for the callback objective check.
The recomputed objective and the callback-reported objective come from different numeric paths, so a fixed 1e-6 absolute threshold here is brittle. Large-magnitude objectives, or float instantiations of f_t, can trip this assert from normal roundoff. Please compare with a relative/scaled tolerance instead of a hard-coded absolute one.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@cpp/src/mip_heuristics/solution_callbacks.cuh` around lines 193 - 194, The
assertion uses a fixed absolute tolerance (1e-6) which is brittle for
large-magnitude objectives or float instantiations; modify the cuopt_assert in
the callback that checks outside_sol.get_user_objective() vs
outside_sol_objective to use a scaled/relative tolerance instead (e.g., compute
tol = rel_tol * std::max<f_t>(1,
std::max(std::abs(outside_sol.get_user_objective()),
std::abs(outside_sol_objective))) or include epsilon from
std::numeric_limits<f_t>::epsilon() when appropriate) and then assert that
std::abs(... - ...) <= tol; update the call site around cuopt_assert to use this
computed tol rather than the hard-coded 1e-6.
| on_injected(outside_sol.get_host_assignment(), | ||
| outside_sol.get_objective(), | ||
| internals::mip_solution_origin_t::USER_INITIAL); |
There was a problem hiding this comment.
Mark runtime callback injections as USER_INJECTED.
This helper is handling a SET_SOLUTION callback, so tagging the accepted candidate as USER_INITIAL rewrites its provenance. Any origin-based logging, stats, or callback metadata will misclassify a runtime injection as startup seeding.
🏷️ Proposed fix
on_injected(outside_sol.get_host_assignment(),
outside_sol.get_objective(),
- internals::mip_solution_origin_t::USER_INITIAL);
+ internals::mip_solution_origin_t::USER_INJECTED);📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| on_injected(outside_sol.get_host_assignment(), | |
| outside_sol.get_objective(), | |
| internals::mip_solution_origin_t::USER_INITIAL); | |
| on_injected(outside_sol.get_host_assignment(), | |
| outside_sol.get_objective(), | |
| internals::mip_solution_origin_t::USER_INJECTED); |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@cpp/src/mip_heuristics/solution_callbacks.cuh` around lines 195 - 197, The
callback currently records runtime SET_SOLUTION injections with the wrong
provenance by calling on_injected(outside_sol.get_host_assignment(),
outside_sol.get_objective(), internals::mip_solution_origin_t::USER_INITIAL); —
change the origin tag to internals::mip_solution_origin_t::USER_INJECTED so
runtime callback injections are correctly classified; update the call site of
on_injected(...) accordingly (keep outside_sol.get_host_assignment() and
outside_sol.get_objective() intact).
| return detail::compute_hash(hashes); | ||
|
|
||
| auto pop = diversity_manager.get_population_pointer(); | ||
| for (const auto& sol : pop->population_to_vector()) { | ||
| hashes.push_back(sol.get_hash()); | ||
| } | ||
|
|
||
| uint32_t final_hash = detail::compute_hash(hashes); | ||
| printf("%s: final hash: 0x%x, pop size %d\n", | ||
| path.c_str(), | ||
| final_hash, | ||
| (int)pop->population_to_vector().size()); | ||
| return final_hash; |
There was a problem hiding this comment.
Remove unreachable dead code after early return.
Line 300 returns before the code at lines 302-312 can execute. This appears to be leftover code from development that should be removed.
🧹 Proposed fix to remove dead code
}
return detail::compute_hash(hashes);
-
- auto pop = diversity_manager.get_population_pointer();
- for (const auto& sol : pop->population_to_vector()) {
- hashes.push_back(sol.get_hash());
- }
-
- uint32_t final_hash = detail::compute_hash(hashes);
- printf("%s: final hash: 0x%x, pop size %d\n",
- path.c_str(),
- final_hash,
- (int)pop->population_to_vector().size());
- return final_hash;
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| return detail::compute_hash(hashes); | |
| auto pop = diversity_manager.get_population_pointer(); | |
| for (const auto& sol : pop->population_to_vector()) { | |
| hashes.push_back(sol.get_hash()); | |
| } | |
| uint32_t final_hash = detail::compute_hash(hashes); | |
| printf("%s: final hash: 0x%x, pop size %d\n", | |
| path.c_str(), | |
| final_hash, | |
| (int)pop->population_to_vector().size()); | |
| return final_hash; | |
| return detail::compute_hash(hashes); |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@cpp/tests/mip/diversity_test.cu` around lines 300 - 312, Remove the
unreachable dead code after the early return: keep the single return statement
that calls detail::compute_hash(hashes) and delete the subsequent block that
references diversity_manager.get_population_pointer(), the loop over
pop->population_to_vector() calling sol.get_hash(), the secondary call to
detail::compute_hash, the printf using path.c_str(), and the later return of
final_hash so no code remains after the initial return.
local test PR to trigger coderabbit review
Description
Issue
Checklist