Expose diving hyper parameters + Vector length/Farkas diving#1298
Expose diving hyper parameters + Vector length/Farkas diving#1298nguidotti wants to merge 109 commits into
Conversation
Remove dependency on rmm::mr::device_memory_resource base class. Resources now satisfy the cuda::mr::resource concept directly. - Replace shared_ptr<device_memory_resource> with value types and cuda::mr::any_resource<cuda::mr::device_accessible> for type-erased storage - Replace set_current_device_resource(ptr) with set_current_device_resource_ref - Replace set_per_device_resource(id, ptr) with set_per_device_resource_ref - Remove make_owning_wrapper usage - Remove dynamic_cast on memory resources (no common base class) - Remove owning_wrapper.hpp and device_memory_resource.hpp includes - Add missing thrust/iterator/transform_output_iterator.h include (no longer transitively included via CCCL)
…nd deterministic mode. Signed-off-by: Nicolas Guidotti <224634272+nguidotti@users.noreply.github.com>
Signed-off-by: Nicolas Guidotti <nguidotti@nvidia.com>
Signed-off-by: Nicolas Guidotti <nguidotti@nvidia.com>
Signed-off-by: Nicolas Guidotti <nguidotti@nvidia.com>
… shared_ptr to avoid unnecessary copy. Signed-off-by: Nicolas L. Guidotti <nguidotti@nvidia.com>
Signed-off-by: Nicolas L. Guidotti <nguidotti@nvidia.com>
Signed-off-by: Nicolas L. Guidotti <nguidotti@nvidia.com>
Signed-off-by: Nicolas L. Guidotti <nguidotti@nvidia.com>
Signed-off-by: Nicolas L. Guidotti <nguidotti@nvidia.com>
Signed-off-by: Nicolas L. Guidotti <nguidotti@nvidia.com>
Signed-off-by: Nicolas L. Guidotti <nguidotti@nvidia.com>
…l crash in work-stealing Signed-off-by: Nicolas L. Guidotti <nguidotti@nvidia.com>
…queue for now. refactoring. Signed-off-by: Nicolas L. Guidotti <nguidotti@nvidia.com>
… are present Signed-off-by: Nicolas L. Guidotti <nguidotti@nvidia.com>
Signed-off-by: Nicolas L. Guidotti <nguidotti@nvidia.com>
Signed-off-by: Nicolas L. Guidotti <nguidotti@nvidia.com>
# Conflicts: # cpp/src/utilities/cuda_helpers.cuh
Signed-off-by: Nicolas L. Guidotti <nguidotti@nvidia.com>
# Conflicts: # ci/validate_wheel.sh
Signed-off-by: Nicolas L. Guidotti <nguidotti@nvidia.com>
Signed-off-by: Nicolas L. Guidotti <nguidotti@nvidia.com>
Signed-off-by: Nicolas L. Guidotti <nguidotti@nvidia.com>
Signed-off-by: Nicolas L. Guidotti <nguidotti@nvidia.com>
# Conflicts: # cpp/src/branch_and_bound/branch_and_bound.cpp
Signed-off-by: Nicolas L. Guidotti <nguidotti@nvidia.com>
Signed-off-by: Nicolas L. Guidotti <nguidotti@nvidia.com>
Signed-off-by: Nicolas L. Guidotti <nguidotti@nvidia.com>
…rkers Signed-off-by: Nicolas L. Guidotti <nguidotti@nvidia.com>
Signed-off-by: Nicolas L. Guidotti <nguidotti@nvidia.com>
# Conflicts: # cpp/src/branch_and_bound/mip_node.hpp # cpp/src/branch_and_bound/pseudo_costs.cpp
Signed-off-by: Nicolas L. Guidotti <nguidotti@nvidia.com>
Signed-off-by: Nicolas L. Guidotti <nguidotti@nvidia.com>
Signed-off-by: Nicolas L. Guidotti <nguidotti@nvidia.com>
Signed-off-by: Nicolas L. Guidotti <nguidotti@nvidia.com>
Signed-off-by: Nicolas L. Guidotti <nguidotti@nvidia.com>
# Conflicts: # cpp/src/mip_heuristics/mip_constants.hpp
# Conflicts: # cpp/src/math_optimization/solver_settings.cu
📝 WalkthroughWalkthroughThis PR introduces new diving heuristic strategies (Farkas and vector-length distance-based variable selection) for MIP branch-and-bound, refactors the parallel search architecture from a generic worker pool to specialized BFS and diving worker types, compresses LP bases into packed bytes within nodes, and adds consistent OpenMP task prioritization across the solver infrastructure. ChangesDiving Heuristics and Search Architecture
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 10
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
cpp/src/branch_and_bound/branch_and_bound.cpp (1)
1123-1150:⚠️ Potential issue | 🔴 Critical | ⚡ Quick winDeterministic diving dispatch is incomplete for newly added strategies.
Line 1123’s switch omits
FARKAS_DIVINGandVECTOR_LENGTH_DIVING. Since deterministic workers are initialized from all diving strategies (Lines 3065-3066), this can fall into default (Line 1149), returnbranch_var = -1, and trip the assert at Line 1255.🛠️ Minimal fix for deterministic strategy parity
switch (this->worker.diving_type) { case search_strategy_t::PSEUDOCOST_DIVING: return pseudocost_diving( this->worker.pc_snapshot, fractional, x, *this->worker.root_solution, log); case search_strategy_t::LINE_SEARCH_DIVING: return line_search_diving<i_t, f_t>(fractional, x, *this->worker.root_solution, log); case search_strategy_t::GUIDED_DIVING: if (this->worker.incumbent_snapshot.empty()) { return pseudocost_diving( this->worker.pc_snapshot, fractional, x, *this->worker.root_solution, log); } else { return guided_diving( this->worker.pc_snapshot, fractional, x, this->worker.incumbent_snapshot, log); } case search_strategy_t::COEFFICIENT_DIVING: { return coefficient_diving<i_t, f_t>(this->worker.leaf_problem, fractional, x, this->bnb.var_up_locks_, this->bnb.var_down_locks_, log); } + case search_strategy_t::FARKAS_DIVING: + return farkas_diving( + this->worker.leaf_problem, fractional, x, this->bnb.settings_.zero_tol, log); + + case search_strategy_t::VECTOR_LENGTH_DIVING: + return vector_length_diving(this->worker.leaf_problem, fractional, x, log); default: CUOPT_LOG_ERROR("Invalid diving method!"); return {-1, branch_direction_t::NONE}; }Please add deterministic-mode coverage for both strategies so this path is guarded by tests.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@cpp/src/branch_and_bound/branch_and_bound.cpp` around lines 1123 - 1150, The switch over this->worker.diving_type lacks cases for FARKAS_DIVING and VECTOR_LENGTH_DIVING which causes deterministic workers to hit the default and return branch_var = -1; add explicit cases for search_strategy_t::FARKAS_DIVING and search_strategy_t::VECTOR_LENGTH_DIVING and dispatch them to their deterministic implementations (e.g., call the corresponding deterministic functions similar to pseudocost_diving, line_search_diving, guided_diving, coefficient_diving) so the deterministic path used by worker.diving_type is covered; ensure you reference the same parameters used in nearby cases (fractional, x, snapshots, log, and this->worker.* fields) so the assert at Line 1255 will not trip.
🧹 Nitpick comments (1)
cpp/src/dual_simplex/initial_basis.cpp (1)
47-102: ⚡ Quick winAdd round-trip tests for the new basis serialization path.
This code now sits on the node save/restore path and has edge cases around tail packing and unsupported statuses. A small gtest covering lengths 0..7 and all supported
variable_status_tvalues would make regressions much easier to catch.As per coding guidelines,
**/*.{cpp,cc,cxx,c,h,hpp,py}: Add unit tests for code changes; refer tocpp/src/testsfor C/C++ gtest examples andpython/cuopt/cuopt/testsfor Python pytest examples.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@cpp/src/dual_simplex/initial_basis.cpp` around lines 47 - 102, Add gtests that perform round-trip serialization using compress_vstatus and decompress_vstatus: for vstatus lengths 0..7 generate all combinations (or iterate over all supported variable_status_t values per position) and assert decompress_vstatus(compress_vstatus(v)) == v; include cases exercising the tail path and any unsupported/invalid status behavior if applicable. Place tests alongside other C++ tests in cpp/src/tests, use the functions compress_vstatus and decompress_vstatus to locate the code, and ensure assertions cover equality and packed size expectations (packed_vstatus.size() == expected n + has_tail) to catch regressions.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@cpp/src/branch_and_bound/diving_heuristics.cpp`:
- Around line 262-362: Add unit tests covering the two new heuristics
farkas_diving and vector_length_diving plus the branching-strategy dispatch path
(the selector that chooses which diving heuristic) to ensure deterministic
variable/direction selection, correct behavior when fractional is empty, and
edge cases around zero_tol/tolerance boundaries; create gtest cases in
cpp/src/tests that (1) pass a controlled lp_problem_t and solution to
farkas_diving to assert the chosen branch_var and branch_direction_t for
positive/negative/zero objective coefficients and when fractional is empty, (2)
do the same for vector_length_diving including scenarios that exercise
column_length and eps effects and tie-breaking, and (3) test the dispatcher (the
function that routes to these heuristics) for correct heuristic selection and
propagation of empty-fractional and tolerance behavior.
In `@cpp/src/branch_and_bound/diving_heuristics.hpp`:
- Around line 81-86: The declaration of farkas_diving(...) lacks documentation
for the numerical tolerance parameter zero_tol; update the declaration comment
for branch_variable_t<i_t> farkas_diving(...) to state the contract for zero_tol
(e.g., that it is an absolute tolerance used to treat values |x| <= zero_tol as
zero when deciding integrality/feasibility, expected typical range/units, valid
values and behavior on edge cases such as zero or negative inputs, and
recommended example values), so callers know how to tune and what semantic role
zero_tol plays in the algorithm.
In `@cpp/src/branch_and_bound/node_queue.hpp`:
- Around line 80-82: The node_queue currently requires manual lock()/unlock()
which risks deadlocks on exceptions; add an RAII-style guard by implementing
node_queue::scoped_lock() that returns a movable/non-copyable guard which
acquires the internal mutex in its constructor and releases it in its
destructor, then update all internal mutable operations (the methods exposed
around lock()/unlock() usage between lines 86–132) and call sites to use auto lk
= node_queue.scoped_lock(); instead of raw lock()/unlock(); keep lock()/unlock()
for backward compatibility but document them as deprecated and ensure all
new/modified methods use the scoped_lock to guarantee exception-safe unlocking.
In `@cpp/src/dual_simplex/initial_basis.cpp`:
- Around line 21-45: The packed encoding currently maps
variable_status_t::NONBASIC_FIXED to 0b01 which decodes back to NONBASIC_LOWER,
losing FIXED; fix encode/decode in encode(uint8_t) and decode(uint8_t) so
NONBASIC_FIXED gets a unique 2-bit pattern (for example set case
variable_status_t::NONBASIC_FIXED: val = 0b11) and update decode to return
variable_status_t::NONBASIC_FIXED when val == 0b11; alternatively, if you prefer
to reject FIXED like SUPERBASIC, add an assert/guard in encode for
variable_status_t::NONBASIC_FIXED and do not provide a mapping — but be explicit
in either encode, decode, and call sites (compress_vstatus/decompress_vstatus
and places that reconstruct worker->leaf_vstatus) so the representation is not
lossy.
In `@cpp/src/dual_simplex/presolve.hpp`:
- Around line 53-56: The min_abs_obj_coeff field is incorrectly initialized to 0
which can cause division-by-zero/NaN when computing
log10(original_lp_.max_abs_obj_coeff / original_lp_.min_abs_obj_coeff); change
its initializer from 0 to std::numeric_limits<f_t>::infinity() and update the
comment for max_abs_obj_coeff/min_abs_obj_coeff to state that min_abs_obj_coeff
starts at +infinity and is only lowered by positive coefficient magnitudes (so
the default fails closed), ensuring downstream calls using
original_lp_.min_abs_obj_coeff are safe.
In `@cpp/src/math_optimization/solver_settings.cu`:
- Around line 118-120: The unified parameter registry is missing registration
for mip_diving_hyper_params_t::farkas_obj_dynamism_tol so that
set_parameter*/config files can control it; add an entry for this field to the
float_parameters array alongside the other diving hyper-parameters (e.g., create
a CUOPT_MIP_HYPER_* enum constant like CUOPT_MIP_HYPER_FARKAS_OBJ_DYNAMISM_TOL
and add a mapping {CUOPT_MIP_HYPER_FARKAS_OBJ_DYNAMISM_TOL,
&mip_settings.diving_params.farkas_obj_dynamism_tol, <min>, <max>, <default>,
"description"}), and repeat the same insertion for the other float_parameters
blocks noted (lines analogous to 164-174 and 192-194) so the parameter is
consistently exposed.
In `@cpp/src/mip_heuristics/presolve/probing_cache.cu`:
- Around line 899-902: Clamp num_tasks to >=1 before constructing the OpenMP
taskloop to avoid passing zero into the pragma or into calculate_index_range:
compute effective_num_tasks = max<size_t>(1, bound_presolve.settings.num_tasks ?
bound_presolve.settings.num_tasks : (omp_get_num_threads() ?
omp_get_num_threads() - 1 : 1)) (or equivalent) and use effective_num_tasks in
the pragma and the loop and in the call to calculate_index_range; also add a
regression test covering the single-thread / num_tasks==0 path (use
cpp/src/tests or python/cuopt/cuopt/tests per project conventions) to ensure
probing-cache still runs when bound_presolve.settings.num_tasks==0 or when
omp_get_num_threads()==1.
In `@cpp/src/utilities/circular_deque.hpp`:
- Around line 23-112: Add gtest unit tests for circular_deque_t to verify
default construction, sized construction, push_back/push_front,
pop_front/pop_back, size()/capacity(), empty()/full(), wrap-around behavior and
clear_resize; specifically create tests that (1) default-construct
circular_deque_t<T> and assert empty() and capacity()==0/1 expectations, (2)
construct with explicit capacity and push until full then assert full() and
correct size(), (3) alternate push_front/push_back and pop_front/pop_back to
force index wrap-around and assert invariants (size(), front(), back()), (4)
call clear_resize(new_capacity) and verify buffer semantics and that further
pushes/pops behave correctly, and (5) exercise boundary cases like popping from
single-element deque to become empty; reference circular_deque_t, push_back,
push_front, pop_front, pop_back, size, capacity, empty, full, and clear_resize
and place tests under cpp/src/tests following existing gtest examples.
- Around line 26-27: The default constructor creates a buffer of size 1 so
full() immediately returns true; change the constructor to allocate at least two
slots (e.g., buffer_(2)) and set capacity_ to buffer_.size() (capacity_ = 2) so
the circular buffer can represent an empty state correctly (keep the existing
full() check (tail_ + 1) % capacity_ == head_). Update the circular_deque_t()
constructor to allocate buffer_ with size >= 2 and initialize capacity_, head_,
tail_ accordingly so push_front/push_back become usable.
In `@cpp/src/utilities/omp_helpers.hpp`:
- Around line 18-23: The function calculate_index_range performs division by n
without validation; add a precondition/assert at the start of
calculate_index_range to ensure n > 0 (and optionally that total >= 0 if you
want stricter checks) so the function fails fast on invalid input instead of
invoking undefined behavior; update the implementation of calculate_index_range
to check n and document the invariant above the function (use assert or
equivalent in your project).
---
Outside diff comments:
In `@cpp/src/branch_and_bound/branch_and_bound.cpp`:
- Around line 1123-1150: The switch over this->worker.diving_type lacks cases
for FARKAS_DIVING and VECTOR_LENGTH_DIVING which causes deterministic workers to
hit the default and return branch_var = -1; add explicit cases for
search_strategy_t::FARKAS_DIVING and search_strategy_t::VECTOR_LENGTH_DIVING and
dispatch them to their deterministic implementations (e.g., call the
corresponding deterministic functions similar to pseudocost_diving,
line_search_diving, guided_diving, coefficient_diving) so the deterministic path
used by worker.diving_type is covered; ensure you reference the same parameters
used in nearby cases (fractional, x, snapshots, log, and this->worker.* fields)
so the assert at Line 1255 will not trip.
---
Nitpick comments:
In `@cpp/src/dual_simplex/initial_basis.cpp`:
- Around line 47-102: Add gtests that perform round-trip serialization using
compress_vstatus and decompress_vstatus: for vstatus lengths 0..7 generate all
combinations (or iterate over all supported variable_status_t values per
position) and assert decompress_vstatus(compress_vstatus(v)) == v; include cases
exercising the tail path and any unsupported/invalid status behavior if
applicable. Place tests alongside other C++ tests in cpp/src/tests, use the
functions compress_vstatus and decompress_vstatus to locate the code, and ensure
assertions cover equality and packed size expectations (packed_vstatus.size() ==
expected n + has_tail) to catch regressions.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 5052667f-6bf1-4cea-b500-2f935b793979
📒 Files selected for processing (32)
cpp/include/cuopt/linear_programming/constants.hcpp/include/cuopt/linear_programming/mip/diving_hyper_params.hppcpp/include/cuopt/linear_programming/mip/solver_settings.hppcpp/src/branch_and_bound/branch_and_bound.cppcpp/src/branch_and_bound/branch_and_bound.hppcpp/src/branch_and_bound/constants.hppcpp/src/branch_and_bound/deterministic_workers.hppcpp/src/branch_and_bound/diving_heuristics.cppcpp/src/branch_and_bound/diving_heuristics.hppcpp/src/branch_and_bound/mip_node.hppcpp/src/branch_and_bound/node_queue.hppcpp/src/branch_and_bound/pseudo_costs.cppcpp/src/branch_and_bound/pseudo_costs.hppcpp/src/branch_and_bound/worker.hppcpp/src/branch_and_bound/worker_pool.hppcpp/src/dual_simplex/initial_basis.cppcpp/src/dual_simplex/initial_basis.hppcpp/src/dual_simplex/presolve.hppcpp/src/dual_simplex/simplex_solver_settings.hppcpp/src/math_optimization/solver_settings.cucpp/src/mip_heuristics/diversity/lns/rins.cucpp/src/mip_heuristics/feasibility_jump/early_cpufj.cucpp/src/mip_heuristics/feasibility_jump/early_gpufj.cucpp/src/mip_heuristics/local_search/local_search.cucpp/src/mip_heuristics/mip_constants.hppcpp/src/mip_heuristics/presolve/conditional_bound_strengthening.cucpp/src/mip_heuristics/presolve/probing_cache.cucpp/src/mip_heuristics/solve.cucpp/src/mip_heuristics/solver.cucpp/src/utilities/circular_deque.hppcpp/src/utilities/omp_helpers.hppcpp/src/utilities/pcgenerator.hpp
💤 Files with no reviewable changes (1)
- cpp/src/branch_and_bound/pseudo_costs.hpp
| template <typename i_t, typename f_t> | ||
| branch_variable_t<i_t> farkas_diving(const lp_problem_t<i_t, f_t>& lp, | ||
| const std::vector<i_t>& fractional, | ||
| const std::vector<f_t>& solution, | ||
| f_t zero_tol, | ||
| logger_t& log); |
There was a problem hiding this comment.
Document the zero_tol contract in the declaration.
Line 85 introduces a numerical tolerance parameter, but the declaration does not explain expected scale/role (e.g., absolute tolerance vs. feasibility threshold), which makes tuning and call-site use ambiguous.
✏️ Suggested header-level clarification
template <typename i_t, typename f_t>
+// `zero_tol` is the absolute near-zero threshold used by Farkas score computations.
+// Values with |x| <= zero_tol are treated as zero.
branch_variable_t<i_t> farkas_diving(const lp_problem_t<i_t, f_t>& lp,
const std::vector<i_t>& fractional,
const std::vector<f_t>& solution,
f_t zero_tol,
logger_t& log);As per coding guidelines: "Flag missing documentation for numerical tolerances or algorithm parameters".
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| template <typename i_t, typename f_t> | |
| branch_variable_t<i_t> farkas_diving(const lp_problem_t<i_t, f_t>& lp, | |
| const std::vector<i_t>& fractional, | |
| const std::vector<f_t>& solution, | |
| f_t zero_tol, | |
| logger_t& log); | |
| template <typename i_t, typename f_t> | |
| // `zero_tol` is the absolute near-zero threshold used by Farkas score computations. | |
| // Values with |x| <= zero_tol are treated as zero. | |
| branch_variable_t<i_t> farkas_diving(const lp_problem_t<i_t, f_t>& lp, | |
| const std::vector<i_t>& fractional, | |
| const std::vector<f_t>& solution, | |
| f_t zero_tol, | |
| logger_t& log); |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@cpp/src/branch_and_bound/diving_heuristics.hpp` around lines 81 - 86, The
declaration of farkas_diving(...) lacks documentation for the numerical
tolerance parameter zero_tol; update the declaration comment for
branch_variable_t<i_t> farkas_diving(...) to state the contract for zero_tol
(e.g., that it is an absolute tolerance used to treat values |x| <= zero_tol as
zero when deciding integrality/feasibility, expected typical range/units, valid
values and behavior on edge cases such as zero or negative inputs, and
recommended example values), so callers know how to tune and what semantic role
zero_tol plays in the algorithm.
| // Maximum and minimum value of the coefficients in the objective function. This is used | ||
| // for determine the "objective dynamism" in Farkas diving. | ||
| f_t max_abs_obj_coeff = 0; | ||
| f_t min_abs_obj_coeff = 0; |
There was a problem hiding this comment.
Initialize min_abs_obj_coeff to a non-zero-safe sentinel.
These fields feed std::log10(original_lp_.max_abs_obj_coeff / original_lp_.min_abs_obj_coeff) downstream. With the new default of 0, an unset or zero minimum can turn the Farkas-diving gate into 0/0, inf, or NaN. Initializing min_abs_obj_coeff to std::numeric_limits<f_t>::infinity() and documenting that only positive coefficients should lower it makes the default fail closed.
Proposed fix
- f_t max_abs_obj_coeff = 0;
- f_t min_abs_obj_coeff = 0;
+ f_t max_abs_obj_coeff = 0;
+ f_t min_abs_obj_coeff = std::numeric_limits<f_t>::infinity();📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Maximum and minimum value of the coefficients in the objective function. This is used | |
| // for determine the "objective dynamism" in Farkas diving. | |
| f_t max_abs_obj_coeff = 0; | |
| f_t min_abs_obj_coeff = 0; | |
| // Maximum and minimum value of the coefficients in the objective function. This is used | |
| // for determine the "objective dynamism" in Farkas diving. | |
| f_t max_abs_obj_coeff = 0; | |
| f_t min_abs_obj_coeff = std::numeric_limits<f_t>::infinity(); |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@cpp/src/dual_simplex/presolve.hpp` around lines 53 - 56, The
min_abs_obj_coeff field is incorrectly initialized to 0 which can cause
division-by-zero/NaN when computing log10(original_lp_.max_abs_obj_coeff /
original_lp_.min_abs_obj_coeff); change its initializer from 0 to
std::numeric_limits<f_t>::infinity() and update the comment for
max_abs_obj_coeff/min_abs_obj_coeff to state that min_abs_obj_coeff starts at
+infinity and is only lowered by positive coefficient magnitudes (so the default
fails closed), ensuring downstream calls using original_lp_.min_abs_obj_coeff
are safe.
| // Diving heuristic hyper-parameters (hidden from default --help: name contains "hyper_") | ||
| {CUOPT_MIP_HYPER_DIVING_ITERATION_LIMIT_FACTOR, &mip_settings.diving_params.iteration_limit_factor, f_t(0.0), f_t(1.0), f_t(0.05), "fraction of best-first iterations allowed per dive"} | ||
| }; |
There was a problem hiding this comment.
Expose farkas_obj_dynamism_tol in the unified parameter registry.
mip_diving_hyper_params_t includes farkas_obj_dynamism_tol, but this file does not register it in float_parameters, so it cannot be configured via set_parameter*() or config files like the rest of diving knobs.
Proposed fix
--- a/cpp/include/cuopt/linear_programming/constants.h
+++ b/cpp/include/cuopt/linear_programming/constants.h
@@
`#define` CUOPT_MIP_HYPER_DIVING_BACKTRACK_LIMIT "mip_hyper_diving_backtrack_limit"
+#define CUOPT_MIP_HYPER_DIVING_FARKAS_OBJ_DYNAMISM_TOL "mip_hyper_diving_farkas_obj_dynamism_tol"--- a/cpp/src/math_optimization/solver_settings.cu
+++ b/cpp/src/math_optimization/solver_settings.cu
@@
{CUOPT_MIP_HYPER_DIVING_ITERATION_LIMIT_FACTOR, &mip_settings.diving_params.iteration_limit_factor, f_t(0.0), f_t(1.0), f_t(0.05), "fraction of best-first iterations allowed per dive"}
+ ,{CUOPT_MIP_HYPER_DIVING_FARKAS_OBJ_DYNAMISM_TOL, &mip_settings.diving_params.farkas_obj_dynamism_tol, f_t(0.0), f_t(1.0), f_t(1e-4), "minimum objective-coefficient spread to keep Farkas diving enabled"}Also applies to: 164-174, 192-194
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@cpp/src/math_optimization/solver_settings.cu` around lines 118 - 120, The
unified parameter registry is missing registration for
mip_diving_hyper_params_t::farkas_obj_dynamism_tol so that set_parameter*/config
files can control it; add an entry for this field to the float_parameters array
alongside the other diving hyper-parameters (e.g., create a CUOPT_MIP_HYPER_*
enum constant like CUOPT_MIP_HYPER_FARKAS_OBJ_DYNAMISM_TOL and add a mapping
{CUOPT_MIP_HYPER_FARKAS_OBJ_DYNAMISM_TOL,
&mip_settings.diving_params.farkas_obj_dynamism_tol, <min>, <max>, <default>,
"description"}), and repeat the same insertion for the other float_parameters
blocks noted (lines analogous to 164-174 and 192-194) so the parameter is
consistently exposed.
| #pragma omp taskloop num_tasks(num_tasks) default(shared) priority(CUOPT_DEFAULT_TASK_PRIORITY) | ||
| for (size_t task_id = 0; task_id < num_tasks; ++task_id) { | ||
| size_t n = step_end - step_start; | ||
| size_t begin = step_start + std::floor(static_cast<f_t>(n) * task_id / num_tasks); | ||
| size_t end = step_start + std::floor(static_cast<f_t>(n) * (task_id + 1) / num_tasks); | ||
| size_t n = step_end - step_start; | ||
| auto [begin, end] = calculate_index_range(task_id, n, num_tasks); |
There was a problem hiding this comment.
Clamp num_tasks before building the taskloop.
num_tasks can be 0 here when bound_presolve.settings.num_tasks == 0 or when this path runs with a single OpenMP thread and falls back to omp_get_num_threads() - 1. That makes num_tasks(num_tasks) invalid and sends a zero divisor into calculate_index_range(...), so probing-cache work is skipped or the runtime trips over the pragma.
Suggested fix
- size_t num_tasks = bound_presolve.settings.num_tasks < 0 ? omp_get_num_threads() - 1
- : bound_presolve.settings.num_tasks;
+ size_t num_tasks = bound_presolve.settings.num_tasks < 0 ? omp_get_num_threads() - 1
+ : bound_presolve.settings.num_tasks;
+ num_tasks = std::max<size_t>(num_tasks, 1);Please add a regression test for the single-thread / num_tasks == 0 path. As per coding guidelines **/*.{cpp,cc,cxx,c,h,hpp,py}: Add unit tests for code changes; refer to cpp/src/tests for C/C++ gtest examples and python/cuopt/cuopt/tests for Python pytest examples.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@cpp/src/mip_heuristics/presolve/probing_cache.cu` around lines 899 - 902,
Clamp num_tasks to >=1 before constructing the OpenMP taskloop to avoid passing
zero into the pragma or into calculate_index_range: compute effective_num_tasks
= max<size_t>(1, bound_presolve.settings.num_tasks ?
bound_presolve.settings.num_tasks : (omp_get_num_threads() ?
omp_get_num_threads() - 1 : 1)) (or equivalent) and use effective_num_tasks in
the pragma and the loop and in the call to calculate_index_range; also add a
regression test covering the single-thread / num_tasks==0 path (use
cpp/src/tests or python/cuopt/cuopt/tests per project conventions) to ensure
probing-cache still runs when bound_presolve.settings.num_tasks==0 or when
omp_get_num_threads()==1.
| template <typename T> | ||
| class circular_deque_t { | ||
| public: | ||
| circular_deque_t() : buffer_(1), capacity_(1), head_(0), tail_(0) {} | ||
|
|
||
| // Allocates storage for exactly `capacity` elements up front. | ||
| explicit circular_deque_t(size_t capacity) | ||
| : buffer_(capacity + 1), // +1 to distinguish full (next(tail)==head) from empty (head==tail) | ||
| capacity_(capacity + 1), | ||
| head_(0), | ||
| tail_(0) | ||
| { | ||
| assert(capacity > 0); | ||
| } | ||
|
|
||
| bool empty() const { return head_ == tail_; } | ||
| bool full() const { return next(tail_) == head_; } | ||
|
|
||
| size_t size() const { return (tail_ - head_ + capacity_) % capacity_; } | ||
| size_t capacity() const { return capacity_ - 1; } | ||
|
|
||
| void clear_resize(size_t new_capacity) | ||
| { | ||
| assert(new_capacity > 0); | ||
| head_ = 0; | ||
| tail_ = 0; | ||
| capacity_ = new_capacity + 1; | ||
| buffer_.resize(capacity_); | ||
| } | ||
|
|
||
| void push_back(T val) | ||
| { | ||
| assert(!full()); | ||
| buffer_[tail_] = std::move(val); | ||
| tail_ = next(tail_); | ||
| } | ||
|
|
||
| void push_front(T val) | ||
| { | ||
| assert(!full()); | ||
| head_ = prev(head_); | ||
| buffer_[head_] = std::move(val); | ||
| } | ||
|
|
||
| T pop_front() | ||
| { | ||
| assert(!empty()); | ||
| T val = std::move(buffer_[head_]); | ||
| head_ = next(head_); | ||
| return val; | ||
| } | ||
|
|
||
| T pop_back() | ||
| { | ||
| assert(!empty()); | ||
| tail_ = prev(tail_); | ||
| return std::move(buffer_[tail_]); | ||
| } | ||
|
|
||
| T& front() | ||
| { | ||
| assert(!empty()); | ||
| return buffer_[head_]; | ||
| } | ||
| const T& front() const | ||
| { | ||
| assert(!empty()); | ||
| return buffer_[head_]; | ||
| } | ||
|
|
||
| T& back() | ||
| { | ||
| assert(!empty()); | ||
| return buffer_[prev(tail_)]; | ||
| } | ||
| const T& back() const | ||
| { | ||
| assert(!empty()); | ||
| return buffer_[prev(tail_)]; | ||
| } | ||
|
|
||
| private: | ||
| size_t next(size_t idx) const { return (idx + 1) % capacity_; } | ||
| size_t prev(size_t idx) const { return (idx + capacity_ - 1) % capacity_; } | ||
|
|
||
| std::vector<T> buffer_; | ||
| size_t capacity_; | ||
| size_t head_; | ||
| size_t tail_; | ||
| }; |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major | 🏗️ Heavy lift
Add unit tests for wrap-around/full-empty invariants of the new deque.
This new core container needs tests for default construction, push_front/push_back, pop_front/pop_back, wrap-around behavior, and full/empty boundaries.
As per coding guidelines, "**/*.{cpp,cc,cxx,c,h,hpp,py}: Add unit tests for code changes; refer to cpp/src/tests for C/C++ gtest examples".
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@cpp/src/utilities/circular_deque.hpp` around lines 23 - 112, Add gtest unit
tests for circular_deque_t to verify default construction, sized construction,
push_back/push_front, pop_front/pop_back, size()/capacity(), empty()/full(),
wrap-around behavior and clear_resize; specifically create tests that (1)
default-construct circular_deque_t<T> and assert empty() and capacity()==0/1
expectations, (2) construct with explicit capacity and push until full then
assert full() and correct size(), (3) alternate push_front/push_back and
pop_front/pop_back to force index wrap-around and assert invariants (size(),
front(), back()), (4) call clear_resize(new_capacity) and verify buffer
semantics and that further pushes/pops behave correctly, and (5) exercise
boundary cases like popping from single-element deque to become empty; reference
circular_deque_t, push_back, push_front, pop_front, pop_back, size, capacity,
empty, full, and clear_resize and place tests under cpp/src/tests following
existing gtest examples.
| std::pair<i_t, i_t> calculate_index_range(i_t k, double total, double n) | ||
| { | ||
| i_t start = std::floor(k * total / n); | ||
| i_t end = std::floor((k + 1) * total / n); | ||
| return {start, end}; | ||
| } |
There was a problem hiding this comment.
Guard against invalid partition divisor in calculate_index_range.
Line 20-Line 21 divides by n without checking n > 0. Add a precondition assert to fail fast on invalid inputs.
Fix
template <typename i_t>
std::pair<i_t, i_t> calculate_index_range(i_t k, double total, double n)
{
+ assert(n > 0);
i_t start = std::floor(k * total / n);
i_t end = std::floor((k + 1) * total / n);
return {start, end};
}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@cpp/src/utilities/omp_helpers.hpp` around lines 18 - 23, The function
calculate_index_range performs division by n without validation; add a
precondition/assert at the start of calculate_index_range to ensure n > 0 (and
optionally that total >= 0 if you want stricter checks) so the function fails
fast on invalid input instead of invoking undefined behavior; update the
implementation of calculate_index_range to check n and document the invariant
above the function (use assert or equivalent in your project).
|
/ok to test 2c545ea |
|
/ok to test 883a04e |
This PR implements the following diving heuristics:
This also expose the diving hyper parameters similarly to heuristics (#993). Note that this PR includes #1149.
Benchmark Results
Time limit is set to 10min. Ran on GH200.
All
In particular,
cuoptis able prove optimality totimtab1andran14x18-disj-8.Farkas diving only
(Farkas diving found an incumbent in 25/240 instances)
Note that Farkas divings is disabled when the coefficients in the objective function is too close.
Vector length only
(Vector length diving found an incumbent in 80/240 instances)
References
[1] T. Achterberg, “Constraint Integer Programming,” PhD, Technischen Universität Berlin, Berlin, 2007. doi: [10.14279/depositonce-1634](https://doi.org/10.14279/depositonce-1634).
[2] J. Witzig and A. Gleixner, “Conflict-Driven Heuristics for Mixed Integer Programming,” Feb. 07, 2019, arXiv: arXiv:1902.02615. doi: [10.48550/arXiv.1902.02615](https://doi.org/10.48550/arXiv.1902.02615).
Checklist