Pre presolve heuristics - review test#1
Conversation
📝 WalkthroughWalkthroughAdds early feasibility-jump heuristics (CPU and GPU) that run during presolve and propagate incumbents via callbacks; introduces an external initial cutoff stored in solver context with public setters/getters and refactors branch-and-bound to use this centralized cutoff for pruning. Changes
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes 🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Tip Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu (1)
1405-1554:⚠️ Potential issue | 🟠 MajorFix time-limit duration creation for infinite limits (overflow/UB).
in_time_limitcan beinf(default incpu_fj_thread_t), but it is cast tointto buildstd::chrono::milliseconds, which can overflow/UB. Also guard the average-time division if the loop exits before the first iteration.🛠️ Proposed fix
- auto time_limit = std::chrono::milliseconds((int)(in_time_limit * 1000)); + auto time_limit = std::chrono::duration<double>::max(); + if (in_time_limit < std::numeric_limits<f_t>::infinity()) { + time_limit = std::chrono::duration<double>(in_time_limit); + } @@ - if (in_time_limit < std::numeric_limits<f_t>::infinity() && - now - loop_time_start > time_limit) { + if (in_time_limit < std::numeric_limits<f_t>::infinity() && + now - loop_time_start > time_limit) { CUOPT_LOG_TRACE("%sTime limit of %.4f seconds reached, breaking loop at iteration %d", fj_cpu.log_prefix.c_str(), - time_limit.count() / 1000.f, + time_limit.count(), fj_cpu.iterations); break; } @@ - double avg_time_per_iter = total_time / fj_cpu.iterations; + double avg_time_per_iter = + fj_cpu.iterations > 0 ? total_time / fj_cpu.iterations : 0.0;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu` around lines 1405 - 1554, The time-limit construction in cpufj_solve_loop currently casts in_time_limit to int and into std::chrono::milliseconds which is UB when in_time_limit is infinity; update the logic to only construct a chrono duration when std::isfinite(in_time_limit) (e.g., keep a bool has_time_limit and a std::chrono::duration<double> or milliseconds variable set conditionally) and use that guard in the time-check comparing now - loop_time_start; also protect the average-time computation at the end by checking fj_cpu.iterations > 0 before dividing (or set avg_time_per_iter = 0 when iterations == 0) to avoid a divide-by-zero. Ensure changes touch cpufj_solve_loop, the time_limit/loop_time_start variables and the avg_time_per_iter calculation.
🧹 Nitpick comments (2)
cpp/src/mip_heuristics/solver_context.cuh (1)
67-69: Consider documenting ownership for early_cpufj_ptr.The
early_cpufj_ptris a raw non-owning pointer. Consider adding a comment similar to line 59 to clarify this is a non-owning reference and who owns the pointed-to object.📝 Suggested documentation
- early_cpufj_t<i_t, f_t>* early_cpufj_ptr{nullptr}; + early_cpufj_t<i_t, f_t>* early_cpufj_ptr{nullptr}; // Non-owning; managed by caller🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/mip_heuristics/solver_context.cuh` around lines 67 - 69, early_cpufj_ptr is a raw pointer that appears to be non-owning; add a short comment above the declaration of early_cpufj_ptr in solver_context.cuh (next to the declaration of early_cpufj_t<i_t, f_t>* early_cpufj_ptr{nullptr}) stating that it is a non-owning/unowned reference and indicating who is responsible for the lifetime (e.g., owned elsewhere by X or managed by Y), similar to the existing comment style used on line 59 for other fields.cpp/src/mip_heuristics/feasibility_jump/early_gpufj.cu (1)
62-67: Consider wrappingsolve()in exception handling.If
fj_ptr_->solve()throws an exception, the worker thread will terminate abruptly. Whilestop()will still join successfully, any exception information will be lost. Consider catching and storing exceptions for later inspection, or at minimum logging them.♻️ Suggested improvement
template <typename i_t, typename f_t> void early_gpufj_t<i_t, f_t>::run_worker() { - RAFT_CUDA_TRY(cudaSetDevice(this->problem_ptr_->handle_ptr->get_device())); - fj_ptr_->solve(*this->solution_ptr_); + try { + RAFT_CUDA_TRY(cudaSetDevice(this->problem_ptr_->handle_ptr->get_device())); + fj_ptr_->solve(*this->solution_ptr_); + } catch (const std::exception& e) { + CUOPT_LOG_ERROR("[Early GPU FJ] Worker exception: %s", e.what()); + } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/mip_heuristics/feasibility_jump/early_gpufj.cu` around lines 62 - 67, Wrap the call to fj_ptr_->solve(*this->solution_ptr_) inside a try/catch in early_gpufj_t<i_t,f_t>::run_worker so any thrown exceptions are captured instead of letting the worker thread terminate silently; catch std::exception (and a catch-all) and either store std::current_exception() into a member std::exception_ptr (e.g., worker_exception_ or similar) for later inspection or call the appropriate logger on this object to record the error, and ensure the catch does not rethrow so stop()/join can complete normally.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@cpp/src/mip_heuristics/early_heuristic.cuh`:
- Around line 60-81: The method try_update_best mutates shared state
(best_objective_, best_assignment_, solution_found_) without synchronization,
causing potential data races if called concurrently; either document that
try_update_best (and the base class) requires single-threaded access or add a
mutex (e.g., a std::mutex member like best_update_mutex_) and lock it at the
start of try_update_best to protect the entire update sequence (including
checking and setting best_objective_, copying/processing assignment via
problem_ptr_->post_process_assignment and assigning best_assignment_, setting
solution_found_, and invoking incumbent_callback_), ensuring the lock is held
while reading/writing those members and released before logging or long-running
host/device operations where appropriate.
In `@cpp/src/mip_heuristics/feasibility_jump/early_cpufj.cu`:
- Around line 32-53: The start() method does not reset the preemption_flag_, so
if stop() previously set preemption_flag_ to true the new cpu solver will
immediately preempt; update early_cpufj_t<i_t,f_t>::start() to explicitly reset
preemption_flag_ (e.g., set preemption_flag_ = false) at the top of the method
before creating cpu_fj_thread_, ensuring subsequent calls to start() create a
fresh solver; reference start(), stop(), preemption_flag_, and cpu_fj_thread_
when making this change.
In `@cpp/src/mip_heuristics/feasibility_jump/feasibility_jump.cu`:
- Around line 936-950: Reset last_reported_objective_ at the start of each
solve() so callbacks aren’t suppressed by previous runs: inside fj_t::solve()
set last_reported_objective_ to positive infinity (use
std::numeric_limits<f_t>::infinity() or
std::numeric_limits<decltype(last_reported_objective_)>::infinity() as
appropriate) before any solution processing so the comparison in the improvement
callback block (user_obj < last_reported_objective_) will work correctly for
each independent solve invocation.
In `@cpp/src/mip_heuristics/solve.cu`:
- Around line 74-79: The solver.context.initial_cutoff is being assigned after
run_solver() returns, so move the setup of solver.context.initial_cutoff (the
initial_cutoff parameter passed into run_mip) to before the call to run_solver()
inside run_mip, so the branch-and-bound initialization in run_solver() can
consume it; update references to solver.context.initial_cutoff in run_mip and
ensure the value passed into run_solver() is the already-set context value. Also
correct or clarify the comment around the early CPUFJ block (the code following
run_solver() that currently reads like it runs during presolve) to accurately
state that the CPUFJ path executes post-Papilo/post-solve if that’s intended, or
move the block if it was supposed to run earlier. Ensure changes reference
run_mip, solver.context.initial_cutoff, run_solver, and the early CPUFJ
comment/block.
---
Outside diff comments:
In `@cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu`:
- Around line 1405-1554: The time-limit construction in cpufj_solve_loop
currently casts in_time_limit to int and into std::chrono::milliseconds which is
UB when in_time_limit is infinity; update the logic to only construct a chrono
duration when std::isfinite(in_time_limit) (e.g., keep a bool has_time_limit and
a std::chrono::duration<double> or milliseconds variable set conditionally) and
use that guard in the time-check comparing now - loop_time_start; also protect
the average-time computation at the end by checking fj_cpu.iterations > 0 before
dividing (or set avg_time_per_iter = 0 when iterations == 0) to avoid a
divide-by-zero. Ensure changes touch cpufj_solve_loop, the
time_limit/loop_time_start variables and the avg_time_per_iter calculation.
---
Nitpick comments:
In `@cpp/src/mip_heuristics/feasibility_jump/early_gpufj.cu`:
- Around line 62-67: Wrap the call to fj_ptr_->solve(*this->solution_ptr_)
inside a try/catch in early_gpufj_t<i_t,f_t>::run_worker so any thrown
exceptions are captured instead of letting the worker thread terminate silently;
catch std::exception (and a catch-all) and either store std::current_exception()
into a member std::exception_ptr (e.g., worker_exception_ or similar) for later
inspection or call the appropriate logger on this object to record the error,
and ensure the catch does not rethrow so stop()/join can complete normally.
In `@cpp/src/mip_heuristics/solver_context.cuh`:
- Around line 67-69: early_cpufj_ptr is a raw pointer that appears to be
non-owning; add a short comment above the declaration of early_cpufj_ptr in
solver_context.cuh (next to the declaration of early_cpufj_t<i_t, f_t>*
early_cpufj_ptr{nullptr}) stating that it is a non-owning/unowned reference and
indicating who is responsible for the lifetime (e.g., owned elsewhere by X or
managed by Y), similar to the existing comment style used on line 59 for other
fields.
ℹ️ Review info
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (16)
cpp/src/branch_and_bound/branch_and_bound.cppcpp/src/branch_and_bound/branch_and_bound.hppcpp/src/mip_heuristics/CMakeLists.txtcpp/src/mip_heuristics/diversity/population.cucpp/src/mip_heuristics/early_heuristic.cuhcpp/src/mip_heuristics/feasibility_jump/early_cpufj.cucpp/src/mip_heuristics/feasibility_jump/early_cpufj.cuhcpp/src/mip_heuristics/feasibility_jump/early_gpufj.cucpp/src/mip_heuristics/feasibility_jump/early_gpufj.cuhcpp/src/mip_heuristics/feasibility_jump/feasibility_jump.cucpp/src/mip_heuristics/feasibility_jump/feasibility_jump.cuhcpp/src/mip_heuristics/feasibility_jump/fj_cpu.cucpp/src/mip_heuristics/feasibility_jump/fj_cpu.cuhcpp/src/mip_heuristics/solve.cucpp/src/mip_heuristics/solver.cucpp/src/mip_heuristics/solver_context.cuh
| // Invoke improvement callback if we have a better feasible solution | ||
| if (is_feasible && improvement_callback) { | ||
| f_t user_obj = solution.get_user_objective(); | ||
| if (user_obj < last_reported_objective_) { | ||
| last_reported_objective_ = user_obj; | ||
| // Copy assignment to host for callback | ||
| std::vector<f_t> h_assignment(solution.assignment.size()); | ||
| raft::copy(h_assignment.data(), | ||
| solution.assignment.data(), | ||
| solution.assignment.size(), | ||
| climber_stream); | ||
| climber_stream.synchronize(); | ||
| improvement_callback(user_obj, h_assignment); | ||
| } | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
rg -n "last_reported_objective_" --type cppRepository: aliceb-nv/cuopt-fork
Length of output: 46
🏁 Script executed:
fd feasibility_jump.cuRepository: aliceb-nv/cuopt-fork
Length of output: 184
🏁 Script executed:
rg "last_reported_objective_" cpp/src/mip_heuristics/feasibility_jump/Repository: aliceb-nv/cuopt-fork
Length of output: 412
🏁 Script executed:
cat -n cpp/src/mip_heuristics/feasibility_jump/feasibility_jump.cuh | head -100Repository: aliceb-nv/cuopt-fork
Length of output: 4539
🏁 Script executed:
grep -n "class\|struct" cpp/src/mip_heuristics/feasibility_jump/feasibility_jump.cuh | head -20Repository: aliceb-nv/cuopt-fork
Length of output: 795
🏁 Script executed:
sed -n '201,450p' cpp/src/mip_heuristics/feasibility_jump/feasibility_jump.cuhRepository: aliceb-nv/cuopt-fork
Length of output: 13071
🏁 Script executed:
sed -n '900,1000p' cpp/src/mip_heuristics/feasibility_jump/feasibility_jump.cuRepository: aliceb-nv/cuopt-fork
Length of output: 4361
🏁 Script executed:
sed -n '500,700p' cpp/src/mip_heuristics/feasibility_jump/feasibility_jump.cuhRepository: aliceb-nv/cuopt-fork
Length of output: 5705
🏁 Script executed:
grep -n "improvement_callback =" cpp/src/mip_heuristics/feasibility_jump/feasibility_jump.cuRepository: aliceb-nv/cuopt-fork
Length of output: 46
🏁 Script executed:
rg "improvement_callback" cpp/src/mip_heuristics/feasibility_jump/Repository: aliceb-nv/cuopt-fork
Length of output: 1073
🏁 Script executed:
cat -n cpp/src/mip_heuristics/feasibility_jump/early_gpufj.cu | grep -A 10 -B 5 "improvement_callback"Repository: aliceb-nv/cuopt-fork
Length of output: 742
🏁 Script executed:
rg -n "fj_t<.*>::fj_t|fj_t<.*>::solve" cpp/src/mip_heuristics/feasibility_jump/feasibility_jump.cu | head -20Repository: aliceb-nv/cuopt-fork
Length of output: 219
🏁 Script executed:
sed -n '1074,1150p' cpp/src/mip_heuristics/feasibility_jump/feasibility_jump.cuRepository: aliceb-nv/cuopt-fork
Length of output: 3511
🏁 Script executed:
sed -n '46,150p' cpp/src/mip_heuristics/feasibility_jump/feasibility_jump.cuRepository: aliceb-nv/cuopt-fork
Length of output: 5660
🏁 Script executed:
web_search
C++ std::function contextual conversion to bool operatorRepository: aliceb-nv/cuopt-fork
Length of output: 156
🌐 Web query:
C++ std::function empty check contextually convertible to bool
💡 Result:
std::function is contextually convertible to bool because it has an explicit conversion operator:
explicit operator bool() const noexcept;— returnstrueif it stores a callable target,falseif it’s empty. [1]
So these are the idiomatic “empty checks”:
std::function<void()> f;
if (f) { // has target
f();
}
if (!f) { // empty
}But because the conversion is explicit, this does not compile:
bool b = f; // error (no implicit conversion)Use either contextual conversion (e.g., if (f), !f, f && ..., f ? ... : ...) or an explicit cast:
bool b = static_cast<bool>(f);The “contextually convertible to bool” cases include conditions of if/while/for, logical ! && ||, and ?: (among others), which is why if (f) works even though operator bool is explicit. [2]
Reset to empty (if needed):
f = nullptr; // or: f = {};[3]
Sources: [1] cppreference std::function::operator bool • [2] contextual conversion explanation (Stack Overflow) • [3] resetting an std::function (Stack Overflow)
Reset last_reported_objective_ at the start of each solve.
The callback check on line 937 is correct—std::function is contextually convertible to bool and safely checks if the callback is set. However, last_reported_objective_ persists across multiple solve() calls. When solve() is invoked again on the same fj_t instance, callbacks will be suppressed for any solution worse than the best found in a previous solve, since the comparison user_obj < last_reported_objective_ will fail. This should be reset to +infinity at the beginning of solve() to ensure callbacks are properly invoked for each independent solve.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@cpp/src/mip_heuristics/feasibility_jump/feasibility_jump.cu` around lines 936
- 950, Reset last_reported_objective_ at the start of each solve() so callbacks
aren’t suppressed by previous runs: inside fj_t::solve() set
last_reported_objective_ to positive infinity (use
std::numeric_limits<f_t>::infinity() or
std::numeric_limits<decltype(last_reported_objective_)>::infinity() as
appropriate) before any solution processing so the comparison in the improvement
callback block (user_obj < last_reported_objective_) will work correctly for
each independent solve invocation.
There was a problem hiding this comment.
The finding doesn't apply. fj_t::improvement_callback is assigned in only one place — early_gpufj.cu on fj_ptr_. That object is created fresh in start(), solve() is called exactly once on it in run_worker(), then it's destroyed in stop().
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check the lifecycle of fj_t in early_gpufj.cu
cat cpp/src/mip_heuristics/feasibility_jump/early_gpufj.cu
Length of output: 2872
🏁 Script executed:
#!/bin/bash
# Find all places where fj_t::solve is called
rg -n "->solve\(|\.solve\(" --type cpp -g "**/mip_heuristics/**"
Length of output: 404
🏁 Script executed:
#!/bin/bash
# Check if fj_t instances might be reused by looking at how they're created and stored
rg -n "fj_t<.*>|fj_ptr" --type cpp -g "**/mip_heuristics/**" -A 2 -B 2
Length of output: 94165
@aliceb-nv, you're absolutely right! Thank you for the clarification. The fj_t instance created in early_gpufj_t::start() has a single-use lifecycle: it's constructed fresh, solve() is called exactly once in the worker thread, and then the object is destroyed in stop(). Since last_reported_objective_ doesn't persist across this lifecycle, resetting it at the start of solve() would be unnecessary.
The finding doesn't apply to this usage pattern.
✏️ Learnings added
Learnt from: aliceb-nv
Repo: aliceb-nv/cuopt-fork PR: 1
File: cpp/src/mip_heuristics/feasibility_jump/feasibility_jump.cu:936-950
Timestamp: 2026-02-23T18:28:34.786Z
Learning: In early_gpufj_t and early_cpufj_t, fj_t instances follow a single-use lifecycle: created fresh in start(), solve() called once in the worker thread, then destroyed in stop(). The improvement_callback is only set in early_gpufj.cu (line 55), and last_reported_objective_ does not persist across solve() invocations because the entire fj_t object is destroyed after each use.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@cpp/src/mip_heuristics/early_heuristic.cuh`:
- Line 92: The member start_time_ is not initialized before use in
try_update_best, causing meaningless elapsed time; initialize start_time_ to a
safe default (e.g., std::chrono::steady_clock::now()) either at declaration or
in the class constructor (the EarlyHeuristic/constructor that defines
start_time_) so any calls to try_update_best compute a valid elapsed time;
update the declaration of start_time_ or the EarlyHeuristic constructor to set
start_time_ and keep existing uses in try_update_best unchanged.
In `@cpp/src/mip_heuristics/solve.cu`:
- Around line 179-193: The papilo-presolved early CPUFJ path registers an
incumbent_callback that calls presolver_ptr->uncrush_primal_solution and
invoke_solution_callbacks without thread serialization; wrap the body of
incumbent_callback with the same mutex used elsewhere (early_callback_mutex) so
uncrush_primal_solution and invoke_solution_callbacks(mip_callbacks, ...) are
executed under the lock, ensuring thread-safe callback invocation for the
early_cpufj path that uses problem.presolve_data.papilo_presolve_ptr and
settings.get_mip_callbacks().
ℹ️ Review info
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
cpp/src/mip_heuristics/early_heuristic.cuhcpp/src/mip_heuristics/feasibility_jump/early_cpufj.cucpp/src/mip_heuristics/solve.cu
🚧 Files skipped from review as they are similar to previous changes (1)
- cpp/src/mip_heuristics/feasibility_jump/early_cpufj.cu
| std::vector<f_t> best_assignment_; | ||
|
|
||
| early_incumbent_callback_t<f_t> incumbent_callback_; | ||
| std::chrono::steady_clock::time_point start_time_; |
There was a problem hiding this comment.
start_time_ is not initialized.
The member start_time_ is used in try_update_best (line 76) to compute elapsed time for logging, but it has no default initialization in the class definition or constructor. If try_update_best is called before derived classes set start_time_, the elapsed time will be meaningless.
Consider initializing it to steady_clock::now() in the constructor or at declaration to ensure safe fallback behavior.
🛠️ Proposed fix
- std::chrono::steady_clock::time_point start_time_;
+ std::chrono::steady_clock::time_point start_time_{std::chrono::steady_clock::now()};📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| std::chrono::steady_clock::time_point start_time_; | |
| std::chrono::steady_clock::time_point start_time_{std::chrono::steady_clock::now()}; |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@cpp/src/mip_heuristics/early_heuristic.cuh` at line 92, The member
start_time_ is not initialized before use in try_update_best, causing
meaningless elapsed time; initialize start_time_ to a safe default (e.g.,
std::chrono::steady_clock::now()) either at declaration or in the class
constructor (the EarlyHeuristic/constructor that defines start_time_) so any
calls to try_update_best compute a valid elapsed time; update the declaration of
start_time_ or the EarlyHeuristic constructor to set start_time_ and keep
existing uses in try_update_best unchanged.
| // Run early CPUFJ on papilo-presolved problem during cuOpt presolve (probing cache). | ||
| // Stopped by run_solver after presolve completes; its best objective feeds into initial_cutoff. | ||
| std::unique_ptr<detail::early_cpufj_t<i_t, f_t>> early_cpufj; | ||
| bool run_early_cpufj = problem.has_papilo_presolve_data() && | ||
| settings.determinism_mode != CUOPT_MODE_DETERMINISTIC && | ||
| problem.original_problem_ptr->get_n_integers() > 0; | ||
| if (run_early_cpufj) { | ||
| auto* presolver_ptr = problem.presolve_data.papilo_presolve_ptr; | ||
| auto mip_callbacks = settings.get_mip_callbacks(); | ||
| auto incumbent_callback = [presolver_ptr, mip_callbacks](f_t objective, | ||
| const std::vector<f_t>& assignment) { | ||
| std::vector<f_t> user_assignment; | ||
| presolver_ptr->uncrush_primal_solution(assignment, user_assignment); | ||
| invoke_solution_callbacks(mip_callbacks, objective, user_assignment); | ||
| }; |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# Search for callback invocation patterns and thread safety mechanisms
rg -n "invoke_solution_callbacks" --type-add 'cuda:*.{cu,cuh}' -t cuda -B 2 -A 5Repository: aliceb-nv/cuopt-fork
Length of output: 2001
🏁 Script executed:
# Look for mutex/synchronization in solve.cu and related files
rg -n "mutex|lock|synchronize" cpp/src/mip_heuristics/solve.cu -C 3Repository: aliceb-nv/cuopt-fork
Length of output: 1001
🏁 Script executed:
# Search for GET_SOLUTION callback definition and usage
rg -n "GET_SOLUTION|get_solution" --type-add 'cuda:*.{cu,cuh}' -t cuda -C 2Repository: aliceb-nv/cuopt-fork
Length of output: 25704
🏁 Script executed:
# Examine the early_cpufj_t implementation and how it invokes callbacks
fd "early_cpufj" --type f --extension cu --extension cuh -x cat {}Repository: aliceb-nv/cuopt-fork
Length of output: 3484
🏁 Script executed:
# Check the worker thread model for early_cpufj
rg -n "early_cpufj|worker.*thread|std::thread" cpp/src/mip_heuristics/ -C 3 --type-add 'cuda:*.{cu,cuh}' -t cudaRepository: aliceb-nv/cuopt-fork
Length of output: 23553
Callback serialization missing in papilo-presolved early CPUFJ path.
The incumbent_callback at lines 188–193 invokes user GET_SOLUTION callbacks from the early CPUFJ worker thread without serialization. While the main early FJ path (line 318) protects similar callback invocation with early_callback_mutex, the papilo-presolved case has no equivalent lock. This creates a potential race condition if both the papilo early CPUFJ worker and the main solver thread invoke callbacks concurrently. Add mutex protection to the papilo incumbent_callback or document that callbacks must be thread-safe.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@cpp/src/mip_heuristics/solve.cu` around lines 179 - 193, The papilo-presolved
early CPUFJ path registers an incumbent_callback that calls
presolver_ptr->uncrush_primal_solution and invoke_solution_callbacks without
thread serialization; wrap the body of incumbent_callback with the same mutex
used elsewhere (early_callback_mutex) so uncrush_primal_solution and
invoke_solution_callbacks(mip_callbacks, ...) are executed under the lock,
ensuring thread-safe callback invocation for the early_cpufj path that uses
problem.presolve_data.papilo_presolve_ptr and settings.get_mip_callbacks().
Description
Issue
Checklist
Summary by CodeRabbit
New Features
Improvements
Bug Fixes