Skip to content

Pre presolve heuristics - review test#1

Closed
aliceb-nv wants to merge 3 commits into
mainfrom
pre-presolve-heuristics
Closed

Pre presolve heuristics - review test#1
aliceb-nv wants to merge 3 commits into
mainfrom
pre-presolve-heuristics

Conversation

@aliceb-nv
Copy link
Copy Markdown
Owner

@aliceb-nv aliceb-nv commented Feb 23, 2026

Description

Issue

Checklist

  • I am familiar with the Contributing Guidelines.
  • Testing
    • New or existing tests cover these changes
    • Added tests
    • Created an issue to follow-up
    • NA
  • Documentation
    • The documentation is up to date with these changes
    • Added new documentation
    • NA

Summary by CodeRabbit

  • New Features

    • External cutoff mechanism to steer pruning decisions.
    • Early CPU and GPU feasibility-jump heuristics that run during presolve to find incumbents.
  • Improvements

    • Centralized cutoff-driven pruning integrated across the solver, and early-heuristic incumbents propagated into the main solve.
    • Improved early-heuristic reporting via objective+assignment callbacks.
  • Bug Fixes

    • Null-pointer safety added for scaling; guards when adding external solutions to populations.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Feb 23, 2026

📝 Walkthrough

Walkthrough

Adds early feasibility-jump heuristics (CPU and GPU) that run during presolve and propagate incumbents via callbacks; introduces an external initial cutoff stored in solver context with public setters/getters and refactors branch-and-bound to use this centralized cutoff for pruning.

Changes

Cohort / File(s) Summary
Branch-and-Bound Cutoff Mechanism
cpp/src/branch_and_bound/branch_and_bound.hpp, cpp/src/branch_and_bound/branch_and_bound.cpp
Added initial_cutoff_, set_initial_cutoff(f_t), and get_cutoff(); refactored numerous pruning/fathoming checks to use get_cutoff() instead of direct upper_bound_ reads.
Early Heuristic Base
cpp/src/mip_heuristics/early_heuristic.cuh
Added CRTP template early_heuristic_t<i_t,f_t,Derived> providing problem/solution setup, incumbent tracking, device-to-host assignment handling, and incumbent callbacks.
Early CPU Feasibility Jump
cpp/src/mip_heuristics/feasibility_jump/early_cpufj.cuh, .../early_cpufj.cu
New early_cpufj_t class derived from early_heuristic_t; manages a CPU FJ thread, preemption flag, start/stop lifecycle, and wires improvements to the base incumbent updater.
Early GPU Feasibility Jump
cpp/src/mip_heuristics/feasibility_jump/early_gpufj.cuh, .../early_gpufj.cu
New early_gpufj_t class derived from early_heuristic_t; creates GPU solver context, launches a worker thread, and forwards improvements via callbacks.
Feasibility Jump Core & CPUFJ Helpers
cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cuh, .../fj_cpu.cu, .../feasibility_jump.cuh, .../feasibility_jump.cu
Added init_fj_cpu_standalone() declaration/definition, refactored CPU solve loop into cpufj_solve_loop, introduced fj_improvement_callback_t and improvement_callback/last_reported_objective_ in fj_t, and added improvement-callback invocation and diversity-manager null-checks.
MIP Solver Integration & Context
cpp/src/mip_heuristics/solve.cu, cpp/src/mip_heuristics/solver.cu, cpp/src/mip_heuristics/solver_context.cuh
Extended run_mip() to accept initial_cutoff, propagate it into mip_solver_context_t.initial_cutoff, started/stopped early CPUFJ/GPUFJ during presolve with incumbent propagation, changed scaling strategy member to pointer, and added early_cpufj_ptr to context.
Build & Safety Fixes
cpp/src/mip_heuristics/CMakeLists.txt, cpp/src/mip_heuristics/diversity/population.cu
Added feasibility_jump/early_cpufj.cu and .../early_gpufj.cu to non-LP sources; added null-checks before using context.scaling in solution (un)scaling calls.
Miscellaneous
(instantiations, logging) cpp/src/mip_heuristics/...
Multiple files add explicit template instantiations for float/double variants and minor logging/formatting adjustments across new heuristics and FJ integration.

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 7.69% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title 'Pre presolve heuristics - review test' is vague and generic. It uses non-descriptive phrasing ('review test') and doesn't clearly convey the main technical changes (adding early CPU/GPU feasibility jump heuristics, introducing cutoff mechanism, etc.). Consider using a more descriptive title such as 'Add early CPU/GPU feasibility jump heuristics with cutoff mechanism' to better communicate the key changes to reviewers scanning the history.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch pre-presolve-heuristics

Tip

Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu (1)

1405-1554: ⚠️ Potential issue | 🟠 Major

Fix time-limit duration creation for infinite limits (overflow/UB).
in_time_limit can be inf (default in cpu_fj_thread_t), but it is cast to int to build std::chrono::milliseconds, which can overflow/UB. Also guard the average-time division if the loop exits before the first iteration.

🛠️ Proposed fix
-  auto time_limit      = std::chrono::milliseconds((int)(in_time_limit * 1000));
+  auto time_limit = std::chrono::duration<double>::max();
+  if (in_time_limit < std::numeric_limits<f_t>::infinity()) {
+    time_limit = std::chrono::duration<double>(in_time_limit);
+  }
@@
-    if (in_time_limit < std::numeric_limits<f_t>::infinity() &&
-        now - loop_time_start > time_limit) {
+    if (in_time_limit < std::numeric_limits<f_t>::infinity() &&
+        now - loop_time_start > time_limit) {
       CUOPT_LOG_TRACE("%sTime limit of %.4f seconds reached, breaking loop at iteration %d",
                       fj_cpu.log_prefix.c_str(),
-                      time_limit.count() / 1000.f,
+                      time_limit.count(),
                       fj_cpu.iterations);
       break;
     }
@@
-  double avg_time_per_iter = total_time / fj_cpu.iterations;
+  double avg_time_per_iter =
+    fj_cpu.iterations > 0 ? total_time / fj_cpu.iterations : 0.0;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu` around lines 1405 - 1554,
The time-limit construction in cpufj_solve_loop currently casts in_time_limit to
int and into std::chrono::milliseconds which is UB when in_time_limit is
infinity; update the logic to only construct a chrono duration when
std::isfinite(in_time_limit) (e.g., keep a bool has_time_limit and a
std::chrono::duration<double> or milliseconds variable set conditionally) and
use that guard in the time-check comparing now - loop_time_start; also protect
the average-time computation at the end by checking fj_cpu.iterations > 0 before
dividing (or set avg_time_per_iter = 0 when iterations == 0) to avoid a
divide-by-zero. Ensure changes touch cpufj_solve_loop, the
time_limit/loop_time_start variables and the avg_time_per_iter calculation.
🧹 Nitpick comments (2)
cpp/src/mip_heuristics/solver_context.cuh (1)

67-69: Consider documenting ownership for early_cpufj_ptr.

The early_cpufj_ptr is a raw non-owning pointer. Consider adding a comment similar to line 59 to clarify this is a non-owning reference and who owns the pointed-to object.

📝 Suggested documentation
-  early_cpufj_t<i_t, f_t>* early_cpufj_ptr{nullptr};
+  early_cpufj_t<i_t, f_t>* early_cpufj_ptr{nullptr};  // Non-owning; managed by caller
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/solver_context.cuh` around lines 67 - 69,
early_cpufj_ptr is a raw pointer that appears to be non-owning; add a short
comment above the declaration of early_cpufj_ptr in solver_context.cuh (next to
the declaration of early_cpufj_t<i_t, f_t>* early_cpufj_ptr{nullptr}) stating
that it is a non-owning/unowned reference and indicating who is responsible for
the lifetime (e.g., owned elsewhere by X or managed by Y), similar to the
existing comment style used on line 59 for other fields.
cpp/src/mip_heuristics/feasibility_jump/early_gpufj.cu (1)

62-67: Consider wrapping solve() in exception handling.

If fj_ptr_->solve() throws an exception, the worker thread will terminate abruptly. While stop() will still join successfully, any exception information will be lost. Consider catching and storing exceptions for later inspection, or at minimum logging them.

♻️ Suggested improvement
 template <typename i_t, typename f_t>
 void early_gpufj_t<i_t, f_t>::run_worker()
 {
-  RAFT_CUDA_TRY(cudaSetDevice(this->problem_ptr_->handle_ptr->get_device()));
-  fj_ptr_->solve(*this->solution_ptr_);
+  try {
+    RAFT_CUDA_TRY(cudaSetDevice(this->problem_ptr_->handle_ptr->get_device()));
+    fj_ptr_->solve(*this->solution_ptr_);
+  } catch (const std::exception& e) {
+    CUOPT_LOG_ERROR("[Early GPU FJ] Worker exception: %s", e.what());
+  }
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/feasibility_jump/early_gpufj.cu` around lines 62 - 67,
Wrap the call to fj_ptr_->solve(*this->solution_ptr_) inside a try/catch in
early_gpufj_t<i_t,f_t>::run_worker so any thrown exceptions are captured instead
of letting the worker thread terminate silently; catch std::exception (and a
catch-all) and either store std::current_exception() into a member
std::exception_ptr (e.g., worker_exception_ or similar) for later inspection or
call the appropriate logger on this object to record the error, and ensure the
catch does not rethrow so stop()/join can complete normally.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cpp/src/mip_heuristics/early_heuristic.cuh`:
- Around line 60-81: The method try_update_best mutates shared state
(best_objective_, best_assignment_, solution_found_) without synchronization,
causing potential data races if called concurrently; either document that
try_update_best (and the base class) requires single-threaded access or add a
mutex (e.g., a std::mutex member like best_update_mutex_) and lock it at the
start of try_update_best to protect the entire update sequence (including
checking and setting best_objective_, copying/processing assignment via
problem_ptr_->post_process_assignment and assigning best_assignment_, setting
solution_found_, and invoking incumbent_callback_), ensuring the lock is held
while reading/writing those members and released before logging or long-running
host/device operations where appropriate.

In `@cpp/src/mip_heuristics/feasibility_jump/early_cpufj.cu`:
- Around line 32-53: The start() method does not reset the preemption_flag_, so
if stop() previously set preemption_flag_ to true the new cpu solver will
immediately preempt; update early_cpufj_t<i_t,f_t>::start() to explicitly reset
preemption_flag_ (e.g., set preemption_flag_ = false) at the top of the method
before creating cpu_fj_thread_, ensuring subsequent calls to start() create a
fresh solver; reference start(), stop(), preemption_flag_, and cpu_fj_thread_
when making this change.

In `@cpp/src/mip_heuristics/feasibility_jump/feasibility_jump.cu`:
- Around line 936-950: Reset last_reported_objective_ at the start of each
solve() so callbacks aren’t suppressed by previous runs: inside fj_t::solve()
set last_reported_objective_ to positive infinity (use
std::numeric_limits<f_t>::infinity() or
std::numeric_limits<decltype(last_reported_objective_)>::infinity() as
appropriate) before any solution processing so the comparison in the improvement
callback block (user_obj < last_reported_objective_) will work correctly for
each independent solve invocation.

In `@cpp/src/mip_heuristics/solve.cu`:
- Around line 74-79: The solver.context.initial_cutoff is being assigned after
run_solver() returns, so move the setup of solver.context.initial_cutoff (the
initial_cutoff parameter passed into run_mip) to before the call to run_solver()
inside run_mip, so the branch-and-bound initialization in run_solver() can
consume it; update references to solver.context.initial_cutoff in run_mip and
ensure the value passed into run_solver() is the already-set context value. Also
correct or clarify the comment around the early CPUFJ block (the code following
run_solver() that currently reads like it runs during presolve) to accurately
state that the CPUFJ path executes post-Papilo/post-solve if that’s intended, or
move the block if it was supposed to run earlier. Ensure changes reference
run_mip, solver.context.initial_cutoff, run_solver, and the early CPUFJ
comment/block.

---

Outside diff comments:
In `@cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu`:
- Around line 1405-1554: The time-limit construction in cpufj_solve_loop
currently casts in_time_limit to int and into std::chrono::milliseconds which is
UB when in_time_limit is infinity; update the logic to only construct a chrono
duration when std::isfinite(in_time_limit) (e.g., keep a bool has_time_limit and
a std::chrono::duration<double> or milliseconds variable set conditionally) and
use that guard in the time-check comparing now - loop_time_start; also protect
the average-time computation at the end by checking fj_cpu.iterations > 0 before
dividing (or set avg_time_per_iter = 0 when iterations == 0) to avoid a
divide-by-zero. Ensure changes touch cpufj_solve_loop, the
time_limit/loop_time_start variables and the avg_time_per_iter calculation.

---

Nitpick comments:
In `@cpp/src/mip_heuristics/feasibility_jump/early_gpufj.cu`:
- Around line 62-67: Wrap the call to fj_ptr_->solve(*this->solution_ptr_)
inside a try/catch in early_gpufj_t<i_t,f_t>::run_worker so any thrown
exceptions are captured instead of letting the worker thread terminate silently;
catch std::exception (and a catch-all) and either store std::current_exception()
into a member std::exception_ptr (e.g., worker_exception_ or similar) for later
inspection or call the appropriate logger on this object to record the error,
and ensure the catch does not rethrow so stop()/join can complete normally.

In `@cpp/src/mip_heuristics/solver_context.cuh`:
- Around line 67-69: early_cpufj_ptr is a raw pointer that appears to be
non-owning; add a short comment above the declaration of early_cpufj_ptr in
solver_context.cuh (next to the declaration of early_cpufj_t<i_t, f_t>*
early_cpufj_ptr{nullptr}) stating that it is a non-owning/unowned reference and
indicating who is responsible for the lifetime (e.g., owned elsewhere by X or
managed by Y), similar to the existing comment style used on line 59 for other
fields.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fcfd4f0 and 5881d35.

📒 Files selected for processing (16)
  • cpp/src/branch_and_bound/branch_and_bound.cpp
  • cpp/src/branch_and_bound/branch_and_bound.hpp
  • cpp/src/mip_heuristics/CMakeLists.txt
  • cpp/src/mip_heuristics/diversity/population.cu
  • cpp/src/mip_heuristics/early_heuristic.cuh
  • cpp/src/mip_heuristics/feasibility_jump/early_cpufj.cu
  • cpp/src/mip_heuristics/feasibility_jump/early_cpufj.cuh
  • cpp/src/mip_heuristics/feasibility_jump/early_gpufj.cu
  • cpp/src/mip_heuristics/feasibility_jump/early_gpufj.cuh
  • cpp/src/mip_heuristics/feasibility_jump/feasibility_jump.cu
  • cpp/src/mip_heuristics/feasibility_jump/feasibility_jump.cuh
  • cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu
  • cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cuh
  • cpp/src/mip_heuristics/solve.cu
  • cpp/src/mip_heuristics/solver.cu
  • cpp/src/mip_heuristics/solver_context.cuh

Comment thread cpp/src/mip_heuristics/early_heuristic.cuh
Comment thread cpp/src/mip_heuristics/feasibility_jump/early_cpufj.cu
Comment on lines +936 to +950
// Invoke improvement callback if we have a better feasible solution
if (is_feasible && improvement_callback) {
f_t user_obj = solution.get_user_objective();
if (user_obj < last_reported_objective_) {
last_reported_objective_ = user_obj;
// Copy assignment to host for callback
std::vector<f_t> h_assignment(solution.assignment.size());
raft::copy(h_assignment.data(),
solution.assignment.data(),
solution.assignment.size(),
climber_stream);
climber_stream.synchronize();
improvement_callback(user_obj, h_assignment);
}
}
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

rg -n "last_reported_objective_" --type cpp

Repository: aliceb-nv/cuopt-fork

Length of output: 46


🏁 Script executed:

fd feasibility_jump.cu

Repository: aliceb-nv/cuopt-fork

Length of output: 184


🏁 Script executed:

rg "last_reported_objective_" cpp/src/mip_heuristics/feasibility_jump/

Repository: aliceb-nv/cuopt-fork

Length of output: 412


🏁 Script executed:

cat -n cpp/src/mip_heuristics/feasibility_jump/feasibility_jump.cuh | head -100

Repository: aliceb-nv/cuopt-fork

Length of output: 4539


🏁 Script executed:

grep -n "class\|struct" cpp/src/mip_heuristics/feasibility_jump/feasibility_jump.cuh | head -20

Repository: aliceb-nv/cuopt-fork

Length of output: 795


🏁 Script executed:

sed -n '201,450p' cpp/src/mip_heuristics/feasibility_jump/feasibility_jump.cuh

Repository: aliceb-nv/cuopt-fork

Length of output: 13071


🏁 Script executed:

sed -n '900,1000p' cpp/src/mip_heuristics/feasibility_jump/feasibility_jump.cu

Repository: aliceb-nv/cuopt-fork

Length of output: 4361


🏁 Script executed:

sed -n '500,700p' cpp/src/mip_heuristics/feasibility_jump/feasibility_jump.cuh

Repository: aliceb-nv/cuopt-fork

Length of output: 5705


🏁 Script executed:

grep -n "improvement_callback =" cpp/src/mip_heuristics/feasibility_jump/feasibility_jump.cu

Repository: aliceb-nv/cuopt-fork

Length of output: 46


🏁 Script executed:

rg "improvement_callback" cpp/src/mip_heuristics/feasibility_jump/

Repository: aliceb-nv/cuopt-fork

Length of output: 1073


🏁 Script executed:

cat -n cpp/src/mip_heuristics/feasibility_jump/early_gpufj.cu | grep -A 10 -B 5 "improvement_callback"

Repository: aliceb-nv/cuopt-fork

Length of output: 742


🏁 Script executed:

rg -n "fj_t<.*>::fj_t|fj_t<.*>::solve" cpp/src/mip_heuristics/feasibility_jump/feasibility_jump.cu | head -20

Repository: aliceb-nv/cuopt-fork

Length of output: 219


🏁 Script executed:

sed -n '1074,1150p' cpp/src/mip_heuristics/feasibility_jump/feasibility_jump.cu

Repository: aliceb-nv/cuopt-fork

Length of output: 3511


🏁 Script executed:

sed -n '46,150p' cpp/src/mip_heuristics/feasibility_jump/feasibility_jump.cu

Repository: aliceb-nv/cuopt-fork

Length of output: 5660


🏁 Script executed:

web_search
C++ std::function contextual conversion to bool operator

Repository: aliceb-nv/cuopt-fork

Length of output: 156


🌐 Web query:

C++ std::function empty check contextually convertible to bool

💡 Result:

std::function is contextually convertible to bool because it has an explicit conversion operator:

  • explicit operator bool() const noexcept; — returns true if it stores a callable target, false if it’s empty. [1]

So these are the idiomatic “empty checks”:

std::function<void()> f;

if (f) {            // has target
    f();
}

if (!f) {           // empty
}

But because the conversion is explicit, this does not compile:

bool b = f;         // error (no implicit conversion)

Use either contextual conversion (e.g., if (f), !f, f && ..., f ? ... : ...) or an explicit cast:

bool b = static_cast<bool>(f);

The “contextually convertible to bool” cases include conditions of if/while/for, logical ! && ||, and ?: (among others), which is why if (f) works even though operator bool is explicit. [2]

Reset to empty (if needed):

f = nullptr;  // or: f = {};

[3]

Sources: [1] cppreference std::function::operator bool • [2] contextual conversion explanation (Stack Overflow) • [3] resetting an std::function (Stack Overflow)


Reset last_reported_objective_ at the start of each solve.

The callback check on line 937 is correct—std::function is contextually convertible to bool and safely checks if the callback is set. However, last_reported_objective_ persists across multiple solve() calls. When solve() is invoked again on the same fj_t instance, callbacks will be suppressed for any solution worse than the best found in a previous solve, since the comparison user_obj < last_reported_objective_ will fail. This should be reset to +infinity at the beginning of solve() to ensure callbacks are properly invoked for each independent solve.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/feasibility_jump/feasibility_jump.cu` around lines 936
- 950, Reset last_reported_objective_ at the start of each solve() so callbacks
aren’t suppressed by previous runs: inside fj_t::solve() set
last_reported_objective_ to positive infinity (use
std::numeric_limits<f_t>::infinity() or
std::numeric_limits<decltype(last_reported_objective_)>::infinity() as
appropriate) before any solution processing so the comparison in the improvement
callback block (user_obj < last_reported_objective_) will work correctly for
each independent solve invocation.

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The finding doesn't apply. fj_t::improvement_callback is assigned in only one place — early_gpufj.cu on fj_ptr_. That object is created fresh in start(), solve() is called exactly once on it in run_worker(), then it's destroyed in stop().

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check the lifecycle of fj_t in early_gpufj.cu
cat cpp/src/mip_heuristics/feasibility_jump/early_gpufj.cu

Length of output: 2872


🏁 Script executed:

#!/bin/bash
# Find all places where fj_t::solve is called
rg -n "->solve\(|\.solve\(" --type cpp -g "**/mip_heuristics/**"

Length of output: 404


🏁 Script executed:

#!/bin/bash
# Check if fj_t instances might be reused by looking at how they're created and stored
rg -n "fj_t<.*>|fj_ptr" --type cpp -g "**/mip_heuristics/**" -A 2 -B 2

Length of output: 94165


@aliceb-nv, you're absolutely right! Thank you for the clarification. The fj_t instance created in early_gpufj_t::start() has a single-use lifecycle: it's constructed fresh, solve() is called exactly once in the worker thread, and then the object is destroyed in stop(). Since last_reported_objective_ doesn't persist across this lifecycle, resetting it at the start of solve() would be unnecessary.

The finding doesn't apply to this usage pattern.


✏️ Learnings added
Learnt from: aliceb-nv
Repo: aliceb-nv/cuopt-fork PR: 1
File: cpp/src/mip_heuristics/feasibility_jump/feasibility_jump.cu:936-950
Timestamp: 2026-02-23T18:28:34.786Z
Learning: In early_gpufj_t and early_cpufj_t, fj_t instances follow a single-use lifecycle: created fresh in start(), solve() called once in the worker thread, then destroyed in stop(). The improvement_callback is only set in early_gpufj.cu (line 55), and last_reported_objective_ does not persist across solve() invocations because the entire fj_t object is destroyed after each use.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

Comment thread cpp/src/mip_heuristics/solve.cu
@aliceb-nv aliceb-nv closed this Feb 23, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cpp/src/mip_heuristics/early_heuristic.cuh`:
- Line 92: The member start_time_ is not initialized before use in
try_update_best, causing meaningless elapsed time; initialize start_time_ to a
safe default (e.g., std::chrono::steady_clock::now()) either at declaration or
in the class constructor (the EarlyHeuristic/constructor that defines
start_time_) so any calls to try_update_best compute a valid elapsed time;
update the declaration of start_time_ or the EarlyHeuristic constructor to set
start_time_ and keep existing uses in try_update_best unchanged.

In `@cpp/src/mip_heuristics/solve.cu`:
- Around line 179-193: The papilo-presolved early CPUFJ path registers an
incumbent_callback that calls presolver_ptr->uncrush_primal_solution and
invoke_solution_callbacks without thread serialization; wrap the body of
incumbent_callback with the same mutex used elsewhere (early_callback_mutex) so
uncrush_primal_solution and invoke_solution_callbacks(mip_callbacks, ...) are
executed under the lock, ensuring thread-safe callback invocation for the
early_cpufj path that uses problem.presolve_data.papilo_presolve_ptr and
settings.get_mip_callbacks().

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5881d35 and 8a3684c.

📒 Files selected for processing (3)
  • cpp/src/mip_heuristics/early_heuristic.cuh
  • cpp/src/mip_heuristics/feasibility_jump/early_cpufj.cu
  • cpp/src/mip_heuristics/solve.cu
🚧 Files skipped from review as they are similar to previous changes (1)
  • cpp/src/mip_heuristics/feasibility_jump/early_cpufj.cu

std::vector<f_t> best_assignment_;

early_incumbent_callback_t<f_t> incumbent_callback_;
std::chrono::steady_clock::time_point start_time_;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

start_time_ is not initialized.

The member start_time_ is used in try_update_best (line 76) to compute elapsed time for logging, but it has no default initialization in the class definition or constructor. If try_update_best is called before derived classes set start_time_, the elapsed time will be meaningless.

Consider initializing it to steady_clock::now() in the constructor or at declaration to ensure safe fallback behavior.

🛠️ Proposed fix
-  std::chrono::steady_clock::time_point start_time_;
+  std::chrono::steady_clock::time_point start_time_{std::chrono::steady_clock::now()};
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
std::chrono::steady_clock::time_point start_time_;
std::chrono::steady_clock::time_point start_time_{std::chrono::steady_clock::now()};
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/early_heuristic.cuh` at line 92, The member
start_time_ is not initialized before use in try_update_best, causing
meaningless elapsed time; initialize start_time_ to a safe default (e.g.,
std::chrono::steady_clock::now()) either at declaration or in the class
constructor (the EarlyHeuristic/constructor that defines start_time_) so any
calls to try_update_best compute a valid elapsed time; update the declaration of
start_time_ or the EarlyHeuristic constructor to set start_time_ and keep
existing uses in try_update_best unchanged.

Comment on lines +179 to +193
// Run early CPUFJ on papilo-presolved problem during cuOpt presolve (probing cache).
// Stopped by run_solver after presolve completes; its best objective feeds into initial_cutoff.
std::unique_ptr<detail::early_cpufj_t<i_t, f_t>> early_cpufj;
bool run_early_cpufj = problem.has_papilo_presolve_data() &&
settings.determinism_mode != CUOPT_MODE_DETERMINISTIC &&
problem.original_problem_ptr->get_n_integers() > 0;
if (run_early_cpufj) {
auto* presolver_ptr = problem.presolve_data.papilo_presolve_ptr;
auto mip_callbacks = settings.get_mip_callbacks();
auto incumbent_callback = [presolver_ptr, mip_callbacks](f_t objective,
const std::vector<f_t>& assignment) {
std::vector<f_t> user_assignment;
presolver_ptr->uncrush_primal_solution(assignment, user_assignment);
invoke_solution_callbacks(mip_callbacks, objective, user_assignment);
};
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# Search for callback invocation patterns and thread safety mechanisms
rg -n "invoke_solution_callbacks" --type-add 'cuda:*.{cu,cuh}' -t cuda -B 2 -A 5

Repository: aliceb-nv/cuopt-fork

Length of output: 2001


🏁 Script executed:

# Look for mutex/synchronization in solve.cu and related files
rg -n "mutex|lock|synchronize" cpp/src/mip_heuristics/solve.cu -C 3

Repository: aliceb-nv/cuopt-fork

Length of output: 1001


🏁 Script executed:

# Search for GET_SOLUTION callback definition and usage
rg -n "GET_SOLUTION|get_solution" --type-add 'cuda:*.{cu,cuh}' -t cuda -C 2

Repository: aliceb-nv/cuopt-fork

Length of output: 25704


🏁 Script executed:

# Examine the early_cpufj_t implementation and how it invokes callbacks
fd "early_cpufj" --type f --extension cu --extension cuh -x cat {}

Repository: aliceb-nv/cuopt-fork

Length of output: 3484


🏁 Script executed:

# Check the worker thread model for early_cpufj
rg -n "early_cpufj|worker.*thread|std::thread" cpp/src/mip_heuristics/ -C 3 --type-add 'cuda:*.{cu,cuh}' -t cuda

Repository: aliceb-nv/cuopt-fork

Length of output: 23553


Callback serialization missing in papilo-presolved early CPUFJ path.

The incumbent_callback at lines 188–193 invokes user GET_SOLUTION callbacks from the early CPUFJ worker thread without serialization. While the main early FJ path (line 318) protects similar callback invocation with early_callback_mutex, the papilo-presolved case has no equivalent lock. This creates a potential race condition if both the papilo early CPUFJ worker and the main solver thread invoke callbacks concurrently. Add mutex protection to the papilo incumbent_callback or document that callbacks must be thread-safe.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/solve.cu` around lines 179 - 193, The papilo-presolved
early CPUFJ path registers an incumbent_callback that calls
presolver_ptr->uncrush_primal_solution and invoke_solution_callbacks without
thread serialization; wrap the body of incumbent_callback with the same mutex
used elsewhere (early_callback_mutex) so uncrush_primal_solution and
invoke_solution_callbacks(mip_callbacks, ...) are executed under the lock,
ensuring thread-safe callback invocation for the early_cpufj path that uses
problem.presolve_data.papilo_presolve_ptr and settings.get_mip_callbacks().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant