Skip to content

Primal and dual crushing for Papilo#11

Closed
aliceb-nv wants to merge 7 commits into
mainfrom
papilo-crush
Closed

Primal and dual crushing for Papilo#11
aliceb-nv wants to merge 7 commits into
mainfrom
papilo-crush

Conversation

@aliceb-nv
Copy link
Copy Markdown
Owner

Dummy PR to trigger coderabbit review.

Description

Issue

Checklist

  • I am familiar with the Contributing Guidelines.
  • Testing
    • New or existing tests cover these changes
    • Added tests
    • Created an issue to follow-up
    • NA
  • Documentation
    • The documentation is up to date with these changes
    • Added new documentation
    • NA

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 14, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: afc5140f-ce9d-4488-be10-fe67af9df872

📥 Commits

Reviewing files that changed from the base of the PR and between 2b69c95 and d47d709.

📒 Files selected for processing (6)
  • cpp/src/mip_heuristics/diversity/diversity_manager.cu
  • cpp/src/mip_heuristics/presolve/third_party_presolve.cpp
  • cpp/src/mip_heuristics/solve.cu
  • cpp/tests/linear_programming/unit_tests/presolve_test.cu
  • cpp/tests/mip/incumbent_callback_test.cu
  • datasets/mip/download_miplib_test_dataset.sh

📝 Walkthrough

Walkthrough

This pull request implements Papilo presolve solution crushing functionality to reduce original-space solutions to compressed reduced-space representations. It integrates early-heuristic incumbents as initial solutions, updates presolve configuration, adds comprehensive KKT-based testing, and extends the test dataset with new MIPLIB instances.

Changes

Cohort / File(s) Summary
Papilo Crushing Implementation
cpp/src/mip_heuristics/presolve/third_party_presolve.hpp, cpp/src/mip_heuristics/presolve/third_party_presolve.cpp
Added two new public methods: crush_primal_solution to map original primal vectors to reduced space, and crush_primal_dual_solution to compress both primal and dual solution components (x, y, z) while handling coefficient changes and survivor projections. Modified variable integrality application to only occur for MIP category problems and repositioned SingletonStuffing in presolve method configuration with documented known failure cases.
Solution Integration and Heuristics
cpp/src/mip_heuristics/diversity/diversity_manager.cu, cpp/src/mip_heuristics/solve.cu
Integrated Papilo crushing into user-provided initial solution handling via new host-device copy and crush-expand workflow. Converted early-heuristic FJ callback incumbents into initial solutions pool. Removed presolve gating condition tied to initial solutions presence. Added OMP resource pause guard at solver entry. Changed successful initial solution logging from INFO to DEBUG level.
Presolve and KKT Verification Testing
cpp/tests/linear_programming/unit_tests/presolve_test.cu
Added comprehensive parameterized GTest suites: dual_crush_round_trip.kkt_check validates KKT stationarity, primal/dual feasibility, reduced-cost consistency, and crushed-solution validity across presolve→solve→postsolve→crush pipeline. crush_warmstart.round_trip verifies warmstart solutions after crushing maintain optimality and reduce iteration counts. Introduced CSR transposition and dual-consistency verification helpers with finite/infinite bound tolerance logic.
MIP Integration Testing
cpp/tests/mip/incumbent_callback_test.cu
Added RAII helper scoped_env_restore_t for exception-safe environment variable management. Introduced new test initial_solution_survives_papilo_crush that validates initial solutions maintain objective value after Papilo presolve crushing across two separate solver invocations.
Test Dataset Expansion
datasets/mip/download_miplib_test_dataset.sh
Extended INSTANCES array by 31 MIPLIB instance IDs (e.g., app1-1, bnatt400, brazil3, net12, seymour, air05) for both S3 and HTTP fallback download mechanisms.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 56.25% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check ❓ Inconclusive The description is a generic placeholder ('Dummy PR to trigger coderabbit review') with empty template sections, providing no meaningful information about the actual changeset implementation or objectives. Replace the placeholder description with actual details about the primal/dual crushing implementation, affected components, and testing approach.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Primal and dual crushing for Papilo' directly and clearly describes the main feature additions visible in the changeset: new crush_primal_solution and crush_primal_dual_solution methods, integration into diversity_manager and solve workflows, and comprehensive test coverage.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch papilo-crush

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/nightly.yaml:
- Line 20: The workflow fails because the matrix includes the non-existent
branch value "release/26.06" which is later used by the gh API call via the
CUOPT_BRANCH variable; fix it by either removing "release/26.06" from the branch
matrix in .github/workflows/nightly.yaml or create the corresponding branch
upstream in the NVIDIA/cuopt repo so the gh api call (gh api -q '.commit.sha'
"repos/nvidia/cuopt/branches/${CUOPT_BRANCH}") can resolve; ensure the matrix
entry and the CUOPT_BRANCH variable remain consistent.

In `@cpp/CMakeLists.txt`:
- Around line 297-302: The CMake snippet redundantly checks
OpenSSL_FOUND/OPENSSL_FOUND instead of using the target; change the fallback
logic to rely on the presence of the target OpenSSL::SSL: if OpenSSL::SSL target
is not available, call find_package(OpenSSL CONFIG QUIET) and then, only if the
OpenSSL::SSL target still does not exist, call find_package(OpenSSL REQUIRED).
Update the conditional test to use NOT TARGET OpenSSL::SSL in both checks and
remove references to OpenSSL_FOUND and OPENSSL_FOUND to follow the target-based
pattern used for gRPC/protobuf.

In `@cpp/src/mip_heuristics/presolve/third_party_presolve.cpp`:
- Around line 944-948: The code currently allows crush_rc when z_original is
provided but y_original is empty, which later causes out-of-bounds reads of
y[row]; change the logic so crush_rc is only true when both z_original and
y_original are non-empty (e.g., set crush_rc = !z_original.empty() &&
!y_original.empty()), and add a clear assertion or error (using cuopt_assert)
that rejects callers which pass z_original without y_original; apply the same
guard/update to the other similar reduction-handling block that uses
crush_rc/crush_dual and accesses y (the symmetric block around the second
occurrence of these variables).

In `@cpp/src/utilities/cuda_helpers.cuh`:
- Around line 199-218: The shared-memory cache currently keys only on Function*
(shmem_sizes) so the fast-path shared_lock can skip setting
cudaFuncAttributeMaxDynamicSharedMemorySize on a different GPU; change the cache
key to include the current CUDA device (e.g., get device via cudaGetDevice() or
cudaGetDeviceId()) and use a composite key (Function* plus device id) for all
lookups/updates (both the shared_lock fast path and the unique_lock write path)
while keeping mtx for synchronization; ensure every reference to shmem_sizes
(find, count, operator[]) is updated to use the composite key before calling
cudaFuncSetAttribute or storing the dynamic_request_size.

In `@cpp/tests/mip/incumbent_callback_test.cu`:
- Around line 224-249: The test must ensure early feasibility-jump heuristics
cannot produce the incumbent: before calling solve_mip with settings2, disable
the early CPU/GPU FJ heuristics so only the Papilo + add_initial_solution path
can create the incumbent; update the mip_solver_settings_t instance (settings2)
to set the appropriate flags (e.g., disable_cpu_feasibility_jump /
disable_gpu_feasibility_jump or a single disable_feasibility_jump flag if
available) or alternatively set the CUOPT_DISABLE_GPU_HEURISTICS and
corresponding CPU heuristic env var prior to calling solve_mip(op_problem,
settings2).
- Around line 201-241: The test sets the process env var via
setenv("CUOPT_DISABLE_GPU_HEURISTICS", "1", 1) but restores it with a trailing
unsetenv(), which leaks the changed state if an ASSERT_* or exception occurs;
wrap the setenv/unset in an RAII guard (e.g., a small ScopedEnvRestore class or
lambda-based scope guard) created immediately after the setenv call in
incumbent_callback_test (around the block that constructs handle_, op_problem
and settings1/settings2) so the guard captures the previous value (via getenv)
and restores it in its destructor, ensuring the env is restored even if
solve_mip, ASSERTs, or exceptions abort the test. Ensure the guard is used
before any ASSERT_* or calls that may throw and remove the bare unsetenv() at
the end.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 875195e0-088c-4949-be63-9355816ccd78

📥 Commits

Reviewing files that changed from the base of the PR and between 378add0 and 2b69c95.

📒 Files selected for processing (17)
  • .github/workflows/nightly.yaml
  • .gitignore
  • .pre-commit-config.yaml
  • CONTRIBUTING.md
  • ci/validate_wheel.sh
  • cpp/CMakeLists.txt
  • cpp/libmps_parser/src/mps_parser.cpp
  • cpp/src/mip_heuristics/diversity/diversity_manager.cu
  • cpp/src/mip_heuristics/presolve/third_party_presolve.cpp
  • cpp/src/mip_heuristics/presolve/third_party_presolve.hpp
  • cpp/src/mip_heuristics/solve.cu
  • cpp/src/utilities/cuda_helpers.cuh
  • cpp/tests/linear_programming/unit_tests/presolve_test.cu
  • cpp/tests/mip/incumbent_callback_test.cu
  • cpp/tests/routing/CMakeLists.txt
  • cpp/tests/routing/unit_tests/set_shmem_of_kernel.cu
  • datasets/mip/download_miplib_test_dataset.sh

cuopt_branch:
- "main"
- "release/26.04"
- "release/26.06"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Verify branch refs used by this workflow exist in NVIDIA/cuopt.
gh api "repos/NVIDIA/cuopt/branches/main" >/dev/null
gh api "repos/NVIDIA/cuopt/branches/release/26.06" >/dev/null

echo "Both branches are resolvable."

Repository: aliceb-nv/cuopt-fork

Length of output: 95


🏁 Script executed:

cat -n .github/workflows/nightly.yaml | head -30

Repository: aliceb-nv/cuopt-fork

Length of output: 1039


The branch release/26.06 does not exist in NVIDIA/cuopt.

The workflow will fail when attempting to resolve the SHA on line 29 (gh api -q '.commit.sha' "repos/nvidia/cuopt/branches/${CUOPT_BRANCH}"). Either create the branch upstream or remove it from the matrix.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/nightly.yaml at line 20, The workflow fails because the
matrix includes the non-existent branch value "release/26.06" which is later
used by the gh API call via the CUOPT_BRANCH variable; fix it by either removing
"release/26.06" from the branch matrix in .github/workflows/nightly.yaml or
create the corresponding branch upstream in the NVIDIA/cuopt repo so the gh api
call (gh api -q '.commit.sha' "repos/nvidia/cuopt/branches/${CUOPT_BRANCH}") can
resolve; ensure the matrix entry and the CUOPT_BRANCH variable remain
consistent.

Comment thread cpp/CMakeLists.txt
Comment on lines +297 to +302
if(NOT TARGET OpenSSL::SSL)
find_package(OpenSSL CONFIG QUIET)
if(NOT OpenSSL_FOUND AND NOT OPENSSL_FOUND)
find_package(OpenSSL REQUIRED)
endif()
endif()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Verify how OpenSSL discovery is handled across CMake files and whether mixed *_FOUND
# checks are used instead of target existence checks.
rg -n --type=cmake -C2 'find_package\(OpenSSL|OpenSSL_FOUND|OPENSSL_FOUND|TARGET OpenSSL::SSL'

Repository: aliceb-nv/cuopt-fork

Length of output: 509


🏁 Script executed:

rg -n 'OpenSSL::SSL' cpp/

Repository: aliceb-nv/cuopt-fork

Length of output: 114


🏁 Script executed:

rg -n 'target_link_libraries.*OpenSSL|OpenSSL.*target_link_libraries' cpp/CMakeLists.txt

Repository: aliceb-nv/cuopt-fork

Length of output: 46


🏁 Script executed:

sed -n '290,310p' cpp/CMakeLists.txt

Repository: aliceb-nv/cuopt-fork

Length of output: 789


🏁 Script executed:

rg -n 'cmake_minimum_required|CMAKE_VERSION' cpp/CMakeLists.txt | head -5

Repository: aliceb-nv/cuopt-fork

Length of output: 116


Use target-based fallback for OpenSSL discovery.

The condition checking both OpenSSL_FOUND and OPENSSL_FOUND is unnecessary; find_package(OpenSSL CONFIG QUIET) reliably creates the OpenSSL::SSL target on success in CMake 3.30.4+. Checking for the target directly is cleaner, more maintainable, and aligns with the gRPC/protobuf patterns used elsewhere in this file.

♻️ Proposed change
 if(NOT TARGET OpenSSL::SSL)
   find_package(OpenSSL CONFIG QUIET)
-  if(NOT OpenSSL_FOUND AND NOT OPENSSL_FOUND)
+  if(NOT TARGET OpenSSL::SSL)
     find_package(OpenSSL REQUIRED)
   endif()
 endif()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/CMakeLists.txt` around lines 297 - 302, The CMake snippet redundantly
checks OpenSSL_FOUND/OPENSSL_FOUND instead of using the target; change the
fallback logic to rely on the presence of the target OpenSSL::SSL: if
OpenSSL::SSL target is not available, call find_package(OpenSSL CONFIG QUIET)
and then, only if the OpenSSL::SSL target still does not exist, call
find_package(OpenSSL REQUIRED). Update the conditional test to use NOT TARGET
OpenSSL::SSL in both checks and remove references to OpenSSL_FOUND and
OPENSSL_FOUND to follow the target-based pattern used for gRPC/protobuf.

Comment on lines +944 to +948
const bool crush_dual = !y_original.empty();
if (crush_dual) { cuopt_assert((int)y_original.size() == (int)storage.nRowsOriginal, ""); }

const bool crush_rc = !z_original.empty();
if (crush_rc) { cuopt_assert((int)z_original.size() == (int)storage.nColsOriginal, ""); }
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Reject reduced-cost crushing without a dual vector.

crush_rc is allowed when y_original is empty, but the reduced-cost path later reads y[row] in multiple reduction handlers. A caller that provides z_original without y_original will hit out-of-bounds access on the new public API.

♻️ Proposed fix
   const bool crush_dual = !y_original.empty();
   if (crush_dual) { cuopt_assert((int)y_original.size() == (int)storage.nRowsOriginal, ""); }

   const bool crush_rc = !z_original.empty();
+  cuopt_expects(!crush_rc || crush_dual,
+                error_type_t::ValidationError,
+                "Crushing reduced costs requires the original dual vector");
   if (crush_rc) { cuopt_assert((int)z_original.size() == (int)storage.nColsOriginal, ""); }

Also applies to: 1028-1061

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mip_heuristics/presolve/third_party_presolve.cpp` around lines 944 -
948, The code currently allows crush_rc when z_original is provided but
y_original is empty, which later causes out-of-bounds reads of y[row]; change
the logic so crush_rc is only true when both z_original and y_original are
non-empty (e.g., set crush_rc = !z_original.empty() && !y_original.empty()), and
add a clear assertion or error (using cuopt_assert) that rejects callers which
pass z_original without y_original; apply the same guard/update to the other
similar reduction-handling block that uses crush_rc/crush_dual and accesses y
(the symmetric block around the second occurrence of these variables).

Comment on lines +199 to +218
static std::shared_mutex mtx;
static std::unordered_map<Function*, size_t> shmem_sizes;

if (dynamic_request_size != 0) {
dynamic_request_size = raft::alignTo(dynamic_request_size, size_t(1024));
size_t current_size = shmem_sizes[function];
if (dynamic_request_size > current_size) {
std::lock_guard<std::mutex> lock(mtx);
current_size = shmem_sizes[function];

if (dynamic_request_size > current_size) {
cudaFuncSetAttribute(
function, cudaFuncAttributeMaxDynamicSharedMemorySize, dynamic_request_size);
{
std::shared_lock<std::shared_mutex> rlock(mtx);
auto it = shmem_sizes.find(function);
if (it != shmem_sizes.end() && dynamic_request_size <= it->second) { return true; }
}

std::unique_lock<std::shared_mutex> wlock(mtx);
size_t current_size = shmem_sizes.count(function) ? shmem_sizes[function] : 0;
if (dynamic_request_size > current_size) {
auto err = cudaFuncSetAttribute(
function, cudaFuncAttributeMaxDynamicSharedMemorySize, dynamic_request_size);
if (err == cudaSuccess) {
shmem_sizes[function] = dynamic_request_size;
return (cudaSuccess == cudaGetLastError());
return true;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

Does cudaFuncSetAttribute(cudaFuncAttributeMaxDynamicSharedMemorySize, ...) apply per current CUDA device/context, or is it process-wide for a kernel function pointer?

💡 Result:

cudaFuncSetAttribute with cudaFuncAttributeMaxDynamicSharedMemorySize applies per current CUDA device/context for a kernel function pointer, not process-wide across all devices.

Citations:


Include current CUDA device in the shared-memory cache key.

The cache uses only Function* as the key, but cudaFuncSetAttribute(cudaFuncAttributeMaxDynamicSharedMemorySize, ...) applies per device. On multi-GPU systems, the shared-lock fast path (lines 206–208) can incorrectly skip setting the kernel attribute on a new device after it was cached on a different device, breaking correctness.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/utilities/cuda_helpers.cuh` around lines 199 - 218, The shared-memory
cache currently keys only on Function* (shmem_sizes) so the fast-path
shared_lock can skip setting cudaFuncAttributeMaxDynamicSharedMemorySize on a
different GPU; change the cache key to include the current CUDA device (e.g.,
get device via cudaGetDevice() or cudaGetDeviceId()) and use a composite key
(Function* plus device id) for all lookups/updates (both the shared_lock fast
path and the unique_lock write path) while keeping mtx for synchronization;
ensure every reference to shmem_sizes (find, count, operator[]) is updated to
use the composite key before calling cudaFuncSetAttribute or storing the
dynamic_request_size.

Comment thread cpp/tests/mip/incumbent_callback_test.cu Outdated
Comment thread cpp/tests/mip/incumbent_callback_test.cu
@aliceb-nv
Copy link
Copy Markdown
Owner Author

@CodeRabbit full review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 14, 2026

✅ Actions performed

Full review triggered.

@aliceb-nv aliceb-nv closed this Apr 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant