forked from NVIDIA/cuopt
-
Notifications
You must be signed in to change notification settings - Fork 0
GPU heuristics determinism integration #9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Changes from all commits
Commits
Show all changes
307 commits
Select commit
Hold shift + click to select a range
09a484a
remove unecessary grid sync
aliceb-nv f55e648
bump1
aliceb-nv 554d24c
bump2
aliceb-nv db73e00
FJCPU in deterministic mode
aliceb-nv 0b8b6e1
bump1
aliceb-nv 122bfbc
bump2
aliceb-nv 45ebd57
fix oversight
aliceb-nv a209096
bump1
aliceb-nv 718f3e7
bump2
aliceb-nv 1df9313
logger
aliceb-nv b45982e
fix cpufj
aliceb-nv d272180
bump1
aliceb-nv 0987b5c
bump2
aliceb-nv f378246
tentative LB determinism fix
aliceb-nv 4b77a59
extra logging CPUFJ, PAPI
aliceb-nv bcea60d
branch n bound nvtx ranges
aliceb-nv d7ff462
more cpufj logging
aliceb-nv 37ca39e
BnB features logging
aliceb-nv 5c9e59a
add memory instrumentation wrappers
aliceb-nv 22c95a7
Merge branch 'main' into determinism
aliceb-nv 0fc30c5
don't clog w/ improvements
aliceb-nv 8bd70e3
fixed iter count
aliceb-nv 7bcf3ba
bugfix
aliceb-nv 4bfb96f
more memops logging
aliceb-nv f1afa59
fewer
aliceb-nv cd2595e
log-transform for regressor
aliceb-nv 6a1192f
Merge branch 'release/25.12' into determinism
aliceb-nv 8dfdfd8
fixes
aliceb-nv 8f26067
restore fj scratch
aliceb-nv 24fdc75
Merge branch 'release/25.12' into determinism
aliceb-nv 759b896
fix build
aliceb-nv dc30219
tmp dual simplex instrument
aliceb-nv 1ca01ba
dual simplex feature logging
aliceb-nv 5894ffe
add support for building with clang
aliceb-nv 9911c3a
remove debug calls
aliceb-nv 24b828e
fix cmakelists
aliceb-nv d52bcb0
move suppressiosn
aliceb-nv ef69eb7
PDLP features
aliceb-nv 7ac230f
script tweaks
aliceb-nv 24d7ccb
more feature loggign
aliceb-nv d35bed2
fix logging
aliceb-nv 1913946
basic dual simplex and pdlp predictors
aliceb-nv ecf9f25
improved dualsimplex instrumentation
aliceb-nv fb9c782
integrate work unit limiting with branch_and_bound
aliceb-nv 4a51ca7
Merge branch 'release/25.12' into determinism
aliceb-nv 1a4ac76
add work unit scheduler for parallel threads
aliceb-nv ce44aff
cleanup
aliceb-nv 9b881d8
fix build
aliceb-nv b293e94
cleanup
aliceb-nv 88adfcd
Merge branch 'release/25.12' into determinism
aliceb-nv ee7037f
cleanup
aliceb-nv fb0c072
add support for building with clang
aliceb-nv e3f2a79
remove debug calls
aliceb-nv 057ecc7
fix cmakelists
aliceb-nv 2463de6
move suppressiosn
aliceb-nv 8356f1d
address warnings, add msan setting
aliceb-nv 192ce53
Merge branch 'release/25.12' into clang-tsan
aliceb-nv 8b31229
Merge branch 'main' into clang-tsan
aliceb-nv 3cced01
clang compiler bug workaround
aliceb-nv 63c917a
Merge remote-tracking branch 'fork-repo/clang-tsan' into clang-tsan
aliceb-nv d33939b
Merge branch 'main' into determinism
aliceb-nv 47ae4cf
Merge branch 'main' into determinism
aliceb-nv ed224fb
Merge branch 'main' into determinism
aliceb-nv 741f373
initial bb debug impl
aliceb-nv 63b9ed1
Merge branch 'main' into clang-tsan
aliceb-nv 0ea2c1f
some debug tweaks
aliceb-nv 00326e2
Merge branch 'clang-tsan' into determinism
aliceb-nv c374f69
fix copyright
aliceb-nv 2ae3d3e
Merge branch 'clang-tsan' into determinism
aliceb-nv d89c50a
flag to disable gpu heuristics
aliceb-nv d67f230
BSP b&b progress
aliceb-nv eb6bc28
initial working impl on small problens
aliceb-nv d80667c
more instrumentation
aliceb-nv 8ed172a
keep basis when possible
aliceb-nv 2a16ee6
merge and fix nondeterministic node compare
aliceb-nv ef777c6
more logs, cache cpu clock
aliceb-nv 6dcb062
plunging for deterministic search
aliceb-nv 35e8177
move to work_unit_scheduler to allow for mid-node syncs
aliceb-nv e06af9f
better debug printouts
aliceb-nv 99e9ec2
improve ins_vector coverage; fix case where instrumeted mem accesses …
aliceb-nv 8fdbff8
fix sync bug on termination
aliceb-nv b6d7ecc
revert disable heuristics
aliceb-nv 668391e
no presovle when determinsitic
aliceb-nv 3b6d532
restore nondeterminsitc codepath behavior
aliceb-nv 81aaf20
Merge branch 'main' into determinism
aliceb-nv 07cea4d
cleanup work
aliceb-nv 9fb0edd
more cleanup to test for regressions
aliceb-nv e5e498d
ml script updates
aliceb-nv 525c7b5
fix postprocess in train_regressor.py
aliceb-nv bb5ed1c
Merge branch 'main' into gpudet
aliceb-nv 1dc32de
build fixes
aliceb-nv 79a6d54
determinism fixes
aliceb-nv ec248cd
progress on GPU heuristic work accounting
aliceb-nv 9579cd1
gpufj integration
aliceb-nv c8a9836
much in the way of fixes
aliceb-nv 5e13486
bump1
aliceb-nv ec5152a
bump2
aliceb-nv 0d70d96
refactor
aliceb-nv 2b0e5e4
bump1
aliceb-nv 21bf0a5
bump2
aliceb-nv 75c942a
refactor
aliceb-nv 9db457b
bump1
aliceb-nv c3e7ff6
bump2
aliceb-nv 3ed12cf
remove sigILL
aliceb-nv 1e5807b
bump
aliceb-nv 2322ad1
gpu scale tuning
aliceb-nv e532bcd
tuning 1
aliceb-nv f80c733
tuning 2
aliceb-nv e3e37ad
logging fixes
aliceb-nv 4c9f494
attempts at fixing failed repairs
aliceb-nv 8e9a2fc
B&B bugfixing
aliceb-nv c217922
reliability branching for determinism
aliceb-nv 90f184c
cpufj in local search
aliceb-nv ec4239b
presolve determinsitc, longer gpu heur wokr scale
aliceb-nv 014dc72
bugfixes, reject infeasible itneger incumbents in B&B , lower probing…
aliceb-nv 58c6c4b
switch to hierarchical termination checkers
aliceb-nv 5e5d285
scaled down
aliceb-nv 068c699
output csv in run_mip from user callback hooks
aliceb-nv 7d0fd31
fix false optimality claim
aliceb-nv e077cb4
setting for running gpu heuristics are pre-exploration
aliceb-nv 0619554
refactor and cleanup
aliceb-nv 4907783
fix reliability branching regression
aliceb-nv cc10766
Merge branch 'main' into gpudet
aliceb-nv 8cd14cc
run gpuheur immediately
aliceb-nv 6e1be5b
fix crash in cut loop
aliceb-nv bc93781
fix llm sloppiness
aliceb-nv e6b7457
fix heuristic solutions dropped when exploration doesnt get to start
aliceb-nv 0e9427f
csv even if no incumbent found
aliceb-nv 295fd7e
fix root PDLP bug
aliceb-nv 92ccbe9
fix build
aliceb-nv 59e0ad0
bump1
aliceb-nv 0a47fe0
bump2
aliceb-nv eaf45e6
fix reliability branching nondeterminism
aliceb-nv 02a5bf8
bump 1
aliceb-nv fff6a8e
bump 2
aliceb-nv d915f6a
fix tests and cpufj local search
aliceb-nv 0a0f12f
bump1
aliceb-nv 3a7b43f
bump2
aliceb-nv 6494275
bugfixes and renaming
aliceb-nv 588c77f
bump1
aliceb-nv 7613978
bump2
aliceb-nv d423c6f
bugfix
aliceb-nv 638d35f
bump1
aliceb-nv e8bdc7c
fix wrong work unit timestamp beign used
aliceb-nv 41ac7bb
bump1
aliceb-nv 1867f30
missing missing clock rebase
aliceb-nv a90233f
bump1
aliceb-nv 9598b37
task-level work unit accounting for strong branching instead of threa…
aliceb-nv 83b78ef
bump1
aliceb-nv 0227513
full determinsm logs
aliceb-nv 18a2d4d
bump1
aliceb-nv 3e214a8
thread local seed generator
aliceb-nv e8b1291
no presolve
aliceb-nv cc1e5c3
bump1
aliceb-nv 859975f
gate probing behind deterministm flag
aliceb-nv a63af82
bump1
aliceb-nv 894b200
extra logging
aliceb-nv 0a20329
bump1
aliceb-nv f7ebf56
fix unintiialized reads on problem copy
aliceb-nv 479b662
uninitialized access fixes
aliceb-nv 27fb24b
bump 1
aliceb-nv 9a4b1d0
fix build
aliceb-nv e87c98f
bump1
aliceb-nv 73dffbe
refactoring
aliceb-nv 472f2b6
bump2
aliceb-nv 8d4e34b
mssing compute_feasibilty
aliceb-nv cc33495
Merge branch 'main' into gpudet
aliceb-nv 1ef5fd9
refactoring ahead of PR
aliceb-nv e80e545
bump1
aliceb-nv 427b78c
bug fixes, perform solution uncrush on separate stream to prevent rac…
aliceb-nv b41dc3f
bump 1
aliceb-nv 1aeb045
further bug fixes
aliceb-nv 78d072b
bump1
aliceb-nv 1a4e270
fix missing cudaSetDevice
aliceb-nv 7ca398c
bump1
aliceb-nv 66992e4
fix missing guard in bb to heuristics solution transmission + cleanup
aliceb-nv 63a71f2
bump1
aliceb-nv 0a6528f
FJ nnz-based work estimation
aliceb-nv d82f2a8
bump1
aliceb-nv 1391f13
cleanup
aliceb-nv 78ee5bd
bump1
aliceb-nv 0107bc2
bump2
aliceb-nv da540f0
bump3
aliceb-nv d6c5046
further cleanup before push
aliceb-nv 2bbc1c4
fix race on subsitutions in probing cache
aliceb-nv bc5d939
gpu-accelerae determinism work estimates
aliceb-nv 7583f56
bump1
aliceb-nv 6ffd6fd
bump2
aliceb-nv 91ef5c8
bump3
aliceb-nv 1cad039
Merge branch 'release/26.04' into gpudet
aliceb-nv dc67a1f
ai review comments
aliceb-nv f551754
bump1
aliceb-nv 9b10fe2
bump2
aliceb-nv c9646ce
bump3
aliceb-nv 07cbfc1
mip nodes flat destruction
aliceb-nv 5c3863f
fix callback uncrush copy size
aliceb-nv a26ac85
bug fixes, cleanup
aliceb-nv 8c6eeb7
bump1
aliceb-nv 1994302
bump2
aliceb-nv 1dc2adc
bump3
aliceb-nv c0bb0b1
cleanup ahead of PR
aliceb-nv bcb1dba
pseudocost determinism fixes
aliceb-nv bbde0e7
cleanup ahead of PR, earlier reporting for early incumbents
aliceb-nv dbe90e8
further cleanup ahead of PR
aliceb-nv 5d473e7
fix FJ determinism check
aliceb-nv ba9ce2c
bounds repair logs
aliceb-nv 78e5e29
bump1
aliceb-nv eafa37e
bump2
aliceb-nv 69198ae
bump3
aliceb-nv 29ffb43
more cleanup
aliceb-nv 133fcf1
bugfixes on solution callbacks
aliceb-nv bc7400e
bugfix, cleanup, extra logs to track down remaining divergence
aliceb-nv b4018e3
bump1
aliceb-nv 2c9cc21
bump2
aliceb-nv 583fe65
bump3
aliceb-nv b295f48
adjust test runtime for CI
aliceb-nv 03492f5
disable extra determinism logs
aliceb-nv 957074c
bump1
aliceb-nv 0b1f6b1
bump2
aliceb-nv 3cfd9b1
bump3
aliceb-nv 6d26754
bugfixing
aliceb-nv ec7db3a
extra logs
aliceb-nv 2ffb465
bump1
aliceb-nv 4dc05e9
bump2
aliceb-nv 86ba346
bump3
aliceb-nv e0ef9a6
fix and simplify work accounting for idle wokrers
aliceb-nv 0f4bb86
bump1
aliceb-nv 6848dcc
bump2
aliceb-nv 3951105
bump3
aliceb-nv dd9b716
pass work limit timers by reference
aliceb-nv 6390814
ai review comments
aliceb-nv caedeb9
bump1
aliceb-nv 70382f4
bump 2
aliceb-nv b1db5ab
bump 3
aliceb-nv 1678640
Merge branch 'release/26.04' into gpudet
aliceb-nv dc8f073
precommit fix
aliceb-nv c3f8fc3
download missing test instance
aliceb-nv 7b843bd
Merge branch 'release/26.04' into gpudet
aliceb-nv 629f681
Merge branch 'main' into gpudet
aliceb-nv 75bbfb5
merge cleanup
aliceb-nv a5d830e
tentative crash fix
aliceb-nv a943845
skill details
aliceb-nv be7859f
post merge cleanup
aliceb-nv 40a72db
ai review comments
aliceb-nv 079a84e
ai review comments
aliceb-nv c34cf0b
fix incorrect restore
aliceb-nv 39d01a9
ai review comments
aliceb-nv adc08b0
bump
aliceb-nv 4f8c071
use probing presolver in sequential mode in determinism mode
aliceb-nv b25b34f
with submip
aliceb-nv File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,249 @@ | ||
| <?xml version="1.0" encoding="utf-8"?> | ||
| <ComputeSanitizerOutput> | ||
| <record> | ||
| <kind>Initcheck</kind> | ||
| <what> | ||
| <text>Uninitialized __global__ memory read of size 4 bytes</text> | ||
| <size>4</size> | ||
| </what> | ||
| <where> | ||
| <func>.*</func> | ||
| </where> | ||
| <hostStack> | ||
| <frame> | ||
| <module>.*libcuda.so.*</module> | ||
| </frame> | ||
| <frame> | ||
| <func>cusparseCsr2cscEx2</func> | ||
| <module>.*libcusparse.so.*</module> | ||
| </frame> | ||
| </hostStack> | ||
| </record> | ||
| <record> | ||
| <kind>Initcheck</kind> | ||
| <what> | ||
| <text>Uninitialized __global__ memory read of size 4 bytes</text> | ||
| <size>4</size> | ||
| </what> | ||
| <where> | ||
| <func>ThreadLoad</func> | ||
| </where> | ||
| <hostStack> | ||
| <frame> | ||
| <module>.*libcuda.so.*</module> | ||
| </frame> | ||
| <frame> | ||
| <module>libcudart.*</module> | ||
| </frame> | ||
| <frame> | ||
| <func>cudaLaunchKernel</func> | ||
| </frame> | ||
| <frame> | ||
| <func>.*cub::.*::Device(Segmented)?(Reduce|Scan)(SingleTile)?Kernel.*</func> | ||
| </frame> | ||
| </hostStack> | ||
| </record> | ||
| <record> | ||
| <kind>Initcheck</kind> | ||
| <what> | ||
| <text>Uninitialized __global__ memory read of size 2 bytes</text> | ||
| <size>2</size> | ||
| </what> | ||
| <where> | ||
| <func>ThreadLoad</func> | ||
| </where> | ||
| <hostStack> | ||
| <frame> | ||
| <module>.*libcuda.so.*</module> | ||
| </frame> | ||
| <frame> | ||
| <module>libcudart.*</module> | ||
| </frame> | ||
| <frame> | ||
| <func>cudaLaunchKernel</func> | ||
| </frame> | ||
| <frame> | ||
| <func>.*cub::.*::Device(Segmented)?(Reduce|Scan)(SingleTile)?Kernel.*</func> | ||
| </frame> | ||
| </hostStack> | ||
| </record> | ||
| <record> | ||
| <kind>Initcheck</kind> | ||
| <what> | ||
| <text>Uninitialized __global__ memory read of size 8 bytes</text> | ||
| <size>8</size> | ||
| </what> | ||
| <where> | ||
| <func>DeviceSegmentedReduceKernel</func> | ||
| </where> | ||
| </record> | ||
| <record> | ||
| <kind>Initcheck</kind> | ||
| <what> | ||
| <text>Uninitialized __global__ memory read of size 4 bytes</text> | ||
| <size>4</size> | ||
| </what> | ||
| <where> | ||
| <func>ThreadLoad</func> | ||
| </where> | ||
| <hostStack> | ||
| <frame> | ||
| <module>.*libcuda.so.*</module> | ||
| </frame> | ||
| <frame> | ||
| <module>libcudart.*</module> | ||
| </frame> | ||
| <frame> | ||
| <module>libcudart.*</module> | ||
| </frame> | ||
| <frame> | ||
| <module>.*libcuopt.*</module> | ||
| </frame> | ||
| <frame> | ||
| <func>.*Device(Reduce|Scan)Kernel.*</func> | ||
| </frame> | ||
| </hostStack> | ||
| </record> | ||
| <!-- Rule matching cccl's pattern of copying tuples back to host after reduce_by_keys, which contain uninitialized padding --> | ||
| <!-- Because of aggressive inlining, thrust calls are elided out of the host stack, which prevents a more finely grained rule. In practice this is good enough --> | ||
| <record> | ||
| <kind>InitcheckApiError</kind> | ||
| <level>Error</level> | ||
| <what> | ||
| <text>Host API uninitialized memory access</text> | ||
| <accessSize>16</accessSize> | ||
| </what> | ||
| <hostStack> | ||
| <frame> | ||
| <func>cuMemcpyDtoHAsync.*</func> | ||
| <module>.*libcuda.so.*</module> | ||
| </frame> | ||
| </hostStack> | ||
| </record> | ||
| <!-- Suppress uninit copies on rmm::device_vector copy constructor - often vector members are allocated but not filled --> | ||
| <record> | ||
| <kind>InitcheckApiError</kind> | ||
| <level>Error</level> | ||
| <what> | ||
| <text>Host API uninitialized memory access</text> | ||
| </what> | ||
| <hostStack> | ||
| <frame> | ||
| <func>cuMemcpyAsync</func> | ||
| <module>.*libcuda.so.*</module> | ||
| </frame> | ||
| <frame> | ||
| <module>.*libcudart.so.*</module> | ||
| </frame> | ||
| <frame> | ||
| <module>.*libcudart.so.*</module> | ||
| </frame> | ||
| <frame> | ||
| <module>.*libcudart.so.*</module> | ||
| </frame> | ||
| <frame> | ||
| <module>.*librmm.so.*</module> | ||
| </frame> | ||
| <frame> | ||
| <func>rmm::device_buffer::device_buffer</func> | ||
| <module>.*librmm.so.*</module> | ||
| </frame> | ||
| </hostStack> | ||
| </record> | ||
| <!-- Suppress likely harmless Thrust/CUB tuple-buffer initcheck reads during sort_by_key | ||
| in trivial_presolve's COO->CSC reorder. Source and destination arrays are validated | ||
| immediately before the sort; the warning appears to come from internal tuple staging. --> | ||
| <record> | ||
| <kind>Initcheck</kind> | ||
| <what> | ||
| <text>Uninitialized __global__ memory read</text> | ||
| </what> | ||
| <where> | ||
| <func>transform_kernel</func> | ||
| </where> | ||
| <hostStack> | ||
| <frame> | ||
| <func>cuLaunchKernel_ptsz</func> | ||
| <module>.*libcuda.so.*</module> | ||
| </frame> | ||
| <frame> | ||
| <module>.*libcudart.so.*</module> | ||
| </frame> | ||
| <frame> | ||
| <func>cudaLaunchKernel_ptsz</func> | ||
| </frame> | ||
| </hostStack> | ||
| </record> | ||
| <record> | ||
| <kind>InitcheckApiError</kind> | ||
| <level>Error</level> | ||
| <what> | ||
| <text>Host API uninitialized memory access</text> | ||
| </what> | ||
| <hostStack> | ||
| <frame> | ||
| <func>cuMemcpyAsync</func> | ||
| <module>.*libcuda.so.*</module> | ||
| </frame> | ||
| <frame> | ||
| <module>.*libcudart.so.*</module> | ||
| </frame> | ||
| <frame> | ||
| <module>.*libcudart.so.*</module> | ||
| </frame> | ||
| <frame> | ||
| <module>.*libcudart.so.*</module> | ||
| </frame> | ||
| <frame> | ||
| <module>.*librmm.so.*</module> | ||
| </frame> | ||
| <frame> | ||
| <module>.*librmm.so.*</module> | ||
| </frame> | ||
| <frame> | ||
| <func>rmm::device_uvector.*::device_uvector</func> | ||
| <module>.*libcuopt.so.*</module> | ||
| </frame> | ||
| </hostStack> | ||
| </record> | ||
| <!-- Uninitialized device-to-device copies are usually harmless - if actualy bogus, errors may be caught later on --> | ||
| <record> | ||
| <kind>InitcheckApiError</kind> | ||
| <level>Error</level> | ||
| <what> | ||
| <text>Host API uninitialized memory access</text> | ||
| </what> | ||
| <hostStack> | ||
| <frame> | ||
| <func>cuMemcpyDtoDAsync.*</func> | ||
| <module>.*libcuda.so.*</module> | ||
| </frame> | ||
| </hostStack> | ||
| </record> | ||
| <record> | ||
| <kind>InitcheckApiError</kind> | ||
| <level>Error</level> | ||
| <what> | ||
| <text>Host API uninitialized memory access</text> | ||
| </what> | ||
| <hostStack> | ||
| <frame> | ||
| <func>cuMemcpyAsync</func> | ||
| <module>.*libcuda.so.*</module> | ||
| </frame> | ||
| <frame> | ||
| <module>.*libcudart.so.*</module> | ||
| </frame> | ||
| <frame> | ||
| <module>.*libcudart.so.*</module> | ||
| </frame> | ||
| <frame> | ||
| <func>cudaMemcpyAsync</func> | ||
| </frame> | ||
| <frame> | ||
| <func>rmm::device_buffer::resize</func> | ||
| <module>.*librmm.so.*</module> | ||
| </frame> | ||
| </hostStack> | ||
| </record> | ||
| </ComputeSanitizerOutput> | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.