Migrate RMM usage to CCCL MR design by bdice · Pull Request #5483 · rapidsai/cugraph

bdice · 2026-04-03T23:25:54Z

Summary

Replace removed rmm::mr::device_memory_resource base class, owning_wrapper, shared_ptr-based resource management, and deprecated per-device resource APIs with CCCL-native memory resource types
Use cuda::mr::any_resource<cuda::mr::device_accessible> for owning type-erased storage, rmm::device_async_resource_ref for non-owning references, and value-typed resources (cuda_memory_resource, pinned_host_memory_resource)
Pass the memory resource to raft::handle_t as the workspace_resource (3rd) constructor argument, matching the new raft API (stream_view, stream_pool, std::optional<raft::mr::device_resource>)

Depends on rapidsai/rmm#2361.
Depends on rapidsai/ucxx#636.
Depends on rapidsai/raft#2996.
Depends on rapidsai/cuvs#1990.

Files changed

Headers:

algorithms.hpp, dendrogram.hpp, legacy/graph.hpp, legacy/functions.hpp: get_current_device_resource() → get_current_device_resource_ref() in default argument expressions
host_staging_buffer_manager.hpp: Remove owning_wrapper, store pool_memory_resource by value in a std::optional, accept pinned_host_memory_resource by value in init()
large_buffer_manager.hpp: Store pinned_host_memory_resource by value (not shared_ptr), return device_async_resource_ref from get(), std::move the resource into storage
mtmg/resource_manager.hpp: Use cuda::mr::any_resource<device_accessible> instead of shared_ptr<device_memory_resource> for per_device_rmm_resources_, use non-deprecated set_per_device_resource, pass resource as workspace_resource to raft::handle_t

Tests:

base_fixture.hpp: Return any_resource<device_accessible> from create_memory_resource(), use value-typed MR factory helpers (make_cuda, make_managed, make_pool, make_binning), switch to non-deprecated set_current_device_resource / get_current_device_resource_ref
multi_node_threaded_test.cpp: Switch to non-deprecated set_current_device_resource(resource)
mg_graph500_bfs_test.cu, mg_graph500_sssp_test.cu: Store pinned_mr_ as optional<pinned_host_memory_resource> by value, prefer .value() over operator* for optional access

Examples:

All 4 example files (sg_graph_algorithms.cpp, mg_graph_algorithms.cpp, vertex_and_edge_partition.cu, graph_operations.cu): Use value-typed cuda_memory_resource, non-deprecated set_current_device_resource, pass the resource to raft::handle_t as the workspace_resource (3rd positional arg, with nullptr for the unused stream_pool)

Replace removed rmm::mr::device_memory_resource base class, owning_wrapper, shared_ptr-based resource management, and deprecated per-device resource APIs with CCCL-native memory resource types: value-typed resources, cuda::mr::any_resource for owning type-erased storage, and rmm::device_async_resource_ref for non-owning references.

copy-pr-bot · 2026-04-03T23:25:58Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

…evice_resource

…rk better

…again

bdice

Self-review, leaving comments for an agent to address.

bdice · 2026-04-22T20:28:40Z


  std::unique_ptr<raft::handle_t> handle =
-    std::make_unique<raft::handle_t>(rmm::cuda_stream_per_thread, resource);
+    std::make_unique<raft::handle_t>(rmm::cuda_stream_per_thread);


TODO: I'm not sure if removing the resource argument here is right or wrong. Same applies in the other examples.

bdice · 2026-04-22T20:31:36Z

      rmm::align_down(std::min(free, total / 6), rmm::CUDA_ALLOCATION_ALIGNMENT);

-    auto per_device_it = per_device_rmm_resources_.insert(
+    auto upstream = rmm::mr::cuda_memory_resource();


Inline this in the pool_memory_resource constructor, we don't need the upstream variable.

bdice · 2026-04-22T20:31:52Z

      std::pair{global_rank,
-                rmm::mr::make_owning_wrapper<rmm::mr::pool_memory_resource>(
-                  std::make_shared<rmm::mr::cuda_memory_resource>(), min_alloc)});
+                cuda::mr::any_resource<cuda::mr::device_accessible>(


Do we really need this explicit cast to any_resource? Remove if possible.

bdice · 2026-04-22T20:35:31Z

 #endif

-    rmm::mr::set_per_device_resource_ref(local_device_id, per_device_it.first->second.get());
+    rmm::mr::set_per_device_resource_ref(local_device_id,


Deprecated. Use rmm::mr::set_per_device_resource instead.

bdice · 2026-04-22T20:36:10Z

+      handles.push_back(std::make_unique<raft::handle_t>(
+        rmm::cuda_stream_per_thread, std::make_shared<rmm::cuda_stream_pool>(n_streams)));


Again, why aren't we passing the resource anymore?

bdice · 2026-04-22T20:38:04Z

-  large_memory_buffer_resource_t(std::shared_ptr<rmm::mr::pinned_host_memory_resource> mr) : mr_(mr)
-  {
-  }
+  large_memory_buffer_resource_t(rmm::mr::pinned_host_memory_resource mr) : mr_(mr) {}


We should be able to std::move mr into mr_.

bdice · 2026-04-22T20:38:26Z

+    rmm::mr::pinned_host_memory_resource mr)
  {
-    return detail::large_memory_buffer_resource_t(std::move(mr));
+    return detail::large_memory_buffer_resource_t(mr);


Similar here, we should be able to std::move.

bdice · 2026-04-22T20:39:21Z

    cugraph::large_buffer_manager::init(
      *handle_,
-      cugraph::large_buffer_manager::create_memory_buffer_resource(pinned_mr_),
+      cugraph::large_buffer_manager::create_memory_buffer_resource(*pinned_mr_),


I prefer to call .value() instead of dereferencing optionals. I think that should be fixed everywhere in the diff of this PR.

- resource_manager.hpp: inline pool upstream, drop explicit any_resource cast, use non-deprecated set_per_device_resource, restore per_device_it from insert, and pass workspace_resource to raft::handle_t. - large_buffer_manager.hpp: std::move pinned_host_memory_resource in constructor and create_memory_buffer_resource. - mg_graph500_{bfs,sssp}_test.cu: prefer std::optional::value() over operator*. - examples: pass workspace_resource to raft::handle_t as the third positional argument.

…into rmm-cccl-migration

bdice

Self-review: I'm now happy with these changes. Thanks @ChuckHastings @KyleFromNVIDIA @vyasr and others for all the support with build issues.

bdice · 2026-04-23T04:13:16Z

/merge

bdice · 2026-04-23T04:36:06Z

CUDA 12.2 tests are failing like this:

./../../..//bin/gtests/libcugraph/LOUVAIN_TEST: symbol lookup error: /opt/conda/envs/test/bin/gtests/libcugraph/../../../lib/libcugraph.so: undefined symbol: cudaLibraryGetKernel, version libcudart.so.12

This is happening because we reverted the changes that statically link libcudart. The cudaLibraryGetKernel function was introduced sometime after CUDA 12.2.

libcugraph now requires a CUDA >= 12.8 cudart at runtime (via cuVS's use of cudaLibraryGetKernel), so the 12.2.2 conda/wheel test jobs fail with: undefined symbol: cudaLibraryGetKernel, version libcudart.so.12 Filter the test matrices in pr.yaml and test.yaml to CUDA >= 12.9 until a solution is implemented. Tracked in rapidsai#5498.

jakirkham · 2026-04-23T07:42:25Z

Thanks all! 🙏

This property, add to rmm in rapidsai/rmm#2317 and used in cugraph in #5483, is a holdover from when rmm enforced linking of dynamic cudart. rapidsai/rmm#2375 switched to static cudart and stopped querying the property, so remove it. Issue: rapidsai/build-planning#235 Authors: - Kyle Edwards (https://github.com/KyleFromNVIDIA) Approvers: - Bradley Dice (https://github.com/bdice) URL: #5508

bdice mentioned this pull request Apr 4, 2026

[FEA] Support memory resources from CCCL 3.2 rapidsai/rmm#2011

Open

51 tasks

bdice and others added 5 commits April 14, 2026 19:17

Merge remote-tracking branch 'upstream/main' into rmm-cccl-migration

39a3296

Inline upstream memory resource variable in test fixture MR composition

e8e30a3

Some cmake changes to try and resolve latest JIT issue

c0332ec

add host dependency on cuda-nvrtc

9a3ba99

Replace deprecated set_current_device_resource_ref with set_current_d…

dd809dd

…evice_resource

bdice force-pushed the rmm-cccl-migration branch 4 times, most recently from 70478b1 to bfe9d12 Compare April 17, 2026 07:09

bdice added breaking Breaking change improvement Improvement / enhancement to an existing function labels Apr 17, 2026

bdice force-pushed the rmm-cccl-migration branch from bfe9d12 to 45d182c Compare April 17, 2026 07:29

ChuckHastings added 5 commits April 17, 2026 07:46

add cudart to run recipe also

f586a14

ignore cudart in exports

5e24d0d

try tweaking the recipe a bit more

73564ef

try tweaking the recipe a bit more

1085cf2

try tweaking the recipe a bit more

9793fdf

bdice force-pushed the rmm-cccl-migration branch 5 times, most recently from e42e090 to 09ab98b Compare April 18, 2026 15:52

revert cmake changes to see if the recipe fixes without cmake will wo…

04d9ab1

…rk better

bdice marked this pull request as ready for review April 20, 2026 17:30

bdice requested review from a team as code owners April 20, 2026 17:30

bdice requested a review from gforsyth April 20, 2026 17:30

Add back the properties, removing them introduced the original error …

c04ffca

…again

bdice mentioned this pull request Apr 21, 2026

CCCL Memory Resource Migration — Merge Train rapidsai/rmm#2364

Closed

bdice force-pushed the rmm-cccl-migration branch from 09ab98b to dd809dd Compare April 21, 2026 11:57

bdice commented Apr 22, 2026

View reviewed changes

ChuckHastings added 2 commits April 22, 2026 13:54

Missed a spot where we were statically linking cudart.

9488ae4

Merge branch 'cudart_issue' into rmm-cccl-migration

7c10fb2

ChuckHastings requested review from a team as code owners April 22, 2026 20:57

ChuckHastings mentioned this pull request Apr 22, 2026

Reduce binary size part2 #5486

Merged

bdice removed request for a team and gforsyth April 22, 2026 21:47

ChuckHastings approved these changes Apr 22, 2026

View reviewed changes

vyasr approved these changes Apr 22, 2026

View reviewed changes

bdice added 2 commits April 22, 2026 22:04

Merge branch 'rmm-cccl-migration' of https://github.com/bdice/cugraph …

ebdabdd

…into rmm-cccl-migration

bdice self-assigned this Apr 23, 2026

bdice commented Apr 23, 2026

View reviewed changes

bdice mentioned this pull request Apr 23, 2026

Re-enable CUDA 12.2 CI matrix entries #5498

Open

rapids-bot Bot merged commit a125497 into rapidsai:main Apr 23, 2026
74 checks passed

bdice mentioned this pull request Apr 23, 2026

cugraph tests failing due to missing JIT function on CUDA 12 #5496

Closed

rlratzel mentioned this pull request May 1, 2026

Temporarily disables pre-CUDA 12.9 test runs due to libcugraph limitation rapidsai/nx-cugraph#257

Closed

KyleFromNVIDIA mentioned this pull request May 6, 2026

Remove NO_CUDART_DEP property #5508

Merged

		handles.push_back(std::make_unique<raft::handle_t>(
		rmm::cuda_stream_per_thread, std::make_shared<rmm::cuda_stream_pool>(n_streams)));

Conversation

bdice commented Apr 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Files changed

Uh oh!

copy-pr-bot Bot commented Apr 3, 2026

Uh oh!

bdice left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bdice left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bdice commented Apr 23, 2026

Uh oh!

bdice commented Apr 23, 2026

Uh oh!

Uh oh!

jakirkham commented Apr 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

bdice commented Apr 3, 2026 •

edited

Loading

bdice left a comment •

edited

Loading