Skip to content

Migrate RMM usage to CCCL MR design#5483

Merged
rapids-bot[bot] merged 18 commits into
rapidsai:mainfrom
bdice:rmm-cccl-migration
Apr 23, 2026
Merged

Migrate RMM usage to CCCL MR design#5483
rapids-bot[bot] merged 18 commits into
rapidsai:mainfrom
bdice:rmm-cccl-migration

Conversation

@bdice
Copy link
Copy Markdown
Contributor

@bdice bdice commented Apr 3, 2026

Summary

  • Replace removed rmm::mr::device_memory_resource base class, owning_wrapper, shared_ptr-based resource management, and deprecated per-device resource APIs with CCCL-native memory resource types
  • Use cuda::mr::any_resource<cuda::mr::device_accessible> for owning type-erased storage, rmm::device_async_resource_ref for non-owning references, and value-typed resources (cuda_memory_resource, pinned_host_memory_resource)
  • Pass the memory resource to raft::handle_t as the workspace_resource (3rd) constructor argument, matching the new raft API (stream_view, stream_pool, std::optional<raft::mr::device_resource>)

Depends on rapidsai/rmm#2361.
Depends on rapidsai/ucxx#636.
Depends on rapidsai/raft#2996.
Depends on rapidsai/cuvs#1990.

Files changed

Headers:

  • algorithms.hpp, dendrogram.hpp, legacy/graph.hpp, legacy/functions.hpp: get_current_device_resource()get_current_device_resource_ref() in default argument expressions
  • host_staging_buffer_manager.hpp: Remove owning_wrapper, store pool_memory_resource by value in a std::optional, accept pinned_host_memory_resource by value in init()
  • large_buffer_manager.hpp: Store pinned_host_memory_resource by value (not shared_ptr), return device_async_resource_ref from get(), std::move the resource into storage
  • mtmg/resource_manager.hpp: Use cuda::mr::any_resource<device_accessible> instead of shared_ptr<device_memory_resource> for per_device_rmm_resources_, use non-deprecated set_per_device_resource, pass resource as workspace_resource to raft::handle_t

Tests:

  • base_fixture.hpp: Return any_resource<device_accessible> from create_memory_resource(), use value-typed MR factory helpers (make_cuda, make_managed, make_pool, make_binning), switch to non-deprecated set_current_device_resource / get_current_device_resource_ref
  • multi_node_threaded_test.cpp: Switch to non-deprecated set_current_device_resource(resource)
  • mg_graph500_bfs_test.cu, mg_graph500_sssp_test.cu: Store pinned_mr_ as optional<pinned_host_memory_resource> by value, prefer .value() over operator* for optional access

Examples:

  • All 4 example files (sg_graph_algorithms.cpp, mg_graph_algorithms.cpp, vertex_and_edge_partition.cu, graph_operations.cu): Use value-typed cuda_memory_resource, non-deprecated set_current_device_resource, pass the resource to raft::handle_t as the workspace_resource (3rd positional arg, with nullptr for the unused stream_pool)

Replace removed rmm::mr::device_memory_resource base class, owning_wrapper,
shared_ptr-based resource management, and deprecated per-device resource APIs
with CCCL-native memory resource types: value-typed resources,
cuda::mr::any_resource for owning type-erased storage, and
rmm::device_async_resource_ref for non-owning references.
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented Apr 3, 2026

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@bdice bdice force-pushed the rmm-cccl-migration branch 4 times, most recently from 70478b1 to bfe9d12 Compare April 17, 2026 07:09
@bdice bdice added breaking Breaking change improvement Improvement / enhancement to an existing function labels Apr 17, 2026
@bdice bdice force-pushed the rmm-cccl-migration branch from bfe9d12 to 45d182c Compare April 17, 2026 07:29
@bdice bdice force-pushed the rmm-cccl-migration branch 5 times, most recently from e42e090 to 09ab98b Compare April 18, 2026 15:52
@bdice bdice marked this pull request as ready for review April 20, 2026 17:30
@bdice bdice requested review from a team as code owners April 20, 2026 17:30
@bdice bdice requested a review from gforsyth April 20, 2026 17:30
Copy link
Copy Markdown
Contributor Author

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Self-review, leaving comments for an agent to address.


std::unique_ptr<raft::handle_t> handle =
std::make_unique<raft::handle_t>(rmm::cuda_stream_per_thread, resource);
std::make_unique<raft::handle_t>(rmm::cuda_stream_per_thread);
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: I'm not sure if removing the resource argument here is right or wrong. Same applies in the other examples.

rmm::align_down(std::min(free, total / 6), rmm::CUDA_ALLOCATION_ALIGNMENT);

auto per_device_it = per_device_rmm_resources_.insert(
auto upstream = rmm::mr::cuda_memory_resource();
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inline this in the pool_memory_resource constructor, we don't need the upstream variable.

std::pair{global_rank,
rmm::mr::make_owning_wrapper<rmm::mr::pool_memory_resource>(
std::make_shared<rmm::mr::cuda_memory_resource>(), min_alloc)});
cuda::mr::any_resource<cuda::mr::device_accessible>(
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need this explicit cast to any_resource? Remove if possible.

#endif

rmm::mr::set_per_device_resource_ref(local_device_id, per_device_it.first->second.get());
rmm::mr::set_per_device_resource_ref(local_device_id,
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deprecated. Use rmm::mr::set_per_device_resource instead.

Comment thread cpp/include/cugraph/mtmg/resource_manager.hpp
Comment on lines +189 to +190
handles.push_back(std::make_unique<raft::handle_t>(
rmm::cuda_stream_per_thread, std::make_shared<rmm::cuda_stream_pool>(n_streams)));
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, why aren't we passing the resource anymore?

large_memory_buffer_resource_t(std::shared_ptr<rmm::mr::pinned_host_memory_resource> mr) : mr_(mr)
{
}
large_memory_buffer_resource_t(rmm::mr::pinned_host_memory_resource mr) : mr_(mr) {}
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be able to std::move mr into mr_.

rmm::mr::pinned_host_memory_resource mr)
{
return detail::large_memory_buffer_resource_t(std::move(mr));
return detail::large_memory_buffer_resource_t(mr);
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar here, we should be able to std::move.

cugraph::large_buffer_manager::init(
*handle_,
cugraph::large_buffer_manager::create_memory_buffer_resource(pinned_mr_),
cugraph::large_buffer_manager::create_memory_buffer_resource(*pinned_mr_),
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to call .value() instead of dereferencing optionals. I think that should be fixed everywhere in the diff of this PR.

@ChuckHastings ChuckHastings requested review from a team as code owners April 22, 2026 20:57
@bdice bdice removed request for a team and gforsyth April 22, 2026 21:47
bdice added 2 commits April 22, 2026 22:04
- resource_manager.hpp: inline pool upstream, drop explicit any_resource
  cast, use non-deprecated set_per_device_resource, restore per_device_it
  from insert, and pass workspace_resource to raft::handle_t.
- large_buffer_manager.hpp: std::move pinned_host_memory_resource in
  constructor and create_memory_buffer_resource.
- mg_graph500_{bfs,sssp}_test.cu: prefer std::optional::value() over
  operator*.
- examples: pass workspace_resource to raft::handle_t as the third
  positional argument.
@bdice bdice self-assigned this Apr 23, 2026
Copy link
Copy Markdown
Contributor Author

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Self-review: I'm now happy with these changes. Thanks @ChuckHastings @KyleFromNVIDIA @vyasr and others for all the support with build issues.

@bdice
Copy link
Copy Markdown
Contributor Author

bdice commented Apr 23, 2026

/merge

@bdice
Copy link
Copy Markdown
Contributor Author

bdice commented Apr 23, 2026

CUDA 12.2 tests are failing like this:

./../../..//bin/gtests/libcugraph/LOUVAIN_TEST: symbol lookup error: /opt/conda/envs/test/bin/gtests/libcugraph/../../../lib/libcugraph.so: undefined symbol: cudaLibraryGetKernel, version libcudart.so.12

This is happening because we reverted the changes that statically link libcudart. The cudaLibraryGetKernel function was introduced sometime after CUDA 12.2.

libcugraph now requires a CUDA >= 12.8 cudart at runtime (via cuVS's
use of cudaLibraryGetKernel), so the 12.2.2 conda/wheel test jobs fail
with:

  undefined symbol: cudaLibraryGetKernel, version libcudart.so.12

Filter the test matrices in pr.yaml and test.yaml to CUDA >= 12.9 until
a solution is implemented.

Tracked in rapidsai#5498.
@rapids-bot rapids-bot Bot merged commit a125497 into rapidsai:main Apr 23, 2026
74 checks passed
@jakirkham
Copy link
Copy Markdown
Member

Thanks all! 🙏

rapids-bot Bot pushed a commit that referenced this pull request May 9, 2026
This property, add to rmm in rapidsai/rmm#2317 and used in cugraph in #5483, is a holdover from when rmm enforced linking of dynamic cudart. rapidsai/rmm#2375 switched to static cudart and stopped querying the property, so remove it.

Issue: rapidsai/build-planning#235

Authors:
  - Kyle Edwards (https://github.com/KyleFromNVIDIA)

Approvers:
  - Bradley Dice (https://github.com/bdice)

URL: #5508
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

breaking Breaking change improvement Improvement / enhancement to an existing function

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants