feat: add multiple_blocks_allocation RAII handle for fixed_size_memory_resource by nirandaperera · Pull Request #2368 · rapidsai/rmm

nirandaperera · 2026-04-24T01:13:09Z

Summary

Adds multiple_blocks_allocation, an RAII class that allocates device memory spanning one or more fixed-size blocks from a fixed_size_memory_resource in a single stream-ordered call.
Exposes multiple_blocks_allocation::make_async(mr, size, stream) as the sole public factory, which acquires the required number of blocks under a single mutex lock and records a CUDA event for correct stream ordering.
On destruction the held blocks are returned to the resource's free list via a new internal deallocate_blocks_async_unsafe path (caller holds the mutex) — avoiding a second lock round-trip.
Promotes get_event / get_block to protected in stream_ordered_memory_resource so that fixed_size_memory_resource_impl can grant friend access to multiple_blocks_allocation.
Non-inline method bodies live in cpp/src/mr/fixed_size_memory_resource.cpp to keep the public header lean.
Adds fixed_size_mr_test.cpp which exercises allocation sizes of 0, sub-block, exact-block, and multi-block across cuda_memory_resource and cuda_async_memory_resource upstreams, using statistics_resource_adaptor to verify byte counts.

Test plan

Build and run FIXED_SIZE_MR_TEST (all parameter combinations pass).
Verify no deadlocks or double-free under ASAN/TSAN.
Confirm existing fixed_size_mr tests still pass.

Made with Cursor

Signed-off-by: niranda perera <niranda.perera@gmail.com>

coderabbitai · 2026-04-24T01:20:08Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds a public RAII handle multiple_blocks_allocation for multi-block async allocations, introduces a private async batch deallocator and friend/forward declarations in fixed-size MR internals, makes get_event protected in stream-ordered MR, and adds tests covering async multi-block allocation/deallocation and upstream accounting.

Changes

Cohort / File(s)	Summary
Public API: fixed-size MR `cpp/include/rmm/mr/fixed_size_memory_resource.hpp`	Adds exported `multiple_blocks_allocation` RAII type with `make_async`, accessors (`size`, `capacity`, `block_size`, `stream`), block views (`get_blocks`, `operator[]`, `at`), `clear()`, deleted copy, and move support.
Impl header plumbing `cpp/include/rmm/mr/detail/fixed_size_memory_resource_impl.hpp`	Adds `<memory>` include, forward declarations for `fixed_size_memory_resource` and `multiple_blocks_allocation`, grants friendship to `multiple_blocks_allocation`, and adds a private `deallocate_blocks_async_unsafe(std::vector<std::byte*>&&, cuda_stream_view)` helper (caller must hold MR mutex).
Stream-ordered MR visibility `cpp/include/rmm/mr/detail/stream_ordered_memory_resource.hpp`	Changes `get_event(cuda_stream_view)` visibility from private to protected to allow derived classes to access stream event retrieval.
Implementation: allocation & async dealloc `cpp/src/mr/detail/fixed_size_memory_resource_impl.cpp`, `cpp/src/mr/fixed_size_memory_resource.cpp`	Implements `deallocate_blocks_async_unsafe` (converts freed pointers to free list, records CUDA event, inserts blocks stream-ordered) and implements `multiple_blocks_allocation` (factory `make_async`, constructors/move, `clear()`, destructor with async return via MR mutex and exception-to-log handling; ensures exception safety when partially allocating).
Tests & build `cpp/tests/CMakeLists.txt`, `cpp/tests/mr/fixed_size_mr_test.cpp`	Registers a new fixed-size MR test and adds a parameterized GTest that exercises async multi-block allocations/deallocations across streams/threads and validates upstream byte accounting.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Merge staging into main: CCCL memory resource migration #2361: Modifies the same fixed-size memory-resource internals and stream-ordered infrastructure; strongly related.

Suggested reviewers

ttnghia
bdice
harrism

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 40.91% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and concisely summarizes the main addition: a new RAII handle class for multi-block allocation in fixed-size memory resource.
Description check	✅ Passed	The description is comprehensive and directly related to the changeset, detailing the RAII class functionality, API design, implementation strategy, and test coverage.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

cpp/tests/mr/fixed_size_mr_test.cpp (1)
58-115: Add a multithreaded test for the new allocation/destruction path.

This only exercises single-threaded accounting. The feature change is in the mutex/event path of make_async and destruction, so a test that allocates and destroys handles from multiple CPU threads would give much better coverage of the new synchronization logic. Based on learnings "Test concurrent allocations and deallocations from multiple threads; verify thread safety of pool implementations."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/tests/mr/fixed_size_mr_test.cpp` around lines 58 - 115, The test only
exercises single-threaded allocation/deallocation of fixed_size_memory_resource
and does not cover the new mutex/event synchronization in
rmm::mr::multiple_blocks_allocation::make_async and its destruction path; add a
multithreaded variant of TEST_P(FixedSizeMRTest,
AllocateBlocksAsyncUpstreamCountedDeallocateDoesNotReturnToUpstream) that spawns
several CPU threads which concurrently call
multiple_blocks_allocation::make_async on the same fixed_size_mr instance (using
the same stream_pool or distinct streams) and then clear/destroy the handles
concurrently, asserting the same invariants (handle->size(), handle->capacity(),
counting.get_bytes_counter().value behavior) both before and after fixed_size_mr
destruction to exercise and verify the mutex/event synchronization path in
make_async and destructor.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cpp/src/mr/fixed_size_memory_resource.cpp`:
- Around line 73-85: After acquiring the blocks with self.get_block(...) and
before returning the new multiple_blocks_allocation, ensure that construction
failures do not leak the acquired blocks: wrap the construction of the
unique_ptr<multiple_blocks_allocation> (the new multiple_blocks_allocation(size,
std::move(blocks), stream, std::move(mr)) call) in a try/catch that on
catch(...) calls self.deallocate_blocks_async_unsafe(std::move(blocks), stream)
and rethrows; alternatively create the object via a factory that returns a smart
pointer or use a scoped guard that deallocates blocks on exception so
deallocate_blocks_async_unsafe is always invoked if the
multiple_blocks_allocation constructor throws.
- Around line 39-55: Reject per-thread default streams when creating an
asynchronous multiple_blocks_allocation: in the factory function make_async (the
caller that constructs multiple_blocks_allocation and passes cuda_stream_view
stream), validate stream.is_per_thread_default() and fail fast (throw or return
an error) instead of accepting PTDS; multiple_blocks_allocation stores stream_
and uses it in its destructor (deallocate_blocks_async_unsafe), so accepting
PTDS would cause the stored handle to resolve on a different thread and race
with GPU work—update make_async to check stream.is_per_thread_default() and
document the rejection, or alternatively require the allocation be destroyed on
the creating thread if PTDS is allowed (preferred: reject PTDS in make_async).

---

Nitpick comments:
In `@cpp/tests/mr/fixed_size_mr_test.cpp`:
- Around line 58-115: The test only exercises single-threaded
allocation/deallocation of fixed_size_memory_resource and does not cover the new
mutex/event synchronization in rmm::mr::multiple_blocks_allocation::make_async
and its destruction path; add a multithreaded variant of TEST_P(FixedSizeMRTest,
AllocateBlocksAsyncUpstreamCountedDeallocateDoesNotReturnToUpstream) that spawns
several CPU threads which concurrently call
multiple_blocks_allocation::make_async on the same fixed_size_mr instance (using
the same stream_pool or distinct streams) and then clear/destroy the handles
concurrently, asserting the same invariants (handle->size(), handle->capacity(),
counting.get_bytes_counter().value behavior) both before and after fixed_size_mr
destruction to exercise and verify the mutex/event synchronization path in
make_async and destructor.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 65115bfb-9479-479a-b6c0-2aa3b96ca229

📥 Commits

Reviewing files that changed from the base of the PR and between 386f76d and 2ebec2b.

📒 Files selected for processing (7)

cpp/include/rmm/mr/detail/fixed_size_memory_resource_impl.hpp
cpp/include/rmm/mr/detail/stream_ordered_memory_resource.hpp
cpp/include/rmm/mr/fixed_size_memory_resource.hpp
cpp/src/mr/detail/fixed_size_memory_resource_impl.cpp
cpp/src/mr/fixed_size_memory_resource.cpp
cpp/tests/CMakeLists.txt
cpp/tests/mr/fixed_size_mr_test.cpp

Signed-off-by: niranda perera <niranda.perera@gmail.com>

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cpp/include/rmm/mr/fixed_size_memory_resource.hpp`:
- Around line 91-204: Update the class-level doc for multiple_blocks_allocation
to reference the actual factory make_async (replace the stale
allocate_blocks_async mention) and add a brief thread-safety/lifetime note
clarifying that multiple_blocks_allocation is non-copyable/non-movable, holds a
refcounted fixed_size_memory_resource (mr_) so the pool outlives the handle, and
that allocation/deallocation are stream-ordered on the cuda_stream_view passed
to make_async (including that destructor records the deallocation on that
stream). Ensure references to symbols multiple_blocks_allocation, make_async,
mr_, and the destructor are included so the API contract matches the
implementation.

In `@cpp/src/mr/fixed_size_memory_resource.cpp`:
- Around line 51-57: The destructor
multiple_blocks_allocation::~multiple_blocks_allocation currently calls
mr_->deallocate_blocks_async_unsafe(...) which can surface CUDA errors from a
destructor; make the cleanup no-throw by wrapping the deallocation call in the
project's no-throw CUDA error macro/utility (e.g., RMM_CUDA_TRY_NOEXCEPT) or an
equivalent try/catch that logs errors without throwing. Ensure you still check
blocks_.empty(), acquire the mutex via std::lock_guard as before, and move
blocks_ into the deallocation call, but perform that call inside the no-throw
wrapper so any CUDA error is swallowed/logged rather than allowed to propagate
out of the destructor.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 4e775602-9c2b-4234-8235-8629762e0fa4

📥 Commits

Reviewing files that changed from the base of the PR and between 2ebec2b and 0e68a53.

📒 Files selected for processing (3)

cpp/include/rmm/mr/fixed_size_memory_resource.hpp
cpp/src/mr/fixed_size_memory_resource.cpp
cpp/tests/mr/fixed_size_mr_test.cpp

bdice · 2026-04-28T05:05:27Z

+   *        before the failure are returned to the pool on `stream` (same ordering as normal
+   *        deallocation).
+   */
+  [[nodiscard]] static std::unique_ptr<multiple_blocks_allocation> make_async(


Why do we need a factory instead of a normal constructor?

This was previously inside the fixed sized mr class. Then I thought a factory method was the best. When I pulled it out, I left it as is. I felt its more idiomatic. We can throw and verify args (I should remove the RMM_EXPECTS statements in the ctr) cleanly. But I am fine either way. WDYT @bdice ?

bdice · 2026-04-28T05:15:17Z

+   *        deallocation).
+   */
+  [[nodiscard]] static std::unique_ptr<multiple_blocks_allocation> make_async(
+    fixed_size_memory_resource mr, std::size_t size, cuda_stream_view stream);


All new APIs should accept cuda::stream_ref rather than rmm::cuda_stream_view. I am planning to deprecate rmm::cuda_stream_view at some point. Until then the two types are implicitly convertible and thus interchangeable.

Signed-off-by: niranda perera <niranda.perera@gmail.com>

coderabbitai

♻️ Duplicate comments (2)

cpp/src/mr/fixed_size_memory_resource.cpp (2)

108-121: ⚠️ Potential issue | 🟠 Major

Keep handle construction inside the rollback scope.

If new multiple_blocks_allocation(...) throws, control skips the current catch and the acquired blocks never get returned to the pool. The rollback needs to cover object construction as well, not just get_block().

🛠️ Proposed fix

   std::vector<std::byte*> blocks;
   blocks.reserve(num_blocks);
   try {
     for (std::size_t i = 0; i < num_blocks; ++i) {
       blocks.push_back(
         static_cast<std::byte*>(self.get_block(self.get_block_size(), stream_event).pointer()));
     }
-  } catch (...) {
-    RMM_CUDA_TRY(self.deallocate_blocks_async_unsafe(std::move(blocks), stream));
-    throw;
-  }
-
-  return std::unique_ptr<multiple_blocks_allocation>(
-    new multiple_blocks_allocation(size, std::move(blocks), stream, std::move(mr)));
+    return std::unique_ptr<multiple_blocks_allocation>(
+      new multiple_blocks_allocation(size, std::move(blocks), stream, std::move(mr)));
+  } catch (...) {
+    RMM_CUDA_TRY(self.deallocate_blocks_async_unsafe(std::move(blocks), stream));
+    throw;
+  }

#!/bin/bash
# Verify that the handle construction still sits outside the rollback scope.
sed -n '103,122p' cpp/src/mr/fixed_size_memory_resource.cpp

Based on learnings: "Detect and fix GPU memory leaks: ensure all device allocations (cudaMalloc, cudaMallocManaged, cudaHostAlloc) are properly deallocated, including error paths."

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mr/fixed_size_memory_resource.cpp` around lines 108 - 121, The
construction of the multiple_blocks_allocation handle can throw and currently
lives outside the rollback scope, so if its constructor throws the acquired
blocks in `blocks` won't be returned; move the creation of the unique_ptr/new
multiple_blocks_allocation(...) inside the same try that obtains blocks (or add
a nested try/catch around the `new multiple_blocks_allocation(...)` that calls
`self.deallocate_blocks_async_unsafe(std::move(blocks), stream)` and rethrows)
so that any exception during `multiple_blocks_allocation` construction triggers
the same rollback path; reference `get_block`, `blocks`,
`deallocate_blocks_async_unsafe`, and `multiple_blocks_allocation` when making
the change.

49-56: ⚠️ Potential issue | 🔴 Critical

Use std::exchange() to ensure moved-from state is safely emptied.

After move-construction or move-assignment, std::vector is left in a valid but unspecified state—not guaranteed empty. This creates a double-release risk: if the moved-from object later calls clear() or its destructor, GPU blocks can be returned twice. Additionally, deallocate_blocks_async_unsafe() is called with a moved-from vector in unspecified state.

Also, in make_async() (lines 115-121), the new multiple_blocks_allocation(...) call is outside the try/catch block. If that allocation throws, the acquired blocks are not deallocated, leaking GPU memory. Wrap the object construction in the rollback scope.

🔧 Proposed fix

 void multiple_blocks_allocation::clear()
 {
-  if (!blocks_.empty()) {
+  auto blocks = std::exchange(blocks_, {});
+  if (!blocks.empty()) {
     std::lock_guard<std::mutex> lock(mr_->get_mutex());
-    RMM_CUDA_TRY(mr_->deallocate_blocks_async_unsafe(std::move(blocks_), stream_));
+    RMM_CUDA_TRY(mr_->deallocate_blocks_async_unsafe(std::move(blocks), stream_));
   }
   size_ = 0;
 }
 
 multiple_blocks_allocation::multiple_blocks_allocation(multiple_blocks_allocation&& other) noexcept
-  : blocks_(std::move(other.blocks_)),
+  : blocks_(std::exchange(other.blocks_, {})),
     size_(other.size_),
     stream_(other.stream_),
     mr_(std::move(other.mr_))
 {
   other.size_ = 0;
 }
 
 multiple_blocks_allocation& multiple_blocks_allocation::operator=(
   multiple_blocks_allocation&& other)
 {
   if (this != &other) {
     clear();
-    blocks_     = std::move(other.blocks_);
+    blocks_     = std::exchange(other.blocks_, {});
     size_       = other.size_;
     stream_     = other.stream_;
     mr_         = std::move(other.mr_);
     other.size_ = 0;
   }
   return *this;
 }
 
   try {
     for (std::size_t i = 0; i < num_blocks; ++i) {
       blocks.push_back(...);
     }
-  } catch (...) {
+    return std::unique_ptr<multiple_blocks_allocation>(
+      new multiple_blocks_allocation(size, std::move(blocks), stream, std::move(mr)));
+  } catch (...) {
     RMM_CUDA_TRY(self.deallocate_blocks_async_unsafe(std::move(blocks), stream));
     throw;
   }
-
-  return std::unique_ptr<multiple_blocks_allocation>(
-    new multiple_blocks_allocation(size, std::move(blocks), stream, std::move(mr)));

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mr/fixed_size_memory_resource.cpp` around lines 49 - 56, The move
constructor multiple_blocks_allocation(multiple_blocks_allocation&&) leaves
moved-from members in unspecified state which risks double-release; use
std::exchange to move blocks_ and mr_ and set size_ and stream_ to safe defaults
(e.g., std::exchange(other.size_, 0), std::exchange(other.stream_, nullptr),
std::exchange(other.blocks_, {}), std::exchange(other.mr_, nullptr)) so the
moved-from object is explicitly empty. Also, in make_async() wrap the new
multiple_blocks_allocation(...) construction inside the existing
try/catch/rollback scope (or add a try around it) so that if operator new throws
you call deallocate_blocks_async_unsafe() to release acquired GPU blocks before
rethrowing.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@cpp/src/mr/fixed_size_memory_resource.cpp`:
- Around line 108-121: The construction of the multiple_blocks_allocation handle
can throw and currently lives outside the rollback scope, so if its constructor
throws the acquired blocks in `blocks` won't be returned; move the creation of
the unique_ptr/new multiple_blocks_allocation(...) inside the same try that
obtains blocks (or add a nested try/catch around the `new
multiple_blocks_allocation(...)` that calls
`self.deallocate_blocks_async_unsafe(std::move(blocks), stream)` and rethrows)
so that any exception during `multiple_blocks_allocation` construction triggers
the same rollback path; reference `get_block`, `blocks`,
`deallocate_blocks_async_unsafe`, and `multiple_blocks_allocation` when making
the change.
- Around line 49-56: The move constructor
multiple_blocks_allocation(multiple_blocks_allocation&&) leaves moved-from
members in unspecified state which risks double-release; use std::exchange to
move blocks_ and mr_ and set size_ and stream_ to safe defaults (e.g.,
std::exchange(other.size_, 0), std::exchange(other.stream_, nullptr),
std::exchange(other.blocks_, {}), std::exchange(other.mr_, nullptr)) so the
moved-from object is explicitly empty. Also, in make_async() wrap the new
multiple_blocks_allocation(...) construction inside the existing
try/catch/rollback scope (or add a try around it) so that if operator new throws
you call deallocate_blocks_async_unsafe() to release acquired GPU blocks before
rethrowing.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: a6e7e9e1-54d3-4ffd-b1b9-0ae84ba5448e

📥 Commits

Reviewing files that changed from the base of the PR and between 0e68a53 and c754376.

📒 Files selected for processing (4)

cpp/include/rmm/mr/detail/fixed_size_memory_resource_impl.hpp
cpp/include/rmm/mr/fixed_size_memory_resource.hpp
cpp/src/mr/detail/fixed_size_memory_resource_impl.cpp
cpp/src/mr/fixed_size_memory_resource.cpp

🚧 Files skipped from review as they are similar to previous changes (2)

cpp/include/rmm/mr/fixed_size_memory_resource.hpp
cpp/src/mr/detail/fixed_size_memory_resource_impl.cpp

Signed-off-by: niranda perera <niranda.perera@gmail.com>

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (2)

cpp/src/mr/fixed_size_memory_resource.cpp (1)
111-122: ⚠️ Potential issue | 🟠 Major

Extend the rollback path to cover handle construction.

Line 121 is still outside the rollback try. If new multiple_blocks_allocation(...) throws after the blocks were acquired, those blocks never get returned to the pool and pool capacity leaks on the failure path.
🛠️ Proposed fix
   std::vector<std::byte*> blocks;
   blocks.reserve(num_blocks);
   try {
     for (std::size_t i = 0; i < num_blocks; ++i) {
       blocks.push_back(
         static_cast<std::byte*>(self.get_block(self.get_block_size(), stream_event).pointer()));
     }
-  } catch (...) {
-    RMM_CUDA_TRY(self.deallocate_blocks_async_unsafe(std::move(blocks), stream));
-    throw;
-  }
-
-  return std::unique_ptr<multiple_blocks_allocation>(
-    new multiple_blocks_allocation(size, std::move(blocks), stream, std::move(mr)));
+    return std::unique_ptr<multiple_blocks_allocation>(
+      new multiple_blocks_allocation(size, std::move(blocks), stream, std::move(mr)));
+  } catch (...) {
+    RMM_CUDA_TRY(self.deallocate_blocks_async_unsafe(std::move(blocks), stream));
+    throw;
+  }
As per coding guidelines, "Detect and fix GPU memory leaks: ensure all device allocations (cudaMalloc, cudaMallocManaged, cudaHostAlloc) are properly deallocated, including error paths."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/mr/fixed_size_memory_resource.cpp` around lines 111 - 122, The
allocation of blocks is protected by a try/catch but the subsequent construction
of multiple_blocks_allocation(new multiple_blocks_allocation(...)) sits outside
that try, so if the constructor throws the acquired blocks are never returned;
move the creation of the multiple_blocks_allocation object inside the same try
(or wrap the constructor call in the same try) and, in the catch, call
self.deallocate_blocks_async_unsafe(std::move(blocks), stream) (using
RMM_CUDA_TRY as currently done) to ensure all blocks acquired via
get_block/get_block_size are returned on any failure; reference the blocks
vector, multiple_blocks_allocation constructor, get_block/get_block_size,
deallocate_blocks_async_unsafe, stream_event, stream and mr when making the
change.
cpp/include/rmm/mr/fixed_size_memory_resource.hpp (1)
91-99: ⚠️ Potential issue | 🟡 Minor

Document the thread-safety contract explicitly.

The public docs explain lifetime, but they still do not say whether distinct handles may be cleared/destroyed concurrently or whether a single handle requires external synchronization. Please make that contract explicit in the class/destructor docs.

As per coding guidelines, "Document thread-safety guarantees in memory resource class documentation (Doxygen); specify which operations are thread-safe."

Also applies to: 126-132
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/include/rmm/mr/fixed_size_memory_resource.hpp` around lines 91 - 99,
Update the Doxygen for the RAII handle and the fixed_size_memory_resource to
state the thread-safety contract: specify whether distinct handle instances (the
RAII allocation handle) may be destroyed/cleared concurrently without external
synchronization and whether operations on a single handle (including its
destructor and any clear/release method) are safe only with external
synchronization; also document that the underlying fixed_size_memory_resource
permits concurrent allocations/frees across different streams if applicable or
requires external synchronization. Reference the RAII handle class/destructor
and fixed_size_memory_resource types in the comments so readers can find the
guarantees quickly.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cpp/tests/mr/fixed_size_mr_test.cpp`:
- Around line 141-151: The deallocation loop serializes teardown because
handles.pop_back() destroys the unique_ptr while holding handles_mutex; change
the loop inside the async task so it pops/moves the unique_ptr out while holding
handles_mutex (e.g., move handles.back() into a local variable and pop_back),
then release the lock and let the unique_ptr go out of scope (destroy) outside
the critical section; update the lambda used by dealloc_futs to use
handles_mutex, move the handle to a local variable before unlocking, and perform
destruction (letting multiple_blocks_allocation teardown run) outside the mutex
to allow true concurrent deallocation.

---

Duplicate comments:
In `@cpp/include/rmm/mr/fixed_size_memory_resource.hpp`:
- Around line 91-99: Update the Doxygen for the RAII handle and the
fixed_size_memory_resource to state the thread-safety contract: specify whether
distinct handle instances (the RAII allocation handle) may be destroyed/cleared
concurrently without external synchronization and whether operations on a single
handle (including its destructor and any clear/release method) are safe only
with external synchronization; also document that the underlying
fixed_size_memory_resource permits concurrent allocations/frees across different
streams if applicable or requires external synchronization. Reference the RAII
handle class/destructor and fixed_size_memory_resource types in the comments so
readers can find the guarantees quickly.

In `@cpp/src/mr/fixed_size_memory_resource.cpp`:
- Around line 111-122: The allocation of blocks is protected by a try/catch but
the subsequent construction of multiple_blocks_allocation(new
multiple_blocks_allocation(...)) sits outside that try, so if the constructor
throws the acquired blocks are never returned; move the creation of the
multiple_blocks_allocation object inside the same try (or wrap the constructor
call in the same try) and, in the catch, call
self.deallocate_blocks_async_unsafe(std::move(blocks), stream) (using
RMM_CUDA_TRY as currently done) to ensure all blocks acquired via
get_block/get_block_size are returned on any failure; reference the blocks
vector, multiple_blocks_allocation constructor, get_block/get_block_size,
deallocate_blocks_async_unsafe, stream_event, stream and mr when making the
change.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 359d9832-10c9-4ee9-950d-787128b95b9c

📥 Commits

Reviewing files that changed from the base of the PR and between a5e3f55 and 4333c99.

📒 Files selected for processing (3)

cpp/include/rmm/mr/fixed_size_memory_resource.hpp
cpp/src/mr/fixed_size_memory_resource.cpp
cpp/tests/mr/fixed_size_mr_test.cpp

adding multiblock allocations

2ebec2b

Signed-off-by: niranda perera <niranda.perera@gmail.com>

nirandaperera requested review from a team as code owners April 24, 2026 01:13

nirandaperera requested review from bdice and ttnghia April 24, 2026 01:13

github-project-automation Bot added this to RMM Project Board Apr 24, 2026

nirandaperera added non-breaking Non-breaking change improvement Improvement / enhancement to an existing function labels Apr 24, 2026

coderabbitai Bot reviewed Apr 24, 2026

View reviewed changes

Comment thread cpp/src/mr/fixed_size_memory_resource.cpp

Comment thread cpp/src/mr/fixed_size_memory_resource.cpp

addressing comments and extending the test suite

0e68a53

Signed-off-by: niranda perera <niranda.perera@gmail.com>

coderabbitai Bot reviewed Apr 28, 2026

View reviewed changes

Comment thread cpp/include/rmm/mr/fixed_size_memory_resource.hpp

Comment thread cpp/src/mr/fixed_size_memory_resource.cpp Outdated

bdice reviewed Apr 28, 2026

View reviewed changes

nirandaperera added 2 commits April 28, 2026 16:23

addressing comments

c754376

Signed-off-by: niranda perera <niranda.perera@gmail.com>

remove test ns

a5e3f55

Signed-off-by: niranda perera <niranda.perera@gmail.com>

coderabbitai Bot reviewed Apr 28, 2026

View reviewed changes

addressing comments

4333c99

Signed-off-by: niranda perera <niranda.perera@gmail.com>

nirandaperera requested review from bdice and wence- April 28, 2026 23:47

coderabbitai Bot reviewed Apr 28, 2026

View reviewed changes

Comment thread cpp/tests/mr/fixed_size_mr_test.cpp

coderabbit review

47472d6

Conversation

nirandaperera commented Apr 24, 2026

Summary

Test plan

Uh oh!

coderabbitai Bot commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bdice Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

nirandaperera Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

bdice Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

coderabbitai Bot commented Apr 24, 2026 •

edited

Loading