Honor requested aligned adaptor alignment by bdice · Pull Request #2396 · rapidsai/rmm

bdice · 2026-05-17T17:55:11Z

Description

aligned_resource_adaptor previously ignored the per-call alignment passed to allocate() and deallocate(), using only the adaptor's configured alignment. This meant an adaptor configured with the default alignment could fail to satisfy an over-aligned request from a caller.

This PR computes an effective alignment from the caller request, the configured adaptor alignment, and CUDA_ALLOCATION_ALIGNMENT, then uses it consistently for upstream allocation sizing, returned pointer alignment, and deallocation sizing. The allocation path also validates that the caller-requested alignment is supported before using it in alignment math.

Checklist

I am familiar with the Contributing Guidelines.
New or existing tests cover these changes.
The documentation is up to date with these changes.

coderabbitai · 2026-05-17T17:58:57Z

📝 Walkthrough

Summary by CodeRabbit

Release Notes

Bug Fixes
- Memory allocation and deallocation now properly respect caller-requested alignment parameters, improving pointer alignment correctness.
Tests
- Added validation tests for invalid alignment requests.
- Added tests verifying proper alignment of returned pointers when custom alignment is requested.

Walkthrough

The aligned_resource_adaptor now respects caller-provided alignment parameters in both allocation and deallocation. Implementation helpers compute effective alignment based on requested alignment, configured alignment, and byte thresholds. Tests validate invalid alignment rejection and confirm caller-requested alignment takes precedence.

Changes

Caller-requested alignment support

Layer / File(s)	Summary
Alignment computation helpers and header cleanup `cpp/include/rmm/mr/detail/aligned_resource_adaptor_impl.hpp`, `cpp/src/mr/detail/aligned_resource_adaptor_impl.cpp` (lines 17–36)	Removed the per-instance `upstream_allocation_size` method declaration from the header. Added two namespace-scoped helper functions: `compute_effective_alignment()` combines requested alignment, configured alignment, CUDA alignment, and threshold condition; `compute_upstream_allocation_size()` derives the allocation size for a target alignment.
Allocate with caller-requested alignment `cpp/src/mr/detail/aligned_resource_adaptor_impl.cpp` (lines 62–76)	Updated `allocate()` to receive and use the `alignment` parameter. Computes effective alignment via helper, then branches: when adjustment needed, over-allocates at computed upstream size and returns a pointer aligned to effective alignment while tracking mappings.
Deallocate with caller-requested alignment `cpp/src/mr/detail/aligned_resource_adaptor_impl.cpp` (lines 90–94, 105)	Updated `deallocate()` to receive and use the `alignment` parameter. Computes effective alignment in both fast and slow paths to determine correct upstream deallocation size, replacing previous per-instance alignment logic.
Alignment validation and behavior tests `cpp/tests/mr/aligned_mr_tests.cpp` (lines 91–99, 209–256)	Added three test cases: validation that invalid requested alignment throws `rmm::logic_error`; verification that allocate returns caller-aligned addresses; confirmation that explicit caller alignment overrides configured threshold alignment.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

rapidsai/rmm#2343: Updates CCCL adaptor and test call sites to pass explicit rmm::CUDA_ALLOCATION_ALIGNMENT into allocate*/deallocate*, which complements this PR's changes to actually honor the alignment parameter.
rapidsai/rmm#2330: Adds device_buffer(..., alignment, ...) constructors that pass alignment through to _mr.allocate(..., alignment), which directly depends on this PR's alignment parameter support.

Suggested labels

bug, non-breaking

Suggested reviewers

harrism

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 7.69% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'Honor requested aligned adaptor alignment' directly describes the main change: making the aligned_resource_adaptor respect per-call alignment parameters instead of ignoring them.
Description check	✅ Passed	The description is well-related to the changeset, explaining the bug fix, the solution approach, and confirming tests and documentation are updated.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

cpp/src/mr/detail/aligned_resource_adaptor_impl.cpp (1)
92-105: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Track upstream allocation metadata instead of recomputing it in deallocate().

deallocate() now trusts the caller-provided alignment to decide whether the returned pointer was remapped. For example, allocate(..., 1024, 4096) records an adjusted pointer in pointers_, but deallocate(..., adjusted_ptr, 1024, rmm::CUDA_ALLOCATION_ALIGNMENT) will hit the Line 94 fast path and forward the adjusted pointer and size 1024 upstream without consulting that map. Before this PR the free-side alignment was ignored, so this change turns that mismatch into a bad upstream free. Please recover the original pointer and upstream size from tracked allocation metadata rather than recomputing both from the deallocation arguments, and add a regression test for that case.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cpp/src/mr/detail/aligned_resource_adaptor_impl.cpp` around lines 92 - 105,
The deallocate path currently recomputes upstream pointer/size from the
caller-provided alignment and can forward wrong values when the caller passes a
different alignment than used at allocation; change deallocate in
aligned_resource_adaptor_impl.cpp to lookup tracked allocation metadata instead
of recomputing: when allocate stores remapped pointers_ (and the
upstream_allocation_size) store a small struct (orig_upstream_ptr and
upstream_size) keyed by the adjusted ptr, and in deallocate (function
deallocate) first check pointers_/metadata map to retrieve the original upstream
pointer and exact upstream_size to pass to upstream_mr_.deallocate rather than
calling upstream_allocation_size or relying on effective_alignment; ensure you
erase the metadata entry under mtx_ and add a regression test that calls
allocate(..., alignment=X, requested_alignment=Y) then deallocate with the
opposite alignment to exercise the lookup path.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@cpp/src/mr/detail/aligned_resource_adaptor_impl.cpp`:
- Around line 92-105: The deallocate path currently recomputes upstream
pointer/size from the caller-provided alignment and can forward wrong values
when the caller passes a different alignment than used at allocation; change
deallocate in aligned_resource_adaptor_impl.cpp to lookup tracked allocation
metadata instead of recomputing: when allocate stores remapped pointers_ (and
the upstream_allocation_size) store a small struct (orig_upstream_ptr and
upstream_size) keyed by the adjusted ptr, and in deallocate (function
deallocate) first check pointers_/metadata map to retrieve the original upstream
pointer and exact upstream_size to pass to upstream_mr_.deallocate rather than
calling upstream_allocation_size or relying on effective_alignment; ensure you
erase the metadata entry under mtx_ and add a regression test that calls
allocate(..., alignment=X, requested_alignment=Y) then deallocate with the
opposite alignment to exercise the lookup path.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 81037943-c66b-4f44-847b-6a67a5ed8a66

📥 Commits

Reviewing files that changed from the base of the PR and between b187020 and f996e36.

📒 Files selected for processing (3)

cpp/include/rmm/mr/detail/aligned_resource_adaptor_impl.hpp
cpp/src/mr/detail/aligned_resource_adaptor_impl.cpp
cpp/tests/mr/aligned_mr_tests.cpp

💤 Files with no reviewable changes (1)

cpp/include/rmm/mr/detail/aligned_resource_adaptor_impl.hpp

wence- · 2026-05-26T11:34:34Z

+  if (bytes == 0 || effective_align == rmm::CUDA_ALLOCATION_ALIGNMENT) {
    return upstream_mr_.allocate(stream, bytes, 1);


note: this implicitly encodes that every RMM memory resource must provide allocations that are CUDA_ALLOCATION_ALIGNMENT aligned. Should we (at least in debug mode) assert that?

Imagine I am a writing my own resource, and I forget about this restriction.

I think we should also push (again) for the "related" request in this cccl issue NVIDIA/cccl#8157

We can at least implement said properties on all RMM resources today, I think.

wence- · 2026-05-26T11:35:46Z

+  return aligned_size + (alignment - rmm::CUDA_ALLOCATION_ALIGNMENT);
+}
+
+[[nodiscard]] std::size_t effective_alignment(std::size_t bytes,


Suggested change

[[nodiscard]] std::size_t effective_alignment(std::size_t bytes,

[[nodiscard]] constexpr std::size_t effective_alignment(std::size_t bytes,

wence- · 2026-05-26T11:37:26Z

 namespace detail {
+namespace {
+
+[[nodiscard]] std::size_t upstream_allocation_size(std::size_t bytes, std::size_t alignment)


Suggested change

[[nodiscard]] std::size_t upstream_allocation_size(std::size_t bytes, std::size_t alignment)

[[nodiscard]] constexpr std::size_t upstream_allocation_size(std::size_t bytes, std::size_t alignment) noexcept

(Although align_up is not constexpr because in NDEBUG mode it asserts)

wence- · 2026-05-26T11:46:04Z

+    void* const expected_pointer = int_to_address(4096);
+    auto const size{1024};
+    EXPECT_EQ(mr.allocate(stream, size, alignment), expected_pointer);
+    mr.deallocate(stream, expected_pointer, size, alignment);


I don't understand how gtest's mock methods work. But I would have thought that the thing to test here is:

void *ptr = mr.allocate(...); EXPECT_EQ(static_cast<std::uintptr_t>(ptr) % alignment, 0); ...

wence- · 2026-05-26T11:46:41Z

+    void* const expected_pointer = int_to_address(8192);
+    auto const size{1024};
+    EXPECT_EQ(mr.allocate(stream, size, requested_alignment), expected_pointer);
+    mr.deallocate(stream, expected_pointer, size, requested_alignment);


again here.

Honor requested aligned adaptor alignment

f996e36

bdice requested a review from a team as a code owner May 17, 2026 17:55

bdice requested review from harrism and shrshi May 17, 2026 17:55

github-project-automation Bot added this to RMM Project Board May 17, 2026

bdice added bug Something isn't working non-breaking Non-breaking change labels May 17, 2026

bdice self-assigned this May 17, 2026

bdice moved this to In Progress in RMM Project Board May 17, 2026

coderabbitai Bot reviewed May 17, 2026

View reviewed changes

Merge branch 'main' into fix-aligned-adaptor-request-alignment

aaf9cc9

wence- requested changes May 26, 2026

View reviewed changes

github-project-automation Bot moved this from In Progress to Review in RMM Project Board May 26, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Honor requested aligned adaptor alignment#2396

Honor requested aligned adaptor alignment#2396
bdice wants to merge 2 commits into
rapidsai:mainfrom
bdice:fix-aligned-adaptor-request-alignment

bdice commented May 17, 2026

Uh oh!

coderabbitai Bot commented May 17, 2026

Summary by CodeRabbit

Release Notes

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

wence- May 26, 2026

Uh oh!

wence- May 26, 2026

Uh oh!

wence- May 26, 2026

Uh oh!

wence- May 26, 2026

Uh oh!

wence- May 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		if (bytes == 0 \|\| effective_align == rmm::CUDA_ALLOCATION_ALIGNMENT) {
		return upstream_mr_.allocate(stream, bytes, 1);

	[[nodiscard]] std::size_t effective_alignment(std::size_t bytes,
	[[nodiscard]] constexpr std::size_t effective_alignment(std::size_t bytes,

	[[nodiscard]] std::size_t upstream_allocation_size(std::size_t bytes, std::size_t alignment)
	[[nodiscard]] constexpr std::size_t upstream_allocation_size(std::size_t bytes, std::size_t alignment) noexcept

Conversation

bdice commented May 17, 2026

Description

Checklist

Uh oh!

coderabbitai Bot commented May 17, 2026

Summary by CodeRabbit

Release Notes

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

wence- May 26, 2026

Choose a reason for hiding this comment

Uh oh!

wence- May 26, 2026

Choose a reason for hiding this comment

Uh oh!

wence- May 26, 2026

Choose a reason for hiding this comment

Uh oh!

wence- May 26, 2026

Choose a reason for hiding this comment

Uh oh!

wence- May 26, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants