Share simulated memory resource state#2399
Conversation
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
📝 WalkthroughWalkthroughThe ChangesShared State Refactoring
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
cpp/benchmarks/utilities/simulated_memory_resource.hpp (2)
57-68:⚠️ Potential issue | 🟠 Major | ⚡ Quick winSynchronize the shared allocation cursor.
Now that copies alias
state_, concurrentallocate()calls from different instances can read the samebeginvalue and hand out overlapping simulated ranges. Guard the cursor update insidestatewith a mutex or atomic reservation.As per coding guidelines, "Ensure thread safety in memory resources: use synchronization primitives (mutex, atomic) when accessing shared data structures like free lists from multiple threads".🔒 Proposed fix
+#include <mutex> + void* allocate([[maybe_unused]] cuda::stream_ref stream, std::size_t bytes, [[maybe_unused]] std::size_t alignment = rmm::CUDA_ALLOCATION_ALIGNMENT) { + std::lock_guard<std::mutex> lock{state_->mutex}; + // NOLINTNEXTLINE(cppcoreguidelines-pro-bounds-pointer-arithmetic) RMM_EXPECTS(state_->begin + bytes <= state_->end, "Simulated memory size exceeded (failed to allocate " + rmm::detail::format_bytes(bytes) + ")", rmm::bad_alloc); auto* ptr = static_cast<void*>(state_->begin); state_->begin += bytes; // NOLINT(cppcoreguidelines-pro-bounds-pointer-arithmetic) return ptr; } @@ struct state { state(char* begin, std::size_t memory_size_bytes) : begin{begin}, end{begin + memory_size_bytes} // NOLINT(cppcoreguidelines-pro-bounds-pointer-arithmetic) { } + std::mutex mutex{}; char* begin{}; char* end{}; };Also applies to: 120-131
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@cpp/benchmarks/utilities/simulated_memory_resource.hpp` around lines 57 - 68, The allocate() implementations are not thread-safe because multiple instances can alias state_ and concurrently read/advance state_->begin, causing overlapping allocations; modify both allocate(cuda::stream_ref, std::size_t, std::size_t) and the other allocate overload (the one around lines 120-131) to synchronize access to state_->begin by either (a) adding a mutex inside the shared state object and locking it around the check (state_->begin + bytes <= state_->end), the pointer capture (ptr = state_->begin) and the advance (state_->begin += bytes), or (b) implement an atomic reservation loop that uses an atomic<size_t> for state_->begin and does compare_exchange to reserve bytes; ensure the chosen sync primitive is part of the shared state structure so copies of the resource share the same lock/atomic and prevent overlapping ranges.
42-45:⚠️ Potential issue | 🟠 Major | ⚡ Quick winFix move semantics to prevent dereferencing null
state_after move.When
simulated_memory_resourceis moved using the default move operations, the source object'sstate_becomes null (as guaranteed forstd::shared_ptr). If the moved-from instance is subsequently used, callingallocate()will dereference the null pointer at line 62 or 66, causing undefined behavior.Implement custom move constructor and assignment operator to share the state instead, ensuring moved-from instances remain valid if accidentally used:
↔️ Proposed fix- simulated_memory_resource(simulated_memory_resource&&) = default; - simulated_memory_resource& operator=(simulated_memory_resource&&) = default; + simulated_memory_resource(simulated_memory_resource&& other) noexcept : state_{other.state_} {} + simulated_memory_resource& operator=(simulated_memory_resource&& other) noexcept + { + state_ = other.state_; + return *this; + }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@cpp/benchmarks/utilities/simulated_memory_resource.hpp` around lines 42 - 45, The defaulted move ctor/operator for simulated_memory_resource lets the source object's std::shared_ptr state_ become null which causes allocate() (lines that dereference state_) to UB; replace the defaulted simulated_memory_resource(simulated_memory_resource&&) and operator=(simulated_memory_resource&&) with custom implementations that copy (share) the shared_ptr state_ instead of moving it, so both moved-from and moved-to instances keep a valid state_; ensure the move-assignment also handles self-assignment and preserves any existing state_ cleanup semantics used by the class.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@cpp/benchmarks/utilities/simulated_memory_resource.hpp`:
- Around line 57-68: The allocate() implementations are not thread-safe because
multiple instances can alias state_ and concurrently read/advance state_->begin,
causing overlapping allocations; modify both allocate(cuda::stream_ref,
std::size_t, std::size_t) and the other allocate overload (the one around lines
120-131) to synchronize access to state_->begin by either (a) adding a mutex
inside the shared state object and locking it around the check (state_->begin +
bytes <= state_->end), the pointer capture (ptr = state_->begin) and the advance
(state_->begin += bytes), or (b) implement an atomic reservation loop that uses
an atomic<size_t> for state_->begin and does compare_exchange to reserve bytes;
ensure the chosen sync primitive is part of the shared state structure so copies
of the resource share the same lock/atomic and prevent overlapping ranges.
- Around line 42-45: The defaulted move ctor/operator for
simulated_memory_resource lets the source object's std::shared_ptr state_ become
null which causes allocate() (lines that dereference state_) to UB; replace the
defaulted simulated_memory_resource(simulated_memory_resource&&) and
operator=(simulated_memory_resource&&) with custom implementations that copy
(share) the shared_ptr state_ instead of moving it, so both moved-from and
moved-to instances keep a valid state_; ensure the move-assignment also handles
self-assignment and preserves any existing state_ cleanup semantics used by the
class.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 4b8ca723-9e73-4194-bf84-da79170079b0
📒 Files selected for processing (1)
cpp/benchmarks/utilities/simulated_memory_resource.hpp
Description
Makes copied
simulated_memory_resourceinstances share their simulated address cursor and allocation range state. This preserves replay benchmark behavior when the resource is copied into factories or type-erased wrappers.Checklist