feat(core): Implement memory based backpressure mechanism#605
feat(core): Implement memory based backpressure mechanism#605iambriccardo wants to merge 8 commits intomainfrom
Conversation
📝 WalkthroughSummary by CodeRabbitRelease Notes
WalkthroughThis PR adds a MemoryMonitor that samples system memory and publishes a hysteresis-based blocked/unblocked signal. Two stream wrappers—BackpressureStream and BatchBackpressureStream—are introduced to observe the memory signal (and optional Postgres connection updates) and pause/resume or flush streams/batches accordingly. MemoryMonitor is instantiated at pipeline startup and threaded through ApplyWorker/ApplyLoop/TableSyncWorker and table_copy call paths so table-copy and streaming flows can react to memory pressure. Sequence Diagram(s)sequenceDiagram
participant Monitor as MemoryMonitor (bg task)
participant System as sysinfo::System
participant Watch as watch::Sender<bool>
participant Subscriber as MemoryMonitorSubscription
loop Every MEMORY_REFRESH_INTERVAL
Monitor->>System: sample memory stats
System-->>Monitor: MemorySnapshot
Monitor->>Monitor: compute_next_blocked(used_percent)
Monitor->>Watch: send blocked state (if changed)
Watch-->>Subscriber: broadcast update
end
Subscriber->>Subscriber: poll_update()/current_blocked()
sequenceDiagram
participant Consumer as Consumer
participant Stream as BackpressureStream
participant Inner as EventsStream
participant Memory as MemoryMonitorSubscription
Consumer->>Stream: poll_next()
Stream->>Memory: poll_update(cx)
alt memory blocked
Memory-->>Stream: Some(true)
Stream->>Consumer: Pending
else memory not blocked
Memory-->>Stream: None/false
Stream->>Inner: poll_next()
Inner-->>Stream: Item / Pending / Done
Stream-->>Consumer: Item / Pending / Done
end
Assessment against linked issues
Out-of-scope changes
Possibly related PRs
Comment |
Pull Request Test Coverage Report for Build 22102540804Details
💛 - Coveralls |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@etl/src/concurrency/stream.rs`:
- Around line 47-70: The task stalls because when memory_monitor.poll_update(cx)
yields Ready(...) we set this.paused_for_memory but do not register the waker
for future updates; before returning Poll::Pending you must continue polling
memory_monitor.poll_update(cx) (or loop until it returns Poll::Pending) so the
watch channel registers the current waker. Modify the poll logic around
memory_monitor.poll_update(cx) in the stream's poll method (the match that sets
*this.paused_for_memory and calls this.memory_monitor.current_blocked()) to
consume Ready variants and only stop when poll_update returns Poll::Pending,
updating *this.paused_for_memory on each Ready, and then return Poll::Pending
(so the waker is registered for the next change).
- Around line 190-202: In BatchBackpressureStream's poll implementation, when
*this.paused_for_memory is true and this.items is empty you currently return
Poll::Pending without registering the waker; change the branch so you capture
and store the current task waker (e.g. this.waker = Some(cx.waker().clone()) or
equivalent) before returning Poll::Pending so the stream can be woken when
memory state changes, keeping the existing behavior of flushing when items exist
(the symbols to modify are this.paused_for_memory, this.items, this.reset_timer
and the Poll::Pending return).
etl/src/concurrency/stream.rs
Outdated
| match this.memory_monitor.poll_update(cx) { | ||
| Poll::Ready(Some(blocked)) => { | ||
| *this.paused_for_memory = blocked; | ||
| } | ||
| Poll::Ready(None) => { | ||
| *this.paused_for_memory = false; | ||
| } | ||
| Poll::Pending => { | ||
| let currently_blocked = this.memory_monitor.current_blocked(); | ||
| if *this.paused_for_memory != currently_blocked { | ||
| *this.paused_for_memory = currently_blocked; | ||
| } | ||
| } | ||
| } | ||
|
|
||
| if !was_paused && *this.paused_for_memory { | ||
| info!("backpressure active, stream paused"); | ||
| } else if was_paused && !*this.paused_for_memory { | ||
| info!("backpressure released, stream resumed"); | ||
| } | ||
|
|
||
| if *this.paused_for_memory { | ||
| return Poll::Pending; | ||
| } |
There was a problem hiding this comment.
Critical: Missing waker registration causes indefinite task stall when backpressure activates.
When poll_update(cx) returns Ready(Some(true)), the waker is not registered with the watch channel because a Ready result was obtained. Returning Poll::Pending on line 69 without a registered waker means the task will never be woken when memory pressure is released.
After receiving a Ready from the watch stream, you must poll again to register for the next update before returning Pending.
Proposed fix
if *this.paused_for_memory {
+ // Ensure waker is registered for the next state change.
+ let _ = this.memory_monitor.poll_update(cx);
return Poll::Pending;
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| match this.memory_monitor.poll_update(cx) { | |
| Poll::Ready(Some(blocked)) => { | |
| *this.paused_for_memory = blocked; | |
| } | |
| Poll::Ready(None) => { | |
| *this.paused_for_memory = false; | |
| } | |
| Poll::Pending => { | |
| let currently_blocked = this.memory_monitor.current_blocked(); | |
| if *this.paused_for_memory != currently_blocked { | |
| *this.paused_for_memory = currently_blocked; | |
| } | |
| } | |
| } | |
| if !was_paused && *this.paused_for_memory { | |
| info!("backpressure active, stream paused"); | |
| } else if was_paused && !*this.paused_for_memory { | |
| info!("backpressure released, stream resumed"); | |
| } | |
| if *this.paused_for_memory { | |
| return Poll::Pending; | |
| } | |
| match this.memory_monitor.poll_update(cx) { | |
| Poll::Ready(Some(blocked)) => { | |
| *this.paused_for_memory = blocked; | |
| } | |
| Poll::Ready(None) => { | |
| *this.paused_for_memory = false; | |
| } | |
| Poll::Pending => { | |
| let currently_blocked = this.memory_monitor.current_blocked(); | |
| if *this.paused_for_memory != currently_blocked { | |
| *this.paused_for_memory = currently_blocked; | |
| } | |
| } | |
| } | |
| if !was_paused && *this.paused_for_memory { | |
| info!("backpressure active, stream paused"); | |
| } else if was_paused && !*this.paused_for_memory { | |
| info!("backpressure released, stream resumed"); | |
| } | |
| if *this.paused_for_memory { | |
| // Ensure waker is registered for the next state change. | |
| let _ = this.memory_monitor.poll_update(cx); | |
| return Poll::Pending; | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@etl/src/concurrency/stream.rs` around lines 47 - 70, The task stalls because
when memory_monitor.poll_update(cx) yields Ready(...) we set
this.paused_for_memory but do not register the waker for future updates; before
returning Poll::Pending you must continue polling memory_monitor.poll_update(cx)
(or loop until it returns Poll::Pending) so the watch channel registers the
current waker. Modify the poll logic around memory_monitor.poll_update(cx) in
the stream's poll method (the match that sets *this.paused_for_memory and calls
this.memory_monitor.current_blocked()) to consume Ready variants and only stop
when poll_update returns Poll::Pending, updating *this.paused_for_memory on each
Ready, and then return Poll::Pending (so the waker is registered for the next
change).
| if *this.paused_for_memory { | ||
| if !this.items.is_empty() { | ||
| info!( | ||
| buffered_items = this.items.len(), | ||
| "backpressure active, flushing buffered batch" | ||
| ); | ||
| *this.reset_timer = true; | ||
|
|
||
| return Poll::Ready(Some(ShutdownResult::Ok(std::mem::take(this.items)))); | ||
| } | ||
|
|
||
| return Poll::Pending; | ||
| } |
There was a problem hiding this comment.
Same waker registration issue in BatchBackpressureStream.
The same bug exists here: when returning Poll::Pending after receiving a Ready(Some(blocked)) update, no waker is registered for subsequent state changes.
Proposed fix
if *this.paused_for_memory {
if !this.items.is_empty() {
info!(
buffered_items = this.items.len(),
"backpressure active, flushing buffered batch"
);
*this.reset_timer = true;
return Poll::Ready(Some(ShutdownResult::Ok(std::mem::take(this.items))));
}
+ // Ensure waker is registered for the next state change.
+ let _ = this.memory_monitor.poll_update(cx);
return Poll::Pending;
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if *this.paused_for_memory { | |
| if !this.items.is_empty() { | |
| info!( | |
| buffered_items = this.items.len(), | |
| "backpressure active, flushing buffered batch" | |
| ); | |
| *this.reset_timer = true; | |
| return Poll::Ready(Some(ShutdownResult::Ok(std::mem::take(this.items)))); | |
| } | |
| return Poll::Pending; | |
| } | |
| if *this.paused_for_memory { | |
| if !this.items.is_empty() { | |
| info!( | |
| buffered_items = this.items.len(), | |
| "backpressure active, flushing buffered batch" | |
| ); | |
| *this.reset_timer = true; | |
| return Poll::Ready(Some(ShutdownResult::Ok(std::mem::take(this.items)))); | |
| } | |
| // Ensure waker is registered for the next state change. | |
| let _ = this.memory_monitor.poll_update(cx); | |
| return Poll::Pending; | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@etl/src/concurrency/stream.rs` around lines 190 - 202, In
BatchBackpressureStream's poll implementation, when *this.paused_for_memory is
true and this.items is empty you currently return Poll::Pending without
registering the waker; change the branch so you capture and store the current
task waker (e.g. this.waker = Some(cx.waker().clone()) or equivalent) before
returning Poll::Pending so the stream can be woken when memory state changes,
keeping the existing behavior of flushing when items exist (the symbols to
modify are this.paused_for_memory, this.items, this.reset_timer and the
Poll::Pending return).
There was a problem hiding this comment.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@etl/src/concurrency/stream.rs`:
- Around line 95-121: The code sets paused_for_memory on Ready from
memory_subscription.poll_update but doesn't register the current waker for
future updates, so returning Poll::Pending can sleep forever; fix by repeatedly
calling memory_subscription.poll_update(cx) in a loop (or otherwise re-invoking
it) until it returns Poll::Pending so the waker is registered for the next
change, updating *this.paused_for_memory on each Ready(Some/None) result; apply
this change to the same polling logic in both BackpressureStream and
BatchBackpressureStream (use the existing memory_subscription.poll_update,
paused_for_memory and was_paused symbols to locate and update the code).
No description provided.