Skip to content

Fix garbage messages on device de-enumeration#53

Open
Cristian Manca (CristianManca) wants to merge 1 commit into
qualcomm:mainfrom
CristianManca:fix/wdfserial-clear-ringbuffer-d3final
Open

Fix garbage messages on device de-enumeration#53
Cristian Manca (CristianManca) wants to merge 1 commit into
qualcomm:mainfrom
CristianManca:fix/wdfserial-clear-ringbuffer-d3final

Conversation

@CristianManca

@CristianManca Cristian Manca (CristianManca) commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

This patch fixes garbage messages emitted by qcwdfser during device de-enumeration/re-enumeration by hardening the read pipeline teardown path and preventing stale data reuse.

Root cause

Garbage AT replies were caused by a combination of issues during remove/power transitions:

  1. D0Exit race in read path: stale URB completions could still be present and later copied into the ring buffer.
  2. Surprise-removal retry storm: after STATUS_NO_SUCH_DEVICE, free-list URBs could be re-dispatched repeatedly.
  3. Completion length source: read completion used requested pipe length instead of actual transferred bytes in some cases.

What this patch changes

Read pipeline hardening (QCRD.c)

  • Adds bDeviceGone state tracking.
  • On URB completion with STATUS_NO_SUCH_DEVICE, marks device as gone.
  • In READ_THREAD_REQUEST_ARRIVE_EVENT, skips free-list URB re-dispatch when device is gone (prevents high-rate retry storm during removal).
  • In READ_THREAD_DEVICE_D0_EXIT_EVENT, clears read-side buffered/completion state before signaling D0Exit ready: - QCRD_ClearBuffer(...)
  • KeClearEvent(ReadRequestCompletionEvent)
  • reset state (bBufferOverflow, bDeviceGone, devErrCnt)
  • Resets state also on file-close path for clean restart after re-enumeration.

Read completion correctness (QCRD.c)

  • QCRD_EvtIoReadCompletionAsync now uses actual transferred length: - availableLength = (NT_SUCCESS(status) ? Params->IoStatus.Information : 0)
  • Failed URBs therefore do not contribute stale bytes to ring-buffer flow.
  • Diagnostic logging updated accordingly (requested/transferred/used behavior).

Result

The de-enumeration path no longer re-delivers stale AT response data, and surprise-removal handling no longer drives repeated URB re-dispatch while the device is gone.

@CristianManca Cristian Manca (CristianManca) force-pushed the fix/wdfserial-clear-ringbuffer-d3final branch from 6142c6b to b896853 Compare June 9, 2026 09:59
@5656hcx

Copy link
Copy Markdown
Contributor

Hi Cristian Manca (@CristianManca),

Is this issue introduced by interval timeout implementation only?
Can you explain more why interval timeout will re-deliver outdated data?
If interval timeout is disabled, will this issue happen?

Best,
Chenxi

@CristianManca

Cristian Manca (CristianManca) commented Jun 12, 2026

Copy link
Copy Markdown
Contributor Author

Hi Chenxi Han (@5656hcx) ,

I removed the changes in src/windows/wdfserial/QCRD.c regarding interval timeout.
That change was not related to garbage messages issue.

Regards,
Cristian

@hangzqcom

Copy link
Copy Markdown
Contributor

This patch fixes an issue where qcwdfser emitted copies of the last AT commands replies (garbage messages) when the device de-enumerated. Root cause: stale data remained in the serial ring buffer and ReadIntervalTimeout behavior could re-deliver buffered data during D3 transitions.

This patch prevents qcwdfser from emitting stale copies of the last AT replies on device de‑enumeration by clearing the serial ring buffer at D3Final. That change exposed a timing/regression where reads could still wait on the inter-byte timer (or re-deliver buffered data) and cause latency/throughput issues. To address that, read handling was updated to immediately drain available buffer data and cancel any pending interval timer, restoring correct behavior and timing.

Changes:

Ring buffer and power state management:

  • When the device transitions to the D3Final power state, the read ring buffer is now explicitly cleared and the in-queue byte count reset, ensuring no stale data remains when the device is powered down.

Read request handling and latency improvements:

  • The read handler now bypasses the inter-byte gap timer when data is already present in the ring buffer, delivering data immediately instead of waiting for the timer. This significantly reduces per-byte latency and prevents throughput loss. Additionally, any pending read interval timer is canceled before draining the buffer. [1] [2]
  • The logging in the read handler was corrected to report the actual number of available bytes, not the buffer capacity, for more accurate diagnostics.

Read completion and diagnostics:

  • The read completion routine now distinguishes between requested, transferred, and used byte counts for improved trace logs and ensures failed URBs do not contribute bytes to the ring buffer. Logging is updated to reflect these changes and to avoid dereferencing possibly-null pointers.

PR description needs to be updated to reflect final changes

@5656hcx Chenxi Han (5656hcx) left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update the pull request description to reflect the latest changes.
Also, making the buffer cleanup outside read thread is unsafe.

Comment thread src/windows/wdfserial/QCPNP.c Outdated
("<%ws> QCPNP_EvtDeviceD0Exit D3Final: clearing ring buffer\n", pDevContext->PortName)
);
QCUTIL_RingBufferClear(&pDevContext->ReadRingBuffer);
pDevContext->AmountInInQueue = 0;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updating the ReadRingBuffer outside read thread may experience race condition

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I modified the patch and added changes in QCRD.c

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In QCPNP_EvtDeviceD0Exit (src/windows/wdfserial/QCPNP.c), I removed the if block (TargetState == WdfPowerDeviceD3Final) that directly cleared the ReadRingBuffer outside the read thread (QCUTIL_RingBufferClear + update AmountInInQueue + related logs).

Now, the RX buffer clearing is handled only in the read thread's path (QCRD_ReadRequestHandlerThread / READ_THREAD_DEVICE_D0_EXIT_EVENT), eliminating the issue reported by the reviewer.

@CristianManca Cristian Manca (CristianManca) marked this pull request as draft June 18, 2026 12:39
@CristianManca Cristian Manca (CristianManca) force-pushed the fix/wdfserial-clear-ringbuffer-d3final branch from 9caf3ff to 1533a42 Compare June 19, 2026 09:59
@CristianManca

Copy link
Copy Markdown
Contributor Author

I modified the patch and added changes in QCRD.c

@CristianManca Cristian Manca (CristianManca) marked this pull request as ready for review June 22, 2026 13:03
PREQUEST_CONTEXT pReqContext = QCReqGetContext(Request);
NTSTATUS status = WdfRequestGetStatus(Request);
size_t availableLength = Params->Parameters.Usb.Completion->Parameters.PipeRead.Length;
size_t availableLength = (NT_SUCCESS(status) ? Params->IoStatus.Information : 0);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Windows Sample Drivers uses Params->Parameters.Usb.Completion->Parameters.PipeRead.Length

https://github.com/microsoft/Windows-driver-samples/blob/main/general/DCHU/osrfx2_DCHU_base/osrfx2_DCHU_base/bulkrwr.c#L199

Also we need to support partial transfer, can't just set this to 0.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback — agreed.

I see two valid options here:

1. Conservative/defensive path (avoid overcount if values diverge):

 size_t ioLen  = Params->IoStatus.Information;
 size_t usbLen = Params->Parameters.Usb.Completion->Parameters.PipeRead.Length;
 size_t availableLength = (usbLen == 0) ? ioLen : min(ioLen, usbLen);

This keeps compatibility with the USB completion field used in Microsoft samples, while preventing overcount if IoStatus.Information and PipeRead.Length differ.

2. Simpler path without forcing 0 on non-success status (preserve partial-transfer behavior):

size_t availableLength = Params->IoStatus.Information

This keeps partial bytes when present and avoids dropping valid data on mixed-status completions.

QCRD.c — D0Exit race:
Clear ring buffer and completion list inside the read thread's
D0_EXIT_EVENT handler before signaling D0ExitReadyEvent, removing
the race where stale URB completions wrote old AT responses into
the ring buffer after D0Exit, causing garbage messages on
re-enumeration. Reset bBufferOverflow/bDeviceGone/devErrCnt on
D0Exit and FileClose for clean pipeline state.

QCRD.c — surprise removal URB retry storm:
Add bDeviceGone flag (set on STATUS_NO_SUCH_DEVICE completion) and
guard free-list re-dispatch in REQUEST_ARRIVE_EVENT to skip when
the device is gone. Prevents the ~40000/sec URB retry loop during
surprise removal.

QCRD.c — fixes:
- EvtIoReadCompletionAsync: use IoStatus.Information for
availableLength instead of PipeRead.Length (requested vs actual).

Signed-off-by: Cristian Manca <Cristian.Manca@telit.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants