Skip to content

feat: Add request cancellation to C++ gRPC client#896

Open
yinggeh wants to merge 1 commit into
mainfrom
yinggeh/tri-967-riva-speech-skills-cpp-clients-do-not-support-request
Open

feat: Add request cancellation to C++ gRPC client#896
yinggeh wants to merge 1 commit into
mainfrom
yinggeh/tri-967-riva-speech-skills-cpp-clients-do-not-support-request

Conversation

@yinggeh
Copy link
Copy Markdown
Contributor

@yinggeh yinggeh commented May 13, 2026

Summary

Adds per-request cancellation to the C++ gRPC client, mirroring the existing Python tritonclient.grpc cancellation interfaces.

  • AsyncInfer gains an optional trailing CallContext** ctx_out. On Error::Success, the caller receives a heap-allocated CallContext whose Cancel() method calls grpc::ClientContext::TryCancel(). Cancel() after natural completion is a safe no-op.
  • AsyncInferMulti gains an optional trailing std::vector<CallContext*>* ctxs_out — one handle per fanned-out request. Cancellation is per-request; the multi callback still fires exactly once after every leaf produces a result.
  • StopStream gains a bool cancel_requests = false. When true, the streaming RPC is TryCancel'd and the stream callback receives one final InferResult whose status contains Locally cancelled by application!.
  • After a cancelled stream, StartStream can be called again — grpc_context_ is rebuilt in place because grpc::ClientContext is non-movable and a cancelled context cannot be reused.
  • All cancellation paths surface the same Python-parity message Locally cancelled by application!, matching tritonclient.grpc._utils.get_cancelled_error() in the Python client.

Backwards compatibility

Every new parameter defaults to nullptr / false. Existing call sites compile and behave exactly as before; cancellation is strictly opt-in.

Testing

New src/c++/tests/grpc_cancellation_test.cc (gtest) with 7 cases:

Test Covers
TestGrpcAsyncInfer Python test_grpc_async_infer parity (1:1)
TestGrpcAsyncInferCancelAfterCompletionIsNoOp Cancel-after-finish is safe, no double callback
TestGrpcAsyncInferWithoutContextStillCompletes Default arg path is unchanged
TestGrpcAsyncInferMulti Cancels requests 0 and 2, lets request 1 complete; verifies per-request cancel + result-order preservation + single multi-callback fire
TestGrpcStreamInfer Python test_grpc_stream_infer parity (1:1)
TestGrpcStreamCancelWithoutInfer Cancel an empty stream still emits the cancel message
TestGrpcStreamCancelThenRestart Cancelled stream → fresh stream → successful inference

Related

triton-inference-server/server#8775

@yinggeh yinggeh force-pushed the yinggeh/tri-967-riva-speech-skills-cpp-clients-do-not-support-request branch 3 times, most recently from e26c27d to 8004de9 Compare May 14, 2026 04:35
@yinggeh yinggeh force-pushed the yinggeh/tri-967-riva-speech-skills-cpp-clients-do-not-support-request branch from 8004de9 to a0e7e46 Compare May 14, 2026 04:57
@yinggeh yinggeh requested review from mudit-eng and whoisj May 14, 2026 04:59
@yinggeh yinggeh self-assigned this May 14, 2026
@yinggeh yinggeh added the enhancement New feature or request label May 14, 2026
Copy link
Copy Markdown
Contributor

@whoisj whoisj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment thread src/c++/library/grpc_client.cc
Comment thread src/c++/library/grpc_client.cc
Comment thread src/c++/library/grpc_client.cc
const std::vector<std::vector<const InferRequestedOutput*>>& outputs,
const Headers& headers, grpc_compression_algorithm compression_algorithm)
const Headers& headers, grpc_compression_algorithm compression_algorithm,
std::vector<CallContext*>* ctxs_out)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pointer to pointer is tricky and error prone. Why not pass as a reference?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the argument is optional. A reference cannot be null.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Help me understand. Is it ctxs_out that would be null or the vector entries inside it? Asking because can we have an empty vector?

Copy link
Copy Markdown
Contributor Author

@yinggeh yinggeh May 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Development

Successfully merging this pull request may close these issues.

3 participants