Skip to content

test: Add C++ gRPC cancellation tests to L0_request_cancellation#8775

Open
yinggeh wants to merge 1 commit into
mainfrom
yinggeh/tri-967-riva-speech-skills-cpp-clients-do-not-support-request
Open

test: Add C++ gRPC cancellation tests to L0_request_cancellation#8775
yinggeh wants to merge 1 commit into
mainfrom
yinggeh/tri-967-riva-speech-skills-cpp-clients-do-not-support-request

Conversation

@yinggeh
Copy link
Copy Markdown
Contributor

@yinggeh yinggeh commented May 13, 2026

What does the PR do?

Wires the new C++ grpc_cancellation_test gtest (added in the client-side companion PR) into qa/L0_request_cancellation/test.sh as a sibling of the existing Python cancellation suite. Each gtest case is run against a fresh tritonserver and the count of Cancellation notification received for log lines is asserted to match the expected count for that case.

Also temporarily bumps the model's instance_group count to 3 around the TestGrpcAsyncInferMulti case (reverted after) so the three fanned-out requests can execute concurrently; the test cancels two of them while letting the middle one complete naturally and would otherwise serialize on a single CPU instance.

Checklist

  • PR title reflects the change and is of format <commit_type>: <Title>
  • Changes are described in the pull request.
  • Related issues are referenced.
  • Populated github labels field
  • Added test plan and verified test passes.
  • Verified that the PR passes existing CI.
  • Verified copyright is correct on all changed files.
  • Added succinct git squash message before merging ref.
  • All template sections are filled out.
  • Optional: Additional screenshots for behavior/output changes with before/after.

Commit Type:

  • test

Related PRs:

triton-inference-server/client#896

Where should the reviewer start?

Test plan:

L0_request_cancellation--base

  • CI Pipeline ID: 51138710

Caveats:

Background

Triton has had Python-side request cancellation tests since r23.10 but no C++ counterpart, despite the C++ gRPC client gaining matching APIs in the companion client PR. This PR adds the missing test wiring so the two clients' cancellation behavior stays in lock-step on every CI run.

@yinggeh yinggeh force-pushed the yinggeh/tri-967-riva-speech-skills-cpp-clients-do-not-support-request branch from 0a43ae8 to 308cad3 Compare May 13, 2026 07:54
@yinggeh yinggeh force-pushed the yinggeh/tri-967-riva-speech-skills-cpp-clients-do-not-support-request branch from 308cad3 to 0c063cf Compare May 14, 2026 04:58
@yinggeh yinggeh requested review from mudit-eng and whoisj May 14, 2026 04:59
@yinggeh yinggeh self-assigned this May 14, 2026
@yinggeh yinggeh added the PR: test Adding missing tests or correcting existing test label May 14, 2026

SERVER=/opt/tritonserver/bin/tritonserver
source ../common/util.sh
CANCEL_LOG_LINE="Cancellation notification received for"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably needs a trailing space.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sentence looks odd. There should be something after 'for'.
Do you have an example the actual log line looks like?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two locations

LOG_VERBOSE(1) << "Cancellation notification received for " << Name()
<< ", rpc_ok=" << rpc_ok << ", context "
<< state->context_->unique_id_ << " step "
<< state->context_->step_ << ", state "
<< state->unique_id_ << " step " << state->step_;

LOG_VERBOSE(1) << "Cancellation notification received for " << Name()
<< ", rpc_ok=" << rpc_ok << ", context "
<< state->context_->unique_id_ << " step "
<< state->context_->step_ << ", state "
<< state->unique_id_ << " step " << state->step_;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As J said, there is a trailing space after 'for'.

requests whose results are no longer required can significantly impact server
resources.
Triton supports handling request cancellation received from the gRPC Python
client or a C API user (since r23.10), and C++ client (since r26.05).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is not going in r26.05.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try to cherry-pick since internal team is waiting for this feature.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kind of a big overhaul this late into the release, no?

what is the risk assessment? (discuss offline)


SERVER=/opt/tritonserver/bin/tritonserver
source ../common/util.sh
CANCEL_LOG_LINE="Cancellation notification received for"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sentence looks odd. There should be something after 'for'.
Do you have an example the actual log line looks like?

TEST_LOG="./grpc_cancellation_test_cpp.$TEST_CASE.log"
SERVER_LOG="./grpc_cancellation_test_cpp.$TEST_CASE.server.log"

# AsyncInferMulti fans out N concurrent requests; bump to 3 CPU
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a check for N >= 3?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate? Check N requests, instances or cancellation?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For N concurrent requests to fan out to 3 CPU, shouldn't we have N > 3?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this test, each request execution takes 10 seconds. To avoid backlog in the request queue (reduce overall test time), the model configuration is increased to 3 instances. If N > 3, requests after 3rd will wait in the queue until the first 3 requests have completed execution, which will take 10 seconds.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you mean. Here we are testing requests that are cancelled during execution. I can also add a test for in-queue request cancellation.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Yes, let's test for in-queue cancellation also.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

PR: test Adding missing tests or correcting existing test

Development

Successfully merging this pull request may close these issues.

3 participants