Skip to content

Add Stats to DescribeTaskQueue#596

Merged
stephanos merged 9 commits into
temporalio:masterfrom
stephanos:describe-task-queue-stats
May 30, 2025
Merged

Add Stats to DescribeTaskQueue#596
stephanos merged 9 commits into
temporalio:masterfrom
stephanos:describe-task-queue-stats

Conversation

@stephanos
Copy link
Copy Markdown
Contributor

@stephanos stephanos commented May 26, 2025

READ BEFORE MERGING: All PRs require approval by both Server AND SDK teams before merging! This is why the number of required approvals is "2" and not "1"--two reviewers from the same team is NOT sufficient. If your PR is not approved by someone in BOTH teams, it may be summarily reverted.

What changed?

Added TaskQueueStats field to DescribeTaskQueueResponse.

Why?

To provide non-versioning users with relevant task queue stats.

PS: The long-term plan is to deprecate the ENHANCED mod fields.

Breaking changes

N/A

Server PR

temporalio/temporal#7816

Comment on lines +1036 to +1038
// Select the API mode to use for this request: DEFAULT mode (if unset) or ENHANCED mode.
// Consult the documentation for each field to understand which mode it is supported in.
temporal.api.enums.v1.DescribeTaskQueueMode api_mode = 5;
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved this near the top since it influences how any of the other fields work/are understood.

Comment on lines +1088 to +1094
// [DEFAULT MODE ONLY]
// Statistics for the task queue. Only populated when `report_stats` is set to true in the request.
temporal.api.taskqueue.v1.TaskQueueStats task_queue_stats = 5;
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the only actual change to the protos (apart from options and docs).

Copy link
Copy Markdown
Contributor

@cretz cretz May 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm now wondering if this may necessitate at least a high-level idea of what this call may look like in non-Go SDKs. For instance in Python, what parameters might await my_client.describe_task_queue() accept and what might the return type look like? I admit I am getting a bit lost with enhanced no longer being a superset of non-deprecated default data.

Copy link
Copy Markdown
Contributor Author

@stephanos stephanos May 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd expect these to work (Python-ish):

await my_client.describe_task_queue("my-task-queue", TaskQueueType.Workflow)

(no stats queried)

await my_client.describe_task_queue("my-task-queue", TaskQueueType.Workflow, include_task_queue_status=true)

(task_queue_status populated)

await my_client.describe_task_queue("my-task-queue", TaskQueueType.Workflow, report_stats=true)

(task_queue_stats populated)

Does that help?

Copy link
Copy Markdown
Contributor

@cretz cretz May 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I don't see a separation between default and enhanced there. Are those calls not using "enhanced" mode? What is the return type of the functions (meaning what's the return model)? Of course they can ask about multiple task queue types at a time, so we wouldn't limit there, but that's a minor thing.

My whole confusion is how a user is expected to use "enhanced" and "default". One would assume "enhanced" has everything non-enhanced does (except deprecations).

Copy link
Copy Markdown
Contributor Author

@stephanos stephanos May 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know the exact state of support of describe_task_queue across SDKs, but my understanding is that only the Go SDK added support for the enhanced mode (as a separate API, too). Since the (new) goal is to get rid of enhanced mode altogether, those existing implementations don't have to chance much. They just have to add support for the on/off toggle report_stats to opt-in to the new stats field in the response.

Copy link
Copy Markdown
Contributor Author

@stephanos stephanos May 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PS: I think "enhanced" was not the best choice of wording here. It's really more a batch API (plus versioning selection).

Copy link
Copy Markdown
Contributor

@cretz cretz May 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the (new) goal is to get rid of enhanced mode altogether

Ok, this is where I was confused and I think any user/viewer of this API will be similarly confused. I think we should deprecate enhanced mode at this time to signal this intent.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Enhanced" was supposed to be a protocol-only thing, users wouldn't have asked for it directly, the sdk would just always use it. The old mode was still supported just to not break old clients.

repeated temporal.api.taskqueue.v1.PollerInfo pollers = 1;

// [DEFAULT MODE ONLY]
// Deprecated.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This field remains deprecated. TaskQueueStats supersedes it.

Comment on lines -1039 to -1041
temporal.api.enums.v1.TaskQueueType task_queue_type = 3 [deprecated = true];
// Deprecated. Ignored in `ENHANCED` mode.
bool include_task_queue_status = 4 [deprecated = true];
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the deprecated annotations as we're breathing new life into the "default" mode.

Comment thread temporal/api/workflowservice/v1/request_response.proto Outdated
Comment thread temporal/api/workflowservice/v1/request_response.proto Outdated
@stephanos stephanos force-pushed the describe-task-queue-stats branch from 87f0547 to d08c854 Compare May 27, 2025 17:08
@stephanos stephanos marked this pull request as ready for review May 27, 2025 19:30
@stephanos stephanos requested review from a team as code owners May 27, 2025 19:30
Comment on lines +1089 to +1079
// [DEFAULT MODE ONLY]
// Statistics for the task queue. Only populated when `report_stats` is set to true in the request.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit confusing. So this says [DEFAULT MODE ONLY], but then says "Only populated when report_stats is set to true" and that field is [ENHANCED MODE ONLY]. Isn't this contradictory?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see report_stats says [BOTH MODES] though

Copy link
Copy Markdown
Contributor

@cretz cretz May 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see now, I had gotten confused with report_pollers. So this is a bit confusing.

So do we hide top-level stats for enhanced users? If so, why? Is it some kind of aggregated form of what's in versions_info?

I am also a bit confused with how this works with versions_info. What does this top-level task_queue_stats contain that that one doesn't?

(sorry if we have discussed this already)

Copy link
Copy Markdown
Contributor Author

@stephanos stephanos May 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right; I marked report_stats as BOTH MODES for that reason. I'll admit, it's a bit awkward, but the alternatives felt worse to me (specifically, introducing yet another field or re-using the (default mode-only) include_task_queue_status). But I'm open to suggestions; no strong opinions.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(sorry @stephanos, we submitted our comments here at the same time, but I do have some questions there)

Copy link
Copy Markdown
Contributor Author

@stephanos stephanos May 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've come into this API only recently; it's definitely confusing :-/

Here's my understanding:

  • default ("legacy") mode: query via task_queue and task_queue_type
  • enhanced mode: query via task_queue, task_queue_types (plural!) and versions (allows selecting multiple build_ids)

So you can already see that the enhanced mode is for batch and versioning-related requests, and the default is only for a single task queue.

So do we hide top-level stats for enhanced users? If so, why?

So the TaskQueueStats currently, is only included in TaskQueueTypeInfo, which is only included in TaskQueueVersionInfo which is only used in the DescribeTaskQueueResponse's versions_info map (which is enhanced mode only). Meaning users of enhanced mode have access to that data here: version_info[build_id].types_info[workflow_type].stats.

What does this top-level task_queue_stats contain that that one doesn't?

task_queue_stats would only be present when using the default mode (version_info is only present in enhanced mode). So there's no duplication. The new field "upgrades" the default mode to contain the same information as the enhanced mode already does.

Long-term, the goal is to get rid of "enhanced mode" altogether. Any existing SDK implementation for DescribeTaskQueue only has to add support for the toggle field report_stats to populate task_queue_stats.

Copy link
Copy Markdown
Contributor

@ShahabT ShahabT May 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cretz the enhanced mode operates on a "task queue family" which is all task queues of the same name. Officially, a task queue is identified by name + single type. So the right granularity for this API seem to be a single type.

When I proposed and implemented the enhanced mode I did not know that a task queue is a single type. I can see also how most users think of task queue as an entity encompassing all types because the UI and SDK makes it easy for people to think that way. But if one reads the docs carefully they would see that it says: "There are three types of Task Queues: Activity Task Queues, Workflow Task Queues, and Nexus Task Queues.".

When @mfateev saw the enhanced mode he strongly disliked it because it deals with multiple types meaning it's not describing a single task queue anymore.

Given that, we are thinking to double down on the default/legacy mode. An eventually deprecate the enhanced mode.

If you think it's less confusing for users I don't see an issue with marking enhanced mode as deprecated right now with these changes. After all, all the extra stuff for versioning is not deprecated anyways because they belonged to versioning V2, and the stats part is being covered for unversioned. (stats for versioned V3 can follow later).

Copy link
Copy Markdown
Contributor

@cretz cretz May 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that, we are thinking to double down on the default/legacy mode. An eventually deprecate the enhanced mode.

Ok, this was the confusing point. I don't think this is clear from the API that enhanced actually may have less features than non-enhanced and that we don't expect enhanced to be used (and that we will deprecate it).

If you think it's less confusing for users I don't see an issue with marking enhanced mode as deprecated right now with these changes.

👍 IMO if we are deprecating enhanced mode, we should do it now. It will be way less confusing to all readers of this API.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, I can add the deprecation comments and annotations 👍

// Only set in `ENHANCED` mode.
map<string, temporal.api.taskqueue.v1.TaskQueueVersionInfo> versions_info = 3;

// [DEFAULT MODE ONLY]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we won't let an enhanced user get this basic versioning info? Is it some kind of aggregated form of versions_info?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also boils down to task queue vs task queue family discussion. versioning_info holds the info about what is the current/ramping version for routing tasks of this task queue.

Per discussion in the versioning crew, involving Max, it was decided that each task queue type can have potentially different versioning config. so we cannot return a single versioning_info value in enhanced mode which is potentially asking for multiple types.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's confusing that you get more enhanced versioning information if you don't use enhanced mode. We should deprecate enhanced mode if it's actually no longer the enhanced form with enhanced information.

@stephanos stephanos force-pushed the describe-task-queue-stats branch from d08c854 to 999ca6d Compare May 28, 2025 16:42
string namespace = 1;
// Sticky queues are not supported in `ENHANCED` mode.

// Deprecated.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add some clarification here that the enhanced mode itself is deprecated too?

@stephanos stephanos force-pushed the describe-task-queue-stats branch from 999ca6d to adb3589 Compare May 28, 2025 17:35
Comment thread temporal/api/workflowservice/v1/request_response.proto Outdated
Comment thread temporal/api/workflowservice/v1/request_response.proto Outdated
// Consult the documentation for each field to understand which mode it is supported in.
temporal.api.enums.v1.DescribeTaskQueueMode api_mode = 5 [deprecated = true];;

// [BOTH MODES]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make this more readable, what do you think about moving all the fields that apply to default mode up to here, and all the ones that apply enhanced only (i.e. the deprecated ones) down at the bottom? So a reader can see the current recommended api all in once place and the deprecated stuff is all at the bottom (except for api_mode itself). Actually maybe api_mode should also be at the bottom, as the first deprecated field?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that 👍

Comment thread temporal/api/workflowservice/v1/request_response.proto Outdated
Copy link
Copy Markdown
Contributor

@cretz cretz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The deprecated enhanced form clears things up, thanks!

Comment thread temporal/api/workflowservice/v1/request_response.proto Outdated
Comment on lines +1043 to 1048
// Deprecated, use `report_stats` instead.
// If true, the task queue status will be included in the response.
bool include_task_queue_status = 4 [deprecated = true];
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move this one down below as well?

// Only set in `ENHANCED` mode.
map<string, temporal.api.taskqueue.v1.TaskQueueVersionInfo> versions_info = 3;
// Statistics for the task queue. Only populated when `report_stats` is set to true in the request.
temporal.api.taskqueue.v1.TaskQueueStats task_queue_stats = 5;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we name this just stats? The task_queue_ is implied by context

@stephanos stephanos force-pushed the describe-task-queue-stats branch from e59966c to 9a6ee84 Compare May 30, 2025 02:47
@stephanos stephanos merged commit babf177 into temporalio:master May 30, 2025
4 checks passed
@stephanos stephanos deleted the describe-task-queue-stats branch May 30, 2025 02:51
stephanos added a commit that referenced this pull request Mar 10, 2026
_**READ BEFORE MERGING:** All PRs require approval by both Server AND
SDK teams before merging! This is why the number of required approvals
is "2" and not "1"--two reviewers from the same team is NOT sufficient.
If your PR is not approved by someone in BOTH teams, it may be summarily
reverted._

<!-- Describe what has changed in this PR -->
**What changed?**

Added `TaskQueueStats` field to `DescribeTaskQueueResponse`.

<!-- Tell your future self why have you made these changes -->
**Why?**

To provide non-versioning users with relevant task queue stats.

PS: The long-term plan is to deprecate the ENHANCED mod fields.

<!-- Are there any breaking changes on binary or code level? -->
**Breaking changes**

N/A

<!-- If this breaks the Server, please provide the Server PR to merge
right after this PR was merged. -->
**Server PR**

temporalio/temporal#7816
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants