Add Stats to DescribeTaskQueue#596
Conversation
| // Select the API mode to use for this request: DEFAULT mode (if unset) or ENHANCED mode. | ||
| // Consult the documentation for each field to understand which mode it is supported in. | ||
| temporal.api.enums.v1.DescribeTaskQueueMode api_mode = 5; |
There was a problem hiding this comment.
I moved this near the top since it influences how any of the other fields work/are understood.
| // [DEFAULT MODE ONLY] | ||
| // Statistics for the task queue. Only populated when `report_stats` is set to true in the request. | ||
| temporal.api.taskqueue.v1.TaskQueueStats task_queue_stats = 5; |
There was a problem hiding this comment.
This is the only actual change to the protos (apart from options and docs).
There was a problem hiding this comment.
I'm now wondering if this may necessitate at least a high-level idea of what this call may look like in non-Go SDKs. For instance in Python, what parameters might await my_client.describe_task_queue() accept and what might the return type look like? I admit I am getting a bit lost with enhanced no longer being a superset of non-deprecated default data.
There was a problem hiding this comment.
I'd expect these to work (Python-ish):
await my_client.describe_task_queue("my-task-queue", TaskQueueType.Workflow)
(no stats queried)
await my_client.describe_task_queue("my-task-queue", TaskQueueType.Workflow, include_task_queue_status=true)
(task_queue_status populated)
await my_client.describe_task_queue("my-task-queue", TaskQueueType.Workflow, report_stats=true)
(task_queue_stats populated)
Does that help?
There was a problem hiding this comment.
Thanks! I don't see a separation between default and enhanced there. Are those calls not using "enhanced" mode? What is the return type of the functions (meaning what's the return model)? Of course they can ask about multiple task queue types at a time, so we wouldn't limit there, but that's a minor thing.
My whole confusion is how a user is expected to use "enhanced" and "default". One would assume "enhanced" has everything non-enhanced does (except deprecations).
There was a problem hiding this comment.
I don't know the exact state of support of describe_task_queue across SDKs, but my understanding is that only the Go SDK added support for the enhanced mode (as a separate API, too). Since the (new) goal is to get rid of enhanced mode altogether, those existing implementations don't have to chance much. They just have to add support for the on/off toggle report_stats to opt-in to the new stats field in the response.
There was a problem hiding this comment.
PS: I think "enhanced" was not the best choice of wording here. It's really more a batch API (plus versioning selection).
There was a problem hiding this comment.
Since the (new) goal is to get rid of enhanced mode altogether
Ok, this is where I was confused and I think any user/viewer of this API will be similarly confused. I think we should deprecate enhanced mode at this time to signal this intent.
There was a problem hiding this comment.
"Enhanced" was supposed to be a protocol-only thing, users wouldn't have asked for it directly, the sdk would just always use it. The old mode was still supported just to not break old clients.
| repeated temporal.api.taskqueue.v1.PollerInfo pollers = 1; | ||
|
|
||
| // [DEFAULT MODE ONLY] | ||
| // Deprecated. |
There was a problem hiding this comment.
This field remains deprecated. TaskQueueStats supersedes it.
| temporal.api.enums.v1.TaskQueueType task_queue_type = 3 [deprecated = true]; | ||
| // Deprecated. Ignored in `ENHANCED` mode. | ||
| bool include_task_queue_status = 4 [deprecated = true]; |
There was a problem hiding this comment.
Removed the deprecated annotations as we're breathing new life into the "default" mode.
87f0547 to
d08c854
Compare
| // [DEFAULT MODE ONLY] | ||
| // Statistics for the task queue. Only populated when `report_stats` is set to true in the request. |
There was a problem hiding this comment.
This is a bit confusing. So this says [DEFAULT MODE ONLY], but then says "Only populated when report_stats is set to true" and that field is [ENHANCED MODE ONLY]. Isn't this contradictory?
There was a problem hiding this comment.
I see report_stats says [BOTH MODES] though
There was a problem hiding this comment.
I see now, I had gotten confused with report_pollers. So this is a bit confusing.
So do we hide top-level stats for enhanced users? If so, why? Is it some kind of aggregated form of what's in versions_info?
I am also a bit confused with how this works with versions_info. What does this top-level task_queue_stats contain that that one doesn't?
(sorry if we have discussed this already)
There was a problem hiding this comment.
Right; I marked report_stats as BOTH MODES for that reason. I'll admit, it's a bit awkward, but the alternatives felt worse to me (specifically, introducing yet another field or re-using the (default mode-only) include_task_queue_status). But I'm open to suggestions; no strong opinions.
There was a problem hiding this comment.
(sorry @stephanos, we submitted our comments here at the same time, but I do have some questions there)
There was a problem hiding this comment.
I've come into this API only recently; it's definitely confusing :-/
Here's my understanding:
- default ("legacy") mode: query via
task_queueandtask_queue_type - enhanced mode: query via
task_queue,task_queue_types(plural!) andversions(allows selecting multiplebuild_ids)
So you can already see that the enhanced mode is for batch and versioning-related requests, and the default is only for a single task queue.
So do we hide top-level stats for enhanced users? If so, why?
So the TaskQueueStats currently, is only included in TaskQueueTypeInfo, which is only included in TaskQueueVersionInfo which is only used in the DescribeTaskQueueResponse's versions_info map (which is enhanced mode only). Meaning users of enhanced mode have access to that data here: version_info[build_id].types_info[workflow_type].stats.
What does this top-level task_queue_stats contain that that one doesn't?
task_queue_stats would only be present when using the default mode (version_info is only present in enhanced mode). So there's no duplication. The new field "upgrades" the default mode to contain the same information as the enhanced mode already does.
Long-term, the goal is to get rid of "enhanced mode" altogether. Any existing SDK implementation for DescribeTaskQueue only has to add support for the toggle field report_stats to populate task_queue_stats.
There was a problem hiding this comment.
@cretz the enhanced mode operates on a "task queue family" which is all task queues of the same name. Officially, a task queue is identified by name + single type. So the right granularity for this API seem to be a single type.
When I proposed and implemented the enhanced mode I did not know that a task queue is a single type. I can see also how most users think of task queue as an entity encompassing all types because the UI and SDK makes it easy for people to think that way. But if one reads the docs carefully they would see that it says: "There are three types of Task Queues: Activity Task Queues, Workflow Task Queues, and Nexus Task Queues.".
When @mfateev saw the enhanced mode he strongly disliked it because it deals with multiple types meaning it's not describing a single task queue anymore.
Given that, we are thinking to double down on the default/legacy mode. An eventually deprecate the enhanced mode.
If you think it's less confusing for users I don't see an issue with marking enhanced mode as deprecated right now with these changes. After all, all the extra stuff for versioning is not deprecated anyways because they belonged to versioning V2, and the stats part is being covered for unversioned. (stats for versioned V3 can follow later).
There was a problem hiding this comment.
Given that, we are thinking to double down on the default/legacy mode. An eventually deprecate the enhanced mode.
Ok, this was the confusing point. I don't think this is clear from the API that enhanced actually may have less features than non-enhanced and that we don't expect enhanced to be used (and that we will deprecate it).
If you think it's less confusing for users I don't see an issue with marking enhanced mode as deprecated right now with these changes.
👍 IMO if we are deprecating enhanced mode, we should do it now. It will be way less confusing to all readers of this API.
There was a problem hiding this comment.
Alright, I can add the deprecation comments and annotations 👍
| // Only set in `ENHANCED` mode. | ||
| map<string, temporal.api.taskqueue.v1.TaskQueueVersionInfo> versions_info = 3; | ||
|
|
||
| // [DEFAULT MODE ONLY] |
There was a problem hiding this comment.
So we won't let an enhanced user get this basic versioning info? Is it some kind of aggregated form of versions_info?
There was a problem hiding this comment.
This also boils down to task queue vs task queue family discussion. versioning_info holds the info about what is the current/ramping version for routing tasks of this task queue.
Per discussion in the versioning crew, involving Max, it was decided that each task queue type can have potentially different versioning config. so we cannot return a single versioning_info value in enhanced mode which is potentially asking for multiple types.
There was a problem hiding this comment.
It's confusing that you get more enhanced versioning information if you don't use enhanced mode. We should deprecate enhanced mode if it's actually no longer the enhanced form with enhanced information.
d08c854 to
999ca6d
Compare
| string namespace = 1; | ||
| // Sticky queues are not supported in `ENHANCED` mode. | ||
|
|
||
| // Deprecated. |
There was a problem hiding this comment.
Can we add some clarification here that the enhanced mode itself is deprecated too?
999ca6d to
adb3589
Compare
| // Consult the documentation for each field to understand which mode it is supported in. | ||
| temporal.api.enums.v1.DescribeTaskQueueMode api_mode = 5 [deprecated = true];; | ||
|
|
||
| // [BOTH MODES] |
There was a problem hiding this comment.
To make this more readable, what do you think about moving all the fields that apply to default mode up to here, and all the ones that apply enhanced only (i.e. the deprecated ones) down at the bottom? So a reader can see the current recommended api all in once place and the deprecated stuff is all at the bottom (except for api_mode itself). Actually maybe api_mode should also be at the bottom, as the first deprecated field?
cretz
left a comment
There was a problem hiding this comment.
The deprecated enhanced form clears things up, thanks!
| // Deprecated, use `report_stats` instead. | ||
| // If true, the task queue status will be included in the response. | ||
| bool include_task_queue_status = 4 [deprecated = true]; |
There was a problem hiding this comment.
move this one down below as well?
| // Only set in `ENHANCED` mode. | ||
| map<string, temporal.api.taskqueue.v1.TaskQueueVersionInfo> versions_info = 3; | ||
| // Statistics for the task queue. Only populated when `report_stats` is set to true in the request. | ||
| temporal.api.taskqueue.v1.TaskQueueStats task_queue_stats = 5; |
There was a problem hiding this comment.
Can we name this just stats? The task_queue_ is implied by context
e59966c to
9a6ee84
Compare
_**READ BEFORE MERGING:** All PRs require approval by both Server AND SDK teams before merging! This is why the number of required approvals is "2" and not "1"--two reviewers from the same team is NOT sufficient. If your PR is not approved by someone in BOTH teams, it may be summarily reverted._ <!-- Describe what has changed in this PR --> **What changed?** Added `TaskQueueStats` field to `DescribeTaskQueueResponse`. <!-- Tell your future self why have you made these changes --> **Why?** To provide non-versioning users with relevant task queue stats. PS: The long-term plan is to deprecate the ENHANCED mod fields. <!-- Are there any breaking changes on binary or code level? --> **Breaking changes** N/A <!-- If this breaks the Server, please provide the Server PR to merge right after this PR was merged. --> **Server PR** temporalio/temporal#7816
READ BEFORE MERGING: All PRs require approval by both Server AND SDK teams before merging! This is why the number of required approvals is "2" and not "1"--two reviewers from the same team is NOT sufficient. If your PR is not approved by someone in BOTH teams, it may be summarily reverted.
What changed?
Added
TaskQueueStatsfield toDescribeTaskQueueResponse.Why?
To provide non-versioning users with relevant task queue stats.
PS: The long-term plan is to deprecate the ENHANCED mod fields.
Breaking changes
N/A
Server PR
temporalio/temporal#7816