Skip to content

Queue TTFT Latency Scorer#188

Draft
Mohammad-nassar10 wants to merge 7 commits into
llm-d:mainfrom
Mohammad-nassar10:queue-ttft-scorer
Draft

Queue TTFT Latency Scorer#188
Mohammad-nassar10 wants to merge 7 commits into
llm-d:mainfrom
Mohammad-nassar10:queue-ttft-scorer

Conversation

@Mohammad-nassar10

@Mohammad-nassar10 Mohammad-nassar10 commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

What type of PR is this?

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes #54.

Release note (write NONE if no user-facing change):

NONE

Signed-off-by: Mohammad <mohammad.nassar@ibm.com>
Signed-off-by: Mohammad <mohammad.nassar@ibm.com>
@Mohammad-nassar10 Mohammad-nassar10 marked this pull request as draft June 22, 2026 11:43
@github-actions github-actions Bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. labels Jun 22, 2026
@github-actions

Copy link
Copy Markdown

⚠️ Large PR detected

Your PR is large. Please consider breaking it into multiple PRs.

The do-not-merge/hold label has been added and can be removed by the reviewers based on their judgement.

Signed-off-by: Mohammad <mohammad.nassar@ibm.com>
@nirrozenbaum nirrozenbaum removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 22, 2026
Comment thread examples/plot_capacity.py
@@ -0,0 +1,141 @@
"""

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this file related to the PR?

Signed-off-by: Mohammad <mohammad.nassar@ibm.com>
Signed-off-by: Mohammad <mohammad.nassar@ibm.com>
Signed-off-by: Mohammad <mohammad.nassar@ibm.com>
Signed-off-by: Mohammad <mohammad.nassar@ibm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature]: Latency-Based Model Routing

2 participants