feat: update to llm-d v0.5.0 spans patch#99
Closed
starpit wants to merge 1443 commits into
Closed
Conversation
Signed-off-by: Elvir Crncevic <elvircrn@gmail.com>
…#31739) Signed-off-by: Matthew Bonanni <mbonanni@redhat.com> Co-authored-by: Benjamin Chislett <chislett.ben@gmail.com> Co-authored-by: Benjamin Chislett <bchislett@nvidia.com>
…y on ROCm (#31820) Signed-off-by: Andreas Karatzas <akaratza@amd.com>
… Tests (Nemotron) (#31835) Signed-off-by: Andreas Karatzas <akaratza@amd.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: angelayi <yiangela7@gmail.com> Signed-off-by: Luka Govedič <ProExpertProg@users.noreply.github.com> Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
…LLM_FLOAT32_MATMUL_PRECISION` from all ROCm tests (#31829) Signed-off-by: Andreas Karatzas <akaratza@amd.com>
… (#29197) Signed-off-by: cmartinez <cmartinez@roblox.com> Signed-off-by: vSeamar <cmartinez@roblox.com>
Signed-off-by: Lucas Kabela <lucaskabela@meta.com>
Signed-off-by: ApostaC <yihua98@uchicago.edu>
Signed-off-by: 赵策 <alcor@mac.mynetworksettings.com> Signed-off-by: Alcor <alcor_zhao@outlook.com> Co-authored-by: 赵策 <alcor@mac.mynetworksettings.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
…ber (#31855) Signed-off-by: Tyler Michael Smith <tlrmchlsmth@gmail.com>
…ound (#31465) Signed-off-by: Zhuohao Yang <zy242@cornell.edu> Co-authored-by: Zhuohao Yang <zy242@cornell.edu> Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
…kend (#31816) Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
…omputed_tokens_cpu` CommonAttentionMetadata properties (#31850) Signed-off-by: Lucas Wilkinson <lwilkins@redhat.com>
Signed-off-by: tjp_zju <tanjianpingzju1990@gmail.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
Signed-off-by: Tianshu Yu <tianshuyu.formal@gmail.com>
Signed-off-by: Hollow Man <hollowman@opensuse.org>
…n (#31106) Signed-off-by: c0de128 <kevin.mckay@outlook.com> Signed-off-by: Kevin McKay <kevin@example.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
…oe (#31676) Signed-off-by: xuebwang-amd <xuebwang@amd.com>
Signed-off-by: Wei-Yu Lin <weiyulin@google.com> Signed-off-by: weiyu <62784299+weiyu0824@users.noreply.github.com>
Signed-off-by: sihao.li <sihao.li@intel.com>
…termediate_tensors` in GLM-4-MoE (#31869) Signed-off-by: maang <maang_h@163.com>
…30593) Signed-off-by: BlankR <hjyblanche@gmail.com> Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
Signed-off-by: Andy Liu <andyliu@roblox.com>
…unc` (#31880) Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com>
Signed-off-by: JaredforReal <w13431838023@gmail.com> Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
…improve flow (#31962) Signed-off-by: xuebwang-amd <xuebwang@amd.com>
…ny tests (#32040) Signed-off-by: Divakar Verma <divakar.verma@amd.com>
Signed-off-by: Andrew Xia <axia@fb.com> Co-authored-by: Andrew Xia <axia@fb.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: Nick Hill <nickhill123@gmail.com>
…127) Signed-off-by: Nick Hill <nickhill123@gmail.com>
Signed-off-by: cjackal <44624812+cjackal@users.noreply.github.com>
…)` (#31956) Signed-off-by: Sanghoon Yoon <seanyoon@kakao.com>
Signed-off-by: Andrew Bennett <potatosaladx@meta.com>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
…32099) Signed-off-by: Andreas Karatzas <akaratza@amd.com>
…243) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: Martin Hickey <martin.hickey@ie.ibm.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
…of the Qwen3-Next-80B-A3B-Thinking under rocm_atten (#32336) Signed-off-by: vllmellm <vllm.ellm@embeddedllm.com> (cherry picked from commit e27078e)
Signed-off-by: Doug Lehr <douglehr@amd.com> Co-authored-by: Doug Lehr <douglehr@amd.com> (cherry picked from commit c5891b5)
…n ROCm (#32413) Signed-off-by: ganyi <ygan@amd.com> (cherry picked from commit 77c16df)
Signed-off-by: mgoin <mgoin64@gmail.com> (cherry picked from commit 1be5a73)
…2264 Signed-off-by: Kevin H. Luu <khluu000@gmail.com>
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com> (cherry picked from commit bcf2333)
Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
…enable release image building for CUDA 13.0 (#31032) (cherry picked from commit 8e61425)
…ly image (#32522) Signed-off-by: Shengqi Chen <harry-chen@outlook.com> (cherry picked from commit 965765a)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> (cherry picked from commit 8ebf271)
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn> (cherry picked from commit 444e2e7)
Signed-off-by: NickLucche <nlucches@redhat.com> (cherry picked from commit ea6102b)
…wheels.sh (#32971) Signed-off-by: Shengqi Chen <harry-chen@outlook.com> (cherry picked from commit 136c499)
Apply the spans (block-attention) patch from llm-d v0.5.0, which adds support for span semantics with fused RoPE in the Triton attention kernel. The patch originates from llm-d v0.5.0: https://github.com/llm-d/llm-d/blob/v0.5.0/docker/Dockerfile.cuda#L67 Applied on top of upstream vllm commit d7de043. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Nick Mitchell <nickm@us.ibm.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
sorry ignore this bogus PR