support qwen3vl-235b by liusy58 · Pull Request #525 · sgl-project/SpecForge

liusy58 · 2026-04-08T08:00:22Z

Motivation

This PR adds support assets for training Eagle3 on Qwen3-VL-235B by introducing:

a new draft model config for Qwen3-VL-235B
an offline training example script for hidden state preparation and Eagle3 training

Why

Qwen3-VL-235B needs a dedicated Eagle3 configuration and a reproducible offline example for training.

The example script was also updated to avoid machine-specific absolute paths in command arguments, which makes it easier to reuse across different environments and reduces configuration mistakes.

Modifications

Related Issues

Accuracy Test

Benchmark & Profiling

Checklist

Format your code according to the Code Formatting with Pre-Commit.
Add unit tests as outlined in the Running Unit Tests.
Update documentation / docstrings / example tutorials as needed, according to Writing Documentation.
Provide throughput / latency benchmark results and accuracy evaluation results as needed, according to Benchmark and Profiling and Accuracy Results.
For reviewers: If you haven't made any contributions to this PR and are only assisting with merging the main branch, please remove yourself as a co-author when merging the PR.
Please feel free to join our Slack channel at https://sgl-fru7574.slack.com/archives/C09784E3EN6 to discuss your PR.

gemini-code-assist

Code Review

This pull request adds a configuration file for the Qwen3-VL-235B model with Eagle3 and an offline training script. The feedback identifies that the rope_type should be set to mrope for multimodal rotary embeddings and that the training script should use the correct model path variable to load weights.

gemini-code-assist · 2026-04-08T08:01:52Z

configs/qwen3-vl-235b-eagle3.json

+      20,
+      20
+    ],
+    "rope_type": "default"


For Qwen3-VL models, the rope_type should be set to mrope to correctly utilize multimodal rotary embeddings. The current default setting will cause the model to use standard Llama rotary embeddings, which are incompatible with the 3D rope indices used in VL models. The mrope_section and mrope_interleaved parameters are only active when rope_type is mrope in the LlamaAttention implementation.

Suggested change

"rope_type": "default"

"rope_type": "mrope"

gemini-code-assist · 2026-04-08T08:01:52Z

examples/run_qwen3_vl_235b_moe_eagle3_offline.sh

+    --standalone \
+    --nproc_per_node 1 \
+    $ROOT_DIR/scripts/train_eagle3.py \
+    --target-model-path "$MODEL_NAME" \


The --target-model-path in the training step should use $MODEL_PATH instead of "$MODEL_NAME". $MODEL_NAME is a descriptive string (e.g., Qwen3-VL-235B-A22B-Instruct-FP8), whereas $MODEL_PATH contains the actual path to the model weights required for loading embeddings and the LM head during offline training. This is consistent with the first torchrun command in this script.

Suggested change

--target-model-path "$MODEL_NAME" \

--target-model-path "$MODEL_PATH" \

support qwen3vl-235b

10afc9e

liusy58 requested review from FlamingoPg, FrankLeeeee, shuaills and sleepcoo as code owners April 8, 2026 08:00

gemini-code-assist bot reviewed Apr 8, 2026

View reviewed changes

liusy58 added 2 commits April 8, 2026 16:08

fix

31624d5

fix lint

7d21c63

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support qwen3vl-235b#525

support qwen3vl-235b#525
liusy58 wants to merge 3 commits intosgl-project:mainfrom
liusy58:qwen3-235b

liusy58 commented Apr 8, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Apr 8, 2026

Uh oh!

gemini-code-assist bot Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	--target-model-path "$MODEL_NAME" \
	--target-model-path "$MODEL_PATH" \

Conversation

liusy58 commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Why

Modifications

Related Issues

Accuracy Test

Benchmark & Profiling

Checklist

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

liusy58 commented Apr 8, 2026 •

edited

Loading