Checklist
Motivation
Feature: Add Qwen3.5 Support for Speculative Decoding Training (EAGLE3 / DFlash)
Background and Goal
This feature aims to extend the current speculative decoding training framework to support the Qwen3.5 series, enabling training and integration of speculative decoding draft models such as EAGLE3 and DFlash based on Qwen3.5.
Target models include (but are not limited to):
Current Progress
Based on the current DFlash training code, I have already completed a Qwen3.5 DFlash training adaptation and validation in my personal verification repository, including the following work:
-
Adapted to newer dependency interfaces, with compatibility updates for interface changes in newer versions of:
- SGLang (newer versions)
- transformers
- Hugging Face target-model backend interfaces
This includes handling differences in target model loading, forward outputs, and config parsing to ensure stable training execution.
-
Added Qwen3.5-specific training configs and logic, including:
- Target model configuration
- Draft model configuration
- Qwen3.5-compatible training flow / parameter handling
- DFlash adaptation logic (e.g., training pipeline integration)
-
I would like to further complete and integrate Qwen3.5 speculative decoding training support for EAGLE3 / DFlash, with a unified training entry and configuration workflow (and potentially extend to more speculative decoding training methods later).
。
Related resources
No response
Checklist
Motivation
Feature: Add Qwen3.5 Support for Speculative Decoding Training (EAGLE3 / DFlash)
Background and Goal
This feature aims to extend the current speculative decoding training framework to support the Qwen3.5 series, enabling training and integration of speculative decoding draft models such as EAGLE3 and DFlash based on Qwen3.5.
Target models include (but are not limited to):
Qwen/Qwen3.5-27Bhttps://huggingface.co/Qwen/Qwen3.5-27B
Qwen/Qwen3.5-35B-A3Bhttps://huggingface.co/Qwen/Qwen3.5-35B-A3B
Current Progress
Based on the current DFlash training code, I have already completed a Qwen3.5 DFlash training adaptation and validation in my personal verification repository, including the following work:
Adapted to newer dependency interfaces, with compatibility updates for interface changes in newer versions of:
This includes handling differences in target model loading, forward outputs, and config parsing to ensure stable training execution.
Added Qwen3.5-specific training configs and logic, including:
I would like to further complete and integrate Qwen3.5 speculative decoding training support for EAGLE3 / DFlash, with a unified training entry and configuration workflow (and potentially extend to more speculative decoding training methods later).
。
Related resources
No response