Support for GLM-4.7-FP8

Hi, may I ask a few questions about plans on supporting GLM-4.7 models?
I have used the DFLASH training code in SpecForge to train a draft model for GLM-4.7-FP8, and the accuracy is higher than 0.9, but when running on sglang, the accept rate is only 1.x, I used this sglang PR https://github.com/sgl-project/sglang/pull/16818, ard there any other implementations needed for sglang inference for models that are not Qwen? May I know your official plans on suporting GLM-4.7-FP8?
Thanks a lot!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for GLM-4.7-FP8 #12

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support for GLM-4.7-FP8 #12

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions