Skip to content

Support for GLM-4.7-FP8 #12

@maodoudou168

Description

@maodoudou168

Hi, may I ask a few questions about plans on supporting GLM-4.7 models?
I have used the DFLASH training code in SpecForge to train a draft model for GLM-4.7-FP8, and the accuracy is higher than 0.9, but when running on sglang, the accept rate is only 1.x, I used this sglang PR sgl-project/sglang#16818, ard there any other implementations needed for sglang inference for models that are not Qwen? May I know your official plans on suporting GLM-4.7-FP8?
Thanks a lot!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions