LLaDa support? - LLM Diffusion based #1902

rtcvb32 · 2025-12-26T23:55:16Z

rtcvb32
Dec 26, 2025

There's new diffusion models coming out that i would like to see KoboldCPP support. There's currently a 8B and 100B models that while experimental look very promising.

To my understanding it determines the size of the reply, then does multiple passes, masking sections that are likely low confidence (or random? Maybe pattern sets like checkboard?) until the response is done. At present the ARM model (one token at a time) doesn't allow for this.

Two main advantages with Diffusion method includes that it is faster (fewer passes over the whole thing), and unlike ARM if a bad word/token is given earlier, it can replace that with a better one in subsequent passes (rather than be stuck with it and working around it, or being inconsistent later).

https://llada.pro - LLaDa demo

https://huggingface.co/mradermacher/LLaDA-8B-Instruct-i1-GGUF

LostRuins · 2025-12-27T04:08:19Z

LostRuins
Dec 27, 2025
Maintainer

koboldcpp gets its model support from llama.cpp, you should make the request there

0 replies

rtcvb32 · 2025-12-27T04:56:43Z

rtcvb32
Dec 27, 2025
Author

Gotcha. Went to llama.cpp and found it's already more-or-less supported or being worked on, and probably among the 63 updates that KoboldCPP is behind on.

https://github.com/ggml-org/llama.cpp/pulls?q=llada

Okay, i'll just hope KoboldCPP gets updated to support it soon :)

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

LLaDa support? - LLM Diffusion based #1902

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

LLaDa support? - LLM Diffusion based #1902

Uh oh!

Uh oh!

rtcvb32 Dec 26, 2025

Replies: 2 comments

Uh oh!

LostRuins Dec 27, 2025 Maintainer

Uh oh!

rtcvb32 Dec 27, 2025 Author

rtcvb32
Dec 26, 2025

LostRuins
Dec 27, 2025
Maintainer

rtcvb32
Dec 27, 2025
Author