laylazaes-beep

laylazaes-beep

Popular repositories Loading

qwen3.6-speculative-decoding-rtx3090 qwen3.6-speculative-decoding-rtx3090 Public

Benchmark speculative decoding performance for Qwen3.6-35B-A3B on an RTX 3090 GPU using llama.cpp to evaluate model throughput and structural regressions.