You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Run modern hybrid/MoE LLMs correctly and fast on cheap old Tesla P100 / GTX 1080 Ti cards. Fork of ik_llama.cpp: clean concurrent (np>1) Gated-DeltaNet hybrid decoding + Pascal sm_60 FP16 build tuning + built-in fan-out decomposer.