sm60

Here is 1 public repository matching this topic...

poisonxa16 / PXA_llama

Run modern hybrid/MoE LLMs correctly and fast on cheap old Tesla P100 / GTX 1080 Ti cards. Fork of ik_llama.cpp: clean concurrent (np>1) Gated-DeltaNet hybrid decoding + Pascal sm_60 FP16 build tuning + built-in fan-out decomposer.

pascal concurrency cuda moe homelab mixture-of-experts hybrid-models tesla-p100 llama-cpp local-llm llm-inference gguf speculative-decoding qwen3 gated-deltanet ik-llama gtx-1080-ti sm60

Updated Jun 7, 2026
Shell

Improve this page

Add a description, image, and links to the sm60 topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the sm60 topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sm60

Here is 1 public repository matching this topic...

poisonxa16 / PXA_llama

Improve this page

Add this topic to your repo