Skip to content
#

needle-in-a-haystack

Here are 4 public repositories matching this topic...

Language: All
Filter by language

Top-K sparse attention has no critical key budget: a 4× swing of k_eff barely moves long-context retrieval accuracy across 3 models (Llama-1B/3B, Qwen2.5-3B). The limit is the base model's disambiguation, not the compressor. Paper + raw per-prompt logs + pre-registrations. Selection is exact; kernel port validated bitwise.

  • Updated Jun 2, 2026
  • TeX

Improve this page

Add a description, image, and links to the needle-in-a-haystack topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the needle-in-a-haystack topic, visit your repo's landing page and select "manage topics."

Learn more