Skip to content
View kiwi3shark's full-sized avatar

Block or report kiwi3shark

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Popular repositories Loading

  1. plasma plasma Public

    Forked from ray-project/plasma

    A minimal shared memory object store design

    C

  2. ThunderKittens ThunderKittens Public

    Forked from HazyResearch/ThunderKittens

    Tile primitives for speedy kernels

    Cuda

  3. tiny-llm tiny-llm Public

    Forked from skyzh/tiny-llm

    (🚧 WIP) a course of LLM inference serving on Apple Silicon for systems engineers.

    Python

  4. tlb_shootdowns tlb_shootdowns Public

    Forked from bitcharmer/tlb_shootdowns

    C

  5. SageAttention SageAttention Public

    Forked from thu-ml/SageAttention

    [ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-to-end metrics across language, image, and video models.

    Cuda

  6. mini-sglang mini-sglang Public

    Forked from sgl-project/mini-sglang

    A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.

    Python