A curated list of plugins and extensions built on top of vLLM โ hardware backends, custom model integrations, and general plugins that extend vLLM without forking the core.
๐ PRs welcome! See Contributing for how to add new plugins.
These plugins live under the vLLM hardware plugin RFC and are listed in the official installation docs.
๐ Note: The table above is intentionally kept aligned with the hardware plugin list in the official vLLM docs. Check the vLLM installation page for the latest status and compatibility notes.
Plugins that expose additional accelerators or platform backends using vllm.platform_plugins (or that act as deโfacto hardware integrations).
Plugins that register new model architectures or large out-of-tree projects using vllm.general_plugins (or similar).
| Model / Architecture | Name | Repository / Distribution | Notes |
|---|---|---|---|
| ๐ Diffusion LMs (LLaDA, Dream) | diffuplug | Registers diffusion LMs such as LLaDA and Dream with vLLM. Implements custom samplers (e.g. LLaDASampler) and adapters so diffusion language models can run through vLLM's engine. |
|
| ๐ค OpenVLA (Vision-Language-Action) | openvla-vllm | vLLM plugin for OpenVLA that accelerates offline batch inference for visionโlanguageโaction models, without needing custom compilation. |
๐ก If you know of other modelโregistration plugins (e.g. for custom multimodal architectures), please open a PR and add them here!
Plugins that implement custom decoding strategies, sampling methods, and logits processors for advanced text generation.
๐ฌ These decoder plugins are part of the BudEcosystem vLLM Plugins collection.
Plugins that provide inference optimizations, speculative decoding, parallelism strategies, and performance enhancements for vLLM.
Plugins that focus on features, scheduling, logging, or IO processing rather than new hardware or models. Many of these are small code packages that hook into:
- ๐ง
vllm.general_pluginsโ for patches and model registration. - ๐ฅ
vllm.io_processor_pluginsโ for custom request pre/postโprocessing. - ๐
vllm.stat_logger_pluginsโ for metrics exporters and custom stats.
Examples you might encounter:
-
๐ Custom scheduling / patch packages
- Blog & reference implementation: "vLLM Custom Patches" style plugins that add features like priority-based scheduling via a general plugin, typically wiring to
VLLM_PLUGINSor custom env vars (e.g.VLLM_CUSTOM_PATCHES).
- Blog & reference implementation: "vLLM Custom Patches" style plugins that add features like priority-based scheduling via a general plugin, typically wiring to
-
๐ Logging / metrics plugins
- Packages that register stat loggers via
vllm.stat_logger_pluginsto export metrics to systems like Prometheus, custom APM, or internal dashboards.
- Packages that register stat loggers via
๐ง Most of these are still emerging and tend to be internal to companies. This section is intentionally left open for community contributions.
- vLLM Plugin System (design doc)
- vLLM
vllm.pluginsAPI reference - Registering out-of-tree models
- Hardware-pluggable RFC & hardware plugins section in vLLM docs
-
vLLM Plugin System (official docs) https://docs.vllm.ai/en/stable/design/plugin_system/
-
vLLM Installation โ Hardware Plugins section https://docs.vllm.ai/en/latest/getting_started/installation/#hardware-plugins
-
vLLM Ascend Plugin docs https://docs.vllm.ai/projects/ascend/en/latest/
-
vLLM Gaudi Plugin docs https://docs.vllm.ai/projects/gaudi/en/latest/dev_guide/plugin_system.html
-
vLLM Plugin System (official blog) https://blog.vllm.ai/2025/11/20/vllm-plugin-system.html
-
Introducing vLLM Hardware Plugin - Ascend NPU (official blog) https://blog.vllm.ai/2025/05/12/hardware-plugin.html
-
How to Build vLLM Plugins โ Deep dive into plugin lifecycle, plugin groups, and best practices for writing general & platform plugins. https://blog.budecosystem.com/how-to-build-vllm-plugins-a-comprehensive-developer-guide-with-tips-and-best-practices/
-
How to Make Clean, Maintainable Modifications to vLLM Using the Plugin System โ Practical guide showing how to replace forking/monkeyโpatching with a general plugin that manages patches (e.g. priority-based scheduling). https://www.xugj520.cn/en/archives/customize-vllm-plugin-system-guide.html
Spotted a new plugin, blog post, or tutorial?
- โ Add an entry under the appropriate section:
- Include:
- Name
- Short one-line description
- Link (GitHub, docs, PyPI, or HF card)
- Plugin type (general / platform / IO / stats / etc.)
- Include:
- ๐ค Sort alphabetically within each subsection.
- ๐ฌ Open a PR with a brief explanation:
- What does the plugin do?
- Which plugin group(s) does it register under?
- Minimum vLLM version (if known).
This repo aims to list plugins, not generic vLLM forks or random scripts. As a rule of thumb: if it doesn't register entry points under a vllm.*_plugins group, it probably doesn't belong here.
โญ Star this repo if you find it useful! โญ