Inference engine for Intel devices. Serve LLMs, VLMs, Whisper, Kokoro-TTS, Embedding and Rerank models over OpenAI endpoints.
-
Updated
Dec 20, 2025 - Python
Inference engine for Intel devices. Serve LLMs, VLMs, Whisper, Kokoro-TTS, Embedding and Rerank models over OpenAI endpoints.
Most simple and minimal code to run an LLM chatbot from HuggingFace hub with OpenVINO
Optimized Climate Sentiment Classification Pipeline
Jupyter Notebook for LLM compression via quantization (INT8, INT4, FP16) and evaluation using metrics such as ROUGE and BLEU. Facilitates efficient LLM optimization.
Add a description, image, and links to the optimum-intel topic page so that developers can more easily learn about it.
To associate your repository with the optimum-intel topic, visit your repo's landing page and select "manage topics."