smolvlm

A benchmark suite for lightweight generative multimodal Vision-Language Models, comparing ViLT and SmolVLM under resource-constrained inference environments. Demonstrates CPU-only deployment, model evaluation, and multimodal reasoning with images and text, highlighting practical GenAI engineering for real-world applications.

python ai ml vqa gradio multimodal-deep-learning huggingface-transformers vilt generativeai visionlanguagemodel smolvlm

Updated Dec 24, 2025
Python

kucingcoder / miramo

Star

A Flask-based web app for managing multimodal datasets text and images with CRUD operations via SQLite, and seamless export as a structured Parquet dataset to Hugging Face Hub.

llama datasets bert vlm multimodal huggingface llm llm-training smolvlm

Updated Jul 23, 2025
HTML

omkarsoak / VLM-Receipt-OCR

Star

Receipt OCR using Fine-tuned VLMs

fine-tuning vision-language-models smolvlm

Updated Oct 14, 2025
Jupyter Notebook

uninterruptedpowersupply3-NEW / Sigma-Captioner

Star

A some what optimized implementation of some light weight and popular models

git gui optimized vqa tagger dataset-generation clip blip visual-question-answering caption-generation huggingface-transformers llava moondream vision-language-models florence-2 smolvlm

Updated Sep 12, 2025
Python

mohsine92 / LlamaLens

Star

Vision-Language Model (VLM) for real-time video analysis and description via webcam.

computer-vision analysis llama vlm smolvlm

Updated Nov 20, 2025
JavaScript

Improve this page

Add a description, image, and links to the smolvlm topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the smolvlm topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

smolvlm

Here are 23 public repositories matching this topic...

jamjamjon / usls

lucasjinreal / Namo-R1

kiranbaby14 / TalkMateAI

yakhyo / smolvlm-realtime-webcam-vllm

stlin256 / VLM4Classification

mvish7 / AlignVLM

iBz-04 / reeltek

snnclsr / chatgpt-from-scratch

Qengineering / SmolVLM2-2B-NPU

stlin256 / SmolVLM_with_LLM

Qengineering / SmolVLM2-256M-NPU

gabrielSantosLima / vlm_garbage_classification

mrgehlot / object_detection_using_vllm

Qengineering / SmolVLM2-500M-NPU

CasualEngineerZombie / smolvlm-realtime-face

Kumaran-Elumalai / nextgen-multimodal-generative-vlm-evaluation-suite

kucingcoder / miramo

omkarsoak / VLM-Receipt-OCR

uninterruptedpowersupply3-NEW / Sigma-Captioner

mohsine92 / LlamaLens

Improve this page

Add this topic to your repo