Summary
Gemma 4 (model_type: gemma4) was introduced in transformers >= 5.5.0, but llm-compressor currently pins transformers>=4.56.1,<=4.57.6 in setup.py. This means Gemma 4 models cannot be loaded or quantized without manually overriding the transformers version.
Steps to reproduce
from transformers import AutoModelForImageTextToText
model = AutoModelForImageTextToText.from_pretrained("google/gemma-4-E4B-it", dtype="auto")
With the pinned transformers version, this fails because gemma4 is not a recognized model type.
Current workaround
Install llm-compressor with --no-deps and force-install transformers from git main. A Dockerfile demonstrating this workaround is included in PR #2561.
Suggested fix
Bump the transformers upper bound in setup.py to include >= 5.5.0 once compatibility is verified across the codebase.
Related
Summary
Gemma 4 (
model_type: gemma4) was introduced intransformers >= 5.5.0, but llm-compressor currently pinstransformers>=4.56.1,<=4.57.6insetup.py. This means Gemma 4 models cannot be loaded or quantized without manually overriding the transformers version.Steps to reproduce
With the pinned transformers version, this fails because
gemma4is not a recognized model type.Current workaround
Install llm-compressor with
--no-depsand force-install transformers from git main. A Dockerfile demonstrating this workaround is included in PR #2561.Suggested fix
Bump the transformers upper bound in
setup.pyto include>= 5.5.0once compatibility is verified across the codebase.Related