Fall back to a built-in chat template by model_type by lixiangnlp · Pull Request #2 · Thump604/mlx-lm

lixiangnlp · 2026-04-30T01:06:48Z

Summary

When a checkpoint's tokenizer ships without a chat template (no
`chat_template` in `tokenizer_config.json`, no `chat_template.json` /
`chat_template.jinja` file, and no `chat_template_type` set), look up
`model_type` from `config.json` and use a built-in template module if
registered.

Currently registered:

`model_type`	template module
`deepseek_v32`	`deepseek_v32`
`deepseek_v4`	`deepseek_v32`

Behavior is unchanged when either `tokenizer.chat_template` or
`chat_template_type` is set, or when `model_type` is not in the registry.

Motivation

`Thump604/DeepSeek-V4-Flash-MLX-Q2-mixed-gs128-affine`
ships a `tokenizer_config.json` without any `chat_template` and no
external `chat_template.jinja`, so `mlx_lm.chat` raises before generation:

```
ValueError: Cannot use chat template functions because tokenizer.chat_template
is not set and no template argument was passed!
```

Adding the entry to the upload would also fix it on the data side, but the
mlx_lm-side fallback is useful for any future V4 / V3.2 quant that's
distributed without one — and the registry is the natural place for the
maintainer's own internal `deepseek_v32` Python template to be reused
beyond `chat_template_type` opt-in.

Changes

New `_DEFAULT_CHAT_TEMPLATE_TYPE_BY_MODEL_TYPE` registry in
`mlx_lm/tokenizer_utils.py`.
New `_infer_chat_template_type(model_path)` helper that reads
`config.json` and looks up `model_type`.
The `chat_template_type` resolution in `load()` now falls through to
the helper only when `tokenizer.chat_template is None` and the tokenizer
config has no explicit `chat_template_type`.

Tests

Adds two cases to `tests/test_tokenizers.py`:

`test_chat_template_falls_back_for_known_model_type` — `model_type=deepseek_v4`
picks up the `deepseek_v32` template; `tokenizer.has_chat_template` is True.
`test_chat_template_no_fallback_for_unknown_model_type` — unrecognized
`model_type` leaves the tokenizer with no chat template (no behavior change).

```
$ python -m unittest tests.test_tokenizers.TestTokenizers.test_chat_template_falls_back_for_known_model_type \
tests.test_tokenizers.TestTokenizers.test_chat_template_no_fallback_for_unknown_model_type \
tests.test_tokenizers.TestTokenizers.test_unknown_model_config_tokenizer_fallback
...
Ran 3 tests in 0.004s
OK
```

Dependency

Pairs with #1 — without that, the wrapper's auto-injected `enable_thinking`
kwarg makes the registered template raise `TypeError` once it actually fires.
The tests in this PR avoid that path by stopping at `tokenizer._chat_template`
identity, so this PR is mergeable independently, but real chat use needs both.

When a tokenizer has no chat_template (no `chat_template` field in tokenizer_config.json, no chat_template.json/.jinja, and no `chat_template_type` set), look up the model's `model_type` from config.json and use a built-in template module if registered. Currently maps `deepseek_v4` and `deepseek_v32` -> `deepseek_v32`. Behavior is unchanged when either `chat_template` or `chat_template_type` is set, or when `model_type` is not in the registry. This unblocks chat against quantized DeepSeek-V4 checkpoints whose upload omitted the chat template, e.g. huggingface.co/Thump604/DeepSeek-V4-Flash-MLX-Q2-mixed-gs128-affine. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fall back to a built-in chat template by model_type#2

Fall back to a built-in chat template by model_type#2
lixiangnlp wants to merge 1 commit into
Thump604:deepseek-v4-support-fixesfrom
lixiangnlp:chat-template-fallback-by-model-type

lixiangnlp commented Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

lixiangnlp commented Apr 30, 2026

Summary

Motivation

Changes

Tests

Dependency

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant