Checkpoint conversion tool: Optimize to_maxtext & Onboard deepseek2/3/3.2 by shuningjin · Pull Request #3184 · AI-Hypercomputer/maxtext

shuningjin · 2026-02-18T20:20:19Z

Description

onboard deepseek family

deepseek2-16b, deepseek3-671b, deepseek3.2-671b

optimize to_maxtext

use bfloat16 to load and save
reduce memory by half in all cases
speedup for large model: e.g., deepseek3-671b, previously impractical (loading alone takes 11hr), now total conversion is 9hr (with 4min loading)
increase support: can convert model without hf code, as long as there is safetensor (e.g., deepseek3.2)

Tests

Please describe how you tested this change, and include any instructions and/or
commands to reproduce.

Checklist

Before submitting this PR, please make sure (put X in square brackets):

I have performed a self-review of my code. For an optional AI review, add the gemini-review label.
I have necessary comments in my code, particularly in hard-to-understand areas.
I have run end-to-end tests tests and provided workload links above if applicable.
I have made or will make corresponding changes to the doc if needed, including adding new documentation pages to the relevant Table of Contents (toctree directive) as explained in our documentation.

…ngjin-ckpt-opt3

shuningjin added 15 commits February 13, 2026 22:51

.

f107f65

initial done

ff1d035

init

4b2107e

.

db05516

.

0e993a4

benchmark

fc76fa2

.

14889ae

ds2

aff681b

finalize

fe2391b

rename

56cfaf8

Merge branch 'main' of github.com:AI-Hypercomputer/maxtext into shuni…

43d7bcc

…ngjin-ckpt-opt3

order by layer for time estimate

9c35e89

ds32

e8f548b

load via safe open

02b47fd

small fix

8186fa9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Checkpoint conversion tool: Optimize to_maxtext & Onboard deepseek2/3/3.2#3184

Checkpoint conversion tool: Optimize to_maxtext & Onboard deepseek2/3/3.2#3184
shuningjin wants to merge 15 commits intomainfrom
shuningjin-ckpt-opt3

shuningjin commented Feb 18, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

Conversation

shuningjin commented Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Tests

Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

shuningjin commented Feb 18, 2026 •

edited

Loading