Skip to content

Integrate Engram into custom model#3183

Open
RissyRan wants to merge 1 commit intomainfrom
new_engram_integration
Open

Integrate Engram into custom model#3183
RissyRan wants to merge 1 commit intomainfrom
new_engram_integration

Conversation

@RissyRan
Copy link
Collaborator

@RissyRan RissyRan commented Feb 18, 2026

Description

Integrate Engram feature into a custom model

  • Add configs into base.yml and type.py, integrates with deepseek-custom model
  • This PR supports unscan version of Engram, and scan version will be next PR.
  • Tried to initialize the hash map to model level for one time initialization, but met various JAX initialization error (see more in b/478294699 and this PR). Also added a comment there.
  • Currently, to make it work, you will see mix of jnp and np as some operations running on CPU to avoid ConcretizationTypeError and other issues.

Tests

  • Expect github runners to pass
  • Unit tests still passing for Engram: link
  • End-to-end unscan training for custom model: link
  • Sanity check for DS v2 (expect no impact)
Before change:

I0218 19:13:51.139775 139970093350464 max_utils.py:697] 	Using (GB) 43.98 / 95.74 (45.936912%) on TPU_1(process=0,(1,0,0,0))
I0218 19:15:18.498541 139970093350464 metric_logger.py:181] completed step: 19, seconds: 4.366, TFLOP/s/device: 123.145, Tokens/s/device: 7505.830, total_weights: 131072, loss: 8.135

After change:

I0218 19:05:35.490278 140385131265600 max_utils.py:697] 	Using (GB) 43.99 / 95.74 (45.947357%) on TPU_0(process=0,(0,0,0,0))
I0218 19:07:02.849351 140385131265600 metric_logger.py:181] completed step: 19, seconds: 4.366, TFLOP/s/device: 123.144, Tokens/s/device: 7505.806, total_weights: 131072, loss: 8.135

Checklist

Before submitting this PR, please make sure (put X in square brackets):

  • I have performed a self-review of my code. For an optional AI review, add the gemini-review label.
  • I have necessary comments in my code, particularly in hard-to-understand areas.
  • I have run end-to-end tests tests and provided workload links above if applicable.
  • I have made or will make corresponding changes to the doc if needed, including adding new documentation pages to the relevant Table of Contents (toctree directive) as explained in our documentation.

@codecov
Copy link

codecov bot commented Feb 18, 2026

Codecov Report

❌ Patch coverage is 94.00000% with 3 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/MaxText/layers/deepseek.py 86.95% 2 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

@RissyRan RissyRan force-pushed the new_engram_integration branch from aa6e11c to 5c4f07b Compare February 18, 2026 20:24
@github-actions
Copy link

🤖 Hi @RissyRan, I've received your request, and I'm working on it now! You can track my progress in the logs for more details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

Comments