Skip to content

models/mistralai/Mistral-7B-Instruct-v0.3/t3000/optimized (1x8 mesh port) #206

@yieldthought

Description

@yieldthought

Port this t3000 model from a 2x4 mesh to a 1x8 mesh (TP=8).

Target: models/mistralai/Mistral-7B-Instruct-v0.3/t3000/optimized

Tasks:

  • Update models/mistralai/Mistral-7B-Instruct-v0.3/t3000/optimized/model.py to set MESH_SHAPE = (1, 8) and adjust any mesh-axis/sharding assumptions for a 1x8 mesh.
  • Keep architecture, dtypes, and cache behavior unchanged.
  • Run demo + long eval:
    python demo.py models/mistralai/Mistral-7B-Instruct-v0.3/t3000/optimized/model.py
    python eval.py models/mistralai/Mistral-7B-Instruct-v0.3/t3000/optimized/model.py --model mistralai/Mistral-7B-Instruct-v0.3 --prompt_file prompts/bringup_eval_long.txt --max_new_tokens 100 --max_seq_len
  • Update MODELS.md for the t3000 row and save demo.log/eval.log under models/mistralai/Mistral-7B-Instruct-v0.3/t3000/optimized/.

Notes:

  • Keep paged attention / paged KV cache behavior unchanged.
  • If TT metal cache error: set TT_METAL_CACHE=/tmp/tt-metal-cache and TT_METAL_RUNTIME_ROOT=/proj_sw/user_dev/moconnor/tt-runtime-root.

Metadata

Metadata

Assignees

No one assigned

    Labels

    lbRoute bringup to lbrun_testsRun model testsrun_tests_t3000Run model tests on t3000

    Projects

    Status

    Ready

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions