models/mistralai/Mistral-7B-Instruct-v0.3/t3000/optimized (1x8 mesh port)

Port this t3000 model from a 2x4 mesh to a 1x8 mesh (TP=8).

Target: models/mistralai/Mistral-7B-Instruct-v0.3/t3000/optimized

Tasks:
- Update models/mistralai/Mistral-7B-Instruct-v0.3/t3000/optimized/model.py to set MESH_SHAPE = (1, 8) and adjust any mesh-axis/sharding assumptions for a 1x8 mesh.
- Keep architecture, dtypes, and cache behavior unchanged.
- Run demo + long eval:
  python demo.py models/mistralai/Mistral-7B-Instruct-v0.3/t3000/optimized/model.py
  python eval.py models/mistralai/Mistral-7B-Instruct-v0.3/t3000/optimized/model.py --model mistralai/Mistral-7B-Instruct-v0.3 --prompt_file prompts/bringup_eval_long.txt --max_new_tokens 100 --max_seq_len <current seq len>
- Update MODELS.md for the t3000 row and save demo.log/eval.log under models/mistralai/Mistral-7B-Instruct-v0.3/t3000/optimized/.

Notes:
- Keep paged attention / paged KV cache behavior unchanged.
- If TT metal cache error: set TT_METAL_CACHE=/tmp/tt-metal-cache and TT_METAL_RUNTIME_ROOT=/proj_sw/user_dev/moconnor/tt-runtime-root.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

models/mistralai/Mistral-7B-Instruct-v0.3/t3000/optimized (1x8 mesh port) #206

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

models/mistralai/Mistral-7B-Instruct-v0.3/t3000/optimized (1x8 mesh port) #206

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions