Skip to content

Commit cb4cada

Browse files
committed
1 epoch, more frequent evals — fits budget, catches overfitting early
- epochs: 3 -> 1 (old run showed val_loss regressing by step 2000) - warmup_steps: 3000 -> 1000 (proportional to shorter run) - eval_steps: 1000 -> 500 (14 evals to find generalization peak) - save_steps: 5000 -> 2000 (more checkpoints in shorter run)
1 parent b9ae5c7 commit cb4cada

1 file changed

Lines changed: 4 additions & 4 deletions

File tree

configs/reasoning_core_204m.yaml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -21,15 +21,15 @@ data:
2121

2222
training:
2323
output_dir: ./checkpoints/reasoning_core
24-
epochs: 3
24+
epochs: 1
2525
batch_size: 1
2626
gradient_accumulation: 64
2727
lr: 2.0e-4
28-
warmup_steps: 3000
28+
warmup_steps: 1000
2929
weight_decay: 0.01
3030
max_grad_norm: 1.0
3131
fp16: true
3232
logging_steps: 100
33-
save_steps: 5000
34-
eval_steps: 1000
33+
save_steps: 2000
34+
eval_steps: 500
3535
run_name: leanformer-reasoning-core-204m

0 commit comments

Comments
 (0)