Skip to content

Align terminal reward with the last trainable token and add ALFWorld Evaluation#31

Merged
zhusq20 merged 1 commit into
open-tinker:mainfrom
Xuyan923r:fix/reward-mask-alignment
Mar 1, 2026
Merged

Align terminal reward with the last trainable token and add ALFWorld Evaluation#31
zhusq20 merged 1 commit into
open-tinker:mainfrom
Xuyan923r:fix/reward-mask-alignment

Align terminal reward with the last trainable token and add ALFWorld …

b631bf9
Select commit
Loading
Failed to load commit list.
Sign in for the full log view