Thanks for your great job.
I checked the current public codebase to understand how the skill retrieval ablation in Table 2 is implemented.
What I could confirm from the released code:
- I could not find a public implementation of
UCB or KM / k-means based skill retrieval in the current repository.
- The released skill-based training examples use a fixed
SkillProvider with skill_all=false by default. This appears to be rule-based task matching, not Full, not UCB, and not KM.
skill_all=true would correspond to concatenating all available skills into the prompt / teacher context, so that seems closest to the Full row.
Because of that, I am not sure how to reproduce the UCB and KM rows reported in the paper, and I am also unsure whether the main reported results in the paper rely on KM retrieval or on the currently released rule-based matching.
Could you clarify:
- Is there a plan to open-source the
KM / UCB skill retrieval code or the exact evaluation scripts / configs used for Table 2?
- Which retrieval setting should be considered the default one for reproducing the paper's main results? (I noticed that 'KM' is the one for main table)
Thanks.
Thanks for your great job.
I checked the current public codebase to understand how the skill retrieval ablation in Table 2 is implemented.
What I could confirm from the released code:
UCBorKM/ k-means based skill retrieval in the current repository.SkillProviderwithskill_all=falseby default. This appears to be rule-based task matching, notFull, notUCB, and notKM.skill_all=truewould correspond to concatenating all available skills into the prompt / teacher context, so that seems closest to theFullrow.Because of that, I am not sure how to reproduce the
UCBandKMrows reported in the paper, and I am also unsure whether the main reported results in the paper rely onKMretrieval or on the currently released rule-based matching.Could you clarify:
KM/UCBskill retrieval code or the exact evaluation scripts / configs used for Table 2?Thanks.