Question on Table 2 skill retrieval reproducibility

<img width="784" height="203" alt="Image" src="https://github.com/user-attachments/assets/d712320d-7ff4-499b-bc48-81421ce07cfa" />

Thanks for your great job. 

I checked the current public codebase to understand how the skill retrieval ablation in Table 2 is implemented.

What I could confirm from the released code:

- I could not find a public implementation of `UCB` or `KM` / k-means based skill retrieval in the current repository.
- The released skill-based training examples use a fixed `SkillProvider` with `skill_all=false` by default. This appears to be rule-based task matching, not `Full`, not `UCB`, and not `KM`.
- `skill_all=true` would correspond to concatenating all available skills into the prompt / teacher context, so that seems closest to the `Full` row.

Because of that, I am not sure how to reproduce the `UCB` and `KM` rows reported in the paper, and I am also unsure whether the main reported results in the paper rely on `KM` retrieval or on the currently released rule-based matching.

Could you clarify:

1. Is there a plan to open-source the `KM` / `UCB` skill retrieval code or the exact evaluation scripts / configs used for Table 2?
2. Which retrieval setting should be considered the default one for reproducing the paper's main results? (I noticed that 'KM' is the one for main table)

Thanks.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question on Table 2 skill retrieval reproducibility #19

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Question on Table 2 skill retrieval reproducibility #19

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions