Skip to content

apply: --top-k flag not honored — returns more units than requested #21

@0b00101111

Description

@0b00101111

Symptom

Running `metis apply "" --top-k 3 --format kx --output result.kx.json` returns more than 3 units in the output.

Observed across 15 audit queries against parentguidebook (commit ~6,821 atoms, profile=strict). Per-query unit counts ranged from 2 to 23, despite `--top-k 3` on every call.

Examples (from /tmp/audit/*.kx.json):

  • `C1.kx.json` → 5 units
  • `C13.kx.json` → 3 units (matches expected)
  • `C17.kx.json` → 23 units
  • `C30.kx.json` → 18 units

Reproduce

```bash
cd ~/garage/metis/engine
bun run src/cli.ts apply ~/garage/metis-library/parentguidebook \
"amblyopia affects 1 to 5 percent of children" \
--top-k 3 --format kx --output /tmp/test.kx.json
jq '.units | length' /tmp/test.kx.json

expected: 3, observed: 23

```

Hypothesis

`--top-k` likely caps a different stage of the apply pipeline (e.g., per-method retrieval before RRF fusion, or before traversal expansion) rather than the final output. The KX export probably emits all candidates passed to composition, not the post-rank top-k.

Why it matters

For PGB integration (parentguidebook#109), each article needs a small, focused KX sidecar — ideally 3–10 units per topic, not 20+. Inflated unit counts:

  • Bloat the sidecar files
  • Force editors to skim irrelevant atoms when picking citations
  • Erode the "top-k = curated" mental model

Suggested fix

After the final ranking step (post-RRF, post-traversal), trim the unit list to `min(top-k, units.length)` before serializing to KX.

Add a test:
```ts
test("--top-k caps final unit count", async () => {
const result = await apply(project, query, { topK: 3, format: "kx" });
expect(result.units.length).toBeLessThanOrEqual(3);
});
```

Context

Found during the editorial audit for parentguidebook#109. Other bug filed separately: ranking on borderline matches (relevant atom buried under unrelated ones). Both issues share the apply-pipeline retrieval surface.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions