Problem
The current routing guides instruct agents to fetch llms.txt as their first step. An alternative pattern is to attempt a targeted HTML fetch first (e.g. a specific topic page) and only fall back to the llms.txt index if the direct page doesn't cover the question.
This is worth testing because:
- A direct topic-page fetch is smaller and faster when the agent routes correctly
- The fallback pattern may reduce hallucination on narrow questions (less noise)
- But it adds an extra hop if the first fetch misses
What to test
Two instruction variants in the per-source guide (e.g. storefront.md):
Variant A — llms.txt first (current behavior)
Fetch the llms.txt index, identify relevant bundles, then fetch each bundle.
Variant B — direct page first, llms.txt as fallback
Attempt to fetch the most specific topic URL you can derive from the question. If that page doesn't contain enough information, fetch the llms.txt index and navigate from there.
Run against evals/commerce-storefront.json with --runs 3 per variant and score with scripts/score-evals.py.
Acceptance criteria
- Both variants tested on the same eval set with 3+ runs each
- Results captured in
results/ with variant label in filename
- Recommendation documented: which pattern wins or where each is appropriate
- If Variant B is better, update the per-source guide instruction accordingly
Problem
The current routing guides instruct agents to fetch
llms.txtas their first step. An alternative pattern is to attempt a targeted HTML fetch first (e.g. a specific topic page) and only fall back to thellms.txtindex if the direct page doesn't cover the question.This is worth testing because:
What to test
Two instruction variants in the per-source guide (e.g.
storefront.md):Variant A — llms.txt first (current behavior)
Variant B — direct page first, llms.txt as fallback
Run against
evals/commerce-storefront.jsonwith--runs 3per variant and score withscripts/score-evals.py.Acceptance criteria
results/with variant label in filename