Summary
Values stored in YAML frontmatter of an ingested markdown page are not promoted into the searchable chunk text, so they don't hit on tools/call search. The page is otherwise indexed and discoverable via body content.
This matters because a common pattern is to put a unique "canonical search phrase" in frontmatter as metadata, then expect the agent to find the page by querying that phrase. The page is found if the phrase also appears in the body, but not if it only lives in frontmatter.
Repro
Vault file:
---
type: dr-payables-snapshot
date: 2026-05-19
entity: Forever Crystals, S.R.L.
canonical_search_phrase: dr-payables-usd-2026-05-19-quechibot-canonical
---
# DR Cuentas por Pagar (USD) — 2026-05-19
Entidad: **Forever Crystals, S.R.L.**
...body content with vendor names like ANTARES, LANDMARK REALTY, etc...
After a full sync (launchctl kickstart gui/$(id -u)/com.openclaw.gbrain.daily-sync), the page is indexed and the page count for type: dr-payables-snapshot increments correctly.
Expected
Searching for dr-payables-usd-2026-05-19-quechibot-canonical via MCP tools/call search returns this page.
Actual
search "dr-payables-usd-2026-05-19-quechibot-canonical" → []
search "Landmark Realty Corp" (body term) → returns the page (page_id 1834, slug finance/dominican-republic/payables/2026-05-19-dr-payables-usd, type dr-payables-snapshot)
So the page is in the graph, but tokens that exist only in YAML are unreachable.
Versions
- gbrain
0.31.10 (pglite engine)
- macOS 25.3.0 arm64
Why this matters
Multi-agent setups (in our case OpenClaw + Hermes) use frontmatter-tagged canonical phrases as a clean way for the issuing agent to write a receipt, and for the receiving agent (or a human) to verify the document is indexed and accessible. Today the only safe workaround is to duplicate the phrase into the markdown body, which works but defeats the purpose of structured metadata.
Suggested fix options
- Include scalar frontmatter values in the searchable text for each page chunk (probably the cleanest). The values are short and the index cost is negligible. Could be gated by a config flag (e.g.
search.index_frontmatter_values: true) so existing behavior is preserved if anyone depends on it.
- Promote a known set of "search-relevant" keys (e.g.
aliases, tags, canonical_search_phrase, title) into the searchable chunk by default, and let users extend the allowlist.
- At minimum, document the current behavior so authors know to put canonical phrases in the body, not only frontmatter.
Happy to PR option 2 if it sounds right; let me know which direction you prefer.
Related
Not a blocker — we've already worked around it by adding **Canonical GBrain search phrase:** \`` to the body of every receipt-bearing file. Filing because (a) the YAML-only failure mode is surprising and (b) frontmatter-as-metadata is a natural pattern for agent receipts.
Summary
Values stored in YAML frontmatter of an ingested markdown page are not promoted into the searchable chunk text, so they don't hit on
tools/call search. The page is otherwise indexed and discoverable via body content.This matters because a common pattern is to put a unique "canonical search phrase" in frontmatter as metadata, then expect the agent to find the page by querying that phrase. The page is found if the phrase also appears in the body, but not if it only lives in frontmatter.
Repro
Vault file:
After a full sync (
launchctl kickstart gui/$(id -u)/com.openclaw.gbrain.daily-sync), the page is indexed and the page count fortype: dr-payables-snapshotincrements correctly.Expected
Searching for
dr-payables-usd-2026-05-19-quechibot-canonicalvia MCPtools/call searchreturns this page.Actual
search "dr-payables-usd-2026-05-19-quechibot-canonical"→[]search "Landmark Realty Corp"(body term) → returns the page (page_id 1834, slugfinance/dominican-republic/payables/2026-05-19-dr-payables-usd, typedr-payables-snapshot)So the page is in the graph, but tokens that exist only in YAML are unreachable.
Versions
0.31.10(pglite engine)Why this matters
Multi-agent setups (in our case OpenClaw + Hermes) use frontmatter-tagged canonical phrases as a clean way for the issuing agent to write a receipt, and for the receiving agent (or a human) to verify the document is indexed and accessible. Today the only safe workaround is to duplicate the phrase into the markdown body, which works but defeats the purpose of structured metadata.
Suggested fix options
search.index_frontmatter_values: true) so existing behavior is preserved if anyone depends on it.aliases,tags,canonical_search_phrase,title) into the searchable chunk by default, and let users extend the allowlist.Happy to PR option 2 if it sounds right; let me know which direction you prefer.
Related
Not a blocker — we've already worked around it by adding
**Canonical GBrain search phrase:** \`` to the body of every receipt-bearing file. Filing because (a) the YAML-only failure mode is surprising and (b) frontmatter-as-metadata is a natural pattern for agent receipts.