Local LLM-driven entity descriptions for archive gaps

> **Skills:** Transformers.js / WebLLM · prompt engineering · build scripting · QA review
> **Time:** ~10 hours
> **Good for:** ML engineers · NLP folks · lore curators
> **Difficulty:** Advanced

---

## Context

Of our 310 entities, ~120 don't have Wikipedia entries (`entity.long` is
empty). Mostly minor characters, ships, vehicles. The semantic search struggles
on these because there's no narrative for the embedding model to chew on.

## Goal

At build time, run a small local LLM (Phi-3-mini, Gemma-2B, or similar via
Transformers.js / WebLLM) to generate canonical descriptions from the entity's
relations + name + type. Manually verified before merging into kb.json.

## Where to start

- New `scripts/build-llm-descriptions.ts` — pure Node script that:
  - Loads kb.json
  - For each entity with empty `long`, builds a structured prompt from
    `name + type + relations + short`
  - Calls a local LLM (no external API)
  - Caches output to `data/.cache/llm/<entityId>.json` for review
- A small UI in CLI to approve/reject each generated description before merging
- Re-run `build:embeddings` after merge

## Acceptance criteria

- 80%+ of empty `long` fields populated with plausible descriptions
- Every generated description manually reviewed (one-shot pass is fine)
- No hallucinated facts; if the LLM doesn't have enough signal, leave the
  field empty
- Local-only, no API keys, no network calls beyond model download

## Notes

- Model size budget: ≤2 GB on disk, ≤4 GB RAM
- Generation budget: ≤2s per entity on CPU (so ~4 minutes total)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Local LLM-driven entity descriptions for archive gaps #15

Context

Goal

Where to start

Acceptance criteria

Notes

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Local LLM-driven entity descriptions for archive gaps #15

Description

Context

Goal

Where to start

Acceptance criteria

Notes

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions