Incrementally updating the docs indices #166
Replies: 4 comments 1 reply
-
|
Good idea, its possible, probably need some changes to CLI to allow this |
Beta Was this translation helpful? Give feedback.
-
|
@dartpain could you point in the right direction; this is important for our use case |
Beta Was this translation helpful? Give feedback.
-
|
The way I know it is possible to update the index file using the incremental update method but I don't know if this works. |
Beta Was this translation helpful? Give feedback.
-
|
Incremental index updates are critical once you have a live knowledge base that changes — full re-indexing for every document update doesn't scale. A few patterns that help: Content-addressable chunking — chunk documents by content hash, not by position. When a document changes, only re-index the chunks whose content hash changed. Unchanged chunks reuse existing vectors. This gives you ~90% reuse on typical documentation updates (formatting changes, minor edits). Dependency tracking — when one document links to or references another, track those edges. When Document A changes, check if any indexed chunks from Document B are actually excerpts or paraphrases of A. This avoids stale cross-references in retrieved results. Importance-weighted update priority — not all document changes are equally important. A change to an API reference page should trigger immediate re-indexing; a change to a changelog from 3 years ago can wait. Weight update urgency by access frequency × recency × document type. Temporal metadata on vectors — store For agent systems specifically, the incremental index matters because agents that retrieve from a stale knowledge base will give confidently wrong answers. We treat knowledge staleness as a first-class concern in KinthAI's memory architecture: https://blog.kinthai.ai/why-character-ai-forgets-you-persistent-memory-architecture What's the typical change frequency for the docs you're indexing — a few updates/week or continuous? |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Is there a way to incrementally update the index files if we get some more data instead of running the ingest files for the previous data too?
Beta Was this translation helpful? Give feedback.
All reactions