diff --git a/docs/HINDSIGHT_EVALUATION.md b/docs/HINDSIGHT_EVALUATION.md
new file mode 100644
index 00000000..caccbc80
--- /dev/null
+++ b/docs/HINDSIGHT_EVALUATION.md
@@ -0,0 +1,412 @@
+# HUF × Hindsight Memory Evaluation
+
+## Executive Recommendation
+
+**Recommended option: B — Hindsight as an embedded sidecar service (opt-in per agent).**
+
+**Why:**
+- HUF’s current knowledge stack is strong for **static/document knowledge retrieval** (ingestion + chunking + FTS/vector search + prompt injection), but it is not a full **learning memory** system with reflection/consolidation.  
+- Hindsight is specialized for **learned memory over time** (retain/recall/reflect, memory banks, directives/dispositions).  
+- Sidecar integration preserves HUF’s current strengths while adding capabilities HUF currently does not have, without forcing a risky full-platform rewrite.
+
+---
+
+## Scope & Method Notes
+
+### What was requested
+You asked for a five-phase evaluation including cloning and code-level study of Hindsight and comparative architecture analysis.
+
+### What was done
+- I attempted to clone Hindsight to `/tmp/hindsight` exactly as requested.
+- Clone failed due network restrictions in this environment (`CONNECT tunnel failed, response 403`).
+- I then performed a best-effort fallback study using:
+  - Hindsight GitHub repo pages and raw files accessible via the web tool.
+  - Hindsight docs site pages (Developer Guide / API sections / FAQs / integrations pages).
+  - Full local HUF code review in this repository.
+
+### Confidence labeling
+- **High confidence:** HUF-side architecture findings (direct source code review).
+- **Medium confidence:** Hindsight-side deep internals (since direct local clone/source traversal was blocked; conclusions rely on public docs + accessible repo pages).
+
+---
+
+## Phase 1 — Hindsight Study Notes
+
+## 1) `AGENTS.md` / `CLAUDE.md`
+
+### Findings
+- `AGENTS.md` in Hindsight is minimal and points to `CLAUDE.md`.
+- `CLAUDE.md` is also concise and primarily references docs and workflows rather than containing extensive internal code conventions.
+
+### Practical implication for integration
+- Architectural truth for Hindsight is primarily in docs and API behavior rather than deep project-local agent instructions.
+
+## 2) `README.md`
+
+### Findings
+- Positions Hindsight as memory that helps agents **learn over time**, not only retrieve history.
+- Exposes two integration modes:
+  1. **LLM wrapper** (low-friction memory insertion around LLM calls).
+  2. **Explicit API operations** (`retain`, `recall`, `reflect`).
+- Deployment options include local Docker, embedded modes, and external PostgreSQL-oriented setup.
+
+### Practical implication for integration
+- Hindsight can be integrated incrementally via API without replacing current HUF execution architecture.
+
+## 3) `skills/` and docs references
+
+### Findings
+- Hindsight docs repeatedly advertise a `hindsight-docs` skill for coding assistants.
+- Docs structure suggests productized API-first consumption with SDKs/clients and integrations.
+
+### Practical implication
+- Hindsight is designed to be consumed as infrastructure from other agent frameworks, which fits HUF’s extensible tool architecture.
+
+## 4) `docs/` (developer/API behavior)
+
+### Findings (from docs pages)
+- Core operations are conceptually:
+  - **Retain:** ingest experiences/content into memory.
+  - **Recall:** retrieve relevant memories.
+  - **Reflect:** synthesize disposition-aware responses / higher-order reasoning over retained memory.
+- **Memory banks** are isolated containers and appear to be the main tenancy boundary.
+- Docs and examples show additional governance concepts (e.g., directives/dispositions) and memory lifecycle patterns.
+- FAQ and integration guidance emphasize re-retain/update workflows and full-context retention rather than simplistic pre-summarized facts.
+
+### Practical implication
+- Hindsight’s abstraction layer (bank + retain/recall/reflect) maps naturally to HUF agent+user/session identity and could be wrapped as tools.
+
+## 5) `hindsight-api-slim` module (best-effort)
+
+### Findings
+- I could inspect high-level repo tree presence (`hindsight-api-slim/hindsight_api/...`) but could not fully clone and deeply inspect each source file in this environment.
+- Public docs indicate server-side API orchestration around retain/recall/reflect, embeddings, and LLM wrapping.
+
+### Practical implication
+- Exact module-level internals should be validated in a follow-up once clone/network is available, but API contract appears stable enough for sidecar prototyping.
+
+## 6) `hindsight-integrations`
+
+### Findings
+- Hindsight publishes direct integrations and external API modes across agent ecosystems/tools.
+- Integration docs for external API mode reinforce the pattern: keep app logic where it is, call Hindsight memory operations remotely.
+
+### Practical implication
+- Sidecar model is aligned with how Hindsight expects production adoption.
+
+---
+
+## Phase 2 — HUF Current State (Code-Based)
+
+## Knowledge ingestion/retrieval architecture
+
+### Ingestion
+- `KnowledgeInput` validates source type (File/Text/URL), hashes content for dedupe, and queues async processing after insert.
+- `process_knowledge_input()` in `indexer.py` performs extraction, sentence-aware chunking, and backend indexing with per-source locking and status transitions.
+
+### Chunking
+- `chunkers/sentence.py` uses LlamaIndex `SentenceSplitter` with fallback chunker.
+
+### Backends
+- `Knowledge Source` supports `sqlite_fts` and `sqlite_vec` types.
+- `sqlite_fts.py` implements FTS5 schema + BM25 ranking in per-source SQLite artifacts.
+
+### Retrieval
+- `retriever.py` `knowledge_search()` searches one or many sources, respects source readiness/permissions, and returns top-k merged by score.
+- `context_builder.py` injects mandatory knowledge snippets into prompt under token budget.
+- `knowledge/tool.py` exposes optional explicit `knowledge_search` tool and knowledge source listing tool.
+
+## Conversation/session memory architecture
+
+### Persistence model
+- Conversation/session storage is first-class via:
+  - `Agent Conversation`
+  - `Agent Message`
+  - `Agent Run`
+- `conversation_manager.py` manages session-id-based active conversation lookup/creation and appends ordered messages.
+
+### Memory semantics
+- HUF persists full conversation history and supports rolling summary fields and structured `conversation_data` JSON.
+- Current memory is largely **session/conversation persistence and retrieval**, not a separate long-term reflective memory engine.
+
+## Agent configuration & memory-related controls
+
+- Agent-level controls include:
+  - `persist_conversation`
+  - `persist_user_history`
+  - context strategy/history limits/summary ratio
+  - knowledge source bindings (`Agent Knowledge` child rows with `mode`, `priority`, `max_chunks`, `token_budget`)
+- Knowledge can be **Mandatory** (prompt-injected) or **Optional** (tool-invoked).
+
+---
+
+## Phase 3 — Comparative Analysis
+
+## 3.1 Capability Comparison
+
+| Capability | HUF (current) | Hindsight |
+|---|---|---|
+| Knowledge ingestion | Strong document ingestion (File/Text/URL), async processing, dedupe hash, chunking pipeline. | Strong memory ingestion via retain semantics oriented to episodic/experience memory. |
+| Retrieval method | SQLite FTS5 BM25 (and optional sqlite_vec), top-k retrieval, mandatory prompt injection + tool-based retrieval. | Recall over memory banks; docs describe richer memory retrieval and memory-aware operations. |
+| Memory consolidation | Limited: conversation summary and `conversation_data`; no dedicated multi-stage consolidation pipeline. | Core feature: retain/recall/reflect with consolidation into higher-order memory representations (per docs claims). |
+| Temporal reasoning | Minimal explicit temporal reasoning beyond chronological conversation logs. | Explicitly marketed for long-horizon memory and temporal coherence. |
+| Entity extraction & linking | Not a dedicated subsystem in current HUF knowledge stack. | Described as memory-structured reasoning with richer linkage semantics. |
+| Cross-session learning | Partial persistence (history per session/user) but no autonomous learned memory graph/model across sessions by default. | Designed for long-term learning across interactions in banks. |
+| Per-user memory isolation | Available via session_id + `persist_user_history` patterns in HUF model. | Native bank isolation; bank-per-user/per-agent patterns are first-class. |
+| Reflection / self-improvement | No dedicated reflect operation in knowledge subsystem. | Reflect operation is first-class API primitive. |
+| LLM provider support | Broad via LiteLLM + custom provider routing in HUF. | Supports multiple providers (docs indicate provider/model configurable) but memory API is the core abstraction. |
+| Deployment model | In-process within Frappe app; MariaDB + per-source SQLite artifacts. | Separate service/embedded modes; frequently paired with PostgreSQL/pgvector style deployment options. |
+
+## 3.2 Strengths Assessment
+
+### HUF strengths
+
+**Better than Hindsight:**
+- Tight integration with Frappe DocTypes, permissions, business workflows, and MCP tool ecosystem.
+- Lower operational complexity for static knowledge retrieval in current architecture.
+- Strong tenant/application alignment with existing HUF agent config UX.
+
+**Does that Hindsight does not (in this context):**
+- Native coupling to HUF-specific automation flows (Doc Event, scheduled execution, agent tools bound to business data).
+
+**Architectural constraints:**
+- SQLite-based knowledge artifacts can become operationally awkward for large multi-tenant high-write scenarios.
+- Memory is still retrieval-centric and session-centric; advanced consolidation/reflection must be custom-built.
+
+### Hindsight strengths
+
+**Better than HUF:**
+- Purpose-built long-term agent memory lifecycle (retain/recall/reflect) rather than only RAG-style retrieval.
+- Better abstraction for learned memory and behavior adaptation over time.
+
+**Does that HUF does not:**
+- Reflection primitive as a first-class operation.
+- Bank abstraction for isolated memory lifecycles with memory directives/dispositions.
+
+**Architectural constraints:**
+- Additional infra overhead (service lifecycle, storage dependencies, monitoring).
+- Potentially higher LLM cost due multi-stage memory operations (retain + reflect in addition to generation).
+- External project dependency and version drift risk.
+
+## 3.3 Overlap & Conflict
+
+### Overlap
+- Both systems solve “bring past information into current reasoning.”
+- Both expose retrieval-like mechanisms and can be called as tools during generation.
+
+### Potential conflicts if run simultaneously
+- Duplicate memory writes (same interaction retained in HUF conversation store and Hindsight bank).
+- Contradictory context if HUF mandatory knowledge and Hindsight recall produce divergent facts.
+- Extra token/cost pressure if both memory pipelines inject large context.
+
+### Same problem or different problem?
+- **Conclusion:** Mostly **different but adjacent** problems.
+  - HUF knowledge system: best for curated/static corpus retrieval.
+  - Hindsight memory system: best for evolving learned memory and adaptation.
+
+---
+
+## Phase 4 — Integration Decision Analysis
+
+## Option A: Knowledge Transfer Only (port ideas into HUF)
+
+### What to port
+- Retain/recall/reflect lifecycle semantics.
+- Memory bank abstraction with per-agent/per-user boundaries.
+- Consolidation layers (raw events → distilled observations → stable preferences/mental models).
+- Temporal indexing and entity linking.
+
+### Effort estimate
+- **High (8–16+ weeks)** for robust first version due design, schema, migration, evaluation tooling, and cost/quality tuning.
+
+### What you lose
+- Slow time-to-value.
+- Re-inventing mechanisms already available in Hindsight.
+- Benchmark parity uncertainty.
+
+### When A is attractive
+- If strategic requirement is zero external runtime dependency and full in-house control.
+
+---
+
+## Option B: Hindsight as Embedded Sidecar (Recommended)
+
+### Integration approach
+- Keep HUF RAG stack as-is for static corpora.
+- Add optional Hindsight memory toolset per agent:
+  - `hindsight_retain`
+  - `hindsight_recall`
+  - `hindsight_reflect`
+- Add runtime orchestration policy:
+  - On each user turn: selective retain.
+  - Before response: recall and optionally reflect for eligible agents.
+
+### Identity mapping proposal
+- **Bank ID pattern:**
+  - `huf:{site}:{agent}:{user}` for user-isolated memory.
+  - `huf:{site}:{agent}:shared` when shared memory desired.
+- Map from existing HUF controls:
+  - `persist_user_history=1` ⇒ per-user bank.
+  - `persist_user_history=0` ⇒ shared agent bank.
+
+### Ops overhead
+- New service deployment + health checks + secrets management.
+- Optional PostgreSQL/pgvector infra (depending on Hindsight deployment mode chosen).
+- Increased LLM spend from memory operations (retain/reflect) requiring policy controls.
+
+### UI/UX feasibility
+- Add per-agent toggle in Agent DocType:
+  - `enable_hindsight_memory`
+  - `hindsight_mode` (`off` / `assistive` / `primary`)
+  - per-turn budget caps and max retrieved memories.
+
+### Why B wins
+- Fastest path to real long-term memory capability with bounded platform risk.
+- Preserves existing HUF value and customer workflows.
+- Reversible: can disable per agent if quality/cost not acceptable.
+
+---
+
+## Option C: Hindsight as Core Memory Layer (replace HUF memory/knowledge)
+
+### What breaks
+- Existing HUF knowledge source UX and indexing workflows would need major re-plumbing.
+- Mandatory/optional knowledge semantics and per-source SQLite artifacts would become legacy or require migration adapters.
+
+### Migration concerns
+- Need full migration from Knowledge Source / Knowledge Input model to Hindsight-compatible retain corpus and bank assignment.
+- Risk of retrieval behavior regressions for static documentation use cases.
+
+### Architecture conflict
+- HUF is centered on Frappe + MariaDB app patterns with lightweight per-source SQLite knowledge files.
+- Forcing Hindsight as canonical backend introduces non-trivial operational and conceptual coupling.
+
+### Maintenance burden
+- Highest dependency risk and lock-in to external roadmap changes.
+
+### Verdict on C
+- Not recommended at this stage.
+
+---
+
+## Phase 5 — Recommendation, Plan, Architecture, Risks
+
+## 1) Final recommendation
+
+**Choose Option B now.**
+
+**Opinion (explicit):** this is the best risk-adjusted route because it adds true learned memory without destabilizing HUF’s production-critical Frappe knowledge and automation model.
+
+## 2) Phase-1 implementation (2-week scope)
+
+### Week 1
+1. Add config and feature flags:
+   - New agent fields: `enable_hindsight_memory`, `hindsight_bank_strategy`, `hindsight_recall_top_k`, `hindsight_reflect_enabled`, budget fields.
+2. Implement Hindsight API client wrapper in `huf/ai/`.
+3. Implement bank-id resolver from `(site, agent, user, persist_user_history)`.
+4. Add resilient fail-open behavior (if Hindsight unavailable, continue with current HUF behavior).
+
+### Week 2
+5. Integrate into `AgentManager` execution pipeline:
+   - pre-response recall
+   - post-turn retain
+   - optional reflect call
+6. Store memory-operation telemetry in `Agent Run` metadata (latency, token cost, result counts).
+7. Add admin docs + runbook + feature toggle in UI.
+8. Pilot with 1–2 internal agents and collect quality/cost metrics.
+
+### Concrete HUF touch points for Phase-1
+
+| Area | Existing file(s) | Change required | Notes |
+|---|---|---|---|
+| Agent settings model | `huf/huf/doctype/agent/agent.json`, `huf/huf/doctype/agent/agent.py`, `huf/huf/doctype/agent/agent.js` | Add optional Hindsight fields (enable flag, bank strategy, recall top_k, reflect toggle, timeout/budget caps). | Keep defaults off to preserve current behavior. |
+| Runtime orchestration | `huf/ai/agent_integration.py` | Add sidecar call hooks around generation (pre: recall, post: retain, optional reflect). | Use timeouts and fail-open fallback. |
+| Conversation identity mapping | `huf/ai/conversation_manager.py` | Reuse `session_id`/`external_id` patterns when deriving bank IDs. | Align with `persist_user_history`. |
+| Knowledge coexistence policy | `huf/ai/knowledge/context_builder.py`, `huf/ai/knowledge/tool.py` | Add merge strategy for HUF knowledge context + Hindsight recall context. | Prevent duplicate/contradictory context injection. |
+| Run telemetry | `huf/huf/doctype/agent_run/agent_run.json` and run-write path in `agent_integration.py` | Store Hindsight latency/cost/error metrics in metadata JSON. | Needed for pilot evaluation. |
+| Config/secrets | `huf/hook.py` / environment config path in deployment setup | Add sidecar URL + auth secret + per-site toggle. | Must be site-aware for multi-tenant installs. |
+
+## 3) Architecture sketch
+
+```text
+User -> HUF Agent Runtime
+          |\
+          | \--(existing)--> HUF Knowledge Search (sqlite_fts/sqlite_vec) -> Prompt Context
+          |
+          \----(optional per-agent)--> Hindsight Sidecar API
+                     |-- retain(user turn / tool results)
+                     |-- recall(current query)
+                     \-- reflect(optional synthesis)
+
+HUF persists canonical conversation/run records in Frappe DocTypes.
+Hindsight stores learned memory in bank-scoped memory store.
+```
+
+## 4) Risk register
+
+| Risk | Impact | Likelihood | Mitigation |
+|---|---|---:|---|
+| Sidecar outage | Agent quality drop / possible failures | Medium | Fail-open design: skip Hindsight and continue with HUF core path. |
+| Token/cost blow-up from retain+reflect | Budget overrun | High | Per-agent rate/budget caps, sampling policy, batched retain, reflect toggle default off. |
+| Conflicting context (HUF knowledge vs Hindsight recall) | Incorrect responses | Medium | Prompt policy + source attribution + ranking rules + conflict resolution heuristic. |
+| Tenant isolation mistakes in bank mapping | Data leakage | Low/Med | Deterministic bank-id format + tests + explicit ACL validation. |
+| External dependency drift | Maintenance overhead | Medium | Pin versions, contract tests, fallback compatibility layer. |
+| Performance latency | Slower responses | Medium | Async/non-blocking retain, recall timeout budgets, cached bank metadata. |
+
+## 5) Decision revisit criteria
+
+Revisit Option B decision if any of the following occur:
+1. Hindsight introduces a mode that fully satisfies HUF static knowledge ingestion/search use cases with equal or better UX/cost.
+2. HUF scale reaches thresholds where unified external memory infra clearly outperforms per-source SQLite artifacts (e.g., sustained high-concurrency writes, large tenant counts).
+3. Sidecar memory quality remains below target after 2 release cycles despite tuning.
+4. Compliance/ops policy disallows external sidecar dependencies for customer deployments.
+
+## 6) Pilot success criteria (go/no-go)
+
+Use explicit criteria so Option B can be validated quickly:
+
+| Metric | Target in pilot | Why it matters |
+|---|---:|---|
+| Response quality (human eval on memory-heavy prompts) | +20% vs baseline agent without Hindsight | Proves memory value beyond static RAG. |
+| P95 latency overhead | <= +900 ms | Keeps UX acceptable. |
+| Extra token cost per successful run | <= +25% median | Controls operational spend. |
+| Sidecar failure impact | 0 hard failures (fail-open only) | Ensures reliability/safety in production. |
+| Memory leakage incidents across users | 0 | Validates bank-isolation mapping. |
+
+If two or more targets miss for two consecutive weeks, keep Hindsight behind feature flag and continue with Option A-style incremental native improvements.
+
+---
+
+## Implementation Appendix — Example integration pseudocode
+
+```python
+# inside agent execution flow
+if agent.enable_hindsight_memory:
+    bank_id = resolve_bank_id(site, agent.name, user, agent.persist_user_history)
+
+    recalled = hindsight.recall(
+        bank_id=bank_id,
+        query=user_prompt,
+        top_k=agent.hindsight_recall_top_k,
+    )
+
+    prompt = inject_hindsight_memories(prompt, recalled)
+
+response = llm.generate(prompt)
+
+if agent.enable_hindsight_memory:
+    hindsight.retain(
+        bank_id=bank_id,
+        content=serialize_turn(user_prompt, response, tool_events),
+    )
+
+    if agent.hindsight_reflect_enabled:
+        reflection = hindsight.reflect(bank_id=bank_id, query=user_prompt)
+        store_reflection_metadata(run_id, reflection)
+```
+
+---
+
+## Final decision statement
+
+**Ship Option B (sidecar) first, keep HUF RAG as canonical static knowledge, and treat Hindsight as an opt-in learned-memory subsystem.**