Skip to content

feat: distillery_extract tool and hook scripts for PreCompact transcript summarisation #199

@claude

Description

@claude

Context

This is a follow-up to #191 (session lifecycle hooks and session_id tracking). That issue scoped to session_id (Unit 1) and extended EntrySource provenance (Unit 2). This issue covers the deferred Part C: conversation transcript summarisation -- the infrastructure needed to support Claude Code's PreCompact hook.

The PreCompact Hook Pattern

Before Claude Code compacts the conversation (dropping old context to fit the window), a hook can intercept and extract knowledge that would otherwise be lost. The pattern from #191:

  1. Claude Code fires PreCompact with a transcript_path pointing to the last N messages
  2. A hook reads the transcript and sends it to a fast model (e.g. Haiku) with an extraction prompt
  3. Extracted knowledge is stored as entries with source: inference (added in feat: session lifecycle hooks and session_id tracking for Claude Code integration #191)
  4. When the compacted session continues, the LLM still has access to those entries via distillery_list or distillery_search

Today this is a ~140-line bash script hitting a generic MCP server. Distillery could standardise the extraction prompt and make it reusable across clients.

Related Pattern: UserPromptSubmit Context Injection

From the discussion in #191 -- a complementary hook that fires on every user prompt and injects relevant memories into the system context:

"There's also a hook I once implemented that, when a user requests something, automatically adds 5 relevant 'memories' to the context that the LLM can then work with and hit the ground running. I always immediately noticed when it didn't work because the LLM would start asking 'stupid' questions that it didn't before."

This hook calls distillery_search with the user's prompt as the query and injects the top-k results. No new tool needed -- distillery_search already handles this. The gap is documentation and a reference hook script.

Feature Proposals

A. distillery_extract MCP Tool

A new tool that accepts a conversation transcript and returns structured entries ready to store.

Input:

Output: A list of candidate entries (content, suggested tags, suggested entry_type) that the caller can review and pass to distillery_store.

Why a tool rather than a skill? Skills are skippable -- the LLM may decide they are not relevant to the current focus. Hook-driven extraction needs to be reliable. A dedicated MCP tool is callable directly from the hook script without LLM involvement.

Extraction should identify:

  • Decisions made and their rationale
  • User preferences discovered during the session
  • Bugs found and their resolutions
  • Patterns and conventions established
  • Open questions or unresolved threads

Entries produced should use source: inference (available after #191) to signal lower trust than user-stated memories.

B. Reference Hook Scripts

Provide example bash hook scripts in docs/hooks/ for:

  • PreCompact -- transcript extraction via distillery_extract
  • UserPromptSubmit -- context injection via distillery_search (top-5 results injected as system reminder)
  • SessionStart -- project-scoped context load via distillery_list with status=active

These should be drop-in examples covering the JSON-RPC-over-HTTP call pattern to distillery's MCP server, including the 1-second health check pattern from the original issue.

C. New /extract Skill

A new /extract skill that guides the LLM through reviewing and confirming entries produced by distillery_extract. This gives users a manual fallback when the PreCompact hook is not configured, and a review step before auto-storing hook output.

Design Decisions to Resolve

  • Model selection: Should distillery_extract call an LLM internally, or return a structured prompt the caller uses? Internal call is simpler for users but adds an API key requirement server-side.
  • Transcript format: Claude Code's transcript_path is a file path. The tool could accept either a file path (server reads it) or transcript content directly (caller reads and sends). The latter is more portable across clients.
  • Auto-store vs. review: Should distillery_extract auto-store entries or return candidates for human review? Auto-store is convenient; review prevents noise accumulation.
  • Batching: Long transcripts may produce many candidates. Should extraction be paginated or capped?

Relationship to #191

Out of Scope

  • Changes to Claude Code itself
  • The UserPromptSubmit 30-prompt memory nudge (purely client-side, no server changes needed)
  • Session-level analytics dashboards

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    Status

    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions