Timmy AI Assistant

Timmy is TMI's planned conversational AI assistant for threat model analysis. Timmy operates within the scope of a single threat model and reasons over its data -- assets, threats, diagrams, documents, repositories, and notes -- to help you understand, analyze, and improve your threat models.

Status: Timmy is under active development on the dev/1.4.0 branch. The backend is functional: chat API endpoints, LLM integration via LangChainGo, vector embedding pipeline, content providers, and streaming responses are all implemented. The frontend chat UI is not yet implemented. See Implementation Status below for details.

Development demo video (YouTube)

Purpose

Timmy is inspired by Google's NotebookLM: a "grounded" chat that reasons over specific sources rather than answering from general knowledge alone. You control which sub-entities are included in the conversation via the timmy_enabled flag on each sub-entity, allowing you to focus the discussion on relevant material.

Problems Timmy Solves

Threat models are dense and hard to reason about holistically

A mature threat model contains dozens of assets, threats, data flows, and supporting documents. Humans struggle to hold all of that in mind simultaneously. Timmy can synthesize across the full model and surface connections, gaps, or inconsistencies that a person might miss.

Security review is bottlenecked on expert availability

Not every team has a senior security reviewer on hand. Timmy acts as an always-available collaborator -- it cannot replace a human reviewer, but it can help teams self-serve on initial analysis, ask better questions, and arrive at a review better prepared.

Threat modeling artifacts are underutilized after creation

Teams build threat models and then rarely revisit them conversationally. Timmy makes the model queryable: "What are the highest-risk data flows?", "Which assets lack mitigations?", "Summarize the threats related to authentication."

Onboarding to an existing threat model is slow

A new team member or reviewer joining a threat model must read through everything. Timmy can provide guided summaries and answer targeted questions, dramatically reducing ramp-up time.

How Users Will Interact with Timmy

You navigate to a threat model's chat page, see your sources (sub-entities) in a sidebar, toggle which ones to include, and have a conversation. You can ask Timmy to:

Analyze threats -- identify highest-risk areas, evaluate threat severity, and assess coverage
Identify gaps -- find assets without threats, threats without mitigations, and incomplete data flows
Explain data flows -- summarize how data moves through the system based on DFD diagrams
Suggest mitigations -- recommend security controls based on identified threats
Summarize content -- provide overviews of the threat model or specific sub-entities
Answer questions -- respond to targeted queries about any aspect of the threat model

Previous chat sessions will be preserved and can be resumed.

Implementation Status

Completed (dev/1.4.0)

timmy_enabled field on all threat model sub-entity types: diagrams, assets, threats, documents, notes, and repositories. Defaults to true. Also present on team notes and project notes
Database models for chat sessions, messages, embeddings, and usage tracking (TimmySession, TimmyMessage, TimmyEmbedding, TimmyUsage)
Database schema definitions for the four Timmy tables with proper indexes and foreign key constraints
Server configuration (TimmyConfig) with settings for LLM provider/model, dual embedding providers (text + code), retrieval parameters, rate limits, memory budgets, and chunking. See Configuration-Reference#timmy-ai-assistant for all variables. Timmy is disabled by default
Chat API endpoints -- REST endpoints for creating sessions (with SSE progress), sending messages (with SSE token streaming), listing sessions, and listing message history
LLM integration via LangChainGo -- provider-agnostic chat completion and embedding with OpenTelemetry instrumentation
Dual-index vector embedding pipeline -- text index for assets, threats, diagrams, documents, and notes; code index for repositories (optional, requires separate code embedding model). In-memory brute-force cosine similarity search with LRU eviction and memory budgeting
Content provider abstraction (ContentProvider interface and ContentProviderRegistry) for extracting plain text from source entities for embedding, including:
- Direct text extraction from database-resident entities (assets, threats, notes, repositories)
- JSON semantic extraction from DFD diagrams
- HTTP/HTML content extraction with SSRF protection
- PDF content extraction
- Content pipeline with pluggable sources and extractors (Google Drive, general HTTP)
Two-tier context building -- Tier 1 (entity overview) and Tier 2 (vector search results) assembled into LLM prompts
Rate limiting -- per-user message rate limiting (sliding window) and system-wide LLM concurrency control
SSRF validator for safely fetching external document URLs during content extraction
Import/export support in the frontend for the timmy_enabled field
Dual-index RAG (#241) -- text and code vector indexes with separate embedding models, external embedding ingestion API, optional query decomposition and cross-encoder reranking
Embedding automation API -- embedding-automation built-in group, /automation/embeddings/ endpoints for external tools to push pre-computed embeddings
Optional query decomposition -- LLM-driven splitting of user queries into index-specific sub-queries (off by default)
Optional cross-encoder reranking -- API-based reranker rescores merged results from both indexes for higher-precision context

In Progress (dev/1.4.0)

Additional content providers (#249) -- Confluence, OneDrive/SharePoint, and Google Workspace delegated access

Not Yet Implemented

Frontend chat UI -- Angular components for the chat page, source sidebar, and session management

Architecture Decisions

Key decisions from the backend design discussion:

LLM integration: Provider-agnostic via LangChainGo, allowing operators to choose their LLM provider.
Vector store: In-memory brute-force cosine similarity search with database-serialized embeddings (rows-per-embedding). No separate vector database required. Dual indexes (text + code) with composite keys per threat model.
Conversation storage: Normal relational tables in the existing threat model database.
Memory management: Explicit budget with LRU eviction and session admission control under memory pressure. Single shared memory pool across both index types.
Scope: Two vector indexes per threat model (text + code), loaded on demand, evicted independently after inactivity.
Entity-to-index mapping: Strict -- repositories go to the code index, all other entity types go to the text index.
Query pipeline: Optional two-stage enhancement -- LLM-driven query decomposition splits user questions into index-specific sub-queries (off by default), and cross-encoder reranking rescores merged results for higher precision (requires separate reranker model). Both degrade gracefully when not configured.

Query Pipeline Architecture

flowchart TD
    A[User Message] --> B{Query Decomposer\nconfigured?}
    B -->|yes| C[LLM Decompose:\ntext_query + code_query]
    B -->|no| D[Use original query\nfor both indexes]
    C --> E[Embed text_query\nwith text model]
    C --> F[Embed code_query\nwith code model]
    D --> E
    D --> F
    E --> G[Search Text Index\ntop-K results]
    F --> H{Code Index\nconfigured?}
    H -->|yes| I[Search Code Index\ntop-K results]
    H -->|no| J[Skip]
    G --> K[Merge Candidates]
    I --> K
    J --> K
    K --> L{Reranker\nconfigured?}
    L -->|yes| M[Cross-Encoder Rerank\nwith original query]
    L -->|no| N[Use merged results\nas-is]
    M --> O[Apply Rerank top-K\ncutoff]
    O --> P[Format Tier 2 Context]
    N --> P
    P --> Q[Assemble Full Prompt:\nBase + Tier 1 + Tier 2]
    Q --> R[LLM Synthesis\nstreaming response]

Developer Inspection Aids

Dump extracted text to a Note (`dump_extracted_text_to_note`)

Verifying extractor output (PDF, DOCX, PPTX, XLSX, HTML, plain text) on real-world documents is hard from logs alone. The timmy.dump_extracted_text_to_note flag turns every successful run of the content-extraction pipeline into a Note on the document's parent threat model, so the extracted markdown can be inspected directly in the existing UI.

Field	Value
YAML key	`timmy.dump_extracted_text_to_note`
Environment variable	`TMI_TIMMY_DUMP_EXTRACTED_TEXT_TO_NOTE`
Default	`false`
Allowed in production	No — the server refuses to start if the flag is on and `auth.build_mode == "production"`
Note title format	`[extracted] {document.name} @ {ISO8601 timestamp}`
Note body	Extracted markdown (verbatim)
Trigger	Each successful extraction run by the access poller after a document transitions to accessible. Failed extractions still classify and persist diagnostics; they do not create a Note.

Operator notes:

Strictly an inspection aid. Note creation failures are logged and silently swallowed — the dump must never alter the production behavior of the pipeline.
The dumper is wired only when the flag is on at startup; an attempt to start with the flag enabled in a production build aborts with a clear error.
When enabled, the server emits a startup WARN log so it's visible in dev/test that "every successful extraction will be persisted as a Note".

Related Pages

Architecture-and-Design -- System architecture and design decisions
REST-API-Reference -- API endpoint reference

Related Issues

Server backend: ericfitz/tmi#214
Client UX: ericfitz/tmi-ux#293
Dev-mode dump-to-Note hook: ericfitz/tmi#337

Home

Releases

Getting Started

Deployment

Planning Your Deployment
Terraform Deployment (AWS, OCI, GCP, Azure)
Deploying TMI Server
OCI Container Deployment
Certificate Automation
Deploying TMI Web Application
Setting Up Authentication
Database Setup
Bootstrapping Production
Component Integration
Post-Deployment
Branding and Customization

Timmy AI Assistant

Timmy AI Assistant

Purpose

Problems Timmy Solves

Threat models are dense and hard to reason about holistically

Security review is bottlenecked on expert availability

Threat modeling artifacts are underutilized after creation

Onboarding to an existing threat model is slow

How Users Will Interact with Timmy

Implementation Status

Completed (dev/1.4.0)

In Progress (dev/1.4.0)

Not Yet Implemented

Architecture Decisions

Query Pipeline Architecture

Developer Inspection Aids

Dump extracted text to a Note (dump_extracted_text_to_note)

Related Pages

Related Issues

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Dump extracted text to a Note (`dump_extracted_text_to_note`)