Skip to content

Commit 6eb6031

Browse files
authored
feat(team): harden share pipeline — Ollama client, helper extraction, segmented sync
feat(team): harden share pipeline — Ollama client, helper extraction, segmented sync
2 parents e148c75 + 891fb05 commit 6eb6031

18 files changed

Lines changed: 857 additions & 462 deletions

docs/demo-script.md

Lines changed: 323 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,323 @@
1+
# Smriti Demo: From Deep Dive to Team Knowledge
2+
3+
## The Problem
4+
5+
Priya is a senior engineer at a startup. She just spent 2 hours in a Claude
6+
Code session doing a deep review of their payment service — a critical codebase
7+
she inherited when the original author left.
8+
9+
During the session, she and Claude:
10+
11+
- Traced a race condition in the webhook handler that causes duplicate charges
12+
- Discovered the retry logic uses `setTimeout` instead of exponential backoff
13+
- Decided to replace the hand-rolled queue with BullMQ
14+
- Found that the Stripe SDK is 3 major versions behind and the API they use is deprecated
15+
- Mapped out the full payment flow across 14 files
16+
- Identified 3 missing error boundaries that silently swallow failures
17+
18+
That's a **goldmine** of institutional knowledge. But the Claude session is
19+
just a 400-message transcript buried in `~/.claude/projects/`. Tomorrow, when
20+
her teammate Arjun picks up the webhook fix, he'll start from scratch. When the
21+
intern asks "why BullMQ?", nobody will remember the tradeoff analysis.
22+
23+
**This is the problem Smriti solves.**
24+
25+
---
26+
27+
## Act 1: The Session Ends
28+
29+
Priya's Claude Code session just finished. Here's what her terminal looks like:
30+
31+
```
32+
$ # Session over. 2 hours of deep review — bugs, decisions, architecture notes.
33+
$ # All sitting in a Claude transcript she'll never look at again.
34+
```
35+
36+
She has two paths to preserve this knowledge:
37+
38+
| Path | Command | What it does |
39+
|------|---------|--------------|
40+
| **Ingest** | `smriti ingest claude` | Import into searchable memory (personal) |
41+
| **Share** | `smriti share --segmented` | Export as team documentation (git-committed) |
42+
43+
She'll do both.
44+
45+
---
46+
47+
## Act 2: Ingest — Building Personal Memory
48+
49+
```
50+
$ smriti ingest claude --project payments
51+
```
52+
53+
```
54+
Discovering sessions...
55+
Found 1 new session in payments
56+
57+
Agent: claude-code
58+
Sessions found: 1
59+
Sessions ingested: 1
60+
Messages ingested: 412
61+
Skipped: 0
62+
```
63+
64+
That's it. 412 messages are now indexed — full-text searchable with BM25,
65+
ready for vector embedding, tagged with project and agent metadata.
66+
67+
**What just happened under the hood:**
68+
69+
1. Smriti found the JSONL transcript in `~/.claude/projects/-Users-priya-src-payments/`
70+
2. Parsed every message, tool call, file edit, and error
71+
3. Stored messages in QMD's content-addressable store (SHA256 dedup)
72+
4. Registered the session with project = `payments`, agent = `claude-code`
73+
5. Auto-indexed into FTS5 for instant search
74+
75+
Now Priya can search her memory:
76+
77+
```
78+
$ smriti search "race condition webhook" --project payments
79+
```
80+
81+
```
82+
[0.891] Payment Service Deep Review
83+
assistant: The race condition occurs in src/webhooks/stripe.ts at line 47.
84+
The handler processes the event, then checks idempotency — but between
85+
those two operations, a duplicate webhook can slip through...
86+
87+
[0.823] Payment Service Deep Review
88+
user: What's the fix? Can we just add a mutex?
89+
90+
[0.756] Payment Service Deep Review
91+
assistant: A mutex won't work in a multi-instance deployment. The proper
92+
fix is to check idempotency BEFORE processing, using a database-level
93+
unique constraint on the event ID...
94+
```
95+
96+
Three weeks later, she barely remembers the session. But she can recall it:
97+
98+
```
99+
$ smriti recall "why did we decide on BullMQ for payments" --synthesize
100+
```
101+
102+
```
103+
[0.834] Payment Service Deep Review
104+
assistant: After comparing the options, BullMQ is the clear winner...
105+
106+
--- Synthesis ---
107+
108+
The decision to adopt BullMQ for the payment queue was made during a deep
109+
review of the payment service. The existing implementation used a hand-rolled
110+
queue with setTimeout-based retries, which had several issues:
111+
112+
1. No exponential backoff — failed jobs retry immediately, hammering Stripe
113+
2. No dead-letter queue — permanently failed jobs disappear silently
114+
3. No persistence — server restart loses the entire queue
115+
4. No visibility — no way to inspect pending/failed jobs
116+
117+
BullMQ was chosen over alternatives:
118+
- **pg-boss**: Good, but adds Postgres load to an already-strained DB
119+
- **Custom Redis queue**: Reinventing the wheel; BullMQ is battle-tested
120+
- **SQS/Cloud queue**: Adds AWS dependency the team wants to avoid
121+
122+
BullMQ provides exponential backoff, dead-letter queues, Redis persistence,
123+
and a dashboard (Bull Board) — solving all four issues.
124+
```
125+
126+
That synthesis didn't come from a new LLM call about BullMQ. It came from
127+
**Priya's actual reasoning during the review**, reconstructed from her
128+
session memory.
129+
130+
---
131+
132+
## Act 3: Share — Exporting Team Knowledge
133+
134+
Ingesting is personal. Sharing is for the team.
135+
136+
```
137+
$ smriti share --project payments --segmented
138+
```
139+
140+
```
141+
Segmenting session: Payment Service Deep Review...
142+
Found 5 knowledge units (3 above relevance threshold)
143+
Generating documentation...
144+
145+
Output: /Users/priya/src/payments/.smriti
146+
Files created: 3
147+
Files skipped: 0
148+
```
149+
150+
Smriti's 3-stage pipeline just:
151+
152+
**Stage 1 — Segment**: Analyzed the 412-message session and identified 5
153+
distinct knowledge units:
154+
155+
| Unit | Category | Relevance | Action |
156+
|------|----------|-----------|--------|
157+
| Webhook race condition | bug/investigation | 9 | Shared |
158+
| BullMQ decision | architecture/decision | 8 | Shared |
159+
| Stripe SDK deprecation | project/dependency | 7 | Shared |
160+
| General code navigation | uncategorized | 3 | Filtered out |
161+
| Test setup discussion | uncategorized | 2 | Filtered out |
162+
163+
**Stage 2 — Document**: Generated structured markdown using category-specific
164+
templates. A bug gets Symptoms → Root Cause → Fix → Prevention. A decision
165+
gets Context → Options → Decision → Consequences.
166+
167+
**Stage 3 — Persist**: Wrote files, deduplicated via content hash, updated the
168+
manifest.
169+
170+
Here's what landed on disk:
171+
172+
```
173+
payments/
174+
└── .smriti/
175+
├── CLAUDE.md # Auto-discovered by Claude Code
176+
├── index.json
177+
├── config.json
178+
└── knowledge/
179+
├── bug-investigation/
180+
│ └── 2026-02-28_webhook-race-condition-duplicate-charges.md
181+
├── architecture-decision/
182+
│ └── 2026-02-28_bullmq-for-payment-queue.md
183+
└── project-dependency/
184+
└── 2026-02-28_stripe-sdk-v3-deprecation.md
185+
```
186+
187+
Let's look at the bug document:
188+
189+
```markdown
190+
---
191+
id: unit-a1b2c3
192+
session_id: 6de3c493-60fa
193+
category: bug/investigation
194+
pipeline: segmented
195+
relevance_score: 9
196+
entities: ["Stripe webhooks", "idempotency", "race condition", "PostgreSQL"]
197+
files: ["src/webhooks/stripe.ts", "src/db/events.ts"]
198+
project: payments
199+
author: priya
200+
shared_at: 2026-02-28T17:45:00Z
201+
---
202+
203+
# Webhook Race Condition Causing Duplicate Charges
204+
205+
## Symptoms
206+
207+
Customers occasionally receive duplicate charges for a single purchase.
208+
The issue occurs under high webhook volume — Stripe sends the same event
209+
twice within milliseconds, and both get processed.
210+
211+
## Root Cause
212+
213+
In `src/webhooks/stripe.ts`, the handler processes the event first, then
214+
checks the idempotency table. Between processing and the idempotency check,
215+
a duplicate webhook slips through.
216+
217+
The vulnerable window is ~15ms (database round-trip time), which is enough
218+
for Stripe's retry mechanism to deliver a duplicate.
219+
220+
## Investigation
221+
222+
Traced the flow: `handleWebhook()``processEvent()``markProcessed()`.
223+
The idempotency check happens inside `markProcessed()`, AFTER the charge
224+
is executed. Should be BEFORE.
225+
226+
## Fix
227+
228+
Move the idempotency check to the entry point of `handleWebhook()`:
229+
230+
1. Add a `UNIQUE` constraint on `webhook_events.stripe_event_id`
231+
2. `INSERT OR IGNORE` before processing — if the insert fails, the event
232+
was already handled
233+
3. Wrap the entire handler in a database transaction
234+
235+
## Prevention
236+
237+
- Add integration test that fires duplicate webhooks concurrently
238+
- Add monitoring alert on duplicate event IDs in the events table
239+
- Consider adding Stripe's recommended `idempotency-key` header to all
240+
API calls
241+
```
242+
243+
That's not a raw transcript. It's a **structured incident document** that any
244+
engineer can read, understand, and act on — without ever having been in the
245+
original session.
246+
247+
---
248+
249+
## Act 4: The Payoff
250+
251+
### Monday morning — Arjun picks up the webhook fix
252+
253+
He opens the payments repo. Claude Code automatically reads
254+
`.smriti/CLAUDE.md` and sees the shared knowledge index.
255+
256+
```
257+
$ smriti search "webhook duplicate" --project payments
258+
```
259+
260+
He finds the full investigation, root cause, and fix — before writing a
261+
single line of code.
262+
263+
### Two weeks later — the intern asks "why BullMQ?"
264+
265+
```
266+
$ smriti recall "why BullMQ instead of pg-boss" --synthesize --project payments
267+
```
268+
269+
The original tradeoff analysis surfaces instantly, with Priya's reasoning
270+
preserved verbatim.
271+
272+
### A month later — Priya reviews a different service
273+
274+
She notices the same setTimeout retry pattern:
275+
276+
```
277+
$ smriti search "setTimeout retry" --category bug
278+
```
279+
280+
Her earlier finding surfaces. She already knows the fix.
281+
282+
---
283+
284+
## The Commands
285+
286+
```bash
287+
# After a deep session — capture everything
288+
smriti ingest claude
289+
290+
# Share structured knowledge with the team
291+
smriti share --project payments --segmented
292+
293+
# Commit shared knowledge to git
294+
cd /path/to/payments
295+
git add .smriti/
296+
git commit -m "docs: share payment service review findings"
297+
298+
# Teammates sync the knowledge
299+
smriti sync --project payments
300+
301+
# Search across all your sessions
302+
smriti search "race condition" --project payments
303+
304+
# Get synthesized answers from memory
305+
smriti recall "how should we handle retries" --synthesize
306+
307+
# Check what you've captured
308+
smriti status
309+
```
310+
311+
---
312+
313+
## What Makes This Different
314+
315+
| Without Smriti | With Smriti |
316+
|---|---|
317+
| Session transcript sits in `~/.claude/` forever | Searchable, indexed, synthesizable memory |
318+
| Knowledge dies when the session closes | Knowledge persists across sessions and engineers |
319+
| Teammates start from scratch | Teammates find existing analysis instantly |
320+
| "Why did we decide X?" — nobody remembers | `smriti recall "why X" --synthesize` |
321+
| Deep dives produce code changes only | Deep dives produce code changes + documentation |
322+
323+
The session is ephemeral. The knowledge doesn't have to be.

src/db.ts

Lines changed: 13 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -62,11 +62,20 @@ function initializeQmdStore(db: Database): void {
6262
)
6363
`);
6464

65-
// Create virtual vec table for sqlite-vec
65+
// vectors_vec is managed by QMD at embedding time because dimensions depend on
66+
// the active embedding model. Do not eagerly create it here.
67+
// Migration: older Smriti versions created an incompatible vectors_vec table
68+
// (embedding-only, no hash_seq), which breaks embed/search paths.
6669
try {
67-
db.exec(`CREATE VIRTUAL TABLE IF NOT EXISTS vectors_vec USING vec0(embedding float[1536])`);
70+
const vecTable = db
71+
.prepare(`SELECT sql FROM sqlite_master WHERE type='table' AND name='vectors_vec'`)
72+
.get() as { sql: string } | null;
73+
74+
if (vecTable?.sql && !vecTable.sql.includes("hash_seq")) {
75+
db.exec(`DROP TABLE IF EXISTS vectors_vec`);
76+
}
6877
} catch {
69-
// May fail if model doesn't support this dimension, that's OK
78+
// If sqlite-vec isn't loaded or table introspection fails, continue.
7079
}
7180
}
7281

@@ -356,7 +365,7 @@ export function initializeSmritiTables(db: Database): void {
356365
CREATE INDEX IF NOT EXISTS idx_smriti_shares_hash
357366
ON smriti_shares(content_hash);
358367
CREATE INDEX IF NOT EXISTS idx_smriti_shares_unit
359-
ON smriti_shares(content_hash, unit_id);
368+
ON smriti_shares(unit_id);
360369
361370
-- Indexes (sidecar tables)
362371
CREATE INDEX IF NOT EXISTS idx_smriti_tool_usage_session

0 commit comments

Comments
 (0)