What happened?
Noticed this while load-testing the OpenCode plugin hooks. chat.message appears to treat inputHash as a dedup key, but concurrent calls with the same payload can still create two rows.
In src/opencode.ts, the flow is check-then-insert:
- lookup via
observation.getByInputHash(projectId, hash)
- if not found, call
observation.add(...)
At the storage layer, src/core/db.ts has input_hash plus a non-unique index (idx_observations_project_input_hash), but no unique constraint for (project_id, input_hash). Under concurrency, both calls can miss the lookup and insert.
Expected:
- one observation row
- frequency incremented to reflect repeated event
Actual:
- duplicate observation rows with frequency
1
How to reproduce
node --experimental-strip-types --input-type=module -e "import { mkdtempSync, rmSync } from 'node:fs'; import { join } from 'node:path'; import { tmpdir } from 'node:os'; import { createObsxaPlugin } from './src/opencode.ts'; import { createObsxa } from './src/index.ts'; const dir = mkdtempSync(join(tmpdir(), 'obsxa-plugin-')); const db = join(dir, 'obsxa.db'); const plugin = createObsxaPlugin({ db, projectId: 'p1', projectName: 'P1' }); const hooks = await plugin({ project: { id: 'p1' }, directory: dir, worktree: dir }); const msgIn = { sessionID: 's1', agent: 'assistant', messageID: 'm1' }; const msgOut = { message: { summary: { title: 'same' } }, parts: [{ type: 'text', text: 'same payload message that should dedupe by input hash' }] }; await Promise.all([hooks['chat.message']?.(msgIn, msgOut), hooks['chat.message']?.(msgIn, msgOut)]); const obsxa = await createObsxa({ db }); const rows = await obsxa.observation.list('p1'); console.log('count', rows.length); console.log('freqs', rows.map(r => r.frequency).join(',')); await obsxa.close(); await hooks.destroy?.(); rmSync(dir, { recursive: true, force: true });"
Observed output:
Anything else?
This looks like a data integrity bug for dedup semantics in plugin ingestion, not just a cosmetic duplicate. It can inflate counts and skew any downstream analysis that assumes one hashed payload maps to one observation identity.
Related code paths:
src/opencode.ts (findByHash, chat.message)
src/core/observation.ts (getByInputHash, add, incrementFrequency)
src/core/db.ts (observations.inputHash, idx_observations_project_input_hash)
What happened?
Noticed this while load-testing the OpenCode plugin hooks.
chat.messageappears to treatinputHashas a dedup key, but concurrent calls with the same payload can still create two rows.In
src/opencode.ts, the flow is check-then-insert:observation.getByInputHash(projectId, hash)observation.add(...)At the storage layer,
src/core/db.tshasinput_hashplus a non-unique index (idx_observations_project_input_hash), but no unique constraint for(project_id, input_hash). Under concurrency, both calls can miss the lookup and insert.Expected:
Actual:
1How to reproduce
node --experimental-strip-types --input-type=module -e "import { mkdtempSync, rmSync } from 'node:fs'; import { join } from 'node:path'; import { tmpdir } from 'node:os'; import { createObsxaPlugin } from './src/opencode.ts'; import { createObsxa } from './src/index.ts'; const dir = mkdtempSync(join(tmpdir(), 'obsxa-plugin-')); const db = join(dir, 'obsxa.db'); const plugin = createObsxaPlugin({ db, projectId: 'p1', projectName: 'P1' }); const hooks = await plugin({ project: { id: 'p1' }, directory: dir, worktree: dir }); const msgIn = { sessionID: 's1', agent: 'assistant', messageID: 'm1' }; const msgOut = { message: { summary: { title: 'same' } }, parts: [{ type: 'text', text: 'same payload message that should dedupe by input hash' }] }; await Promise.all([hooks['chat.message']?.(msgIn, msgOut), hooks['chat.message']?.(msgIn, msgOut)]); const obsxa = await createObsxa({ db }); const rows = await obsxa.observation.list('p1'); console.log('count', rows.length); console.log('freqs', rows.map(r => r.frequency).join(',')); await obsxa.close(); await hooks.destroy?.(); rmSync(dir, { recursive: true, force: true });"Observed output:
count 2freqs 1,1Anything else?
This looks like a data integrity bug for dedup semantics in plugin ingestion, not just a cosmetic duplicate. It can inflate counts and skew any downstream analysis that assumes one hashed payload maps to one observation identity.
Related code paths:
src/opencode.ts(findByHash,chat.message)src/core/observation.ts(getByInputHash,add,incrementFrequency)src/core/db.ts(observations.inputHash,idx_observations_project_input_hash)