[enhancement] import-markdown: batch embedding + bulkStore for large memory imports

## 問題描述

目前  CLI 在處理大量 bullet point 時，是**逐筆處理**：

```typescript
// cli.ts:714-723（目前實作）
for (const line of lines) {
  const vector = await ctx.embedder!.embedPassage(text);  // N 次 API call
  await ctx.store.store({ text, vector, ... });            // N 次 lock acquire
}
```

當一個 .md 檔有 100 個 bullet point，就會產生 100 次  API call + 100 次  lock acquisition，效率很差。

## 觀察到的瓶頸

| 步驟 | 目前行為 | 優化方向 |
|------|---------|---------|
| **Embedding** | 每個 entry 單獨 call  | 改用  一次 API call |
| **Deduplication** | 每個 entry 單獨 call  | 已在 PR #720 改用  hybrid search |
| **Store** | 每個 entry 單獨 call  | 改用  一次 lock |

## 建議實作方向

### Phase 1：批次收集結構

在處理每個 .md 檔時，先把整個檔案的 bullet point 收集到陣列，再批次處理：

```typescript
// 1. 先收集所有要處理的 entry（跳過 short / dedup 過的）
const pendingEntries: Array<{text: string, scope: string, filePath: string}> = [];
for (const line of lines) {
  if (!/^[-*+]\s/.test(line)) continue;
  const text = line.slice(2).trim();
  if (text.length < minTextLength) continue;
  // dedup check（已在 PR #720 使用 retrieve()）
  if (dedupEnabled) {
    const existing = await ctx.retriever.retrieve({ query: text, limit: 20, scopeFilter: [effectiveScope], source: "cli" });
    if (existing.length > 0 && existing[0].entry.text === text) { skipped++; continue; }
  }
  pendingEntries.push({ text, scope: effectiveScope, filePath });
}

// 2. 批次 embedding（一次 API call）
const vectors = await ctx.embedder!.embedBatchPassage(pendingEntries.map(e => e.text));

// 3. 組裝並一次 bulkStore
const entries = pendingEntries.map((e, i) => ({
  text: e.text,
  vector: vectors[i],
  importance: importanceDefault,
  category: "other",
  scope: e.scope,
  metadata: JSON.stringify({ importedFrom: e.filePath }),
}));
await ctx.store.bulkStore(entries);
imported += entries.length;
```

### 需保持相容的行為

-  模式：依然逐項報告，行為不變
- Error handling：批次中某筆失敗不影響其他筆
- Scope 正確性：每個 entry 保持原本推導的 scope
- 進度回饋：大量 import 時仍有 console.log 讓使用者知道進度

## 相關改動

- 需等 [PR #720](https://github.com/CortexReach/memory-lancedb-pro/pull/720) 合併（dedup 改用 retrieve）
-  API 已存在，直接使用
-  API 已存在，直接使用

## 預期效益

| 情境 | 改之前 | 改之後 |
|------|--------|--------|
| 100 個 entry 的 .md | 100 次 embed call + 100 次 lock | 1 次 embed call + 1 次 lock |
| 1000 個 entry | 1000 次 embed call + 1000 次 lock | 1 次 embed call + 1 次 lock |

## 連結的 Issue / PR

- [PR #720](https://github.com/CortexReach/memory-lancedb-pro/pull/720) — dedup uses retrieve() hybrid search（需先 merge）
- [Issue #629](https://github.com/CortexReach/memory-lancedb-pro/issues/629) — Ollama batch embedding bug fix（已關閉，相關 API 已修復）

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[enhancement] import-markdown: batch embedding + bulkStore for large memory imports #728

問題描述

觀察到的瓶頸

建議實作方向

Phase 1：批次收集結構

需保持相容的行為

相關改動

預期效益

連結的 Issue / PR

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

步驟	目前行為	優化方向
Embedding	每個 entry 單獨 call	改用一次 API call
Deduplication	每個 entry 單獨 call	已在 PR #720 改用 hybrid search
Store	每個 entry 單獨 call	改用一次 lock

情境	改之前	改之後
100 個 entry 的 .md	100 次 embed call + 100 次 lock	1 次 embed call + 1 次 lock
1000 個 entry	1000 次 embed call + 1000 次 lock	1 次 embed call + 1 次 lock

[enhancement] import-markdown: batch embedding + bulkStore for large memory imports #728

Description

問題描述

觀察到的瓶頸

建議實作方向

Phase 1：批次收集結構

需保持相容的行為

相關改動

預期效益

連結的 Issue / PR

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions