Skip to content

Commit ec2d1b0

Browse files
unamedkrclaude
andcommitted
release: v0.4.2 — performance optimization documentation + debug tooling
v0.4 series summary (29% → 75.2% Combined QA): Retrieval precision: - BM25 min-max score normalization (+36.7%p vs 1.0 cap) - Document coherence boost (same-doc chunks get +5%/extra) - Reranker score blending (0.7 reranker + 0.3 fusion signal) - Full ingest mode with HyPE enabled (+9.5%p) Generation quality: - Citation mapping handles [Source N] format + range validation - Sentence-boundary-aware context truncation (Korean + English) - Finance metric cross-verification in fact_verifier Engine optimization: - Adaptive post-correction time budget (80s total target) - Query deadline gate (70s) skips expensive late-stage steps - Auto-skip correction for simple confident queries - Sub-query cap (3→2) to reduce parallel retrieval cost Playground: - Pipeline trace visualization with Retrieve/Generate/Other breakdown - Source excerpts visible by default with document titles - Code block rendering fix (placeholder-based extraction) - Query options panel (top_k, rerank, trace, stream toggles) - `quantumrag demo` command for instant one-line experience Infrastructure: - `serve` auto-detects quantumrag.yaml in current directory - `from_yaml()` loads .env for API keys - Server startup prints provider/model/embedding info - FAISS upsert stale reference bug fix (850 tests, 0 failures) Optimization lessons documented in CLAUDE.md: - Fusion weight tuning exhausted (4 attempts, all negative) - Current 40/35/25 weights are near-optimal - Next breakthrough requires embedding model change or noise reduction Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 9d42928 commit ec2d1b0

4 files changed

Lines changed: 13 additions & 5 deletions

File tree

CLAUDE.md

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -44,11 +44,15 @@ Index-Heavy, Query-Light RAG 엔진. Python 3.10+, Apache 2.0.
4444
### 현재 성능 현황
4545
- **개별 QA** (4 datasets, 105 questions): 77~100% pass rate → 전체 graduated
4646
- **Combined QA** (73 sources + 50 noise, 436 chunks): **75% pass rate** (full mode), timeout 2건, 30초 avg
47-
- **개선 이력**: 29% → 65% → **75%** (BM25 정규화 + coherence boost + reranker 블렌딩 + HyPE)
47+
- **개선 이력**: 29% → 65% → **75%** (6회 측정-개선 루프)
4848
- **남은 실패**: 26건 — retrieval FAIL 23건, timeout 2건, generation FAIL 1건
49-
- **Ceiling 분석**: fusion 가중치 튜닝은 소진됨. 다음 돌파구는 embedding 모델 교체 또는 노이즈 축소
5049
- **기본 LLM**: gemini-3.1-flash-lite-preview (무료 티어, 비용 효율적)
5150

51+
### 성능 최적화 교훈 (검증 완료)
52+
- **효과 있음**: BM25 min-max 정규화(+36.7%p), Document Coherence Boost, Reranker 블렌딩(0.7/0.3), Full ingest HyPE(+9.5%p)
53+
- **효과 없음 (재시도 금지)**: fusion 가중치 튜닝(4회 모두 악화), dictionary expansion(-5%p), timeout 최적화(0%p), query classifier 변경(-2%p)
54+
- **Ceiling 분석**: 현재 가중치(40/35/25)가 최적점. 다음 돌파구는 embedding 모델 교체 또는 노이즈 축소
55+
5256
## 주요 파일 위치
5357
- 엔진 진입점: `quantumrag/core/engine.py`
5458
- RAG 설정: `quantumrag/core/config.py`

datasets/debug_retrieval.py

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,11 @@
3333

3434

3535
async def debug_query(query: str, data_dir: str) -> None:
36-
cfg = QuantumRAGConfig.auto(storage={"data_dir": data_dir})
36+
cfg = QuantumRAGConfig.default(storage={"data_dir": data_dir})
37+
# Match Combined QA runner: use local embeddings (1024d)
38+
cfg.models.embedding.provider = "local"
39+
cfg.models.embedding.model = "BAAI/bge-m3"
40+
cfg.models.embedding.dimensions = 1024
3741
engine = Engine(config=cfg)
3842
engine._ensure_initialized()
3943

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ build-backend = "hatchling.build"
44

55
[project]
66
name = "quantumrag"
7-
version = "0.4.1"
7+
version = "0.4.2"
88
description = "Index-Heavy, Query-Light RAG Engine — Put in docs, ask questions, it just works."
99
readme = "README.md"
1010
license = "Apache-2.0"

quantumrag/_version.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
__version__ = "0.4.1"
1+
__version__ = "0.4.2"

0 commit comments

Comments
 (0)