Skip to content

feat(langgraph): optional LangGraph search pipeline (Sprint 1+2)#2

Merged
svalench merged 1 commit into
mainfrom
feat/langgraph-search-pipeline
May 8, 2026
Merged

feat(langgraph): optional LangGraph search pipeline (Sprint 1+2)#2
svalench merged 1 commit into
mainfrom
feat/langgraph-search-pipeline

Conversation

@svalench
Copy link
Copy Markdown
Owner

@svalench svalench commented May 8, 2026

Summary

Combines Sprint 1 (LangGraph foundation + feature flag) and Sprint 2 (query expansion + reranking + LLM abstraction) from the planned roadmap into a single, reviewable change.

The new orchestration layer is fully opt-in — defaults keep the legacy linear path so existing deployments are byte-for-byte unchanged.

What's new

LANGGRAPH settings section

GRAPH_SEARCH = {
    "LANGGRAPH": {
        "ENABLED": False,            # master switch
        "QUERY_EXPANSION": False,
        "RERANKING": False,
        "MAX_EXPANDED_QUERIES": 3,
        "RERANK_TOP_K": 20,
        "TIMEOUT_SECONDS": 15,
        "MAX_QUERY_LENGTH": 1024,
        "FALLBACK_ON_ERROR": True,
        "USE_FOR_SIMILAR": False,
        "LLM": {"BACKEND": None, "MODEL": None, "OPTIONS": {}},
    },
}

New modules

  • langgraph_agent.py — TypedDict SearchState, 5 nodes (analyze_query, expand_query, vector_search, rerank_results, postprocess_results) and a build_search_graph factory.
  • llm/ subpackage — BaseLLMBackend contract, DummyLLMBackend (deterministic, dependency-free) and a factory. Bring your own backend via dotted path.

Pipeline

analyze_query → [expand_query] → vector_search → [rerank] → postprocess
  • Conditional edges driven by settings — disabled steps are skipped.
  • Multi-query merge with per-doc score consolidation and dedup.
  • Graceful fallback: any LLM error degrades to the deterministic vector path.

LangGraph dependency is optional

If langgraph is not installed, an in-tree sequential runner with the same node structure is used. Behaviour and tests are identical either way. The new langgraph extra installs it.

Backwards compatibility

  • Searcher.search / Searcher.find_similar signatures unchanged.
  • All 8 pre-existing tests pass unchanged.
  • With LANGGRAPH.ENABLED = False (default), the linear path is executed verbatim.
  • FALLBACK_ON_ERROR=True (default) means even with the graph enabled, any internal failure transparently falls back.

Tests

tests/test_langgraph_search.py — 13 new tests covering:

  • Settings validation (defaults, illegal values).
  • Each node in isolation (truncation, expansion fallback, multi-query merge, model filter, rerank top-K + tail preservation, limit application).
  • Searcher integration (disabled vs enabled returns same shape, multi-query expansion merges hits, factory error triggers fallback).
21 passed in 0.25s

Review notes

  • No hard dependency added to install_requires.
  • New pytest-django added to the test extra (already used implicitly).
  • README has a new section documenting configuration and the BYO LLM backend pattern.

What this PR intentionally does not do

  • No conversational endpoint — that's Sprint 3 (next PR).
  • No smart indexing or streaming — that's Sprint 4 (PR after that).
  • No specific LLM provider implementations — those should live in user code or future opt-in extras.

…ion and reranking

Adds an opt-in orchestration layer (Sprint 1+2 combined) on top of the
existing components without changing the public API.

Highlights:
* New LANGGRAPH settings section with safe defaults (disabled).
* New langgraph_agent module with TypedDict state and 5 nodes:
  analyze_query, expand_query, vector_search, rerank, postprocess.
* In-tree fallback runner so the package works without the langgraph
  package installed; uses langgraph.StateGraph when available.
* New llm/ subpackage with BaseLLMBackend contract, DummyLLMBackend
  (deterministic, dependency-free) and a factory.
* Multi-query merge with per-doc score consolidation and dedup.
* Searcher.search and find_similar are unchanged in signature; when
  the graph fails the searcher falls back to the legacy linear path
  (FALLBACK_ON_ERROR=True by default).
* 13 new tests (settings validation, individual nodes, end-to-end
  searcher behaviour with and without LangGraph).
* README section documenting how to enable and configure the pipeline.

Backwards compatibility:
* All 8 pre-existing tests still pass unchanged.
* When LANGGRAPH.ENABLED is False (default), Searcher.search executes
  the original linear path byte-for-byte.
@svalench svalench merged commit 5e7d8ef into main May 8, 2026
0 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant