Query

Query Modes

GsRag supports 6 query modes:

Mode	Keywords	Vector Search	Graph Traversal	Chunk Retrieval	Use Case
`local`	low-level	entity vectors	Entity neighbors	No	Entity-specific questions
`global`	high-level	relation vectors	From relations	No	Broad/topic questions
`hybrid`	both	entity + relation	Both paths	No	Balanced local+global
`naive`	—	chunk vectors	No	Yes	Simple text retrieval
`mix`	both	entity + relation + chunk	Both paths	Yes	All retrieval paths combined
`bypass`	—	None	None	None	Direct LLM call, no RAG

Query Methods

Three levels of query, each returning progressively more data:

// Level 1 — Standard: best for most use cases
const result: QueryResult = await gsrag.query("What is...", { mode: "hybrid" });
console.log(result.content); // LLM-generated answer

// Level 2 — Raw data: retrieval context only, no LLM generation
const data = await gsrag.queryData("What is...", { mode: "hybrid" });
// Inspect entities, relations, chunks returned

// Level 3 — Full pipeline: includes llmResponse
const raw = await gsrag.queryLlm("What is...", { mode: "hybrid" });
console.log(raw.llmResponse?.content); // Full LLM payload

Two Call Signatures

// Object style (recommended)
await gsrag.query({ query: "...", mode: "hybrid", topK: 20 });

// String + param style
await gsrag.query("...", new QueryParam({ mode: "hybrid", topK: 20 }));
await gsrag.query("...", { mode: "hybrid", enableRerank: false });

QueryParam

import { QueryParam } from "@gsrag/core";

const param = new QueryParam({
  mode: "local",
  topK: 30,
  chunkTopK: 15,
  maxEntityTokens: 4000,
  maxRelationTokens: 5000,
  maxTotalTokens: 20000,
  enableRerank: false,
  stream: false,
  onlyNeedContext: false,
  onlyNeedPrompt: false,
  conversationHistory: [
    { role: "user", content: "Previous question" },
    { role: "assistant", content: "Previous answer" },
  ],
  historyTurns: 2,
  userPrompt: "Focus on technical details.",
  hlKeywords: ["AI", "machine learning"],
  llKeywords: ["transformer", "attention"],
  includeReferences: true,
});

See Configuration for all default values and env overrides.

Streaming

const result = await gsrag.query("Tell me a story...", { stream: true });

if (result.isStreaming && result.responseIterator) {
  for await (const chunk of result.responseIterator) {
    process.stdout.write(chunk);
  }
}

QueryResult

class QueryResult {
  content?: string;                           // Generated answer text
  responseIterator?: AsyncIterable<string>;    // Streaming chunks
  rawData?: QueryRawData;                      // Structured retrieval data
  isStreaming: boolean;

  // Convenience properties:
  get referenceList(): Array<{ referenceId: string; filePath: string }>;
  get metadata(): Record<string, unknown>;
  get status(): "success" | "failure" | undefined;
  get data(): Record<string, unknown> | undefined;  // entities, relations, chunks
  get llmResponse(): Record<string, unknown> | undefined;
}

Query Flow

User Query
    │
    ▼
Keyword Extraction (LLM or fallback)
    │
    ├── local  ──► Entity vector search  ──► Graph neighbor traversal
    ├── global ──► Relation vector search ──► Entity lookup
    ├── hybrid ──► Both paths (round-robin merge)
    ├── mix    ──► Both paths + chunk search
    └── bypass ──► Skip retrieval
    │
    ▼
Context Assembly → Reranking (optional) → Token Truncation
    │
    ▼
LLM Generation → QueryResult

Cache

Query responses are cached when enableLlmCache is enabled. Cache keys are computed from query text, mode, and parameters. Cached responses are returned instantly without calling the LLM.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Query

Query Modes

Query Methods

Two Call Signatures

QueryParam

Streaming

QueryResult

Query Flow

Cache

FilesExpand file tree

query.md

Latest commit

History

query.md

File metadata and controls

Query

Query Modes

Query Methods

Two Call Signatures

QueryParam

Streaming

QueryResult

Query Flow

Cache