Minimal, elegant RAG framework for TypeScript
Zero dependencies · Type-safe · Production-grade · Shield included
Most RAG frameworks are heavyweight, opinionated, and leave security as an afterthought. rag-core is different:
| Principle | What it means |
|---|---|
| 🪶 Zero Dependencies | Pure TypeScript. Uses native fetch. No bloated dependency tree. |
| 🔒 Shield Built-in | Prompt injection detection out of the box. Production-grade from day one. |
| 🧩 Modular & Swappable | Every component — embedder, store, ranker — implements a clean interface. Swap OpenAI for Cohere in one line. |
| 🎯 Deep Type Safety | Generic metadata flows through the entire pipeline. Your IDE knows the shape of your data everywhere. |
| ⚡ 10-Line Quickstart | From install to working RAG pipeline in under 10 lines of code. |
npm install rag-coreimport { RagCore, OpenAIEmbedder, MemoryStore } from 'rag-core';
const rag = new RagCore({
embedder: new OpenAIEmbedder({ apiKey: process.env.OPENAI_API_KEY! }),
vectorStore: new MemoryStore(),
shield: { detectInjection: true },
});
// Ingest a document — chainable, readable, typed
await rag
.ingest({ content: 'TypeScript is a typed superset of JavaScript...' })
.split({ chunkSize: 500 })
.store();
// Query with automatic shield + vector search
const results = await rag.query('What is TypeScript?', { topK: 5 });That's it. 7 lines to a working RAG pipeline with prompt injection protection.
Document → [ Ingestor ] → Chunks → [ Embedder ] → Vectors → [ Store ]
↓
Query → [ Shield ] → [ Embedder ] → [ Store.search ] → [ Ranker ] → Results
| Module | Class | Purpose |
|---|---|---|
| Ingestor | RecursiveCharacterSplitter |
Smart text chunking with recursive separator hierarchy |
| Embedder | OpenAIEmbedder, CohereEmbedder |
Map text to vectors via any embedding API |
| Store | MemoryStore |
Vector storage with cosine similarity search |
| Ranker | CohereRanker |
Re-rank results with cross-encoder models |
| Shield | InjectionDetector |
Detect prompt injection attacks before they reach your LLM |
The main orchestrator. All operations flow through this class.
const rag = new RagCore<MyMetadata>({
embedder: new OpenAIEmbedder({ apiKey: '...' }),
vectorStore: new MemoryStore<MyMetadata>(),
ranker: new CohereRanker({ apiKey: '...' }), // optional
shield: { detectInjection: true, threshold: 0.7 }, // optional
});Start a chainable ingestion pipeline:
await rag
.ingest({ id: 'doc-1', content: '...', metadata: { source: 'web' } })
.split({ chunkSize: 500, chunkOverlap: 50 })
.store();Batch-ingest multiple documents:
await rag.ingestMany([doc1, doc2, doc3], { chunkSize: 500 });Query the pipeline with automatic shield → embed → search → rerank:
const results = await rag.query('How does X work?', {
topK: 5, // number of results (default: 5)
rerank: true, // enable re-ranking (default: false)
shield: true, // enable injection check (default: true)
});Manually check any text for prompt injection:
const check = rag.shield(userInput);
if (!check.safe) {
console.warn(`Blocked! Threats: ${check.threats.join(', ')}`);
}new OpenAIEmbedder({
apiKey: 'sk-...',
model: 'text-embedding-3-small', // default
baseUrl: 'https://api.openai.com/v1', // default
});new CohereEmbedder({
apiKey: '...',
model: 'embed-english-v3.0', // default
});Implement the Embedder interface to use any provider:
import type { Embedder } from 'rag-core';
class MyEmbedder implements Embedder {
async embed(texts: string[]): Promise<number[][]> { /* ... */ }
async embedQuery(text: string): Promise<number[]> { /* ... */ }
}In-memory store with pure cosine similarity. Great for prototyping and small datasets.
const store = new MemoryStore<MyMeta>();
store.size; // number of stored chunks
store.clear(); // remove allImplement the VectorStore interface for Pinecone, Weaviate, Qdrant, etc.:
import type { VectorStore, EmbeddedChunk, SearchResult } from 'rag-core';
class PineconeStore<TMeta> implements VectorStore<TMeta> {
async upsert(chunks: EmbeddedChunk<TMeta>[]): Promise<void> { /* ... */ }
async search(query: number[], topK: number): Promise<SearchResult<TMeta>[]> { /* ... */ }
}The unique selling point of rag-core. Most frameworks completely ignore prompt security.
import { InjectionDetector, sanitize } from 'rag-core';
const detector = new InjectionDetector(0.7); // threshold
const result = detector.analyze('Ignore all previous instructions...');
// { safe: false, score: 0.9, threats: ['role-override:ignore-previous'] }
const clean = sanitize(rawInput); // strips control chars, normalizes unicodeDetected threat categories:
- 🛡️ Role overrides (
ignore previous,you are now,pretend to be) - 🔓 Delimiter injection (
<|im_start|>,[INST],<<SYS>>) - 📤 Data exfiltration (
show me your prompt,repeat the context) - 🎭 Jailbreaks (
DAN mode,developer mode,god mode) - 🔀 Obfuscation (
base64,eval(), encoding tricks)
import { RecursiveCharacterSplitter } from 'rag-core';
const splitter = new RecursiveCharacterSplitter({
chunkSize: 500, // max chars per chunk (default: 500)
chunkOverlap: 50, // overlap between chunks (default: 50)
separators: ['\n\n', '\n', '. ', ' ', ''], // custom hierarchy
});
const chunks = splitter.split({ id: 'doc-1', content: longText });rag-core uses TypeScript generics so your metadata type flows through the entire pipeline:
interface MyMeta {
source: string;
page: number;
confidential: boolean;
}
const rag = new RagCore<MyMeta>({
embedder: new OpenAIEmbedder({ apiKey: '...' }),
vectorStore: new MemoryStore<MyMeta>(),
});
// Metadata is typed everywhere
await rag.ingest({
content: '...',
metadata: { source: 'report.pdf', page: 42, confidential: true },
}).split().store();
const results = await rag.query('...');
results[0].chunk.metadata?.source; // ✅ TypeScript knows this is `string`
results[0].chunk.metadata?.page; // ✅ TypeScript knows this is `number`MIT © rag-core contributors