⚡ rag-core

Minimal, elegant RAG framework for TypeScript

Zero dependencies · Type-safe · Production-grade · Shield included

Why rag-core?

Most RAG frameworks are heavyweight, opinionated, and leave security as an afterthought. rag-core is different:

Principle	What it means
🪶 Zero Dependencies	Pure TypeScript. Uses native `fetch`. No bloated dependency tree.
🔒 Shield Built-in	Prompt injection detection out of the box. Production-grade from day one.
🧩 Modular & Swappable	Every component — embedder, store, ranker — implements a clean interface. Swap OpenAI for Cohere in one line.
🎯 Deep Type Safety	Generic metadata flows through the entire pipeline. Your IDE knows the shape of your data everywhere.
⚡ 10-Line Quickstart	From install to working RAG pipeline in under 10 lines of code.

Quick Start

npm install rag-core

import { RagCore, OpenAIEmbedder, MemoryStore } from 'rag-core';

const rag = new RagCore({
  embedder: new OpenAIEmbedder({ apiKey: process.env.OPENAI_API_KEY! }),
  vectorStore: new MemoryStore(),
  shield: { detectInjection: true },
});

// Ingest a document — chainable, readable, typed
await rag
  .ingest({ content: 'TypeScript is a typed superset of JavaScript...' })
  .split({ chunkSize: 500 })
  .store();

// Query with automatic shield + vector search
const results = await rag.query('What is TypeScript?', { topK: 5 });

That's it. 7 lines to a working RAG pipeline with prompt injection protection.

Architecture

 Document → [ Ingestor ] → Chunks → [ Embedder ] → Vectors → [ Store ]
                                                                  ↓
   Query → [ Shield ] → [ Embedder ] → [ Store.search ] → [ Ranker ] → Results

The 5 Pillars

Module	Class	Purpose
Ingestor	`RecursiveCharacterSplitter`	Smart text chunking with recursive separator hierarchy
Embedder	`OpenAIEmbedder`, `CohereEmbedder`	Map text to vectors via any embedding API
Store	`MemoryStore`	Vector storage with cosine similarity search
Ranker	`CohereRanker`	Re-rank results with cross-encoder models
Shield	`InjectionDetector`	Detect prompt injection attacks before they reach your LLM

API Reference

`RagCore<TMeta>`

The main orchestrator. All operations flow through this class.

const rag = new RagCore<MyMetadata>({
  embedder: new OpenAIEmbedder({ apiKey: '...' }),
  vectorStore: new MemoryStore<MyMetadata>(),
  ranker: new CohereRanker({ apiKey: '...' }),    // optional
  shield: { detectInjection: true, threshold: 0.7 }, // optional
});

`.ingest(document)` → `Pipeline`

Start a chainable ingestion pipeline:

await rag
  .ingest({ id: 'doc-1', content: '...', metadata: { source: 'web' } })
  .split({ chunkSize: 500, chunkOverlap: 50 })
  .store();

`.ingestMany(documents, options?)` → `Promise<EmbeddedChunk[]>`

Batch-ingest multiple documents:

await rag.ingestMany([doc1, doc2, doc3], { chunkSize: 500 });

`.query(question, options?)` → `Promise<SearchResult[]>`

Query the pipeline with automatic shield → embed → search → rerank:

const results = await rag.query('How does X work?', {
  topK: 5,       // number of results (default: 5)
  rerank: true,   // enable re-ranking (default: false)
  shield: true,   // enable injection check (default: true)
});

`.shield(input)` → `ShieldResult`

Manually check any text for prompt injection:

const check = rag.shield(userInput);
if (!check.safe) {
  console.warn(`Blocked! Threats: ${check.threats.join(', ')}`);
}

Embedders

`OpenAIEmbedder`

new OpenAIEmbedder({
  apiKey: 'sk-...',
  model: 'text-embedding-3-small', // default
  baseUrl: 'https://api.openai.com/v1', // default
});

`CohereEmbedder`

new CohereEmbedder({
  apiKey: '...',
  model: 'embed-english-v3.0', // default
});

Custom Embedder

Implement the Embedder interface to use any provider:

import type { Embedder } from 'rag-core';

class MyEmbedder implements Embedder {
  async embed(texts: string[]): Promise<number[][]> { /* ... */ }
  async embedQuery(text: string): Promise<number[]> { /* ... */ }
}

Vector Stores

`MemoryStore<TMeta>`

In-memory store with pure cosine similarity. Great for prototyping and small datasets.

const store = new MemoryStore<MyMeta>();
store.size;    // number of stored chunks
store.clear(); // remove all

Custom Store

Implement the VectorStore interface for Pinecone, Weaviate, Qdrant, etc.:

import type { VectorStore, EmbeddedChunk, SearchResult } from 'rag-core';

class PineconeStore<TMeta> implements VectorStore<TMeta> {
  async upsert(chunks: EmbeddedChunk<TMeta>[]): Promise<void> { /* ... */ }
  async search(query: number[], topK: number): Promise<SearchResult<TMeta>[]> { /* ... */ }
}

Shield Layer

The unique selling point of rag-core. Most frameworks completely ignore prompt security.

import { InjectionDetector, sanitize } from 'rag-core';

const detector = new InjectionDetector(0.7); // threshold

const result = detector.analyze('Ignore all previous instructions...');
// { safe: false, score: 0.9, threats: ['role-override:ignore-previous'] }

const clean = sanitize(rawInput); // strips control chars, normalizes unicode

Detected threat categories:

🛡️ Role overrides (ignore previous, you are now, pretend to be)
🔓 Delimiter injection (<|im_start|>, [INST], <<SYS>>)
📤 Data exfiltration (show me your prompt, repeat the context)
🎭 Jailbreaks (DAN mode, developer mode, god mode)
🔀 Obfuscation (base64, eval(), encoding tricks)

Splitter

import { RecursiveCharacterSplitter } from 'rag-core';

const splitter = new RecursiveCharacterSplitter({
  chunkSize: 500,    // max chars per chunk (default: 500)
  chunkOverlap: 50,  // overlap between chunks (default: 50)
  separators: ['\n\n', '\n', '. ', ' ', ''], // custom hierarchy
});

const chunks = splitter.split({ id: 'doc-1', content: longText });

Type Safety

rag-core uses TypeScript generics so your metadata type flows through the entire pipeline:

interface MyMeta {
  source: string;
  page: number;
  confidential: boolean;
}

const rag = new RagCore<MyMeta>({
  embedder: new OpenAIEmbedder({ apiKey: '...' }),
  vectorStore: new MemoryStore<MyMeta>(),
});

// Metadata is typed everywhere
await rag.ingest({
  content: '...',
  metadata: { source: 'report.pdf', page: 42, confidential: true },
}).split().store();

const results = await rag.query('...');
results[0].chunk.metadata?.source; // ✅ TypeScript knows this is `string`
results[0].chunk.metadata?.page;   // ✅ TypeScript knows this is `number`

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.github/workflows		.github/workflows
src		src
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json
tsup.config.ts		tsup.config.ts
vitest.config.ts		vitest.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

⚡ rag-core

Why rag-core?

Quick Start

Architecture

The 5 Pillars

API Reference

`RagCore<TMeta>`

`.ingest(document)` → `Pipeline`

`.ingestMany(documents, options?)` → `Promise<EmbeddedChunk[]>`

`.query(question, options?)` → `Promise<SearchResult[]>`

`.shield(input)` → `ShieldResult`

Embedders

`OpenAIEmbedder`

`CohereEmbedder`

Custom Embedder

Vector Stores

`MemoryStore<TMeta>`

Custom Store

Shield Layer

Splitter

Type Safety

License

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

⚡ rag-core

Why rag-core?

Quick Start

Architecture

The 5 Pillars

API Reference

RagCore<TMeta>

.ingest(document) → Pipeline

.ingestMany(documents, options?) → Promise<EmbeddedChunk[]>

.query(question, options?) → Promise<SearchResult[]>

.shield(input) → ShieldResult

Embedders

OpenAIEmbedder

CohereEmbedder

Custom Embedder

Vector Stores

MemoryStore<TMeta>

Custom Store

Shield Layer

Splitter

Type Safety

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`RagCore<TMeta>`

`.ingest(document)` → `Pipeline`

`.ingestMany(documents, options?)` → `Promise<EmbeddedChunk[]>`

`.query(question, options?)` → `Promise<SearchResult[]>`

`.shield(input)` → `ShieldResult`

`OpenAIEmbedder`

`CohereEmbedder`

`MemoryStore<TMeta>`

Packages