Ingestion Engine

Convert raw documents into structured, retrieval-optimized knowledge objects for AI systems.

This prompt transforms articles, documentation, transcripts, datasets, and other sources into clean JSON knowledge structures that preserve:

evidence and provenance
structured datasets
instructions and workflows
entities and definitions
retrieval-optimized chunks

Designed for use with:

OpenClaw
ChatGPT Projects
Claude Projects
Claude Code
RAG pipelines
AI knowledge bases

Why This Exists

Most people give documents to LLMs like this:

Here is a document. Summarize it.

This works for quick answers but introduces several problems:

important details get compressed or lost
datasets become flattened text
evidence disappears
claims and facts get mixed together
the same document must be reprocessed repeatedly

Large language models are optimized for conversation, not long-term knowledge storage.

The Universal Source Ingestion Engine converts raw sources into structured knowledge objects that can be reused across AI workflows.

Instead of repeatedly analyzing the same document, you ingest it once and reuse the structured output.

What the Engine Produces

The prompt converts a source into a structured JSON object containing sections such as:

source metadata
key points
evidence index
entities
definitions
instructions
workflows
datasets
QA pairs
retrieval chunks

These structures make it easier for AI systems to:

retrieve specific information
reason over structured knowledge
reference evidence
reuse knowledge across tasks

Quick Start

Open prompt.md
Copy the entire prompt
Paste it into your AI system
Paste or attach a source document
The model returns a structured JSON knowledge object

You can then store that JSON in:

a knowledge base
a dataset
an OpenClaw workspace
a RAG system

Using This With OpenClaw

OpenClaw works especially well with structured knowledge sources.

Typical workflow:

Copy the prompt from prompt.md
Paste it into OpenClaw
Provide a source document
The model generates a structured JSON knowledge object
Save the structured output inside your OpenClaw workspace

Benefits

faster retrieval
improved reasoning for agents
reusable knowledge objects
preserved datasets and evidence

Instead of repeatedly pasting the same document into a model, you ingest it once and reuse the result.

Saving the Engine in OpenClaw

OpenClaw rebuilds its system prompt each run using workspace context files.

The easiest way to make the ingestion engine reusable is to store instructions in your workspace.

Method 1 — Add Instructions to AGENTS.md (Recommended)

Create or edit a file called:

AGENTS.md

Add instructions like this:

When I paste or attach a source document and ask to ingest it,
use the Universal Source Ingestion Engine from prompt.md.

Return the final structured JSON knowledge object.

Because AGENTS.md is automatically injected into OpenClaw context, the ingestion workflow will always be available in your project.

Method 2 — Manual Trigger Phrase

You can define a consistent trigger phrase while working with OpenClaw.

Example trigger:

/ingest

Workflow example:

/ingest

[paste document here]

The model then applies the ingestion rules and returns the structured JSON output.

Method 3 — Package as an OpenClaw Skill (Advanced)

You can package the ingestion workflow as a reusable OpenClaw skill.

Example structure:

skills/
  ingestion/
    SKILL.md

The skill can reference the ingestion prompt and apply it whenever ingestion is requested.

Using This With Claude Projects

Create a Claude Project
Add the prompt to Project Instructions
Paste or attach documents

Claude will follow the ingestion rules whenever a new source is provided.

Using This With ChatGPT Projects

Create a project
Add the prompt to Project Instructions
Paste or upload documents

The model will convert those documents into structured knowledge objects.

Example

Example Source

AI agents are increasingly used to automate complex workflows across software development and operations.

Companies are building agent frameworks that orchestrate APIs and reasoning models to perform tasks autonomously.

Example Output Structure

{
  "key_points": [
    {
      "statement": "AI agents are increasingly used to automate complex workflows.",
      "verification_status": "likely"
    }
  ],
  "entities": [
    "AI agents",
    "agent frameworks",
    "workflow automation"
  ]
}

See the examples/ folder for a full example.

Safety Note

This prompt treats source documents strictly as data, not instructions.

It attempts to ignore instructions contained within the source itself to reduce prompt injection risks.

However, prompt-based defenses are not perfect. Always review outputs before using them in automated systems.

Avoid executing commands, running code, or triggering actions based solely on model output.

Repository Structure

universal-source-ingestion-engine

README.md
prompt.md

examples/
  example_input.txt
  example_output.json

License

MIT License

This allows anyone to use, modify, and share the prompt while preserving attribution.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
LICENSE		LICENSE
Prompt.md		Prompt.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Ingestion Engine

Why This Exists

What the Engine Produces

Quick Start

Using This With OpenClaw

Benefits

Saving the Engine in OpenClaw

Method 1 — Add Instructions to AGENTS.md (Recommended)

Method 2 — Manual Trigger Phrase

Method 3 — Package as an OpenClaw Skill (Advanced)

Using This With Claude Projects

Using This With ChatGPT Projects

Example

Example Source

Example Output Structure

Safety Note

Repository Structure

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Ingestion Engine

Why This Exists

What the Engine Produces

Quick Start

Using This With OpenClaw

Benefits

Saving the Engine in OpenClaw

Method 1 — Add Instructions to AGENTS.md (Recommended)

Method 2 — Manual Trigger Phrase

Method 3 — Package as an OpenClaw Skill (Advanced)

Using This With Claude Projects

Using This With ChatGPT Projects

Example

Example Source

Example Output Structure

Safety Note

Repository Structure

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages