A cognitive operating system for local-first autonomous intelligence
NOUS (Greek: nous, meaning "intellect" or "mind") is a cognitive operating system — a fully local, autonomous AI that runs entirely on your machine with zero external dependencies. Unlike large language models that require cloud infrastructure, GPU clusters, or API subscriptions, NOUS operates as a single 20 MB static binary consuming approximately 50 MB of RAM, capable of running on hardware as constrained as a Raspberry Pi.
The name is also an acronym reflecting the system's architecture:
| Letter | Meaning | What it represents |
|---|---|---|
| N | Native | Runs natively on any hardware. No cloud, no containers, no runtime dependencies. Pure compiled Go. |
| O | Orchestration | Orchestrates 175 cognitive modules across 6 concurrent processing streams on a shared blackboard architecture. |
| U | Unified | Unifies perception, reasoning, planning, execution, reflection, and learning into a single coherent cognitive loop — not a pipeline of disconnected components. |
| S | Streams | Six cognitive streams (Perceiver, Reasoner, Planner, Executor, Reflector, Learner) run concurrently via goroutines on a pub/sub event bus, enabling real-time cognitive processing. |
NOUS is implemented in 235,000 lines of pure Go across 570 source files with 257 test files and zero external dependencies — no Python, no pip, no npm, no C bindings, no model downloads.
NOUS introduces several novel approaches to local AI systems:
NOUS includes the first Mamba Structured State Space Model (SSM) implemented in pure Go — a 7.6M parameter decoder-only model with selective scan, O(1)-per-token generation, and O(n) training. Unlike transformer-based LLMs that generate from statistical patterns (and thus hallucinate), NOUS constrains the Mamba decoder with a FactTrie — a prefix tree encoding all allowed token sequences from the knowledge graph. Beam search over this trie guarantees that the neural model physically cannot assert facts not present in the knowledge base.
Result: 100% factual accuracy on all tested queries. When NOUS doesn't know something, it says so honestly.
Every response NOUS generates feeds into a cognitive compiler that extracts regex-based patterns and slot-filled templates. Subsequent matching queries bypass the full NLU/reasoning pipeline entirely, resolving in ~0ms via compiled pattern handlers. Over weeks of use, the compilation rate converges toward 99%, meaning the system literally rewrites itself to be faster with every interaction. Compiled handlers persist across restarts.
NOUS is the first local AI system to implement autonomous background reasoning. When idle, the Dream Engine executes five operations: graph wandering (random walks for cross-domain connections), knowledge expansion (Wikipedia fetches for conversation topics), causal inference (edge discovery from graph topology), conversation reflection (pattern analysis from episodic memory), and cross-domain synthesis (2-hop insight generation). A quality scoring system based on cross-domain distance, entity rarity, and graph neighbor overlap ensures only genuinely surprising discoveries are surfaced. Dream cycles are scheduled adaptively — more active at night, lighter during active hours.
NOUS instances can share compiled response patterns (crystals) without sharing personal conversations. Each crystal contains only an abstract regex pattern and response template — never raw user data. Trust scoring (quality * votes * recency * bundle trust) auto-filters low-quality imports. This enables privacy-preserving collective intelligence across a network of NOUS instances.
Before responding to complex queries, NOUS convenes an Inner Council of five cognitive perspectives — Pragmatist (factual grounding), Historian (episodic memory patterns), Empath (emotional subtext), Architect (systemic connections), and Skeptic (gap identification). Multi-round debate with four move types (strengthen, concede, challenge, synthesize) enables genuine deliberation with consensus detection and early termination.
NOUS chains a GraphCausalReasoner, MultiHopReasoner, and InnerCouncil to run forward scenario simulations over the knowledge graph. Given a hypothesis ("What if the internet disappeared?"), the engine propagates causal effects step by step with confidence decay (0.85^n per step), discovers multi-hop connections between affected entities, and evaluates each intermediate state from five perspectives. Risk assessment classifies scenarios by severity, predictability, and cascading potential.
┌──────────────────────────────────────────────────────────────────┐
│ USER INPUT │
│ Terminal / HTTP API / Telegram / UNIX pipes │
└──────────────────────┬────────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────────────┐
│ COGNITIVE COMPILER (compiled handlers, ~0ms) │
│ HIT → instant response MISS ↓ │
└──────────────────────┬────────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────────────┐
│ NLU ENGINE (neural + pattern, <1ms, 98.2% accuracy) │
│ 51 intent categories, entity extraction, slot filling │
└──────────────────────┬────────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────────────┐
│ DISPATCH PIPELINE │
│ Priority-tagged, phase-based routing with bypass rules │
│ │
│ ├── Simulation Engine ── causal chains + council + multi-hop │
│ ├── Expert Personas ── 7 domain-constrained experts │
│ ├── Deep Research ── decompose → fetch → infer → report │
│ ├── Knowledge Synthesis ── reason from adjacent knowledge │
│ └── Action Router ── 52 tools, knowledge graph, Wikipedia │
└──────────────────────┬────────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────────────┐
│ RESPONSE LAYER: Mamba SSM (constrained) / Composer / GraphRAG │
│ → Quality Gate → Dedup → Fragment Cleanup → Format Compliance │
│ → Compiler Learning (async pattern extraction) │
└──────────────────────┬────────────────────────────────────────────┘
│
▼
Response (5ms typical)
┌───────────────────────────┐
Background ──>│ DREAM MODE (autonomous) │
│ Wander · Expand · Infer │
│ Reflect · Synthesize │
└───────────────────────────┘
| Layer | Persistence | Capacity | Purpose |
|---|---|---|---|
| Working | Session | 64 slots | Current conversation context with relevance decay |
| Long-term | Permanent | Unbounded | Personal facts, user preferences, profile |
| Episodic | Permanent | 10,000 interactions | Every interaction, timestamped, semantically searchable |
| Project | Per-directory | Unbounded | Project-scoped facts with confidence scores |
| Knowledge | Permanent | 5,000+ facts | Typed knowledge graph + 680 Wikipedia paragraphs + on-demand expansion |
| Compiled | Permanent | Grows with use | Cognitive compiler handlers — patterns that resolve in ~0ms |
| System | Mechanism |
|---|---|
| Mamba SSM | Pure-Go structured state space model. O(1)/token, 7.6M params, selective scan. |
| Constrained Decoding | FactTrie + beam search = neural fluency with zero hallucination. |
| Cognitive Compiler | Compiles responses into deterministic handlers. Convergence: 99% at 1 month. |
| Dream Mode | 5 autonomous operations with quality scoring (surprise, dedup, novelty). |
| Deep Research | 6-phase pipeline: decompose → fetch → infer → connect → extract → report. |
| Simulation Engine | Forward causal propagation with confidence decay and council evaluation. |
| Expert Personas | 7 domain experts with interaction-based specialization learning. |
| Inner Council | 5 perspectives, multi-round debate, 4 move types, consensus detection. |
| Predictive Intelligence | Recency-weighted topic scoring, temporal patterns, anomaly detection. |
| Cognitive Profiling | Learns depth preference, learning style, peak hours, blind spots. |
| Causal Inference | 4 strategies: temporal, dependency, inhibition, production chains. |
| GraphRAG | Spreading activation BFS, relevance ranking, diversity filtering. |
| Knowledge Synthesis | 6 strategies: generalization, decomposition, analogy, causal, contrastive, compositional. |
| Wikipedia On-Demand | Live REST API fetch → fact extraction → graph learning → permanent knowledge. |
| Knowledge Expander | Recursive frontier discovery, multi-generation gap filling. |
| Agent Experience | Records tool chain outcomes per goal type, avoids failing tools. |
| Federated Crystals | Privacy-preserving pattern sharing with trust scoring. |
| Conversation Learning | Extracts typed triples from responses into knowledge graph. |
| Causal Bootstrap | 160+ common-sense causal edges seeded at startup for simulation. |
| Socratic Engine | 5 modes, 79 question templates. Detects when to ask, not answer. |
| Capability | Performance | Details |
|---|---|---|
| Knowledge Q&A | 5ms, 100% accuracy | 680+ topics + Wikipedia on-demand for any topic on Earth |
| Mathematical computation | <1ms | Full expression parser with functions (sqrt, pow, sin, cos) |
| NLU classification | <1ms, 98.2% | 51 intent categories with entity extraction |
| Safety filtering | Instant | 50+ harmful pattern categories with compassionate self-harm response |
| Expert perspectives | 5ms | 7 domain-constrained personas that learn from interactions |
| What-if simulation | 20ms | Causal chain propagation with risk assessment |
| Autonomous research | 1-3s | Topic decomposition → knowledge fetch → causal inference → report |
| Background reasoning | Continuous | Dream mode with quality-scored discovery (surprise, novelty) |
| Predictive intelligence | Instant | Anticipates next topics from temporal + behavioral patterns |
| 52 built-in tools | <1ms-500ms | Weather, calculator, notes, todos, habits, code runner, web search, and 40+ more |
| Autonomous agents | Seconds-minutes | Goal decomposition, tool chaining, experience learning, human-in-the-loop |
| Federated learning | Instant | Privacy-preserving crystal sharing between instances |
| Limitation | Why | Mitigation |
|---|---|---|
| Prose fluency | Template-based NLG, not neural generation | Mamba SSM improves this; cognitive compiler caches good responses |
| Creative writing | Cannot write essays, stories, or poems at LLM quality | Structured creative output (haiku, concept blending) available |
| Open-ended reasoning | Graph traversal is precise but bounded by knowledge density | Knowledge expander + Wikipedia on-demand grow the graph continuously |
| Multi-turn conversation | Follow-up resolution is heuristic, not neural | Reference resolution system handles pronouns and context switching |
| Image/audio processing | Text-only cognitive architecture | Future work: multimodal perception layer |
| Code generation | Template-based, not neural | Code review, dependency analysis, and explanation are strong |
| Broad world knowledge | 680 curated topics vs. billions of LLM parameters | Wikipedia on-demand and dream mode expand knowledge autonomously |
| Cloud LLMs | Local LLMs (Ollama) | NOUS | |
|---|---|---|---|
| Runtime | Cloud servers | Your GPU (4-16 GB) | Any CPU (50 MB) |
| Latency | 500ms-3s | 100ms+ | 5ms |
| Privacy | Data leaves machine | Local | 100% local, zero telemetry |
| Dependencies | API key + internet | Python + model download | Zero |
| Hallucination | 5-15% | 5-10% | 0% (architecturally impossible) |
| Self-improvement | No | No | Yes (compiler + dream mode + learning) |
| Autonomous reasoning | No | No | Yes (dream mode, 5 operations) |
| Causal simulation | No | No | Yes (160+ causal edges, council debate) |
| Memory | Per-session | None | 6-layer persistent memory |
| Tools | Plugin-based | Wrappers | 52 built-in, all local |
| Cost | $20/month+ | GPU hardware | Free |
git clone https://github.com/artaeon/nous.git
cd nous
go build -o nous ./cmd/nous
./nousNo pip install. No model downloads. No GPU drivers. First launch trains the neural classifier (~90 seconds). After that, starts instantly.
go build -o nous-train ./cmd/nous-train
./nous-train mamba -knowledge knowledge/
# Model saved to ~/.nous/mamba.bin (~30 minutes on CPU)./nous --serve --port 3333 --api-key yoursecretkey
curl -X POST http://localhost:3333/api/chat \
-H "Authorization: Bearer yoursecretkey" \
-H "Content-Type: application/json" \
-d '{"message": "what is quantum mechanics"}'# Pipe-friendly cognitive infrastructure
echo "what is the weather in Paris" | nous understand
# {"intent":"weather","action":"weather","entities":{"location":"paris"},"confidence":0.95}
nous reason "Should I use Python or Go for a web server?"
nous remember "project.lang" "Go"nous dream 10 # Autonomous background reasoning
nous research "artificial intelligence" --depth deep # Multi-step investigation
nous simulate "what if the internet disappeared" # Causal scenario simulation
nous expand --generations 3 # Recursive knowledge expansion
nous infer # Run causal inferencenous/ # 235,000 lines of Go, 570 files, zero dependencies
├── cmd/
│ ├── nous/ # Main binary — REPL + HTTP server + agent + CLI
│ ├── nous-train/ # Neural model training (NLU, TextGen, Mamba)
│ ├── wikiimport/ # Wikipedia -> knowledge packages
│ └── wikitemplate/ # Knowledge template generation
├── internal/
│ ├── cognitive/ # 175 modules — the cognitive engine
│ ├── agent/ # Autonomous agent with experience learning
│ ├── federation/ # Federated crystal sharing with trust scoring
│ ├── memory/ # 6-layer persistent memory
│ ├── micromodel/ # Mamba SSM + transformer + constrained decoding
│ ├── tools/ # 52 built-in tools
│ ├── eval/ # Quality evaluation, red-team suites
│ ├── training/ # Distillation, preference optimization
│ ├── server/ # HTTP API + web UI
│ ├── channels/ # Telegram, Discord, Matrix integrations
│ └── hands/ # Multi-step task automation
├── knowledge/ # 78K words, 680+ topics, curated knowledge base
├── packages/ # Structured knowledge packages (JSON)
└── go.mod # go 1.22 — zero external dependencies
| Decision | Rationale |
|---|---|
| Pure Go, zero dependencies | Single static binary. Cross-compiles to any platform. No supply chain risk. Deployment is scp + ./nous. |
| Knowledge graph over embeddings | Typed relations (is_a, causes, enables, prevents) enable causal reasoning, not just similarity matching. |
| Constrained decoding over RLHF | Hallucination is eliminated architecturally, not probabilistically. The system cannot generate unknown facts. |
| Cognitive compiler over caching | Compiled handlers generalize via regex capture groups. "what is {X}" matches any topic, not just previously seen queries. |
| 6-layer memory over context windows | Working memory decays, episodic memory persists, knowledge graph grows. No 128K token limit — memory is unbounded. |
| Federated crystals over fine-tuning | Share patterns between instances without sharing data. No training required — instant import. |
- Go 1.22+ (build only)
- ~50 MB RAM at runtime
- No GPU required
- No internet required (after build; weather/search need network)
- Linux, macOS, or Windows (amd64, arm64)
| Document | Description |
|---|---|
| ARCHITECTURE.md | Detailed technical architecture with diagrams |
| CHANGELOG.md | Version history and release notes |
| BENCHMARKS.md | Performance measurements and profiling |
| TESTING.md | Test strategy, coverage, and CI pipeline |
| SECURITY.md | Threat model and defense layers |
| CONTRIBUTING.md | Development setup and contribution guidelines |
If you use NOUS in academic work, please cite:
@software{nous2026,
title = {NOUS: Native Orchestration of Unified Streams},
author = {Lugmayr, Raphael},
year = {2026},
url = {https://github.com/artaeon/nous},
note = {A cognitive operating system for local-first autonomous intelligence.
235K lines of pure Go, zero dependencies, zero hallucination.}
}MIT. See LICENSE.
Built by Artaeon