Website · Architecture · Kernel matrix · Examples · Related work
O(n) attention is deception. A backend-neutral kernel of predictive primitives — substrates, memory, gating, routing, readouts — that downstream systems combine into trained models without forking the kernel itself.
decepticons is the shared mechanism layer for predictive descendants:
substrate dynamics, controller summaries, memory primitives, feature views,
readouts, and runtime helpers extracted from a broader experiment family so
downstream systems can specialize without forking the kernel.
Requires Python ≥ 3.11. The kernel itself only needs numpy.
If you don't already have a Python virtual environment, make one first.
Modern Linux distributions block pip from writing into the system Python
(PEP 668), so a venv is the standard path:
python3 -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activateThen install from PyPI:
pip install decepticonsFor the model backends:
pip install "decepticons[torch]" # PyTorch CausalBankModel + routed readouts
pip install "decepticons[metal]" # Apple MLX backend (Apple Silicon)To leave the venv when you're done: deactivate.
For development from source (clone + editable install + run tests):
git clone https://github.com/asuramaya/decepticons
cd decepticons
python3 -m venv .venv
source .venv/bin/activate
pip install -e ".[test]"
pytest -vfrom decepticons import ByteCodec, ByteLatentPredictiveCoder
text = "predictive coding likes repeated structure.\n" * 64
model = ByteLatentPredictiveCoder()
report = model.fit(text)
prompt = ByteCodec.encode_text("predictive ")
sample = model.generate(prompt, steps=40, greedy=True)
print(report.train_bits_per_byte)
print(ByteCodec.decode_text(sample))CLI:
decepticons fit --input ./corpus.txt --prompt "predictive " --generate 80A complete worked example lives in
examples/quickstart.py.
For descendant-shaped projects, see
examples/projects/.
| Area | Highlights |
|---|---|
| Substrates | recurrent, delay, linear-memory, oscillatory, mixed, hierarchical |
| Control | controller summaries, pathway gates, summary routing, hormone modulation, predictive surprise |
| Memory | exact-context, n-gram, statistical-backoff, online n-gram, cache views |
| Views | byte-latent, hierarchical, linear-memory, sampled multiscale, bridge features, probability diagnostics |
| Readouts | ridge, frozen-readout expert, sampled multiscale, GRU recurrent, routed squared-ReLU |
| Adapters | causal predictive, oracle analysis, bridge export, noncausal reconstructive, paired teacher/export |
| Runtime | traces, fit reports, rollout evaluation, transfer probes, train-mode checkpoints, artifact accounting |
| Causal-bank | family metadata + deterministic substrate construction (frozen / learnable-decays / learnable-mixing / learned-recurrence / gated-retention) |
| Backends | numpy-only kernel; PyTorch and MLX CausalBankModel implementations |
Full capability matrix: docs/kernel_matrix.md.
decepticons ──→ chronohorn ──→ heinrich
kernel runtime forensics
(this repo) training · fleet geometry · audit
Three layers inside this repo:
- Kernel —
src/decepticons/. Public package. Reusable mechanisms only. - Project descendants —
examples/projects/. Pressure-tests the kernel boundary with concrete descendant shapes (causal · oracle · bridge · noncausal · byte-latent). - Tooling —
examples/tools/. Development and analysis scripts. Not part of the public package.
Code moves into src/ only when all three hold:
- It is a mechanism, not a project policy.
- At least two descendants want the same thing.
- The generalized API is simpler than keeping the duplication.
This rule is the main defense against turning the kernel into a renamed
collection of branches. Full detail in
docs/architecture.md
and the boundary against the runtime in
docs/chronohorn_boundary.md.
All substrate modes are verified by
tests/test_causality.py:
it feeds two identical sequences up to position t, different after t. If
logits at position t differ, causality is violated and CI fails. Modes
verified: frozen, learnable_mixing, learnable_decays, selective scan
augment (state_dim > 0), readout_bands, routed experts.
decepticons never imports its descendants — enforced by an AST scan in
tests/test_dependency_firewall.py.
- Architecture — package map, three-layer model, promotion rule
- Kernel matrix — capability matrix
- Chronohorn boundary — boundary against the runtime descendant
- Downstream patterns — causal, noncausal, oracle, bridge, byte-latent patterns
- Related work — research anchors and prior art
- Landscape — ecosystem snapshot (March 2026)
- Lineage — source attribution
- Examples — example descendants and tooling
- Tests — verification surface
This is a research kernel and reference implementation. The current pressure from descendants is O(n) causal-bank architecture search — cheap ablation lanes to separate mechanisms before promotion, with scale and context survival checked in the descendant runtime.
It is not a frontier runtime, a production compression stack, or a benchmark claim. It exists to keep the shared mechanism layer reusable and legible.
See CONTRIBUTING.md.
Issues and pull requests welcome.
MIT — see LICENSE.