Skip to content

neuron7xLab/time-rupture-inference

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

112 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

time rupture inference

prediction-error temporal adaptation · hidden rupture · fail-closed evidence

version gate tests mypy lineage

error → update → prediction → action → next-state estimate

The one mechanism

A hidden inter-event interval τ₀ ruptures to τ₁ at an unseen step T*. The agent never sees τ₀, τ₁, T*, or the noise — only the realised interval. It must infer the change from its own prediction error and re-adapt. Nothing is handed to the learner.

S(t) → O(t) → B(t) → P(S(t+Δ)) → E(t) → U(t) → S(t+1)
state   obs    belief  prediction  error   update

What is established — and what is not

Allowed claim. The learned agent adapts to hidden temporal regime shifts better than fixed and naive baselines under preregistered metrics, deterministic replay, no-leakage constraints, and ablation controls.

Forbidden claim. The agent has intelligence, consciousness, biological neuroplasticity, or understanding of time.

This release is prediction-error temporal adaptation — frozen and scoped. Causal action (do(A) → S(t+Δ)) is the next pre-registered lineage, deliberately not begun here.

What this is / is not (model taxonomy card)

dimension this system explicitly NOT
learner one scalar estimate m += gain·error + Page-Hinkley drift trigger a neural network / backprop / gradients
representation a single inter-event interval scalar a representational hierarchy or world model
"adaptation" online error-driven parameter + gain update cognition, understanding, or sentience
"neuroplastic-like" markers 4 measured operational quantities, ablation-gated biological neuroplasticity or brain fidelity
scope one hidden-regime temporal-rupture benchmark general intelligence / AGI
claim status narrow, gate-enforced, falsifiable an ontological claim about mind

A machine-checkable lexicon (claims.yaml) plus scripts/claims_lint.py

is enforced in CI and pytest: external-facing text cannot assert cognition / neural-equivalence / AGI outside an explicit disclaimer block (all such terms are forbidden unless negated).

The risk this kills is interpretation-layer inflation, not a code defect.

Result · 30 seeds × 3 shift magnitudes (incl. a decrease)

agent post-shift MAE role
oracle (knows the schedule) 0.793 irreducible noise floor
learned 0.883 the claim
exp_smoothing / moving_avg / last 0.94 – 1.14 naive baselines
injected (τ₀ hard-wired) 8.003 strawman, must fail

Win-rate vs injected 1.000 · vs best naive 1.000 · on every shift. Ablation proves the drift mechanism necessary. Four neuroplasticity-like markers (synaptic / homeostatic / neuromodulation / extinction) are measured, never asserted.

Falsification lineage — every RED kept

tag verdict what it records
cti-os-v1-RED 🔴 drift detector poisoned by cold-start
cti-os-v2-GREEN 🟢 base proof of life
cti-os-v3-RED 🔴 two new controls mis-specified
cti-os-v4-GREEN 🟢 doctoral critique closed → v0.1.0
v5 (PR #1) 🟢 minimal causal-action: gain 0.868, action_null 0.000
v6 🔴 precision-weighting (Kalman) RED — principled ≠ better, kept
v7 🔴 learned reservoir/SSM ≤ heuristic — NO_HEADROOM (boundary = task)
v8 🔴 scalar-inexpressible env: trigger too rare, decorative
v8.1 🔴 frequency fixed; inexpressibility real but carrier-masked
v8.2 🟠 trigger-scoped+carrier-controlled: signal confirmed & carrier-robust; history-oracle under-specified (PARTIAL_RED)

git checkout cti-os-v1-RED reproduces the failure. Scientific thresholds are byte-identical v2→v4. No threshold was ever tuned to green; every RED (v1, v3, v6, v7, v8, v8.1) and PARTIAL_RED (v8.2) is a preserved artifact. Full state: docs/reports/LINEAGE_STATE.md.

Run

pip install -e ".[dev]"
python scripts_prereg.py        # pin the falsifier, then commit it
PYTHONPATH=src pytest tests -q
PYTHONPATH=src python -m ctios.runner --mode full   # fail-closed gate
PYTHONPATH=src python -m ctios.automation            # full chain → runs/ UTC ledger

Exit 0 ⇔ GREEN. Evidence regenerated every run: evidence/release_gate.md, evidence_ledger.jsonl, NEGATIVE_RESULT_*.md, metrics_summary.csv, plots/.

Structure

src/ctios/   env · agents · drift · metrics · gates · ledger · runner · automation
prereg/      preregistration.yaml · falsifier_contract.yaml · sha_pin.txt
configs/     env · agents · metrics · experiment (the 30×3 grid)
evidence/    ledger · negatives · v4 baseline lock · release gate
tests/       CI-verified tests incl. no-leakage, shuffle kill-control, contract
invariants.yaml  machine-readable invariant register (enforced refs)

Formal specification (claim → falsifier → evidence → boundary): docs/SPEC.md. Reproducible demonstration: docs/DEMO.md.

Reviewer map

Citations do not expand scientific claims; they only map boundaries, prior art, and reviewer context.

System identity

TRI-Falsify is a falsification-first temporal inference and hypothesis-auditing apparatus. A pinned, hashed hypothesis runs through an adversarial battery into a sealed, reproducible verdict; on failure it auto-proposes a human-gated next experiment and never runs anything itself. Reviewer entry points:

PR21 adds adversarial portability stress tests: eight deterministic degenerate probes run fail-closed against the battery over a seven-family synthetic portfolio plus a data-sensitivity scan, with a sealed evidence artifact and a CI gate. It improves battery coverage against degenerate probes; it does not assert real-world validity, and an external collaborator run of the private layer remains open.

One command for a reviewer: bash scripts/conference_smoke.sh. One command for the adversarial gate: bash scripts/external_adversarial_demo.sh.

External review package

Start here if you are reviewing the private-safe redacted-hypothesis R&D package: docs/INDI_README.md. One-command check: bash scripts/indi_demo.sh. A redacted private hypothesis runs through the apparatus with no proprietary mechanism, dataset, or theorem content entering this repository.

CTI-OS v7 · GCP readiness (CPU-first, no GPU default)

v7 tests whether a learned sequence model (small GRU / linear state-space) beats the frozen v4 heuristic and a from-scratch conventional baseline on a harder multi-regime, partially-observable environment. The repo is prepared as a deterministic, cost-guarded, reproducible cloud-run artifact — Google is only an execution surface.

# local readiness gates (no cloud, no spend)
make test
make v7-prereg-check
make v7-cpu-smoke
make v7-artifact-check
make gcp-doctor          # actionable PASS/FAIL (install gcloud + auth)
make gcp-dry-run         # non-destructive plan, creates nothing
# only if all green AND operator-reviewed cost guardrails:
PROJECT_ID=cti-os-v7 ZONE=europe-west4-a APPLY=1 make gcp-cpu-run
APPLY=1 PROJECT_ID=cti-os-v7 make gcp-cleanup

Phase 1 is CPU-only; GPU is a documented, manually-unlocked future phase. No command creates cloud resources unless APPLY=1 is explicit. Budgets are alerts, not hard caps — see docs/cloud/gcp_cost_guardrails.md. Pre-registration: docs/prereg/cti_os_v7_preregistration.md.

Claim boundary: a learned sequence model with a representational advantage on a harder task — NOT intelligence, NOT AGI, NOT cognition.

Build the smallest environment where language cannot fake adaptation; then make the model adapt under hidden temporal rupture, or fail loudly.

About

Falsification-first temporal-inference & hypothesis-auditing apparatus: pinned falsifier → adversarial battery → sealed verdict → human-gated next experiment. Preserved RED lineage, byte-identical frozen invariants, verifier trust boundary. Synthetic benchmarks only — not cognition/AGI.

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors