GitHub - neuron7xLab/time-rupture-inference: Falsification-first temporal-inference & hypothesis-auditing apparatus: pinned falsifier → adversarial battery → sealed verdict → human-gated next experiment. Preserved RED lineage, byte-identical frozen invariants, verifier trust boundary. Synthetic benchmarks only — not cognition/AGI.

prediction-error temporal adaptation · hidden rupture · fail-closed evidence

_{error → update → prediction → action → next-state estimate}

The one mechanism

A hidden inter-event interval τ₀ ruptures to τ₁ at an unseen step T*. The agent never sees τ₀, τ₁, T*, or the noise — only the realised interval. It must infer the change from its own prediction error and re-adapt. Nothing is handed to the learner.

S(t) → O(t) → B(t) → P(S(t+Δ)) → E(t) → U(t) → S(t+1)
state   obs    belief  prediction  error   update

What is established — and what is not

Allowed claim. The learned agent adapts to hidden temporal regime shifts better than fixed and naive baselines under preregistered metrics, deterministic replay, no-leakage constraints, and ablation controls.

Forbidden claim. The agent has intelligence, consciousness, biological neuroplasticity, or understanding of time.

This release is prediction-error temporal adaptation — frozen and scoped. Causal action (do(A) → S(t+Δ)) is the next pre-registered lineage, deliberately not begun here.

What this is / is not (model taxonomy card)

dimension	this system	explicitly NOT
learner	one scalar estimate `m += gain·error` + Page-Hinkley drift trigger	a neural network / backprop / gradients
representation	a single inter-event interval scalar	a representational hierarchy or world model
"adaptation"	online error-driven parameter + gain update	cognition, understanding, or sentience
"neuroplastic-like" markers	4 measured operational quantities, ablation-gated	biological neuroplasticity or brain fidelity
scope	one hidden-regime temporal-rupture benchmark	general intelligence / AGI
claim status	narrow, gate-enforced, falsifiable	an ontological claim about mind

A machine-checkable lexicon (claims.yaml) plus scripts/claims_lint.py

is enforced in CI and pytest: external-facing text cannot assert cognition / neural-equivalence / AGI outside an explicit disclaimer block (all such terms are forbidden unless negated).

The risk this kills is interpretation-layer inflation, not a code defect.

Result · 30 seeds × 3 shift magnitudes (incl. a decrease)

agent	post-shift MAE	role
oracle (knows the schedule)	0.793	irreducible noise floor
learned	0.883	the claim
exp_smoothing / moving_avg / last	0.94 – 1.14	naive baselines
injected (τ₀ hard-wired)	8.003	strawman, must fail

Win-rate vs injected 1.000 · vs best naive 1.000 · on every shift. Ablation proves the drift mechanism necessary. Four neuroplasticity-like markers (synaptic / homeostatic / neuromodulation / extinction) are measured, never asserted.

Falsification lineage — every RED kept

tag	verdict	what it records
`cti-os-v1-RED`	🔴	drift detector poisoned by cold-start
`cti-os-v2-GREEN`	🟢	base proof of life
`cti-os-v3-RED`	🔴	two new controls mis-specified
`cti-os-v4-GREEN`	🟢	doctoral critique closed → `v0.1.0`
v5 (PR #1)	🟢	minimal causal-action: gain 0.868, action_null 0.000
v6	🔴	precision-weighting (Kalman) RED — principled ≠ better, kept
v7	🔴	learned reservoir/SSM ≤ heuristic — NO_HEADROOM (boundary = task)
v8	🔴	scalar-inexpressible env: trigger too rare, decorative
v8.1	🔴	frequency fixed; inexpressibility real but carrier-masked
v8.2	🟠	trigger-scoped+carrier-controlled: signal confirmed & carrier-robust; history-oracle under-specified (PARTIAL_RED)

git checkout cti-os-v1-RED reproduces the failure. Scientific thresholds are byte-identical v2→v4. No threshold was ever tuned to green; every RED (v1, v3, v6, v7, v8, v8.1) and PARTIAL_RED (v8.2) is a preserved artifact. Full state: docs/reports/LINEAGE_STATE.md.

Run

pip install -e ".[dev]"
python scripts_prereg.py        # pin the falsifier, then commit it
PYTHONPATH=src pytest tests -q
PYTHONPATH=src python -m ctios.runner --mode full   # fail-closed gate
PYTHONPATH=src python -m ctios.automation            # full chain → runs/ UTC ledger

Exit 0 ⇔ GREEN. Evidence regenerated every run: evidence/release_gate.md, evidence_ledger.jsonl, NEGATIVE_RESULT_*.md, metrics_summary.csv, plots/.

Structure

src/ctios/   env · agents · drift · metrics · gates · ledger · runner · automation
prereg/      preregistration.yaml · falsifier_contract.yaml · sha_pin.txt
configs/     env · agents · metrics · experiment (the 30×3 grid)
evidence/    ledger · negatives · v4 baseline lock · release gate
tests/       CI-verified tests incl. no-leakage, shuffle kill-control, contract
invariants.yaml  machine-readable invariant register (enforced refs)

Formal specification (claim → falsifier → evidence → boundary): docs/SPEC.md. Reproducible demonstration: docs/DEMO.md.

Reviewer map

Trust architecture: docs/TRUST_LAYER.md
Reproduction contract: docs/REPRODUCIBILITY_CONTRACT.md
Claim/source matrix: docs/CLAIM_SOURCE_MATRIX.md
References (claim-mapped): docs/REFERENCES.md
Prior-art boundary map: docs/PRIOR_ART_BOUNDARY_MAP.md
Review paths: docs/REVIEW_PATH.md
Value positioning: docs/VALUE_POSITIONING.md
Open structural gaps: docs/OPEN_STRUCTURAL_GAPS.md

Citations do not expand scientific claims; they only map boundaries, prior art, and reviewer context.

System identity

TRI-Falsify is a falsification-first temporal inference and hypothesis-auditing apparatus. A pinned, hashed hypothesis runs through an adversarial battery into a sealed, reproducible verdict; on failure it auto-proposes a human-gated next experiment and never runs anything itself. Reviewer entry points:

docs/REVIEWER_ONE_PAGER.md — 60-second system identity.
docs/SYSTEM_CARD.md — abstraction, inputs/outputs, boundaries.
docs/CONFERENCE_ABSTRACT.md — workshop abstract.
docs/CONTRIBUTION_CLAIMS.md — original vs prior art.
docs/REPRODUCIBILITY_CONTRACT.md — clean-clone contract + frozen outputs.
docs/FAILURE_TAXONOMY.md — defenses + residual risk.
docs/ARCHITECTURE.md — pipeline + module map.

PR21 adds adversarial portability stress tests: eight deterministic degenerate probes run fail-closed against the battery over a seven-family synthetic portfolio plus a data-sensitivity scan, with a sealed evidence artifact and a CI gate. It improves battery coverage against degenerate probes; it does not assert real-world validity, and an external collaborator run of the private layer remains open.

One command for a reviewer: bash scripts/conference_smoke.sh. One command for the adversarial gate: bash scripts/external_adversarial_demo.sh.

External review package

Start here if you are reviewing the private-safe redacted-hypothesis R&D package: docs/INDI_README.md. One-command check: bash scripts/indi_demo.sh. A redacted private hypothesis runs through the apparatus with no proprietary mechanism, dataset, or theorem content entering this repository.

docs/INDI_README.md — read this first.
docs/INDI_EXECUTIVE_SUMMARY.md — 3-minute summary.
docs/PRIVATE_RND_PROTOCOL.md — redacted hypothesis interface + IP boundary table.
docs/INDI_REVIEWER_CHECKLIST.md — fastest useful path, then graded tiers.

CTI-OS v7 · GCP readiness (CPU-first, no GPU default)

v7 tests whether a learned sequence model (small GRU / linear state-space) beats the frozen v4 heuristic and a from-scratch conventional baseline on a harder multi-regime, partially-observable environment. The repo is prepared as a deterministic, cost-guarded, reproducible cloud-run artifact — Google is only an execution surface.

# local readiness gates (no cloud, no spend)
make test
make v7-prereg-check
make v7-cpu-smoke
make v7-artifact-check
make gcp-doctor          # actionable PASS/FAIL (install gcloud + auth)
make gcp-dry-run         # non-destructive plan, creates nothing
# only if all green AND operator-reviewed cost guardrails:
PROJECT_ID=cti-os-v7 ZONE=europe-west4-a APPLY=1 make gcp-cpu-run
APPLY=1 PROJECT_ID=cti-os-v7 make gcp-cleanup

Phase 1 is CPU-only; GPU is a documented, manually-unlocked future phase. No command creates cloud resources unless APPLY=1 is explicit. Budgets are alerts, not hard caps — see docs/cloud/gcp_cost_guardrails.md. Pre-registration: docs/prereg/cti_os_v7_preregistration.md.

Claim boundary: a learned sequence model with a representational advantage on a harder task — NOT intelligence, NOT AGI, NOT cognition.

_{Build the smallest environment where language cannot fake adaptation; then make the model adapt under hidden temporal rupture, or fail loudly.}

Name		Name	Last commit message	Last commit date
Latest commit History 112 Commits
.claude		.claude
.github		.github
artifacts		artifacts
configs		configs
docs		docs
evidence		evidence
examples		examples
plots		plots
prereg		prereg
release		release
runs		runs
scripts		scripts
src/ctios		src/ctios
templates		templates
tests		tests
tools		tools
.auditignore.json		.auditignore.json
.gitattributes		.gitattributes
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CITATION.cff		CITATION.cff
CLAUDE.md		CLAUDE.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
README_PACKAGE.md		README_PACKAGE.md
SECURITY.md		SECURITY.md
claims.yaml		claims.yaml
invariants.yaml		invariants.yaml
provenance_manifest.json		provenance_manifest.json
pyproject.toml		pyproject.toml
requirements-ci.in		requirements-ci.in
requirements-ci.lock		requirements-ci.lock
requirements-lock.txt		requirements-lock.txt
sbom.spdx.json		sbom.spdx.json
scripts_prereg.py		scripts_prereg.py
verifier_manifest.lock		verifier_manifest.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

The one mechanism

What is established — and what is not

What this is / is not (model taxonomy card)

Result · 30 seeds × 3 shift magnitudes (incl. a decrease)

Falsification lineage — every RED kept

Run

Structure

Reviewer map

System identity

External review package

CTI-OS v7 · GCP readiness (CPU-first, no GPU default)

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

The one mechanism

What is established — and what is not

What this is / is not (model taxonomy card)

Result · 30 seeds × 3 shift magnitudes (incl. a decrease)

Falsification lineage — every RED kept

Run

Structure

Reviewer map

System identity

External review package

CTI-OS v7 · GCP readiness (CPU-first, no GPU default)

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages