Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 9 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,17 +1,19 @@
# Glimmer

> **A research-object knowledge base for AI-native scientific workflows.**
> **An AI-native solution to the reproducibility problem.**
>
> The 2010s gave us reproducible pipelines. Glimmer is the next layer up — the typed-entity graph that makes the agentic feedback loop traversable over those pipelines.
> The 2010s gave us reproducible pipelines. Glimmer is the next layer up — a typed-entity graph over those pipelines whose runs are **executable, standard-gated, and self-verifying**, so "this result reproduces" is a contract the machine checks, not a footnote you trust.

[![Status: v0.3](https://img.shields.io/badge/status-v0.3-blue.svg)](https://github.com/hebbianloop/glimmer)
[![Status: v0.6](https://img.shields.io/badge/status-v0.6-blue.svg)](https://github.com/hebbianloop/glimmer)
[![License: MIT](https://img.shields.io/badge/license-MIT-green.svg)](LICENSE)
[![Template](https://img.shields.io/badge/repo-template-purple.svg)](https://github.com/hebbianloop/glimmer/generate)

## What this is

Existing standards (BIDS, DataLad, NIDM, Nipype) give your project syntactic structure. Glimmer adds the **graph layer**: datasets, methods, derivatives, findings, standards, and publications become first-class typed nodes with versioned edges, distributed across per-entity sidecars. An AI agent traverses the graph to render verifiable decisions with auditable reasoning traces.

**Proactive provenance (v0.6).** The graph is not just descriptive — it *runs*. A [`run-record`](glimmer/schema/schema.md#run-record) is one concrete, replayable invocation (the PROV `Activity`); the `glimmer run` / `glimmer rerun` node runner gates its inputs against their standards, replays the recorded command in a pinned container, and verifies outputs at one of three tiers — **byte-identical**, **numeric-within-tolerance** (re-derive a published number from source), or **structural** (for agent/LLM outputs). This is the executable unit of the [agentic loop](docs/agentic-loop.md). Start at [`docs/proactive-provenance.md`](docs/proactive-provenance.md) and the [`synthetic-provenance`](examples/synthetic-provenance/) example.

Glimmer is domain-agnostic. The canonical worked example in this repo is neuroimaging because that's where standards like BIDS and tools like DataLad and Nipype are most developed — but the architectural pattern (typed-entity graph over a versioned-data substrate) applies to any compute-intensive scientific domain backed by a mature standards ecosystem.

Glimmer is the architectural pattern + a reference implementation. The full case for it is in the CAISC 2026 paper (see [`docs/paper-citation.md`](docs/paper-citation.md)).
Expand Down Expand Up @@ -42,15 +44,17 @@ The line between "core" and "project" is the `glimmer/` directory. Anything insi
```
glimmer/
├── schema/
│ ├── schema.md # v0.3 spec — 10 entity types, edge taxonomy, sidecar format
│ ├── schema.md # v0.6 spec — 13 entity types (incl. run-record), edge taxonomy, sidecar format
│ ├── frontmatter.yaml # machine-readable contract for validators
│ └── glimmer-version # current core version (0.3.1)
│ └── glimmer-version # current core version (0.6.0)
└── tools/
├── validate.py # schema validator (enforces agent-protocol verifiability)
├── run.py # node runner — `glimmer run` / `glimmer rerun` (gate → replay → verify)
├── cli.py # `glimmer` CLI single entry point
└── figure_schema.py # render the schema diagram

examples/
├── synthetic-provenance/ # v0.6 proactive-provenance demo: loop + 3 tiers + gate + equivalence
└── ds000114-nipype/ # canonical worked example from the CAISC 2026 paper
├── install.sh # `datalad install ///openneuro/ds000114` + selective `datalad get`
├── workflow.py # Nipype anatomical preprocessing (BET → FAST)
Expand Down
2 changes: 2 additions & 0 deletions docs/agent-protocol.md
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,8 @@ The structure makes audit possible; the audit itself remains a human responsibil

For deterministic outputs (a Nipype workflow), verification is exact: re-running from the cited SHAs must produce a byte-identical output. For LLM-inferred outputs, verification is structural: the trace must cite real nodes, the cited nodes must contain the values the trace claims, and the interpretation must be a plausible reading of those values. Neither test guarantees correctness; together they guarantee auditability.

As of v0.6 these regimes are no longer prose-only — they are the three tiers the **node runner** (`glimmer run` / `glimmer rerun`, see [`proactive-provenance.md`](proactive-provenance.md)) actually checks: **byte-identical** (deterministic, with header normalization), **numeric-within-tolerance** (stochastic), and **structural** (agent-inferred — this section's contract, executed). A `run-record` with `produced-by-agent` set carries the same mandatory `reasoning-trace`, and the runner's structural tier verifies exactly the three conditions above.

## What this protocol does not solve

- **The agent's reasoning may still be wrong**, even when grounded in real evidence. Glimmer's audit makes errors traceable; it does not eliminate them.
Expand Down
35 changes: 35 additions & 0 deletions docs/agentic-loop.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,40 @@
└─────────────────────────────┘
```

## Making the loop executable: `run-record`s (v0.6)

Through v0.5 this loop was a *pattern* — the "launch agent runs" box had no runtime
primitive; an analysis "run" left only a `derivative` behind, with no replayable record of
the act that produced it. v0.6 makes the box executable with the [`run-record`](../glimmer/schema/schema.md#run-record)
node type and the `glimmer run` / `glimmer rerun` runner (see [`proactive-provenance.md`](proactive-provenance.md)).

The loop now closes *in the graph*:

```
concept ──decompose──▶ hypotheses
│ │ each hypothesis gets one or more PLANNED run-records
│ ▼ (tests-hypothesis → concept; inputs may be spec'd)
│ run-record [planned] ──gate──▶ [ready] ──glimmer run──▶ [executed]
│ │ the runner: validates inputs against their standards,
│ │ replays the command, hashes/verifies outputs
│ ▼
│ regenerates → derivative + emits → finding
│ │ addresses-concept
└──────────────────── feedback ◀───────────────────────┘
the next planning pass reads verdicts + findings, not memory
```

- **Plan** = a `concept` decomposed into hypotheses, each with `planned` run-records.
- **Run** = `glimmer run` gates, executes, and records — advancing `planned → executed`
and writing a verdict. The **analysis agent** role below now *authors and runs*
run-records rather than emitting bare derivatives.
- **Feedback** = the `replay-verdict` + emitted `finding` (which `addresses-concept`) are
what the next iteration reads. Re-running later (`glimmer rerun`) re-verifies the chain.

This is why the loop is *self-sustaining*: every iteration leaves behind a replayable,
standard-gated, self-verifying record, so the agent can't silently lose a finding or
re-make a settled mistake — the verdict is in the graph.

## The four agent roles

A Glimmer project typically uses four distinct agent roles, each operating with a different protocol mode and access scope.
Expand Down Expand Up @@ -158,6 +192,7 @@ Shipped in v0.3 (this loop now runs against the released schema):
- `experiment` node type — for task/acquisition paradigms (Experiment Factory containers, jsPsych/PsychoPy tasks).
- Cross-cutting edges: `addresses-concept` (finding/publication → concept), `tests-hypothesis` (experiment → concept), `extends-concept` / `subsumed-by` / `competes-with` / `superseded-by` (concept → concept), and the universal `contributed-by` attribution edge.
- `persona` and `organization` node types + the in-graph attribution edges `authored-by`, `affiliated-with`, `funded-by`, `mentors`, `leads`, `part-of` (v0.3.1). A literature scout or synthesis agent can now resolve "who worked on this concept" by walking `authored-by` / `leads` to persona nodes rather than parsing free-text author strings.
- `run-record` node type + the `glimmer run` / `glimmer rerun` runner (v0.6) — the executable unit of this loop: a `planned → ready → running → executed` invocation tied to its hypothesis via `tests-hypothesis`, producing `derivative`s (`regenerates`) and a `finding` (`emits`) with a recorded verdict. See [`proactive-provenance.md`](proactive-provenance.md).

Still on the roadmap:

Expand Down
2 changes: 2 additions & 0 deletions docs/datalad-pattern.md
Original file line number Diff line number Diff line change
Expand Up @@ -91,6 +91,8 @@ These fields are optional in v0.1.1 and recommended in v0.1.2. A Glimmer instanc

Every step is verifiable. Every output cites its inputs by content-hash. The Glimmer graph isn't a fragile parallel database — it's a thin reasoning layer over a DataLad superdataset that is itself the source of truth.

The "verification agent re-runs methods, compares output-hashes" step above is concrete as of v0.6: it is the `glimmer rerun` node runner (`glimmer/tools/run.py`, see [`proactive-provenance.md`](proactive-provenance.md)). The runner replays a `run-record`'s command pinned to its `container-digest` via `datalad containers-run`, materializing inputs with `datalad get` and matching their `datalad-annex-key` / `datalad-commit-sha` before execution — the DataLad coordinates on each node are exactly what makes re-fetch-and-replay possible.

## How this relates to the format-agnostic position

Earlier docs argued for a "two-tier" Glimmer/BIDS sidecar strategy. The deeper point is simpler: **format doesn't matter if the agent can translate between formats.** What matters is that the data has the structure the schema requires.
Expand Down
6 changes: 5 additions & 1 deletion docs/paper-citation.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,11 @@ And the code:
title={Glimmer: a Research-Object Knowledge Base for AI-Native Scientific Workflows},
author={El Damaty, Shady},
url={https://github.com/hebbianloop/glimmer},
version={0.1.0},
version={0.6.0},
year={2026}
}
```

The working manuscript that extends the architecture to **proactive provenance**
(executable run-records + the node runner) is drafted at
[`docs/paper-draft.md`](paper-draft.md).
Loading