Skip to content

Storage & archival as a first-class part of the Glimmer model (v0.3.1) + multi-tenant self-provisioning of bares #6

Description

@hebbianloop

Problem

Glimmer v0.3 is a post-hoc documentation ledger: scripts run, then emit nodes (if glimmer:). Nodes record what + content-hash (derivative.output-path/output-hash/provenance-mode) but delegate where/how-many-copies to datalad/git-annex — so storage state is invisible in the research object. A multi-day compute-expensive SST once sat only in /tmp, discoverable only by asking. An output you can't locate / that has one copy is not reproducible, so durability belongs inside the RO model.

Proposed (v0.3.1, additive)

  1. Typed storage fields on derivative + dataset nodes: stored-at [{remote,uuid}], copies, numcopies-required, verified, storage-class (irreplaceable|generated).
  2. Archive-on-emit: GlimmerGraph.derivative()/.dataset() annex-add + copy --to durable remotes + record whereis. Creating the node = durably storing it. Refuse /tmp/outside-annex paths.
  3. Enforcement via config+hooks (not scripts): annex.numcopies in .gitattributes (3 irreplaceable / 2 generated); pre-push hook + glimmer validate CI; teardown gates.
  4. First-class reporting: generic glimmer report over the graph + live git annex whereis — storage never hidden.

Commit-tracking bares

Durable copy := a --bare git+annex tracking full commit history. Object-only stores (sftp/rsync boxes that can't run git) are adjuncts, not commit-tracking copies. Tiers: GitHub (git+pointers) + cheap object store (adjunct) + >=2 commit-tracking bares.

Multi-tenant self-provisioning

Glimmer should let users provision their own tenants + bares at dataset init. ads-glimmer is the first consumer and will migrate onto this once it lands.

Reference

Full design: ads-glimmer/docs/data/INFORMATION-ARCHITECTURE.md

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions