Everything in Unix is a file, except for the filesystem itself. Until now.
C4 is a file management system designed to simplify distributed systems. It decouples file storage and transport from organization and metadata by providing a standardized content-derived identifier, and using these identifiers in place of file content in a simple, readable, text-based filesystem format.
$ echo "hello" | c4
c447Fm3BJZQ62765jMZJH4m28hrDM7Szbj9CUmj4F4gnvyDYXYz4WfnK2nYRhFvRgYEectEXYBYWLDpLo6XGNAfKdtThat 90-character string is a SMPTE ST 2114:2017 identifier — an international standard built on SHA-512. Run it on any machine, in any language, ten years from now. Same input, same ID. Unlike a UUID, which is assigned arbitrarily and requires coordination to be meaningful, a C4 ID is discovered from the data itself. If two people have a file with the same ID, they have identical copies of the same file.
C4 extends this idea to entire filesystems. c4 id produces a c4m
file — a plain-text description that captures the identity of every
file and directory in a tree:
$ c4 id ./project/
-rw-r--r-- 2026-03-18T22:22:57Z 13 README.md c44iCq6un9W47x7ydjJSWp4arMJ...
drwxr-xr-x 2026-03-18T22:22:57Z 66 src/ c44nbgL6nkBWsEBDCUCr4LufsjVhJt...
-rw-r--r-- 2026-03-18T22:22:57Z 66 main.go c43Q4j81SxGkV9FhbeW23YrMTj6...A c4m file is just text. Pipe it, grep it, diff it, email it. Each
line looks like ls -l with a content ID at the end. The format is
designed to compose with awk, sort, grep, and the rest of the
Unix toolkit — no special tools required, though C4 provides some
that are useful.
brew install mrjoshuak/tap/c4Pre-built archives for macOS, Linux, and Windows: c4toolkit releases
go install github.com/Avalanche-io/c4/cmd/c4@latest
go install github.com/Avalanche-io/c4sh@latestpip install c4py # Python
npm install @avalanche-io/c4 # TypeScript / JavaScriptSee c4toolkit for the full suite and version matrix.
Know what changed. Snapshot a directory, come back later, compare:
c4 id ./deliverables/ > monday.c4m
# ... work happens ...
c4 diff monday.c4m ./deliverables/Build history. Append diffs to a c4m file to create a version chain:
c4 id ./project/ > project.c4m # snapshot
c4 diff project.c4m ./project/ >> project.c4m # append changes
c4 log project.c4m # see what changed
c4 patch -n 1 project.c4m # recover the original stateMake a directory match a description. Declarative — you say what the result should be, C4 figures out how to get there:
c4 patch target.c4m ./dir/Creates, moves, and removes files as needed. Nothing starts until all required content is confirmed available. Safe to re-run after interruption.
Store and retrieve content by ID:
c4 id -s ./final/ > delivery.c4m # identify + store
c4 cat c44iCq6un9W47... # retrieve by IDThe store can be a local directory, an S3-compatible object store, or both. Content goes in once, comes out by ID.
Undo what you just did:
c4 patch -s new_state.c4m ./dir/ > changeset.c4m # apply + preserve
c4 patch -r changeset.c4m ./dir/ # revertWork with Unix tools. One entry per line, fields in predictable positions:
awk '{print $NF}' project.c4m | sort | uniq -d # find duplicates
grep '\.exr ' project.c4m | wc -l # count EXR filesMerge two trees. Combine branches, deliveries, or any two c4m files into one:
c4 merge branch-a.c4m branch-b.c4m > merged.c4mConflicts (same path, different content) are reported. Non-overlapping entries are combined.
Scan fast, hash later. Structure-only scans skip content hashing. Edit the manifest, then hash only what survived:
c4 id -m s ./project/ > scan.c4m # names only
vi scan.c4m # remove what you don't want
c4 id -c scan.c4m ./project/ # hash the rest| Command | What it does |
|---|---|
c4 id |
Identify files, directories, or c4m files |
c4 cat |
Retrieve/display content by C4 ID or c4m file path |
c4 diff |
Compare two states (c4m files or directories) |
c4 patch |
Apply a target state: reconcile directories, resolve chains |
c4 merge |
Combine two or more trees |
c4 log |
Show patch history |
c4 split |
Split a patch chain |
c4 explain |
Human-readable narration of what a command would do |
c4 paths |
Convert between c4m format and plain path lists |
c4 intersect |
Find common entries between two c4m files |
Every command that takes a c4m file also takes a directory, and vice
versa. c4 <path> is a shortcut for c4 id -s (identify and store).
echo "data" | c4 produces a bare ID from stdin and stores it.
A c4m file is a complete filesystem description in plain text. Each line has permissions, timestamp, size, name, and C4 ID. Readable, editable, diffable, composable.
A 10,000-entry c4m file is about 1.4 MB. It describes the identity of every file in the tree regardless of how large those files are. The description is the lightweight handle; the content is the heavy thing it refers to.
import "github.com/Avalanche-io/c4"
id := c4.Identify(strings.NewReader("hello"))
fmt.Println(id)
// c447Fm3BJZQ62765jMZJH4m28hrDM7Szbj9CUmj4F4gnvyDYXYz4WfnK2nYRhFvRgYEectEXYBYWLDpLo6XGNAfKdtC4 IDs are 90-character base58 strings — SHA-512 with a c4 prefix.
URL-safe, filename-safe, double-click selectable.
Zero external dependencies. Go 1.16+.
- C4 Framework Universal Asset ID (video)
- C4 ID Whitepaper
- CLI Reference
- Getting Started
- FAQ — design decisions (SHA-512 permanence, c4m format, store scaling)
C4 is part of a cross-language ecosystem. Every tool reads and writes the same c4m format and produces identical C4 IDs:
| Tool | Language | What it does |
|---|---|---|
| c4 | Go | CLI — identify, diff, patch, merge |
| c4sh | Go | Shell — cd into c4m files, browse, copy |
| c4py | Python | Library — scan, diff, verify, store |
| c4git | Go | Git filter — version large files |
| c4ts | TypeScript | Browser + Node.js — zero dependencies |
| c4-swift | Swift | Apple platforms — Sendable, Codable |
| libc4 | C | Embed in any application |
See c4toolkit for the full suite, install options, and version matrix.
Apache 2.0