Add composefs-ostree and some basic CLI tools by alexlarsson · Pull Request #144 · composefs/composefs-rs

alexlarsson · 2025-06-16T16:42:07Z

Based on ideas from #141

This is an initial version of ostree support. This allows pulling
from local and remote ostree repos, which will create a set of
regular file content objects, as well as a blob containing all the
remaining ostree objects. From the blob we can create an image.

When pulling a commit, a base blob (i.e. "the previous version" can be
specified. Any objects in that base blob will not be downloaded. If a
name is given for the pulled commit, then pre-existing blobs with the
same name will automatically be used as a base blob.

This is an initial version and there are several things missing:

Pull operations are completely serial
There is no support for ostree summary files
There is no support for ostree delta files
There is no caching of local file availability (other than base blob)
Local ostree repos only support archive mode

allisonkarlitskaya

I love this! Thanks for working on it!

I made some comments on the first round of commits. Feel free to adjust those and PR them separately: we can merge those now without further discussion.

The blobs thing is going to need a call.

I didn't review the crate addition in any detail at all. That's probably also going to need a call :)

alexlarsson · 2025-06-19T07:41:11Z

Hmmm, thinking more about this. We probably want a "content type" magic thing in the splitstream header as well, so we can error out if the wrapped thing is of the wrong type.

alexlarsson · 2025-06-26T12:07:47Z

Ok. Reworked this to use splitstreams for object maps and commits. And, by using an object mapping to find the object map we make the content of the splitstream for the commit be just the commit data, and thus the sha256 of that splitstream matches the ostree commit id.

alexlarsson · 2025-06-26T12:08:17Z

@allisonkarlitskaya There is still lots to do here. But have a look at this approach and see what you think.

alexlarsson · 2025-06-27T16:19:58Z

Added some further changes. We now validate all objects when pulling and all non-file objects when creating images. Its hard to efficiently validate file objects during create-image though, we would like to avoid re-reading the external object files to compute the sha256.

Remaining things to do:

Stream larger objects into repo
Support summaries and summary branches for remote repos
Support deltas when remote pulling
Parallelize downloads of objects
Report pull progress in some sane way
Use some kind of local cache for available objects other than just those from "previous version"
Handle GPG validation of commit objects

alexlarsson · 2025-06-30T14:38:26Z

I started working on the delta support, but it failed because of an issue in gvariant-rs.

allisonkarlitskaya

It occurs to me that it might be interesting not to sort the table of fs-verity references, and it might also be interesting to permit duplicate items.

On the topic of deferring writing of objects to a background thread, this would allow us to write "external object #123" based on a sequential index to the splitstream without actually knowing the hash value yet, and then fill in the actual values in the header at the end when we're writing: it helps there that the fs-verity references aren't compressed and therefore not part of the stream...

cgwalters · 2025-09-05T12:58:46Z

It seems like we should get in the splitstream changes in 0f6d69e at least sooner rather than later? Can you file a separate PR?

This changes the splitstream format a bit, with the goal of allowing splitstreams to support ostree files as well (see composefs#144) The primary differences are: * The header is not compressed * All referenced fs-verity objects are stored in the header, including external chunks, mapped splitstreams and (a new feature) references that are not used in chunks. * The mapping table is separate from the reference table (and generally smaller), and indexes into it. * There is a magic value to detect the file format. * There is a magic content type to detect the type wrapped in the stream. * We store a tag for what ObjectID format is used * The total size of the stream is stored in the header. The ability to reference file objects in the repo even if they are not part of the splitstream "content" will be useful for the ostree support to reference file content objects. This change also allows more efficient GC enumeration, because we don't have to parse the entire splitstream to find the referenced objects. Signed-off-by: Alexander Larsson <alexl@redhat.com>

alexlarsson · 2026-04-20T16:46:27Z

I rebased this and fixes some comments. Still some work to do though.

alexlarsson · 2026-06-16T15:27:56Z

Ok, i updated this to the latest version and added streaming creation of repo files and parallelized fetching. Plus some other cleanups.

alexlarsson · 2026-06-17T13:23:45Z

Ok, I sent some time on this, its now much more like the "cfsctl oci" commands and behavior, and it does parallel fetches. I also added various integration tests. I think this is pretty complete for what it does (i.e. imports ostree commits into composefs and lets you mount it).

There are some TODOs for summary and delta support, but those are not necessarily super important for the basic functionallity.

cgwalters · 2026-06-17T15:29:21Z

+                ref ostree_ref,
+                base_name,
+            } => {
+                eprintln!("Fetching {ostree_ref}");


Don't log via eprintln! we have the progress API now.

Also on that topic...I think we should expose a varlink API for this now, right?

I guess neither of these need to strictly block merging though.

🤔 I guess actually...if we go down this varlink path, perhaps in theory we could have both the oci and ostree fetchers be extension binaries i.e. something like /usr/libexec/composefs/ext/oci is automatically cfsctl oci? That could be interesting...and would actually force us to have a good "core" varlink api.

cgwalters · 2026-06-18T08:41:32Z

@allisonkarlitskaya You have a "changes requested" here which blocks merges

Signed-off-by: Alexander Larsson <alexl@redhat.com>

This lets you look up a ref digest from the splitstream by index and is needed by the ostree code. Signed-off-by: Alexander Larsson <alexl@redhat.com>

This is basically ensure_object_from_fd(), but for anything implementing Read. basically ensure_object_from_fd() is reimplemented based on this. We will need this in the ostree support code for streaming a zlib compressed file to the repo. Signed-off-by: Alexander Larsson <alexl@redhat.com>

Based on ideas from composefs#141 This is an initial version of ostree support. This allows pulling from local and remote ostree repos, which will create a set of regular file content objects, as well as a commit splitstream containing all the remaining ostree objects and file data. From the splitstream we can create an image. When pulling a commit, base commits (i.e. "the previous version" can be specified, either manually and/or added automatically based on parent commit or previous commit for the pulled ref. Any objects in that base commit will not be downloaded. Commits are splitstreams named ostree-commit-xxxx, and refs that points to these are refs/ostree/$ref. erofs images are automatically created for pulled commits, and they can be mounted with "cfsctl ostree mount". There are also some other subcommands, that are simliar to those of oci: * dump * compute-id * inspect * tag * untag * images Signed-off-by: Alexander Larsson <alexl@redhat.com> Assisted-by: Claude Code (Opus 4.6)

Unblocking

cgwalters

Overall: Nothing blocking, from my PoV we can get this in as a clearly experimental separate crate and iterate from there.

A lot of interesting stuff here, but I think we need to write up an even more concrete plan for how consumers (flatpak?) would use this - gets into the varlink (or a -capi crate?) discussion.

cgwalters · 2026-06-24T14:40:27Z

+OSTree file objects are stored in the archive-z2 format, except not
+compressed, and optionally the file content part of it may be stored


The -z2 is obsolete terminology, it's just archive.

Also: the core library calls the "metadata+data" an "object stream". It's really pretty similar to a tar header + content.

cgwalters · 2026-06-24T14:40:39Z

+
+OSTree file objects are stored in the archive-z2 format, except not
+compressed, and optionally the file content part of it may be stored
+as referencing the index of an external object. The z2 format is,


So not "z2 format" just "object format".

Like: when we do ostree fsck we reconstruct in memory that format and that's what we compare vs the digest - there's no compression involved.

cgwalters · 2026-06-24T14:42:24Z

+content data and optionally a reference to an external object.
+
+The exact form of the data looks like this, packed in order from the
+start of the splitstream content. All ints are in little endian.


How about "numbers" or "integers". "ints" is a bit too informal...

cgwalters · 2026-06-24T15:54:10Z

+limit the search.  We can also compute the total number of objects
+(n_objects) by looking in the last bucket.
+
+### Object ids


Should re-mention the sorting here

cgwalters · 2026-06-24T15:54:23Z

+```
+ n_objects x
+-----------------------------------+
+|  [u8; 32] ostree object id        |


(binary sha256) would help clarify

cgwalters · 2026-06-24T15:57:48Z

+```
+ n_objects x
+-----------------------------------+
+| u32: Offset to per-object data    |


I'd hope no one ever wants to ship more than 4G of metadata...but perhaps worth calling out. Or...perhaps instead of raw "offset" since we have the alignment requirement, we could make it "block" offset with a 8 byte block. We'd then go up to 32G of metadata. But dunno.

cgwalters · 2026-06-24T16:05:07Z

+        }
+        for i in 1..256 {
+            // Then we sum them up to the end
+            header.bucket_ends[i] += header.bucket_ends[i - 1];


Same comment here re windows()

cgwalters · 2026-06-24T16:06:55Z

+}
+
+/// Abstraction over local and remote ostree repository access.
+pub(crate) trait OstreeRepo<ObjectID: FsVerityHashValue>: Send + Sync {


One operation I think we want here is fsck too at least

cgwalters · 2026-06-24T17:05:37Z

A lot of interesting stuff here, but I think we need to write up an even more concrete plan for how consumers (flatpak?) would use this - gets into the varlink (or a -capi crate?) discussion.

Or to say it a different way - I think we do need a tight focus on the "core" things to get to the current milestones. I didn't add this to a milestone, but if we did I think it would be e.g. 1.1 or so? I am happy to debate it of course, we could say that this is part of 1.0 too.

What I keep thinking about again here is making it really clear and obvious how to use composefs to store arbitrary content without writing a custom crate and glue. One idea here is: have a (varlink) API that accepts metadata + tarball and stores it as an OCI artifact, but where we do the splitstream stuff and support mounting it too. Think e.g. storing a dpkg/rpm in composefs.

alexlarsson · 2026-06-25T08:37:51Z

@cgwalters I agree, we should have a way that lets you just store unstructured data from a tar or a directory, with no guarantees of "exact bitwise reproduceability" of the original data. That seems very useful, however its not quite the same as the oci/ostree case where we want to roundtrip both ways losslessly.

Anyway, lots of good comments here, but I'm gonna merge this for now and lets consider this unstable and then I'll do updates to it in separate PRs, because this one is getting unwieldy.

alexlarsson force-pushed the ostree-support branch 2 times, most recently from e0e827f to 9c5b086 Compare June 17, 2025 06:54

allisonkarlitskaya previously requested changes Jun 17, 2025

View reviewed changes

alexlarsson mentioned this pull request Jun 17, 2025

Various repository fixes #146

Merged

alexlarsson force-pushed the ostree-support branch from 9c5b086 to cd067c5 Compare June 18, 2025 14:17

alexlarsson force-pushed the ostree-support branch 2 times, most recently from 2ed83a2 to c041afe Compare June 19, 2025 09:11

alexlarsson force-pushed the ostree-support branch from c041afe to dd0bf65 Compare June 26, 2025 12:06

alexlarsson force-pushed the ostree-support branch from dd0bf65 to d6a5b39 Compare June 27, 2025 16:02

alexlarsson force-pushed the ostree-support branch 4 times, most recently from 481e604 to e88573d Compare June 30, 2025 14:26

allisonkarlitskaya reviewed Jul 4, 2025

View reviewed changes

alexlarsson mentioned this pull request Sep 29, 2025

Preparatory splitstream format changes for ostree support #185

Merged

alexlarsson force-pushed the ostree-support branch 2 times, most recently from c788da2 to 2ee193a Compare October 6, 2025 14:58

alexlarsson force-pushed the ostree-support branch from 2ee193a to da310b0 Compare November 26, 2025 17:38

alexlarsson force-pushed the ostree-support branch from da310b0 to 8b32f51 Compare January 19, 2026 09:42

alexlarsson force-pushed the ostree-support branch 2 times, most recently from 5fba232 to 1228e9b Compare June 2, 2026 15:49

alexlarsson force-pushed the ostree-support branch from 1228e9b to 8d8c6b2 Compare June 16, 2026 15:26

alexlarsson force-pushed the ostree-support branch from 8d8c6b2 to 5837fb4 Compare June 17, 2026 13:19

alexlarsson force-pushed the ostree-support branch 7 times, most recently from 188640d to efba46d Compare June 17, 2026 16:20

cgwalters reviewed Jun 17, 2026

View reviewed changes

alexlarsson force-pushed the ostree-support branch from efba46d to e966ce5 Compare June 18, 2026 11:04

allisonkarlitskaya self-requested a review June 23, 2026 11:14

alexlarsson added 4 commits June 23, 2026 13:16

Expose ErrnoFilter for other crates

703b734

Signed-off-by: Alexander Larsson <alexl@redhat.com>

SplitStreamReader: Add lookup_external_ref()

9fb2d16

This lets you look up a ref digest from the splitstream by index and is needed by the ostree code. Signed-off-by: Alexander Larsson <alexl@redhat.com>

alexlarsson force-pushed the ostree-support branch from e966ce5 to 6047295 Compare June 23, 2026 11:18

cgwalters approved these changes Jun 24, 2026

View reviewed changes

alexlarsson added this pull request to the merge queue Jun 25, 2026

Merged via the queue into composefs:main with commit 0b6424e Jun 25, 2026
17 checks passed

		OSTree file objects are stored in the archive-z2 format, except not
		compressed, and optionally the file content part of it may be stored

Uh oh!

Conversation

alexlarsson commented Jun 16, 2025

Uh oh!

allisonkarlitskaya left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

alexlarsson commented Jun 19, 2025

Uh oh!

alexlarsson commented Jun 26, 2025

Uh oh!

alexlarsson commented Jun 26, 2025

Uh oh!

alexlarsson commented Jun 27, 2025

Uh oh!

alexlarsson commented Jun 30, 2025

Uh oh!

allisonkarlitskaya left a comment

Choose a reason for hiding this comment

Uh oh!

cgwalters commented Sep 5, 2025

Uh oh!

alexlarsson commented Apr 20, 2026

Uh oh!

alexlarsson commented Jun 16, 2026

Uh oh!

alexlarsson commented Jun 17, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cgwalters commented Jun 18, 2026

Uh oh!

cgwalters left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cgwalters commented Jun 24, 2026

Uh oh!

alexlarsson commented Jun 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants