Skip to content

Fix in-container upgrade failures by patching start.sh tar flags#20

Closed
decriptor wants to merge 2 commits into
mainfrom
shaw/fix-tar-command
Closed

Fix in-container upgrade failures by patching start.sh tar flags#20
decriptor wants to merge 2 commits into
mainfrom
shaw/fix-tar-command

Conversation

@decriptor
Copy link
Copy Markdown
Collaborator

Summary

Users on hosts with restricted-chown semantics (userns-remap Docker daemons, NFS bind mounts with root_squash, TrueNAS / Synology bind mounts with strict ACLs) report that in-container Roon Server upgrades fail silently, leaving them stuck on whatever version was installed at first start. Initial install works fine.

The reporter on a recent support ticket nailed the cause: entrypoint.sh extracts with tar xjf ... --no-same-permissions --no-same-owner, but RoonServer's bundled start.sh (which handles in-container upgrades) uses bare tar xf "$PKG". The build tarball stores files as vsts_azpcontainer:vsts_azpcontainer (uid 1001, the Azure Pipelines build agent). When start.sh runs as root in a container whose host filesystem rejects chown 1001:1001, tar exits non-zero, start.sh's || doerror reports "ERROR: failed to tar xf $PKG", and the upgrade aborts.

Mechanism (verified)

Host setup Initial install (entrypoint.sh) Upgrade (start.sh)
Local ext4/xfs, no remapping ✓ — --no-same-owner makes tar use container's effective uid ✓ — root can chown to anything
userns-remap Docker daemon ✓ — --no-same-owner sidesteps the chown ✗ — bare tar xf tries chown 1001 outside the userns mapping → EPERM → exit non-zero
NFS root_squash ✗ — same chown rejection
TrueNAS / Synology strict ACL ✗ — same

Confirmed by extracting the upstream tarball in a roonserver:1.0.6 container and inspecting tar tjvf --numeric-owner output (uid 1001) and start.sh:21 (bare tar xf).

What this PR does

entrypoint.sh now patches RoonServer/start.sh after the install block to inject --no-same-owner --no-same-permissions into its tar invocation. The patch is:

  • Idempotent — guarded by ! grep -q -- "--no-same-owner", so it skips when already applied (own past run, or upstream's eventual fix).
  • Safe under upstream evolution — if start.sh's pattern changes, sed simply doesn't substitute and the patch silently does nothing (the existing || doerror would surface a new failure if any). The grep guard covers both the "already patched" case and "upstream now has the flag".
  • Self-healing across upgrades — every successful in-container upgrade replaces start.sh with a fresh upstream copy. Next container restart re-runs the entrypoint and re-patches.

The proper fix is upstream — start.sh should match entrypoint.sh's flags. This PR is the in-image workaround until that lands.

Test plan

  • tests/smoke.sh extended (56 → 60 checks):
    • patches the upstream-shape tar xf "$PKG" correctly
    • inserts exactly one set of flags (no double-application)
    • emits the "Patched RoonServer/start.sh" log line
    • skips silently when the flag is already present
  • All 60 smoke + 37 runtime tests green locally on roonserver:test build

…lity

In-container Roon Server upgrades are silently failing for users with
restricted-chown host setups (userns-remap, NFS root_squash, TrueNAS /
Synology bind mounts with strict ACLs). Initial install works because
entrypoint.sh extracts with `tar xjf ... --no-same-permissions
--no-same-owner`, but RoonServer's bundled start.sh handles in-container
upgrades with bare `tar xf "$PKG"`. The build tarballs ship with files
owned by uid 1001 (the Azure Pipelines build agent), so when start.sh
runs as root and tries to chown extracted files to 1001 on a host that
rejects foreign-uid chown, tar exits non-zero and start.sh's
`|| doerror` reports "ERROR: failed to tar xf $PKG". The user is left
on the old version with no obvious diagnostic.

The proper fix is upstream — start.sh should use the same flags
entrypoint.sh uses. While that's pending, patch the on-disk start.sh
from our entrypoint to inject the missing flags. Idempotent: skipped
when the flag is already present, so it survives upstream's eventual
fix without conflict, and re-runs after every successful upgrade
(which restores start.sh from the new build tarball).

Tests: pre-seed an existing install with an upstream-shape start.sh,
run the entrypoint, assert the patched line is present and that the
flag count is exactly 1 (not double-applied). Second test pre-seeds
with an already-patched start.sh and asserts the log message is
absent (idempotency on second pass).

Smoke suite: 56 → 60 checks.
Copilot AI review requested due to automatic review settings April 28, 2026 20:49
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses in-container Roon Server upgrade failures on filesystems/hosts where chown is restricted by ensuring the upgrade extraction in RoonServer/start.sh uses tar flags that avoid preserving owner/permissions from the build artifact.

Changes:

  • Patch the on-disk RoonServer/start.sh at container startup to add --no-same-owner --no-same-permissions to its tar xf "$PKG" invocation.
  • Add smoke tests that seed an “upstream-shaped” start.sh, verify the patch is applied exactly once, and verify patching is skipped when already present.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
entrypoint.sh Adds an idempotent sed-based patch step that injects tar flags into RoonServer/start.sh to make upgrades resilient on restricted-chown hosts.
tests/smoke.sh Adds smoke checks that validate patching behavior, idempotency, and logging.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tests/smoke.sh Outdated
Comment thread entrypoint.sh Outdated
Two fixes from PR review on #20:

- entrypoint.sh: patch guard now requires BOTH --no-same-owner AND
  --no-same-permissions before considering start.sh already patched.
  A partial upstream fix (one flag added, the other missing) would
  otherwise leave the file half-fixed with our patch silently skipped.
  Post-sed verification checks both flags before logging success.

- smoke.sh: assert exit==0 on the second-run idempotency check. The
  previous version let \`|| true\` mask container startup failures,
  which would have made the negative "no log" assertion pass for the
  wrong reason. Now: capture exit code, assert it's 0, then assert
  the log is absent — so the negative assertion only triggers when
  the container actually ran cleanly.
@yo3fxy
Copy link
Copy Markdown

yo3fxy commented Apr 28, 2026

i'm not 100% sure this will be enough; during my troubleshooting i noticed that patching Roon 2.64 's RoonServer/start.sh script was not enough; with patched start.sh , triggering an update from one of my Remotes still failed; from what i can tell, when triggered from a Roon Remote, Roon is not using /Roon/app/RoonServer/start.sh to upgrade itself, but a version extracted from the tar.bz2 archive ; that's why i ended up upgrading Roon manually (with command: docker exec 553609b4306f /Roon/app/RoonServer/start.sh --update /tmp/RoonServer_linuxx64.tar.bz2 /Roon/app/RoonServer)

[admin@myTVS-682 app]# docker exec 553609b4306f ls -la /tmp
total 111084
drwxrwxrwt 1 root root 4096 Apr 28 11:42 .
drwxr-xr-x 1 root root 4096 Apr 28 11:09 ..
-rw-r--r-- 1 root root 0 Apr 28 11:09 .rnsems0-
-rw-r--r-- 1 root root 0 Apr 28 11:09 .rnsgem0-
-rw-r--r-- 1 root root 113735316 Apr 28 11:42 4a0efd1e-cea0-4cb6-bcc6-fa67f053e5a9__RoonServer_linuxx64_206501653.tar.bz2
drwxr-xr-x 3 root root 4096 Apr 28 11:42 602ea097-378f-400a-9fc8-710c009fd0f9
prwx------ 1 root root 0 Apr 28 11:09 clr-debug-pipe-25-163403866-in
prwx------ 1 root root 0 Apr 28 11:09 clr-debug-pipe-25-163403866-out
[ . . . ]
srw------- 1 root root 0 Apr 28 11:24 dotnet-diagnostic-96-163496605-socket
[admin@myTVS-682 app]# docker exec 553609b4306f ls -la /tmp/602ea097-378f-400a-9fc8-710c009fd0f9
total 12
drwxr-xr-x 3 root root 4096 Apr 28 11:42 .
drwxrwxrwt 1 root root 4096 Apr 28 11:42 ..
drwxr-xr-x 2 root root 4096 Apr 28 11:42 RoonServer
[admin@myTVS-682 app]# docker exec 553609b4306f ls -la /tmp/602ea097-378f-400a-9fc8-710c009fd0f9/RoonServer
total 12
drwxr-xr-x 2 root root 4096 Apr 28 11:42 .
drwxr-xr-x 3 root root 4096 Apr 28 11:42 ..
-rwxr-xr-x 1 1001 1001 1311 Apr 16 21:57 start.sh
[admin@myTVS-682 app]#

@gtunes-dev
Copy link
Copy Markdown
Collaborator

I'm wondering if it's possible to fix how the tarball is built in the Azure pipeline. Something like:

tar --owner=0 --group=0 --numeric-owner -cjf RoonServer_linuxx64.tar.bz2 RoonServer/

Right now, the build is leaking a bad userid into the tarball. This fixes that and I think it might work with no changes to either entrypoint.sh or start.sh. You should be able to drop the --no-same-owner flag from entrypoint.sh. There's a good chance you never needed the --no-same-permissions flag. If you actually do, it either needs to be retained in entrypoint + added to start.sh or the tree needs to be cleaned before it gets tarballed.

I'm just suggesting this as an approach because it seems a lot cleaner to try to address this at the source rather than patch files with sed or whatever after they're installed.

@decriptor
Copy link
Copy Markdown
Collaborator Author

@gtunes-dev Agreed! I didn't mean to actually push this branch/PR yet. I was toying around with this to see if I could create a quick "patch" for the current image.

I'm definitely planning on fixing the build. :)

@gtunes-dev
Copy link
Copy Markdown
Collaborator

@gtunes-dev Agreed! I didn't mean to actually push this branch/PR yet. I was toying around with this to see if I could create a quick "patch" for the current image.

It's going to be tricky for some of your users to install an updated image. The build fix hopefully will just work.

But...I'm not 100% sure what the change I proposed will actually do in all situations and on all linux platforms. I think it would work. I'm also thinking about what happens in the future if you ever want to support running as non-root and I think it will work in those situations, too. Like I said, though, not 100% on any of it.

BTW - running as non root I think works like this in the container-based approach:

  • entrypoint.sh starts off as root (always)
  • it figures out if it needs to install stuff (first install or branch change) and, if so, does that as root
  • whether or not it installed anything, it chown's the app and database dirs as the user-specified user (or root) always. it does this defensively. in my tests, this is very quick (sub second). It does not chown the /Roon directory or any other user-mapped directories - just the stuff it created. This seems right
  • then uses gosu to run start.sh as either root or user-specified user

I think the change I'm suggesting to the tarball is ok with this approach for self updates done by start.sh. Because self updates will happen as either root or user-specified user, the tar (with no flags) of the clean tarball should result in files either all as root or as the user-specified user. Should just work. I hope :)

Just saying this stuff for the record so that the clean tarball approach is seen as forward facing :)

@yo3fxy
Copy link
Copy Markdown

yo3fxy commented Apr 29, 2026

until we fix the issue at source, as gtunes-dev suggested, i found another possible solution on QNAP's forum - EFAULT (Bad address) error during file extraction in Container Station 3:

Since files need to be uncompressed to the docker volume space , you can add :z to the end of volumes.

@decriptor
Copy link
Copy Markdown
Collaborator Author

ok, I've created a fix in our code. I'm going to close this out because we should have a hotfix out soon with a couple other changes.

@decriptor decriptor closed this Apr 30, 2026
@decriptor decriptor deleted the shaw/fix-tar-command branch April 30, 2026 02:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants