Skip to content

Extend cvmfs info#3

Open
DrDaveD wants to merge 17 commits intoreduce-capsfrom
extend-cvmfs-info
Open

Extend cvmfs info#3
DrDaveD wants to merge 17 commits intoreduce-capsfrom
extend-cvmfs-info

Conversation

@DrDaveD
Copy link
Copy Markdown
Owner

@DrDaveD DrDaveD commented Nov 12, 2025

Just to show difference between reduce-caps and extend-cvmfs-info

@DrDaveD DrDaveD force-pushed the reduce-caps branch 2 times, most recently from 2f7b61a to ffa3eb8 Compare December 22, 2025 16:38
@DrDaveD DrDaveD force-pushed the extend-cvmfs-info branch 8 times, most recently from 8f0d70a to d4a8132 Compare December 23, 2025 19:30
@DrDaveD DrDaveD force-pushed the reduce-caps branch 5 times, most recently from 94620f7 to 7d8933c Compare January 14, 2026 19:20
This is an initial PR setting up for a PR fixing cvmfs#3666. It reduces the
capabilities in all the client processes: the main fuse process, the
cache manager process, and the watchdog/monitor processes of each. For
now this leaves CAP_DAC_READ_SEARCH capability available to the main
fuse process, but in the followup PR that will be made available only
when needed.

In order to avoid the fuse process to need CAP_SYS_ADMIN, this depends
on the watchdog to do the unmount also in the case of the fuse process
exiting on its own for some reason, for example with a kill signal. In
the normal case it's the automounter that does the unmount.

In summary, the capabilities permitted by the 4 processes after this PR
are (when started as root):

1. fuse process - CAP_DAC_READ_SEARCH
2. fuse watchdog - CAP_SYS_ADMIN and CAP_DAC_READ_SEARCH.
CAP_DAC_READ_SEARCH is inheritable & ambient so only it is passed to gdb
when it does a stack trace. That capability is only needed because the
fuse process has it and ptrace requires the attaching process to have at
least the capabilities of the attached process.
3. cache manager - none
4. cache manager watchdog - none

The fuse process and cache manager process have their PR_SET_DUMPABLE
flag enabled so they can be ptraced, but the RLIMIT_CORE set to zero so
they still don't dump core. They didn't dump core before because
PR_SET_DUMPABLE was disabled because they started privileged.

For some reason the github diff does't show that
`swissknife_capabilities.{h|cc}` were first renamed to
`capabilities.{h|cc}`, although the git diff showed that.

---------

Co-authored-by: Valentin Volkl <valentin.volkl@cern.ch>
@DrDaveD DrDaveD force-pushed the extend-cvmfs-info branch from d4a8132 to 64c78c2 Compare March 9, 2026 16:42
This is the first step of cvmfs#4118.

The basic idea is that it's more efficient to do a bulk delete operation
via POST with up to 1000 objects instead of doing one DELETE request per
object. See
[here](https://docs.aws.amazon.com/AmazonS3/latest/API/API_DeleteObjects.html)
for more details.
@DrDaveD DrDaveD force-pushed the extend-cvmfs-info branch from 64c78c2 to 38dd41a Compare March 9, 2026 17:42
DrDaveD and others added 10 commits March 9, 2026 22:38
I have sometimes seen some very long timeout run times in the dockerized
cloud tests where they seemed to be stuck. One time I investigated it
and one of these unused cleanup traps were implicated in interfering
with the 300-second per test timeout function in run.sh. I searched
through all the tests and found these two with trap statements but no
cleanup function.
"Try to cd into the path I'm invoked with (if any), and regardless of
success, print current directory" is not an adequate way to deduce
script location. It is guaranteed to work only if invoked with absolute
path, or if CWD in an actual script location.

On Fedora 43, execution goes as follows, when invoked as
- `sudo cvmfs_config chksetup` (used in test/src/001-chksetup)
- `bash cvmfs_config chksetup`
- `sudo bash cvmfs_config chksetup`
- `sudo bash -x cvmfs_config chksetup`

But not if invoked as
- `cvmfs_config chksetup`
- `/usr/bin/cvmfs_config chksetup`
- `bash -c 'cvmfs_config chksetup'`

```
+++ dirname cvmfs_config
++ cd .
++ pwd
+ SCRIPT_LOCATION=/usr/local/src/cvmfs/test ++ echo /usr/local/src/cvmfs/test
++ sed -e 's/^\(.*\)\/bin$/\1/'
+ INSTALL_BASE=/usr/local/src/cvmfs/test ++ '[' x/usr/local/src/cvmfs/test = x/usr ']'
++ echo /usr/local/src/cvmfs/test/sbin
+ SBIN_BASE=/usr/local/src/cvmfs/test/sbin
```


I tried to change `test/src/001-chksetup` in a way to provoke the same
error in the actions runner, for demonstration and solution testing
purposes, but so far I failed at provoking it. I can confirm that this
patch solves the issue in my Fedora 43 container where I run tests.
For convenience, so no longer need to set CVMFS_TEST_USER explicitly on
test machines with a user other than sftnight.
As discussed in cvmfs#3923:

> This is opening a can of worms for sure :-) I agree that this printout
should show the most relevant information in a way that catches the eye.
For me that's actually the failing tests, and the return code they
failed with, which can be added in a digest at the end. Then I think all
of the skipped test could be similarly grouped in the beginning - we can
do a first loop to find those.
> 
> It should be noted that these scripts originally intended for the test
report to be generated from the JUnit xml, relying on Jenkins to do
this. We could still use JUnit, but it should still work without
additional dependencies when just invoking run.sh from bash.
> 
> One thing that would make sense to me is to generate CTest testcases
for all the integration tests. That way running the tests becomes much
more standard. However I also appreciate the flexibility that comes from
having our own test runner script, so I would still keep run.sh.
> 
> So, back to the run.sh output. With this PR this goes from:
> 
> ```
> docker exec -u sftnight -t cvmfs-dev bash -c \
> "cd /home/sftnight/cvmfs/test/common/container &&
CVMFS_TEST_PROXY=DIRECT bash test.sh"
>   shell: /usr/bin/bash -e {0}
> running CernVM-FS client test cases...
> Limiting test cases to suite(s): quick
> Testing Dummy test... OK
> Testing Check installation... OK
> Testing Probing atlas, lhcb... OK
> Testing Nested catalogs with same prefix... Skipped by suite
restriction
> ...
> Testing Check changing CVMFS_CACHE_REFCOUNT between reloads... OK
> Testing Check that simultaneous mount attempts only mount the
repository once... Skipped by exclusion
> Testing Browsing with the streaming cache manager... Skipped by
exclusion
> 
> Tests:    107
> Skipped:  26
> Passed:   77
> Warnings: 4
> Failures: 0
> ```
> 
> To:
> 
> ```
>  docker exec -u sftnight -t cvmfs-dev bash -c \
> "cd /home/sftnight/cvmfs/test/common/container &&
CVMFS_TEST_PROXY=DIRECT bash test.sh"
>   shell: /usr/bin/bash -e {0}
> running CernVM-FS client test cases...
> Limiting test cases to suite(s): quick
> Testing src/000-dummy Dummy test... OK
> Testing src/001-chksetup Check installation... OK
> Testing src/002-probe Probing atlas, lhcb... OK
> Testing src/003-nested Nested catalogs with same prefix... Skipped by
suite restriction
> Testing src/004-davinci Setup Davinci... Skipped by suite restriction
> Testing src/005-asetup Setup ATLAS... Skipped by suite restriction
> ...
> Testing src/103-reloadcachemgr Check changing CVMFS_CACHE_REFCOUNT
between reloads... OK
> Testing src/104-concurrent_mounts Check that simultaneous mount
attempts only mount the repository once... Skipped by exclusion
> Testing src/105-streaming-cache Browsing with the streaming cache
manager... Skipped by exclusion
> 
> Tests:    107
> Skipped:  26
> Passed:   78
> Warnings: 3
> Failures: 0
> ```
> 
> A very nice improvement for sure! I think the "src/" can be dropped,
as github and any ide allows you to easily find the location of
`105-streaming-cache`. I would also add some padding to make the OKs
more visible, and print the time taken to complete the test. So it could
look something like (imagine proper alignment):
> 
> ```
> Testing 001-chksetup Check installation .............................
OK in 125.3s
> Testing 002-probe Probing atlas, lhcb ........................... OK
in 34.3s
> Testing 003-nested Nested catalogs with same... ........ Skipped by
suite restriction
> ```



Now looks a bit like (but with richer formatting/colors):
```
./run.sh test.log -x src/007-testjobs --  src/00*

Limiting test cases to suite(s): quick
Starting CernVM-FS integration test run at Wed Mar 11 17:18:45 CET 2026
Logging test output to /tmp/cvmfs-test-logs/cvmfs-test-log-20260311-172130.log
Running from git revision 8571ddd

Selection preview:
Will run 9 tests.
Excluded by -x (1):
007-testjobs ........... SKIP

000-dummy .............. PASS |   0.1s
001-chksetup ........... PASS |   0.4s
002-probe .............. PASS |   1.2s
003-nested ............. PASS |   0.8s
004-davinci ............ WARN |  16.7s | memory limit exceeded
005-asetup ............. PASS |   8.1s
006-buildkernel ........ FAIL |   1.1s | retval 2
008-default_domain ..... PASS |   3.5s
009-tar ................ PASS | 349.8s

Tests:    10
Skipped:  1
Passed:   7
Warnings: 1
Failures: 1

took 6 minutes 30.392 seconds

Warnings / failures digest:
004-davinci ............ WARN |  16.7s | memory limit exceeded
006-buildkernel ........ FAIL |   1.1s | retval 2
```

Output with -s argument:

```
 ./run.sh test.log -s quick -x src/007-testjobs -- src/00*
Limiting test cases to suite(s): quick
Starting CernVM-FS integration test run at Fri Mar 13 13:17:12 CET 2026
Logging test output to /home/sftnight/cvmfs/test/test.log
Running from git revision 6626535

Selection preview:
Will run 4 tests.
Excluded by suite (-s) (5):
003-nested ............. SKIP
004-davinci ............ SKIP
005-asetup ............. SKIP
006-buildkernel ........ SKIP
009-tar ................ SKIP
Excluded by -x (1):
007-testjobs ........... SKIP

000-dummy .............. PASS |   0.1s
001-chksetup ........... PASS |   0.3s
002-probe .............. PASS |   4.0s
008-default_domain ..... PASS |   9.9s

Tests:    10
Skipped:  6
Passed:   4
Warnings: 0
Failures: 0

took 18.220 seconds

All executed tests ran **successfully.**
```

Now also has sensible defaults when running ./run.sh without any
arguments and a warning if its not run from the parent directory.

See also the output of the checks of this PR.
…exists (cvmfs#4128)

It was reported on Mattermost that a `cvmfs_server resign` requires a
repository's `.crt` file to be present, which was surprising. This
surprise can be avoided by actually resigning an existing
`.cvmfswhitelist` file rather than recreating it.

However, this does not avoid the need for the configuration of the repo
to be present; only the `-w` option does that. There was no hint about
that in the usage info though, so this PR also adds that.

---------

Co-authored-by: Valentin Volkl <valentin.volkl@cern.ch>
This makes catalog access for snapshot (pull), gc, and check ignore any
bulk hashes when there are chunks available. This cleans out all the old
legacy bulk hashes from the early days of CVMFS without needing to run
`cvmfs_server eliminate-bulk-hashes`.

The repository most affected is cms.cern.ch which has about 2T of these
old bulk hashes. I first tried using `cvmfs_server
eliminate-bulk-hashes` on a copy of the repository and for some reason
it was not successful in removing the old hashes.

The caveat with the approach in this PR is that if someone tries to
replicate from a repository that has been cleaned out with this while
using an older version of cvmfs-server, they will run into missing
files. There are no longer any existing clients that still use the old
bulk hashes. That was already the case when the `eliminate-bulk-hashes`
command was added almost 6 years ago in cvmfs#2545.

---------

Co-authored-by: Valentin Volkl <valentin.volkl@cern.ch>
* Fixes the migration tests and adds a github action workflow
  - updates tarsum helper to python3
* Includes bump to `2.14.0~pre1` in packaging
* Updates the Dockerfile for containerized CI with better caching
* Some small fixes for running tests in a barebones container (`command
-v` instead of `which`, `#!/usr/bin/env sh`)

---------

Co-authored-by: Dave Dykstra <2129743+DrDaveD@users.noreply.github.com>
vvolkl and others added 5 commits March 17, 2026 18:15
The lines we had in cvmfs_options.cmake were not effective because this
has to be set before project()
…fs#4062)

After cvmfs#4029, upgrading a running mount crashes. I tracked it down to a
Panic() on the running quota manager because it did not recognize the
new `kSetCleanupPolicy` command. This PR bumps the protocol revision
number so a new cvmfs module won't send the new commands to an old quota
manager, avoiding the crash.
@vvolkl vvolkl force-pushed the extend-cvmfs-info branch from 58aef7e to f5c078b Compare March 18, 2026 11:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants