Skip to content

babs merge incorrectly raises Exception "Unable to find file content for some file(s)" #355

@tien-tong

Description

@tien-tong

Summary

babs merge checks the output of git annex find --not --in output-storage, and a non-empty stdout (if msg != '') will raise Exception "Unable to find file content for some file(s)".

https://github.com/PennLINC/babs/blob/main/babs/merge.py#L320-L332

A warning message from git annex find --not --in output-storage will incorrectly raise this Exception, and write an empty file to <babs_root>/merge_ds/code/list_content_missing.txt'

This issue came up when using git-annex version 10.20260213 which produced this warning message, whereas previous versions (e.g., 10.20250828) did not.

Pushing merging actions to output RIA...
Enumerating objects: 12, done.
Counting objects: 100% (12/12), done.
Delta compression using up to 40 threads
Compressing objects: 100% (2/2), done.
Writing objects: 100% (2/2), 85.07 KiB | 1.09 MiB/s, done.
Total 2 (delta 0), reused 0 (delta 0), pack-reused 0 (from 0)
remote: Checking connectivity: 2, done.
To /gpfs/fs001/cbica/projects/pennlinc_hcpd/hcp-d/derivatives/xcpd-0-10-6-babs/output_ria/7f1/8e816-3c96-43fd-af35-941e6ad1d31d
   a2f5d0f..2b4c31b  master -> master

  Remote origin: This repository is not initialized for use by git-annex, but /gpfs/fs001/cbica/projects/pennlinc_hcpd/hcp-d/derivatives/xcpd-0-10-6-babs/output_ria/7f1/8e816-3c96-43fd-af35-941e6ad1d31d/annex/objects/ exists, which indicates this repository was used by git-annex before, and may have lost its annex.uuid and annex.version configs. Either set back missing configs, or run git-annex init to initialize with a new uuid.
Traceback (most recent call last):
  File "/cbica/projects/pennlinc_hcpd/miniforge3/bin/babs", line 8, in <module>
    sys.exit(_main())
             ^^^^^^^
  File "/gpfs/fs001/cbica/projects/pennlinc_hcpd/software/babs/babs/cli.py", line 752, in _main
    options.func(**args)
  File "/gpfs/fs001/cbica/projects/pennlinc_hcpd/software/babs/babs/cli.py", line 580, in babs_merge_main
    babs_proj.babs_merge(chunk_size, trial_run)
  File "/gpfs/fs001/cbica/projects/pennlinc_hcpd/software/babs/babs/merge.py", line 327, in babs_merge
    raise Exception(
Exception: Unable to find file content for some file(s). The information has been saved to this text file: '/gpfs/fs001/cbica/projects/pennlinc_hcpd/hcp-d/derivatives/xcpd-0-10-6-babs/merge_ds/code/list_content_missing.txt'.

With this error, even though the job branches are merged to master, after datalad clone ria+file://path/to/output_ria#~data test, need to run git annex fsck -f output-storage --fast * before outputs can be datalad get

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugIssues noting problems and PRs fixing those problems.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions