Add extra_paths to input dataset config to retain non-subject files (e.g. nidm.ttl)
Problem
babs init-generated participant_job.sh hard-codes the per-subject sparse-checkout
of each non-zipped input dataset to only ${subid} (and ${sesid}) plus
dataset_description.json:
babs/templates/participant_job.sh.jinja2:102
That means any other dataset-root files needed by the BIDS app are dropped from
the working tree.
Concrete case discovered with @djarecka: an NIDM input dataset has
sourcedata/NIDM/nidm.ttl which is symlinked into each subject. After the
per-subject sparse-checkout runs, nidm.ttl is no longer in the working tree,
the symlinks dangle, and datalad run fails with "no such file".
There should be a way to configure extra_paths so these files are included
in the sparse-checkout set.
The existing required_files field in the input dataset YAML doesn't help.
It's a per-subject inclusion filter, and per
docs/preparation_config_yaml_file.rst:115 it's not implemented yet anyway.
Proposal
Add an optional extra_paths list to each entry in input_datasets, e.g.:
input_datasets:
NIDM:
is_zipped: false
origin_url: ...
path_in_babs: sourcedata/NIDM
extra_paths:
- nidm.ttl
In babs/templates/participant_job.sh.jinja2, each extra_path should be:
- Added to the per-subject
git sparse-checkout set --stdin list (line 102).
This is the load-bearing change — it's what makes the file visible to
the BIDS app.
- Optionally added to
datalad run -i so annex content (if any) is fetched
and provenance is recorded.
Add
extra_pathsto input dataset config to retain non-subject files (e.g.nidm.ttl)Problem
babs init-generatedparticipant_job.shhard-codes the per-subject sparse-checkoutof each non-zipped input dataset to only
${subid}(and${sesid}) plusdataset_description.json:babs/templates/participant_job.sh.jinja2:102
That means any other dataset-root files needed by the BIDS app are dropped from
the working tree.
Concrete case discovered with @djarecka: an NIDM input dataset has
sourcedata/NIDM/nidm.ttlwhich is symlinked into each subject. After theper-subject sparse-checkout runs,
nidm.ttlis no longer in the working tree,the symlinks dangle, and
datalad runfails with "no such file".There should be a way to configure
extra_pathsso these files are includedin the sparse-checkout set.
The existing
required_filesfield in the input dataset YAML doesn't help.It's a per-subject inclusion filter, and per
docs/preparation_config_yaml_file.rst:115 it's not implemented yet anyway.
Proposal
Add an optional
extra_pathslist to each entry ininput_datasets, e.g.:In
babs/templates/participant_job.sh.jinja2, eachextra_pathshould be:git sparse-checkout set --stdinlist (line 102).This is the load-bearing change — it's what makes the file visible to
the BIDS app.
datalad run -iso annex content (if any) is fetchedand provenance is recorded.