Add pre_run/post_run hooks; make zipping an optional built-in hook#381
Open
asmacdo wants to merge 18 commits into
Open
Add pre_run/post_run hooks; make zipping an optional built-in hook#381asmacdo wants to merge 18 commits into
asmacdo wants to merge 18 commits into
Conversation
Add two optional, ordered hook lists to the participant job script:
- pre_app: spliced before the `datalad run` wrapper
- post_run: spliced after the run, before the push to output storage
Each splice region is a subshell that exports the contract vars
(subid, BRANCH, PROJECT_ROOT, JOB_SCRATCH_DIR; plus sesid at session
level) so separate-process hook scripts read them by name, while the
export cannot leak into the datalad run / container. processing_level
is intentionally not exported: it is a render-time constant, and sesid
presence already distinguishes subject vs session.
Hooks reach the template as list[str]; resolving the YAML hooks: block
into those strings is a later step. With no hooks configured the render
is byte-identical to before (jinja {% if %} guard + trim/lstrip).
Thread hook_pre_app/hook_post_run (default None) through
generate_submit_script(); both existing call sites are unaffected.
Add render tests: no-hooks (None == [], no markers, shellcheck),
pre_app+post_run expansion (contract export, snippet order, position
around the run wrapper, shellcheck), and session-only sesid export.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
First resolution step for the participant_job splice points (8b4185a): turn a `hooks:` config block into the command strings those splice points consume, plus the files babs materializes at init. This is what lets cross-cutting concerns -- optional zip, NORDIC, a BIDS-validator gate -- live as user-configurable hooks in one codepath instead of diverged templates. Design: PennLINC#365. babs/hooks.py (pure, no I/O): resolve_hooks(hooks_config, *, source_base) classifies each entry into Verbatim (form a, raw snippet) or CopyIn (form b, `{script: <path>}` copied to code/hooks/<basename>.sh). Render (templated built-ins / container hooks) is defined as the forward-compat seam but not yet produced. Fail-fast ValueError on bad config. CopyIn.as_import() emits the {original_path, analysis_path} shape _init_import_files already consumes, so a later slice wires materialization with no duplicated copy logic. tests/test_hooks.py: 21 unit tests (run in the slurm-docker-ci container; conftest pulls slurm deps at collection). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
A hook collision is about the materialized file, not the name alone. Key the check on descriptor equality: the *same* hook reused at multiple splice points (e.g. a BIDS validator at both pre_app and post_run) is copied once and referenced from each list, while two *different* descriptors claiming the same code/hooks/<name>.sh still raise. Because Render carries `context`, this also catches one template rendered two ways into the same name (a real conflict) once Render is wired. tests: same-source-at-both-points materializes once; different-source-same- name still collides; Render equality distinguishes context (pins the invariant the rule depends on, since resolve_hooks doesn't emit Render yet). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Follow babs's existing convention for copied-in local files instead of
inventing a parallel one. `imported_files.original_path` -- which form (b)
reuses via _init_import_files -- is an absolute local path used as-is
(op.exists + copy, no cwd-join, no abspath validation); mechababs likewise
resolves the inclusion file to absolute before passing it ("so datalad run cp
doesn't need the cwd to match"). So resolve_hooks no longer takes source_base
and no longer joins: CopyIn.original_path is the `script:` value verbatim, and
_init_import_files raises FileNotFoundError on a bad path exactly as it does
for imported_files today.
This also lands on the right side of the path-handling review on
PennLINC#369: source paths are absolute (the explicitly-supported
copy-in case), while the destination (code/hooks/<name>.sh) stays relative
and is containment-validated by _validate_name.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Connect the resolver to babs for the single-app codepath:
- Render: Container.generate_bash_participant_job resolves
`self.config.get('hooks')` and passes hook_pre_app/hook_post_run to
generate_submit_script (container.py).
- Materialize: babs_bootstrap appends the resolved CopyIns'
as_import() dicts to the existing shared _init_import_files call, so
hook scripts copy into code/hooks/ with no duplicated import logic.
- _init_import_files now makedirs the destination's parent (hooks land in
code/hooks/, which -- unlike flat code/ -- doesn't pre-exist).
The pipeline call site (bootstrap.py:508) is left unwired by design: it
defaults hook_pre_app=None so pipeline renders byte-identical, and pipeline
mode is slated for deletion once NORDIC is a hook. No hooks configured =>
no code/hooks/ dir and no splice subshell (both are guarded), so projects
without hooks are untouched.
Tests:
- test_babs_init_raw_bids: splice a form-(b) contract-guard hook at both
pre_app and post_run. It runs as a separate process, so it only sees the
contract vars (subid/BRANCH/PROJECT_ROOT/JOB_SCRATCH_DIR) because the
splice subshell exports them; `${var:?}` fails the job under set -e if any
is unset. Same source at both points also exercises copy-once dedup. Plus
a post-init assertion that the guard is materialized + spliced -- a dropped
hook would otherwise pass silently (no guard left to fail).
- test_babs_init_single_app_hooks: init-only wiring check (no job execution)
-- asserts a configured hook lands in code/hooks/ and is referenced twice.
- test_babs_init_list_sub_file: assert no code/hooks/ dir when no hooks.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Replace the alphabetic form labels in docstrings/comments with names derived
from the config syntax, so a reader doesn't have to decode "(a)/(b)/(c)":
(a) raw snippet -> snippet (Verbatim)
(b) user script -> script (CopyIn)
(c) container -> templated built-in (Render)
"templated built-in" names the Render *mode* (babs renders a shipped
*.sh.jinja2 into code/hooks/), which serves both a zip hook and a
container-running hook (e.g. nordic) alike -- they differ only in what the
shipped template does, not in how it's produced. So there is no distinct
"container" form; a container-running hook is just a {builtin: <name>} whose
template composes a singularity run.
Comments/docstrings only -- no behavior change.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The two splice points bracket the same `datalad run`, so name them for that boundary: pre_run / post_run (was pre_app / post_run, which mixed "app" and "run" vocab). Renames the config key, the hook_pre_app -> hook_pre_run template param, and the test markers. Config-key churn is cheapest now, before the keys are public. (Review feedback, PR1-review Q8.) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add a `hooks` section to the config-yaml docs: the pre_run/post_run splice points, the snippet and script entry forms, the exported runtime contract (subid/sesid/BRANCH/PROJECT_ROOT/JOB_SCRATCH_DIR) + subshell/set -e semantics, and -- most importantly -- that hooks splice OUTSIDE the datalad run wrapper, so a hook that writes files leaves uncommitted changes that aren't pushed; the safe default is validation that fails the job, and persisting output requires the hook to own its own datalad run/save. (Review feedback, PR1-review Q1/Q2/Q10/Q11.) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Two different scripts sharing a basename within one splice point produced "(in 'pre_run' and 'pre_run')". Name the single point when prior_point == point, the pair otherwise. Adds test_same_point_same_name_collide and tightens the cross-point test to assert the full location. (Review feedback, PR1-review Q12.) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This was referenced Jun 10, 2026
A {builtin: <name>} entry now resolves to a Render of the shipped
templates/hooks/<name>.sh.jinja2; keys beyond `builtin` become the
per-hook part of the render context (e.g. zip's optional `path`).
Bootstrap still gates with NotImplementedError until the next step
wires the actual rendering; CopyIns keep flowing to _init_import_files.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
`_init_render_hooks` renders each Render's shipped template (babs/templates/hooks/) into code/hooks/<name>.sh at init and datalad saves, replacing the NotImplementedError gate. Render context = the per-hook config params plus the derived part (processing_level), with derived values taking precedence. zip.sh.jinja2 is the first built-in: unlock -> 7z inside its own `datalad run --explicit` -> remove the granular outputs in a separate commit (the datalad/datalad#7822 workaround, TODO'd for removal once babs's minimum datalad carries the fix). One hook = one zip; the archive is named ${subid}[_${sesid}]_<basename(path)>.zip and contains basename(path) at its top level. Also wraps a pre-existing >99-char line in hooks.py that failed ruff. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Asserts {builtin: zip, path: ...} renders code/hooks/zip.sh (id, archive
name, 7z line, granular removal), splices once at post_run, and lands
the "Materialize built-in hook scripts" init commit. The TODO-fenced
demo block copies the rendered artifacts into demo-output/ for
inspection after a docker run; remove before PR assembly.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
BUILTIN_PARAMS maps each built-in to its accepted per-hook params
(zip -> {path}). An unknown builtin or a typo'd param (e.g. `paht:`)
now fails at resolve time instead of flowing silently into the render
context (StrictUndefined only catches missing context, not unexpected
keys). The registry also subsumes name validation for built-ins: only
known names resolve, so a path-escaping name can't reach the template
loader.
Also wraps a >99-char line in test_hooks.py that failed ruff.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The inner run script stops being named for zipping (zip moves to the post_run hook). generate_submit_script grows an optional `output_dir` (mutually exclusive with the pipeline-only `zip_foldernames`): when given, the datalad run declares the app output folder itself as the output, committing granular results. Single-app still passes zip_foldernames until the config-surface swap lands; pipeline mode is untouched. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…t_dir The top-level output_dir (the folder the app writes into, carrying the versioned derivative name) is now the single source for the app write dir, the datalad run -o declaration, and the default zip-hook path. Legacy keys hard-error with a migration message (app_output_settings_from_config stays for pipeline mode; dies in PR 3). - bidsapp_run.sh.jinja2 de-zipped: cmd_zip gone, and the granular outputs survive the script (the run now commits them; the zip hook owns their removal) - the zip built-in becomes a STATIC script taking <path> [<name>] as splice-site arguments (visible in participant_job.sh): path defaults to output_dir at resolve time, name (a free-form archive stem, the old map value's only real job) defaults to basename(path) at runtime, and sesid presence encodes the processing level. All zip instances share one code/hooks/zip.sh; multi-zip differs only in args. The Render seam stays reserved for the container-running built-ins (PR 3). - notebooks/eg_*.yaml + test fixtures migrated; docs: output_dir section replaces zip_foldernames; raw_bids e2e flips post_run to [contract guard, zip] (guard before zip -- ordering is load-bearing) WIP: not yet run in docker (units + e2e pending; docker held by the PR-1 CI session). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
With zip shipped as a static arg-taking script, nothing produces a Render: drop the dataclass, bootstrap's _init_render_hooks, and the templated-builtin resolution branch; built-in CopyIns ride _init_import_files like script hooks. Reintroduce a rendered form alongside the container-running built-ins (NORDIC) only if runtime arg/env composition turns out not to suffice there. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Implements the hooks proposal discussed in #365 — the splice-point mechanism plus its first real consumer: output zipping moves out of the generated run script into an optional built-in
post_runhook, making zipping opt-in (#327, #364).(Originally split as two stacked PRs; combined per the 2026-06-10 community meeting, since a release will be cut before this merges and the split bought nothing.)
Part 1 — the hooks mechanism
A new optional
hooks:section in the container config YAML splices user-supplied shell commands into each participant job at two splice points that bracket the BIDS App run:pre_run— after job setup, just before thedatalad runthat executes the BIDS App.post_run— after thatdatalad runcompletes (its outputs committed), before results are pushed to the output RIA.Three entry forms:
participant_job.sh.{script: <absolute path>}; copied intoanalysis/code/hooks/<basename>.shatbabs init(the same mechanism asimported_files, so the hook is git-tracked in the analysis dataset) and run asbash ./code/hooks/<name>.sh.{builtin: <name>}; a static script shipped inside BABS, copied in like a script hook. Its params become shell arguments at the splice site, resolved at config time — several instances of one built-in share a single script and differ only in args.zipis the first built-in.Design points:
datalad runwrapper. The hook author owns commit semantics: the safe default use is validation that fails the job (a non-zeropost_runhook aborts before push, so bad results never leave the node); a hook that persists output runs its owndatalad run/save— which is exactly what the zip built-in does. This is also what lets a future preprocessing hook (NORDIC) produce an intermediate without committing it.subid(+sesidat session level),BRANCH,PROJECT_ROOT,JOB_SCRATCH_DIR, so a separate-process hook can read them; a hook'scd/variable changes can't leak into the rest of the job, and the export can't leak into the BIDS App environment.set -eis preserved: a failing hook fails the job.babs/hooks.py:resolve_hooks()classifies config entries (no I/O); bootstrap executes the copies through the existing_init_import_files. Collisions key on conflicting content, not name reuse: the same hook used at both splice points is copied once; two different sources claiming the same destination raise at init. Built-in names and params are validated against a registry, so a typo failsbabs init, not the job.Part 2 — zip as a built-in hook, and the config swap
The zip/output config is redefined around one new top-level key,
output_dir— the folder the app writes into (relative to the dataset root), carrying the full versioned derivative name. It replaces bothzip_foldernames(the name was derived from key + version) andall_results_in_one_zip:output_diris the single source for the versioned name — app write dir =datalad run -odeclaration = zip source = zip name, one string that can't drift. Zipping itself becomes just a hook: no zip hook = no zipping.The zip hook takes two optional parameters:
path:(the folder to zip; defaults to the top-leveloutput_dir) andname:(the archive-name stem — the X in${subid}[_${sesid}]_X.zip; defaults topath's basename). In the old schema the map value only ever versioned the archive name; a free-form stem keeps that without any derivation.Multi-zip (what the multi-entry
zip_foldernamesmap did): one zip hook per folder, each with its ownpath:(+name:when the app controls the folder name and the archive should stay versioned):The zip built-in is a static script taking its params as args at the splice site (visible in
participant_job.sh); all instances share onecode/hooks/zip.sh. Nothing zip varies on needs render time —sesidpresence encodes the processing level at runtime, per the splice contract. (An init-time rendered built-in form was prototyped during this work and removed as unconsumed; it returns alongside the container-running built-ins only if runtime arg/env composition turns out not to suffice there.)The hook owns its own commit:
datalad unlock→7zinside its owndatalad run --explicit→ a separategit rmof the granular outputs (separate becausedatalad run --explicitdoesn't track deletions — datalad run --explicitdoes not save file deletions in--output` paths datalad/datalad#7822, since fixed upstream; TODO in-script to fold it back in once babs's minimum datalad has the fix).The generated run script no longer zips:
<container>_zip.sh→<container>_run.sh, the in-script 7z +rm -rf outputstail is gone, and the participant job'sdatalad run -odeclares the granularoutput_dir.All shipped example configs (
notebooks/eg_*.yaml), test fixtures, and docs are migrated; the legacy derivation survives only for pipeline mode, which the NORDIC follow-up deletes.Breaking change, on purpose: the legacy keys hard-error with a migration message — a clean break, no deprecation shim (accepted at the 2026-06-10 community meeting; a release is cut before this merges).
Demo: what
babs initproducesFrom a config with
output_dir: outputs/fmriprep_anat-24-1-1and an argless{builtin: zip}atpost_run(simbids, in theslurm-docker-cicontainer):code/hooks/zip.sh— the static zip built-in, copied in verbatim (git-tracked, so you can read exactly what will run)code/participant_job.sh— the run's-odeclares the granularoutput_dir; thepost_runsplice runs the zip hook with its resolved argumentanalysis dataset history after
babs init— the static hook rides the imported-files commitWhat this deliberately does not do
Testing
sesidexported only at session level.code/hooks/(negative), zip builtin copied verbatim with its resolved arg visible.raw_bidse2e splices a contract-guard script hook at both points (: "${subid:?}"etc. fails the job at runtime if the exported contract breaks) and runspost_run: [guard, {builtin: zip}]— guard before zip because hook ordering is load-bearing (a validator must see the outputs before zip flattens them).*prepapps (suspected:sesid="$2"now unused once in-script zip is gone) — to fix, plus a fresh full-e2e run.Follow-ups (next PRs)
singularitybuilt-in (container-running hook, no datalad) — NORDIC then becomes that hook plus its specifics as additional hooks, after which pipeline mode is deleted (NORDIC is its only user).babs mergeas a dependent slurm compute job (upstream issue to come) — makes the two-commit zip's heavier merge acceptable at ~1k-subject scale.babs statuswhen zipping is off: merged-results accounting currently assumes zips; likely derivable from branch ancestry in git history instead of new persisted state.