Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
e98f7c9
fix: correct local envrc file reference and add feature specification…
miloswrath Feb 24, 2026
9ead686
feat: implement manifest handling and reindexing logic in Save class …
miloswrath Feb 24, 2026
efd5f99
feat: implement subject transaction safety and conflict detection in …
miloswrath Feb 24, 2026
7260df3
feat: enhance Save class with manifest-driven run stability tests and…
miloswrath Feb 24, 2026
f778051
automated commit by vosslab linux
miloswrath Feb 24, 2026
50fef75
automated commit by vosslab linux
miloswrath Feb 24, 2026
a4599fa
automated commit by vosslab linux
miloswrath Feb 24, 2026
47e2139
feat: refactor code structure to use 'act' namespace and update relat…
miloswrath Feb 24, 2026
e800568
fix: update import path for ID_COMPARISONS to use 'act' namespace
miloswrath Feb 24, 2026
83f4158
feat: implement ID_COMPARISONS stub and enhance logging in comparison…
miloswrath Feb 26, 2026
24abac9
feat: add logging for subject file plans and disable shutil.copy in s…
miloswrath Feb 26, 2026
3a4719c
fix: restore shutil.copy functionality in save methods
miloswrath Feb 26, 2026
1df58ea
feat: add feature implementation and specification for CLI modernizat…
miloswrath Feb 26, 2026
48071ec
feat: modernize CLI with argparse and add --rebuild-manifest-only flag
miloswrath Feb 26, 2026
289a6f1
fix: correct variable references for cleaning codes in QC class methods
miloswrath Feb 26, 2026
b293ecd
feat: implement manifest-only mode in pipeline, skipping GGIR and plo…
miloswrath Feb 26, 2026
e9c4bc0
Refactor code structure for improved readability and maintainability
miloswrath Feb 26, 2026
1fa94d5
feat: add RedCap subject-lab mapping functionality with unit tests
miloswrath Feb 26, 2026
ce8541d
feat: implement RDSS metadata resolution and associated tests for LSS…
miloswrath Feb 26, 2026
41b407a
feat: implement manifest payload rebuilding from LSS with strict erro…
miloswrath Feb 26, 2026
ed4dc20
feat: implement atomic manifest write and error handling for rebuild …
miloswrath Feb 26, 2026
924dc6f
docs: add documentation for rebuild manifest only CLI mode and operat…
miloswrath Feb 26, 2026
8b01ca1
Merge pull request #11 from HBClab/10-modernize-cli-and-add---manifes…
miloswrath Feb 26, 2026
f9c923e
Update directory paths and modify script arguments for improved funct…
miloswrath Mar 2, 2026
7bc408a
fix: update monkeypatch for shutil.copy to use save_module in test_mo…
miloswrath Mar 2, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .envrc
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
use flake # uses the flake
source_env_if_exists .envrc.local # uses a local envrc file if it exists
source_env_if_exists .env.local # uses a local envrc file if it exists
2 changes: 1 addition & 1 deletion AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
## Build, Test, and Development Commands
- `python -m venv .venv && source .venv/bin/activate` – create a local env (Python 3.11).
- `pip install -r act/requirements.txt` – install runtime deps; add extras here only when they are runtime-critical.
- `python -m code.main 1 "$BOOST_TOKEN" vosslnx` – run the ingest + GGIR pipeline; system flag may be `vosslnxft`, `local`, or `argon`.
- `python -m act.main 1 "$BOOST_TOKEN" vosslnx` – run the ingest + GGIR pipeline; system flag may be `vosslnxft`, `local`, or `argon`.
- `python act/tests/gt3x/plots.py` – regenerate GT3X diagnostic plots and CSV summaries; confirm paths before running.
- `bash cron.sh` – mirrors production cron behaviour; ensure credentials and git remotes are safe before invoking.

Expand Down
49 changes: 45 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,8 @@ An automation stack for synchronizing raw actigraphy exports, routing them throu
- [Prerequisites](#prerequisites)
- [Quick Start](#quick-start)
- [Running the Pipeline](#running-the-pipeline)
- [Rebuild Manifest Only](#rebuild-manifest-only)
- [Operator Runbook](#operator-runbook)
- [Configuration](#configuration)
- [Testing \& QA](#testing--qa)
- [Automation \& Cron Support](#automation--cron-support)
Expand Down Expand Up @@ -63,17 +65,56 @@ conda activate act-newer
## Running the Pipeline
```bash
export BOOST_TOKEN=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
python -m code.main <daysago> $BOOST_TOKEN <system>
python -m act.main --daysago 1 --token "$BOOST_TOKEN" --system vosslnx
```
- `daysago` filters RDSS files by acquisition date; use `1` for “yesterday’s drops”.
- `system` controls filesystem roots: `vosslnx` (default), `vosslnxft`, `local`, or `argon`.
- `--daysago` filters RDSS files by acquisition date; use `1` for “yesterday’s drops”.
- `--system` controls filesystem roots: `vosslnx`, `vosslnxft`, `local`, or `argon`.
- The run will:
1. Create fresh symlinks under `../mnt` (see `utils.mnt`).
2. Match REDCap IDs to RDSS filenames (`utils.comparison_utils`).
3. Copy curated CSVs into the correct LSS project folders (`utils.save`).
4. Call GGIR through `core/acc_new.R` and execute QC/plotting (`utils.qc`, `utils.group`).
5. Write a subject manifest to `act/res/data.json`.

### Rebuild Manifest Only
Use this mode to rebuild `res/data.json` from current LSS session folders without ingest copy, GGIR, or plotting:

```bash
python -m act.main --daysago 1 --token "$BOOST_TOKEN" --system vosslnx --rebuild-manifest-only
```

Behavior in `--rebuild-manifest-only` mode:
- Source of truth is LSS layout (`sub-*/accel/ses-*/*_accel.csv`).
- Run is derived directly from folder name (`ses-# -> run=#`).
- RedCap resolves `subject_id -> labID` and RDSS enriches `filename`, `labID`, `date`.
- Manifest writes are atomic (`temp -> fsync -> replace`) and preserve prior `res/data.json` on failure.

Strict failure conditions (non-zero exit):
- Multiple candidate accel CSVs in one session folder.
- Missing RedCap mapping for any discovered subject.
- Missing RDSS metadata (`filename`, `labID`, `date`) for any discovered session.

### Operator Runbook
Generic Linux (venv):

```bash
python -m venv .venv
source .venv/bin/activate
pip install -r act/requirements.txt
export BOOST_TOKEN=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
python -m act.main --daysago 1 --token "$BOOST_TOKEN" --system local --rebuild-manifest-only
```

NixOS / nix shell:

```bash
nix develop
export BOOST_TOKEN=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
python -m act.main --daysago 1 --token "$BOOST_TOKEN" --system vosslnx --rebuild-manifest-only
```

For routine ingest + GGIR runs, omit `--rebuild-manifest-only`.

For ad-hoc diagnostics, re-run plot generation with `python act/tests/gt3x/plots.py` (requires adjusting the hard-coded file path).

## Configuration
Expand All @@ -99,7 +140,7 @@ For ad-hoc diagnostics, re-run plot generation with `python act/tests/gt3x/plots
- Review git staging before enabling cron on a new host to avoid committing large raw exports.

## Troubleshooting Tips
- **Missing symlinks:** run `python -c "from code.utils.mnt import create_symlinks; create_symlinks('../mnt', system='argon')"` (swap `system` as needed) and confirm mount availability.
- **Missing symlinks:** run `python -c "from act.utils.mnt import create_symlinks; create_symlinks('../mnt', system='argon')"` (swap `system` as needed) and confirm mount availability.
- **GGIR failures:** check the console output and logs under `act/core/` or R’s stderr; ensure the conda env includes GGIR dependencies.
- **REDCap mismatches:** `utils.comparison_utils.ID_COMPARISONS` logs duplicate IDs; review its stdout and `AGENTS.md` for remediation steps.
- **Permission errors:** verify the executing user can read RDSS and write to the LSS target directories.
Expand Down
40 changes: 0 additions & 40 deletions act/__init__.py
Original file line number Diff line number Diff line change
@@ -1,40 +0,0 @@
"""
Project package namespace that also exposes the stdlib `code` helpers so modules
like `pdb` continue to work when this package shadows Python's builtin `code`.
"""

from __future__ import annotations

import importlib.util
import sys
import sysconfig
from pathlib import Path


def _load_stdlib_code() -> object:
"""Load the standard-library `code` module without triggering recursive imports."""
stdlib_dir = Path(sysconfig.get_path("stdlib"))
stdlib_code = stdlib_dir / "code.py"
if not stdlib_code.exists():
raise RuntimeError(f"Unable to locate stdlib code.py at {stdlib_code}")

spec = importlib.util.spec_from_file_location("_stdlib_code", stdlib_code)
module = importlib.util.module_from_spec(spec)
assert spec.loader is not None
spec.loader.exec_module(module)
return module


_STDLIB_CODE = _load_stdlib_code()

# Register the stdlib module under an internal name so it can be reused.
sys.modules.setdefault("_stdlib_code", _STDLIB_CODE)

# Re-export stdlib attributes that our package doesn't define so callers such
# as pdb can access InteractiveConsole, compile_command, etc.
for _attr in dir(_STDLIB_CODE):
if _attr.startswith("_") or _attr in globals():
continue
globals()[_attr] = getattr(_STDLIB_CODE, _attr)

del _attr
2 changes: 1 addition & 1 deletion act/core/gg.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ def run_gg(self):
After each GGIR run, invoke the QC pipeline for that project.
"""
# Assume QC is available at this import path
from code.utils.qc import QC
from act.utils.qc import QC

for project_dir in [self.INTDIR, self.OBSDIR]:
command = f"Rscript act/core/acc_new.R --project_dir {project_dir} --deriv_dir {self.DERIVATIVES}"
Expand Down
56 changes: 56 additions & 0 deletions act/docs/TESTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,62 @@ When adding save logic tests:
- destination paths include expected `accel/ses-#` structure
- For error paths, mock file operations (for example `shutil.copy`) and assert graceful continuation or expected failure semantics.

## Manifest Reindex Testing
Manifest-only session reindex tests live in:
- `act/tests/test_save_manifest_reindex.py`
- `act/tests/test_save_edge_cases.py`

Use these focused commands during development:

```bash
pytest --collect-only -q act/tests/test_save_manifest_reindex.py
pytest -q act/tests/test_save_manifest_reindex.py act/tests/test_save_edge_cases.py
```

Expected behaviors covered by this suite:
- append: incoming session later than existing history receives the next dense run.
- backfill: incoming earlier session inserts chronologically and shifts later runs.
- tie-date skip: same-date subject conflicts are skipped with no manifest/filesystem mutation.
- duplicate noop: repeat ingest of the same `(labID, date, filename)` does not drift runs.

## Operator Guidance
- `res/data.json` is the canonical source of truth for session run ordering.
- Current design assumes single-writer ingest semantics for `res/data.json`.
- Manual edits to `res/data.json` can force session reindex/rename behavior on the next run.
- If manual edits are necessary, run the manifest-focused tests above before production ingest.

### Manifest Rebuild-Only Operations
- CLI mode: `--rebuild-manifest-only`.
- Rebuild mode skips ingest copy/rename, GGIR, and plotting.
- Rebuild mode still requires a valid RedCap token because subject→lab mapping is enforced.
- Rebuild exits non-zero on strict failures:
- multi-candidate session CSVs in a single `ses-*` folder,
- missing RedCap subject mapping,
- missing RDSS metadata for any discovered LSS session.

Linux (venv) example:

```bash
python -m venv .venv
source .venv/bin/activate
pip install -r act/requirements.txt
python -m act.main --daysago 1 --token "$BOOST_TOKEN" --system local --rebuild-manifest-only
```

NixOS example:

```bash
nix develop
python -m act.main --daysago 1 --token "$BOOST_TOKEN" --system vosslnx --rebuild-manifest-only
```

Checkpoint-8 validation commands:

```bash
pytest --collect-only -q act/tests/test_manifest_rebuild_from_lss.py
pytest -q act/tests/test_manifest_rebuild_from_lss.py act/tests/test_pipeline_smoke.py
```

## Smoke E2E Constraints
- Smoke tests must stay Python-only and fast.
- Mock external boundaries:
Expand Down
137 changes: 94 additions & 43 deletions act/main.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,9 @@
import argparse
import logging
import os
import sys
from code.utils.group import Group
from code.utils.pipe import Pipe


DEFAULT_SYSTEMS = ("vosslnx", "vosslnxft", "argon", "local")


def _configure_logging() -> None:
Expand All @@ -27,45 +28,95 @@ def _configure_logging() -> None:
)


if __name__ == "__main__":
def _daysago_type(value: str) -> int:
try:
parsed = int(value)
except ValueError as exc:
raise argparse.ArgumentTypeError("daysago must be an integer") from exc

if parsed < 0:
raise argparse.ArgumentTypeError("daysago must be non-negative")
return parsed


def _token_type(value: str) -> str:
if not value.strip():
raise argparse.ArgumentTypeError("token must be a non-empty string")
return value


def _available_systems() -> tuple[str, ...]:
try:
from act.utils.pipe import Pipe

available = getattr(Pipe, "available_systems", None)
if callable(available):
return tuple(available())
except Exception:
pass

return DEFAULT_SYSTEMS


def build_parser() -> argparse.ArgumentParser:
parser = argparse.ArgumentParser(
prog="python -m act.main",
description=(
"Run BOOST ingest pipeline using explicit typed arguments. "
"Use --rebuild-manifest-only to rebuild manifest and skip GGIR/plotting."
),
)
parser.add_argument(
"--token",
type=_token_type,
required=True,
help="RedCap API token (required, non-empty)",
)
parser.add_argument(
"--daysago",
type=_daysago_type,
required=True,
help="Lookback window in days (required, integer >= 0)",
)
parser.add_argument(
"--system",
choices=_available_systems(),
required=True,
help="Target system path profile",
)
parser.add_argument(
"--rebuild-manifest-only",
action="store_true",
help="Rebuild manifest-only mode (skips GGIR and plotting)",
)
return parser


def main(argv: list[str] | None = None) -> int:
from act.utils.group import Group
from act.utils.pipe import Pipe

_configure_logging()
# Expect at least 2 arguments: daysago (integer) and token (string)
if len(sys.argv) < 3:
print("Usage: python main.py <daysago> <token> [system]")
print(" <daysago> must be an integer, <token> must be a non-empty string.")
print(
" [system] optional values: 'vosslnx', 'vosslnxft', 'argon', 'local' (default 'vosslnx')."
)
sys.exit(1)

# Parse daysago
args = build_parser().parse_args(argv)

p = Pipe(
token=args.token,
daysago=args.daysago,
system=args.system,
rebuild_manifest_only=args.rebuild_manifest_only,
)

try:
daysago = int(sys.argv[1])
except ValueError:
print("Error: <daysago> must be an integer.")
sys.exit(1)

# Parse token
token = sys.argv[2]
if not token:
print("Error: <token> cannot be empty.")
sys.exit(1)

# Parse system
if len(sys.argv) > 3:
system = sys.argv[3]
else:
system = None
if not system:
print("System not specified, defaulting to 'vosslnx'.")
elif system not in ["vosslnx", "vosslnxft", "argon", "local"]:
print(
"Error: <system> must be one of 'vosslnx', 'vosslnxft', 'argon', or 'local'."
)
sys.exit(1)

p = Pipe(token, daysago, system)
p.run_pipe()

Group(system).plot_person()
Group(system).plot_session()
p.run_pipe()
except ValueError as exc:
logging.error("%s", exc)
return 1

if not args.rebuild_manifest_only:
Group(args.system).plot_person()
Group(args.system).plot_session()
return 0


if __name__ == "__main__":
raise SystemExit(main())
Loading