Collect LLVM code coverage for instrumented deep‑learning libraries (currently PyTorch) by executing large corpora of Python scripts in a controlled, parallel, and fault‑tolerant workflow. Results are merged into .profdata for analysis.
- Uses an instrumented container image (PyTorch 2.2.0 provided) to run Python inputs under coverage.
- Buckets Python inputs by time intervals to stage work in batches.
- For each bucket, runs a driver that:
- Iterates each top‑level subdirectory and executes all
.pyfiles viaexec()in‑process (tolerating exceptions andSystemExit). - Produces one
.profrawper subdirectory and merges them into a single.profdataper bucket.
- Iterates each top‑level subdirectory and executes all
- Copies merged
.profdata(and a log if present) back to the host.
run.py– CLI entrypoint to orchestrate collection inside Docker.cov.py– Core orchestration: bucketing, container lifecycle, copy‑in/out, and running drivers.scripts/acetest_driver.py– Orchestrates per‑bucket runs; launchestorch_driver.pyonce per subdirectory and merges coverage.scripts/torch_driver.py– Executes all Python files within a directory usingexec(); never fails the overall run.dockerfile/torch-2.2.0-instrumented.Dockerfile– PyTorch 2.2.0 image with Clang/LLVM coverage enabled.build.sh– Convenience build script for the provided images.
- Docker installed and running.
- Disk space for instrumented builds and coverage artifacts.
- Python 3.10+ on the host to run orchestration scripts.
Note: llvm-profdata is installed inside the instrumented container image and used there to merge coverage. If you run the drivers on the host, ensure llvm-profdata is available on your PATH.
The PyTorch 2.2.0 instrumented image is provided. Build it with:
bash build.shThis creates the image ncsu-swat/torch-2.2.0-instrumented.
You can point to a directory of Python files, or a directory already organized into interval buckets like 0-60/, 60-120/, etc. If your target directory is not bucketed, the tool will bucket by file modification times (mtime).
Example target layout (not pre‑bucketed):
target/
project_a/...
project_b/...
single.py
Example bucketed layout (pre‑categorized):
target/
0-60/
project_a/...
60-120/
project_b/...
Run the collector for PyTorch 2.2.0:
python3 run.py \
--dll torch \
--ver 2.2.0 \
--target /absolute/path/to/target \
--output _result \
--baseline acetest \
--itv 60 \
--num_parallel 16Arguments:
--dll: Currently onlytorchis wired through the collector.--ver: Must match a built image tag, e.g.,2.2.0→ncsu-swat/torch-2.2.0-instrumented.--target: Directory containing.pyfiles or interval buckets. The tool will bucket if needed.--output: Host directory for results (default_result).--baseline: Driver set to use; passacetestto usescripts/acetest_driver.py.--itv: Interval in seconds for bucketing if--targetis not pre‑bucketed (default 60).--filter: Optional regex to include specific files (matched against path relative to--target).--num_parallel: Parallelism for per‑subdirectory runs within a bucket.
- If
--targetis not already bucketed (^\d+-\d+$folder names), files are classified by mtime into interval buckets under_result/<baseline>/. - A Docker container is started from the instrumented image.
- For each bucket:
- The bucket directory is copied into
/root/inputs/<bucket>in the container. scripts/acetest_driver.pyis executed inside the container. It:- Spawns one process per top‑level subdirectory in the bucket.
- For each subdirectory, runs
scripts/torch_driver.pyonce to execute all.pyfiles viaexec(); exceptions are caught to avoid aborting coverage. - Sets
LLVM_PROFILE_FILEso each subdirectory produces one.profrawat/root/profraw/<bucket>/<subdir>/coverage.profraw. - Merges all
.profrawto/root/profraw/<bucket>/merged.profdata.
- The merged
.profdata(andprofile.logif generated) is copied back to the host under_result/profdata/<baseline>/<bucket>/.
- The bucket directory is copied into
- The container is stopped and removed.
Threading is constrained (OMP/MKL/BLAS/TF env vars) to reduce nondeterminism and resource contention.
- Merged coverage:
_result/profdata/<baseline>/<bucket>/merged.profdata - Optional logs:
_result/profdata/<baseline>/<bucket>/profile.log - If bucketing was created by the tool:
_result/<baseline>/<bucket>/...contains copied Python inputs per bucket.
-
scripts/acetest_driver.py(baselineacetest)- Discovers top‑level subdirectories under
--inputs-dir. - For each subdirectory, runs
torch_driver.pyonce withLLVM_PROFILE_FILEpointing to a subdir‑specific.profraw. - Merges all
.profrawin the bucket to a single.profdata.
- Discovers top‑level subdirectories under
-
scripts/torch_driver.py- Executes all
.pyfiles within a directory (recursively by default) usingcompile()+exec(). - Catches exceptions and treats
SystemExitas non‑fatal to keep the batch running. - Prints a summary and always returns success so coverage continues.
- Executes all
- Docker image not found: Run
bash build.shand ensure--vermatches the image tag. - No Python files found: Check your
--targetpath or provide--filterif needed. - Coverage not produced: Ensure your instrumented container is used. The drivers set
LLVM_PROFILE_FILEfor you. - Performance tuning: Adjust
--num_paralleland the per‑subdir timeout (scripts/acetest_driver.pydefault is 180s).
- TensorFlow pipeline is scaffolded with an instrumented Dockerfile. A dedicated collector can be added similar to
TorchCovCollector.