PERF: AlignSections Filters OoC Optimization by joeykleingers · Pull Request #1560 · BlueQuartzSoftware/simplnx

joeykleingers · 2026-03-05T15:10:16Z

Summary

Adds slice-buffered OOC algorithm paths for the 4 AlignSections filters, using dual-dispatch (Strategy C) to preserve the original in-core code untouched while adding OOC-optimized variants.

Changes

Base class (AlignSections.cpp)

New AlignSectionsTransferDataOocImpl<T> dispatched when any cell array is OOC
Reads each Z-slice into a local buffer, applies the 2D X/Y shift in memory, writes back — eliminates per-tuple chunk thrashing

AlignSectionsMisorientation

New findShiftsOoc() buffers 2 adjacent Z-slices of quats, cellPhases, and mask before the convergence loop
Pre-loads first reference slice, then std::swap cur→ref each iteration to avoid re-reading the reference from DataStore
Eliminates repeated ZarrStore reads during the 7×7 candidate shift grid convergence

AlignSectionsMutualInformation

New formFeaturesSectionsOoc() buffers one Z-slice of quats, cellPhases, and mask for the per-slice 2D flood-fill segmentation

AlignSectionsFeatureCentroid & AlignSectionsListFilter

Benefit from transfer phase optimization only (findShifts access patterns are already sequential/trivial)

Tests

All 9 correctness tests now exercise both in-core and OOC algorithm paths via GENERATE(false, true) + ForceOocAlgorithmGuard
Benchmark tests (200³ programmatic datasets) left without GENERATE for clean timing

Benchmark Results (200×200×200)

Filter	In-Core Before → After	OOC Before → After	OOC Speedup
AlignSectionsMisorientation	0.74s → 0.79s (~1.0x)	32.89s → 16.14s	2.0x
AlignSectionsMutualInformation	0.49s → 0.52s (~1.0x)	15.61s → 15.14s	~1.0x
AlignSectionsFeatureCentroid	0.24s → 0.28s (~1.0x)	8.41s → 8.82s	~1.0x
AlignSectionsListFilter	0.22s → 0.25s (~1.0x)	7.50s → 7.95s	~1.0x

Optimization Ceiling Analysis

The OOC speedups are more modest than Groups B/C/D because the transfer phase dominates OOC runtime and is bottlenecked by ZarrStore's per-element overhead (~55–75ns per operator[]: mutex lock/unlock + chunk lookup vs ~1ns for in-core DataStore). The Misorientation filter shows the most benefit (2.0x) because its findShifts convergence loop re-reads the same 2 slices many times — slice buffering plus reference-slice swap (reusing the previous iteration's current-slice buffer as the next iteration's reference) eliminates that repeated I/O.

Further improvement requires deeper OOC infrastructure changes:

Bulk read/write API on AbstractDataStore — eliminates ~47.8M mutex lock/unlock cycles per filter (~1–2s savings)
Chunk-level bulk transfer in FileCore — bypasses per-element chunk lookup entirely (estimated 3–5x transfer phase improvement)
Larger FIFO cache or per-array cache isolation — enables parallel OOC array processing

Test Plan

All 9 correctness tests pass on in-core build (simplnx-Rel)
All 9 correctness tests pass on OOC build (simplnx-ooc-Rel)
GENERATE(false, true) exercises both algorithm paths in both builds
Benchmark tests confirm zero in-core regression

Add slice-buffered OOC paths for the AlignSections filter family: - AlignSectionsMisorientation: OOC findShiftsOoc() with 2-slice quats/phases/mask buffering (1.6x OOC speedup) - AlignSectionsMutualInformation: OOC formFeaturesSectionsOoc() with per-slice buffering - AlignSectionsFeatureCentroid: Transfer phase optimization only - AlignSectionsListFilter: Transfer phase optimization only Base class AlignSections::execute() now dispatches to AlignSectionsTransferDataOocImpl when any cell array is OOC, using sequential read-into-buffer then write-back-shifted pattern that eliminates per-tuple chunk thrashing. All correctness tests now exercise both in-core and OOC algorithm paths via GENERATE(false, true) + ForceOocAlgorithmGuard. Signed-off-by: BlueQuartz Software <info@bluequartz.net>

…ndShifts Pre-load the first reference slice before the convergence loop and swap cur→ref buffers at each iteration instead of re-reading the reference from DataStore. Halves per-iteration DataStore reads, improving OOC Misorientation from 21s to 16s (2.0x vs baseline). Add Doxygen comments for private OOC methods in Misorientation and MutualInformation headers.

- Remove unused #include <iostream> from AlignSectionsMisorientation.cpp - Remove duplicate cancel check (m_ShouldCancel before getCancel()) - Fix local variable naming: m_CellPhases/m_CrystalStructures → cellPhases/crystalStructures in formFeaturesSections - Use hidden Catch2 tag [.Benchmark] so benchmark tests don't run in default CI - Run clang-format on all PR files

joeykleingers added the Out-of-Core label Mar 5, 2026

joeykleingers requested a review from imikejackson March 5, 2026 15:10

joeykleingers force-pushed the worktree-OptimizeGroupE branch from 2504647 to 200daaa Compare March 5, 2026 15:11

joeykleingers force-pushed the worktree-OptimizeGroupE branch from 200daaa to 8251640 Compare March 5, 2026 15:12

imikejackson changed the title ~~PERF: OOC optimization for AlignSections family (Group E)~~ PERF: OoC optimization for AlignSections Filters Mar 5, 2026

joeykleingers added 2 commits March 5, 2026 12:53

imikejackson changed the title ~~PERF: OoC optimization for AlignSections Filters~~ PERF: AlignSections Filters OoC Optimization Mar 9, 2026

joeykleingers marked this pull request as draft March 10, 2026 01:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PERF: AlignSections Filters OoC Optimization#1560

PERF: AlignSections Filters OoC Optimization#1560
joeykleingers wants to merge 3 commits intoBlueQuartzSoftware:developfrom
joeykleingers:worktree-OptimizeGroupE

joeykleingers commented Mar 5, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

joeykleingers commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Benchmark Results (200×200×200)

Optimization Ceiling Analysis

Test Plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

joeykleingers commented Mar 5, 2026 •

edited

Loading