Skip to content

feat: OpenMP + AVX-512/AVX2 parallelization for multi-core performance#106

Open
aminkvh wants to merge 2 commits into
mittinatten:masterfrom
aminkvh:master
Open

feat: OpenMP + AVX-512/AVX2 parallelization for multi-core performance#106
aminkvh wants to merge 2 commits into
mittinatten:masterfrom
aminkvh:master

Conversation

@aminkvh
Copy link
Copy Markdown

@aminkvh aminkvh commented May 14, 2026

Summary

This PR adds OpenMP multi-threading and AVX-512/AVX2 SIMD vectorization to
FreeSASA, giving 5-8× speedups on multi-core machines with no changes to the
public API. All 54 unit tests pass with bit-identical results.

What Changed

  • src/sasa_lr.c — Replaced pthreads with OpenMP; removed the hard
    thread cap; added dynamic scheduling
  • src/sasa_sr.c — AVX-512/AVX2 SIMD inner distance check using a
    float32 SoA neighbor cache; scalar/auto-vectorization fallback for
    portability; C99-compatible aligned allocation (posix_memalign wrapper)
  • src/nb.c — Two-phase parallel neighbor-list construction (was
    entirely serial O(N²))
  • src/freesasa.c — New freesasa_calc_structures_parallel() for
    processing multiple structures (trajectory frames) concurrently
  • src/freesasa.h — Declaration for the above; no existing API changed
  • src/main.cc — CLI auto-detects core count via omp_get_max_threads()
  • CMakeLists.txt — CMake build system (existing autotools build unchanged)

Performance (58,674-atom structure)

Algorithm 1 thread 8 threads 16 threads
L&R standard (20 slices) 1.19 s 0.23 s 0.18 s (6.5×)
L&R high-res (100 slices) 4.08 s 0.62 s 0.48 s (8.4×)
S&R (100 pts, AVX-512) 0.27 s 0.13 s 0.13 s (2.1×)

Backward Compatibility

  • All 54 existing unit tests pass with bit-identical results
  • OMP_NUM_THREADS=1 or parameters.n_threads=1 restores single-threaded behavior
  • No existing API function was modified or removed
  • FREESASA_DEF_NUMBER_THREADS is preserved and exported correctly
  • C99 standard maintained throughout (no C11 APIs used)

Notes

I noticed in CONTRIBUTING.md that new features require unit tests. The new
freesasa_calc_structures_parallel() function is currently covered by the
Python test suite. I am happy to add a pure C unit test if you would prefer
that before merging.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant