Avoid memory-universe copies during conformational topology discovery#371
Merged
Merged
Conversation
jimboid
approved these changes
Jun 24, 2026
jimboid
left a comment
Member
There was a problem hiding this comment.
Happy to approved this PR. We discussed this both at the catchup on Monday and the all-hands meeting on Tuesday this week.
…om-axes Cache customised united-atom axes topology for frame covariance
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR improves conformational dihedral topology discovery by avoiding repeated construction of standalone in-memory MDAnalysis universes during static topology setup.
This optimisation builds directly on the previous conformational refactor PRs. Those changes separated conformational analysis from the static
LevelDAGstage, introduced a dedicatedConformationDAG, and split the conformational workflow into clearer phases for topology discovery, angle collection, peak construction, state assignment, and reduction. That architecture made it possible to profile the conformational path more precisely and identify the true bottleneck.While investigating Dask-backed conformational parallelisation, profiling showed that Dask introduced additional runtime overhead for the current conformational workload. Further investigation showed that the main bottleneck was not the serial conformational algorithm itself, but repeated MDAnalysis memory-universe creation during molecule and residue topology discovery.
Instead of adding Dask to conformational analysis, this PR applies a smaller targeted optimisation: topology discovery now uses lightweight AtomGroup selections where possible, while preserving the existing serial conformational workflow and regression behaviour.
Changes
Lightweight topology fragment extraction:
extract_fragment_atomgroup(...)toUniverseOperations.extract_fragment(...)but returns an AtomGroup instead of building a standalone in-memory universe.Avoid memory-universe copies in dihedral topology discovery:
UniverseOperations.select_atoms(...)with lightweight AtomGroup selection.Preserve residue and united-atom topology behaviour:
resindexselection strings when operating on lightweight molecule AtomGroups.MDAnalysis.analysis.dihedrals.Dihedral.Update unit test coverage:
UniverseOperations.select_atoms(...).Impact
Dihedral.run(...)workflow.