Skip to content

refactor: Detect molecules#2528

Open
RobBuchananCompPhys wants to merge 8 commits into
develop2from
dissolve2/refactor-detect-molecules
Open

refactor: Detect molecules#2528
RobBuchananCompPhys wants to merge 8 commits into
develop2from
dissolve2/refactor-detect-molecules

Conversation

@RobBuchananCompPhys

Copy link
Copy Markdown
Contributor

No description provided.

@RobBuchananCompPhys RobBuchananCompPhys force-pushed the dissolve2/refactor-detect-molecules branch from 353a3e6 to cd27f8f Compare June 23, 2026 14:56
@RobBuchananCompPhys RobBuchananCompPhys changed the base branch from dissolve2/detect-molecules to develop2 June 23, 2026 14:56
@RobBuchananCompPhys RobBuchananCompPhys force-pushed the dissolve2/refactor-detect-molecules branch from cd27f8f to 150dd4c Compare June 23, 2026 15:02
@RobBuchananCompPhys RobBuchananCompPhys marked this pull request as ready for review June 23, 2026 15:04
@RobBuchananCompPhys RobBuchananCompPhys force-pushed the dissolve2/refactor-detect-molecules branch 2 times, most recently from 2b5a4f2 to 6e08831 Compare June 25, 2026 09:27
@RobBuchananCompPhys RobBuchananCompPhys force-pushed the dissolve2/refactor-detect-molecules branch from a160f32 to 9303bb2 Compare June 25, 2026 10:46

@trisyoungs trisyoungs left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is excellent and a whole lot cleaner than the old one, but I think there are some looping / workflow errors to address!

Comment thread src/classes/fragment.h

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this still need a rebase? This was in #2530 ...

// Inputs
addInput<Structure>("Structure", "Input structure", inputStructure_);
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/*
* Definition
*/

std::string_view DetectMoleculesNode::type() const { return "DetectMolecules"; }

std::string_view DetectMoleculesNode::summary() const { return "Detect molecular species within a structure"; }

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/*
* Processing
*/

Comment on lines +28 to +29
for (int i = 0; i < structure.nAtoms(); i++)
fragments.emplace_back(fragment(i));

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this function start from the next un-selected atom? Need something like a std::set to track which atoms have been detected within instances (a bug that doesn't show up for the atomic NaCl or MgO cases!)

detectedStructures_.clear();

// Return all discovered molecular fragment index vectors
auto allFragmentIndices = findMolecularFragments(inputStructure_);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're pre-detecting all bound fragments here (which is a good idea)....


std::set<const StructureAtom *> atomMask;

for (int i = 0; i < inputStructure_.nAtoms(); i++)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...why is this a loop over structure atoms and not fragments?

Comment on lines +94 to +138
// Set up the return value and bind its contents
NETADefinition bestNETA;
std::vector<StructureAtom *> rootAtoms;

// Maintain a set of atoms matched by any NETA description we generate
std::set<StructureAtom *> alreadyMatched;

// Skip this atom?
if (alreadyMatched.find(fragmentAtom) != alreadyMatched.end())
continue;

// Create a NETA definition with this atom as the root
NETADefinition neta;
neta.create(static_cast<AtomBase *>(fragmentAtom), std::nullopt,
Flags<NETADefinition::NETACreationFlags>(NETADefinition::NETACreationFlags::ExplicitHydrogens,
NETADefinition::NETACreationFlags::IncludeRootElement));

// Apply this match over the whole species
std::vector<StructureAtom *> currentRootAtoms;
for (auto fragAtomIndex : fragmentIndices)
{
const auto fragmentAtom = inputStructure_.atom(fragAtomIndex);
if (neta.matches(fragmentAtom))
{
currentRootAtoms.push_back(fragmentAtom);
alreadyMatched.insert(fragmentAtom);
}
}

// Is this a better description?
auto better = false;
if (rootAtoms.empty() || currentRootAtoms.size() < rootAtoms.size())
better = true;
else if (currentRootAtoms.size() == rootAtoms.size())
{
// Replace the current match if there are more bonds on the current atom.
if (fragmentAtom->nBonds() > rootAtoms.front()->nBonds())
better = true;
}

if (better)
{
bestNETA = neta;
rootAtoms = currentRootAtoms;
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole NETA creation part needs to run on each atom within a fragment, reason being to find the best, most concrete way of describing uniquely the bonding / atoms within it. Then it can be applied to each fragment (of the same size) in turn to see if the NETA matches.

Comment on lines +180 to +182
// Unfold all detected structures
for (auto &structure : detectedStructures_)
structure.unFold();

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This step needs to happen before you store the coordinates of the instances at the end of the loop above. Could happen immediately after the fragment detection at the top of the routine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants