Conversation
There was a problem hiding this comment.
Pull request overview
This PR strengthens ARC’s Transition State (TS) validation logic by improving the statistical robustness of the NMD (Normal Mode Displacement) check and making the IRC (Intrinsic Reaction Coordinate) endpoint validation chemically aware via molecular graph isomorphism, with a fallback to the previous bond-list approach.
Changes:
- Update NMD TS validation to use robust statistics (median + MAD) and add a formed/broken bond directionality gate.
- Update IRC endpoint validation to perceive fragments and validate reactant/product identity via graph isomorphism (with a connectivity fallback).
- Add/adjust unit tests for the new NMD and IRC behaviors.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
arc/checks/ts.py |
Adds fragment perception + isomorphism-based IRC validation with fallback. |
arc/checks/ts_test.py |
Adds tests for fragment perception, fragment/species matching, and IRC endpoint swapping. |
arc/checks/nmd.py |
Adds directionality gate, module constants, and switches baseline/spread to median+MAD. |
arc/checks/nmd_test.py |
Updates expectations for new robust stats and adds tests for directionality + constants. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #838 +/- ##
==========================================
+ Coverage 58.66% 58.70% +0.04%
==========================================
Files 97 97
Lines 29197 29308 +111
Branches 7752 7782 +30
==========================================
+ Hits 17127 17205 +78
- Misses 9872 9891 +19
- Partials 2198 2212 +14
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
NMD Improvements (arc/checks/ts.py): - Replaced mean-based displacement baseline with Median + MAD (Median Absolute Deviation) to insulate the check from floppy rotors. - Implemented a mandatory Directionality Check: ensures formed and broken bonds move in anti-correlated directions along the imaginary mode. - Separated primary (formed/broken) from secondary (changed-order) bonds in the sigma test to reflect physical displacement scales. - Set a global numerical noise floor (1e-4 A) for the Hessian and raised the default validation threshold to 3.0 sigma.
IRC Improvements (arc/checks/ts.py): - Upgraded 'check_irc_species_and_rxn' to use molecular graph isomorphism as the primary validation method. - Added 'perceive_irc_fragments' using DFS-based connected component detection to handle multi-species reactions (e.g., A + B) reliably. - Implemented permutation-based matching to verify that the set of perceived IRC fragments matches the expected reaction species. - Retained distance-matrix bond-list comparison as a robust fallback.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| frag_symbols = tuple(symbols[i] for i in frag_idx) | ||
| frag_coords = tuple(coords[i] for i in frag_idx) | ||
| frag_xyz = xyz_from_data(coords=frag_coords, symbols=frag_symbols) | ||
| frag_charge = charge if len(fragment_indices) == 1 else 0 | ||
| mol = perceive_molecule_from_xyz(frag_xyz, charge=frag_charge, n_fragments=1) | ||
| if mol is None: |
There was a problem hiding this comment.
For multi-fragment IRC endpoints, _perceive_irc_fragments() forces frag_charge = 0, which prevents correct perception of ionic fragments (e.g., quaternary ammonium + hydroxide, net charge 0 but fragment charges +1/-1). This will make the isomorphism path fail for charged multi-species reactions and fall back to the bond-list check, losing the intended validation of formal charges/radicals. Consider leveraging perceive_molecule_from_xyz(..., charge=total_charge, n_fragments=len(fragment_indices)) so charge can be distributed across fragments (see _combine_fragments in arc/species/perceive.py), then split the perceived disconnected molecule into fragments, or otherwise implement/borrow a charge-splitting strategy instead of hard-coding 0.
| """ | ||
| diffs = list() | ||
| report = None | ||
| amplitude = amplitude or 1.0 |
There was a problem hiding this comment.
I know that this code isn't new or being updated but I do think it's worth pointing out that since amplitude = amplitude or 1.0 then amplitude can never be None so the first half check we have at line 445 is redundant
| continue | ||
| diff = abs(r_bond_length - p_bond_length) | ||
| if amplitude is not None and return_none_if_change_is_insignificant \ | ||
| and abs(diff * amplitude / r_bond_length) < 0.05 and abs(diff * amplitude / p_bond_length) < 0.05: |
There was a problem hiding this comment.
Correct me if I'm but since diff is the change between the two xyzs and those geometries were already generated with amplitude applied in get_displace_xyzs(), hasn't the amplitude already influenced the result. So we are scaling it a second time. Is that intended?
This PR significantly improves how ARC validates Transition States by making the NMD (Normal Mode Displacement) check more statistically sound and the IRC (Intrinsic Reaction Coordinate) check chemically aware.
1. NMD: Smarter "Sanity Checks"
The NMD check is our first line of defense. It now uses Median + MAD (Median Absolute Deviation) statistics instead of simple averages.
2. IRC: From Geometry to Chemistry
Previously, the IRC check only looked at a "list of distances." It now performs Full Molecular Graph Isomorphism.
3. Verification
arc/checks/ts_test.pypass.