Skip to content

Releases: asad/SMSD

SMSD Pro 7.1.1

14 Apr 19:00

Choose a tag to compare

SMSD Pro 7.1.1

Bug-fix patch on top of v7.1.0. No new features, no public API breakage.

Fixed

  • Cross-language ECFP / FCFP fingerprint parity between Java, C++, and
    Python (two long-standing Java drifts at radius ≥ 1).
  • Java canonical SMILES writer: bond symbol for aromatic-adjacent
    single bonds and implicit H count inside stereo brackets.
  • Python smsd.canonical_smiles(smi) / smsd.to_smiles(smi) raised
    TypeError on string input. Both now accept str or MolGraph.
  • MatchResult.overlapCoefficient returned the wrong similarity
    metric. Both MatchResult.overlap and MatchResult.overlapCoefficient
    now return Szymkiewicz-Simpson overlap as documented; the new
    MatchResult.tanimoto attribute exposes the Jaccard value.
  • Canonical SMILES writer now emits [nH] for pyrrole-type aromatic
    nitrogen, so output kekulizes cleanly in downstream readers.
  • FP-level smsd.overlapCoefficient / count_overlapCoefficient
    camelCase aliases now return Simpson overlap as documented.

Verified

  • Python pytest: 603 passed, 6 skipped, 0 failures.
  • Java JUnit: 581 tests, 0 failures, 0 errors.

Apache 2.0 — see NOTICE.

SMSD Pro 7.1.0

14 Apr 18:58

Choose a tag to compare

SMSD Pro 7.1.0

Copyright (c) 2018-2026 Syed Asad Rahman — BioInception PVT LTD

What's New

Comprehensive test suite — 597 tests across 9 test files covering MCS, substructure search, fingerprints, similarity metrics, file I/O, scaffolds, coordinate transforms, depiction, SMARTS, and batch operations.

Bug Fixes

  • tanimoto_coefficient / overlap_coefficient now correctly handles sparse count input from circular_fingerprint_counts()
  • counts_to_array() now accepts both dict and list[tuple] formats
  • fingerprint() no longer raises ValueError for kind='ecfp', 'fcfp', or 'torsion'
  • Fingerprint radius capped at molecule size — prevents bit-vector saturation for out-of-range radius values
  • Documentation API names verified and corrected across all public docs

Install

Java (Maven):
```xml

com.bioinceptionlabs
smsd
7.1.0

```

Python:
```bash
pip install smsd==7.1.0
```

JAR:
```bash
java -jar smsd-7.1.0-jar-with-dependencies.jar --help
```

Platforms

  • Java 25+, C++17, Python 3.10–3.13
  • macOS (arm64, x86_64), Linux (x86_64, aarch64), Windows (AMD64)
  • GPU: Metal (Apple Silicon), CUDA (Volta+)

Apache-2.0 License

SMSD Pro 7.0.0

13 Apr 16:29

Choose a tag to compare

SMSD Pro 7.0.0

Syed Asad Rahman — BioInception PVT LTD

Major release: unified Python and Java API, clean break from legacy aliases,
full Java parity for convenience methods, all tests green.

What's New

Unified Python API — clean break

Two entry points replace all previous aliases:

import smsd

# MCS — returns dict (single) or list[dict] (multiple)
mcs = smsd.find_mcs("c1ccccc1", "c1ccc(O)cc1")
mcs = smsd.find_mcs(mol1, mol2, max_results=5)

# Substructure — returns dict (single) or list[dict] (multiple)
hit = smsd.find_substructure("c1ccccc1", "c1ccc(O)cc1")
hit = smsd.find_substructure(query, target, max_results=3)

# Boolean convenience check
if smsd.is_substructure(query, target):
    print("Query is a substructure of target")

Both functions accept SMILES strings, MolGraph objects, or RDKit Mol objects.

Removed legacy aliases

The following names are no longer available in v7.0.0:

Removed Replacement
smsd.mcs() smsd.find_mcs()
smsd.substructure_search() smsd.find_substructure()
smsd.all_mcs() smsd.find_mcs(mol1, mol2, max_results=N)
smsd.overlapCoefficient() smsd.overlap_coefficient()
smsd.tanimoto() smsd.tanimoto_coefficient()
smsd.count_overlap_coefficient() (use C++ binding directly)
smsd.count_tanimoto() (use C++ binding directly)

Java parity — unified convenience methods

import com.bioinception.smsd.core.SearchEngine;

// MCS with default options
Map<Integer, Integer> mcs = SearchEngine.findMCS(g1, g2);

// Substructure with default options
Map<Integer, Integer> hit = SearchEngine.findSubstructure(query, target);

// With custom options and timeout
Map<Integer, Integer> hit = SearchEngine.findSubstructure(query, target, chemOpts, 10_000L);

All overloads work with both MolGraph and CDK IAtomContainer inputs.

Internal improvements

  • Raw C++ bindings renamed to _native_find_mcs, _native_find_all_mcs,
    _native_is_substructure, _native_find_substructure — clearly internal.
  • All internal calls use the unified API (find_mcs, find_substructure).
  • mcs_from_smiles(), mcs_rdkit(), substructure_rdkit(), and
    depict_mcs() all updated to the new names.

Migration Guide

Python

- mapping = smsd.mcs(mol1, mol2)
+ mapping = smsd.find_mcs(mol1, mol2)

- mapping = smsd.substructure_search(query, target)
+ mapping = smsd.find_substructure(query, target)

- results = smsd.all_mcs(mol1, mol2, max_results=5)
+ results = smsd.find_mcs(mol1, mol2, max_results=5)

Java

No breaking changes. New convenience methods added; existing methods unchanged.

Benchmark

Dalke NN dataset (1,000 high-similarity ChEMBL pairs):

Metric SMSD Pro 7.0.0 RDKit FindMCS 2026.03
Total time 40 s 213 s
Median time 0.6 ms 0.4 ms
Mean MCS size 25.8 atoms 25.0 atoms
Timeouts 0 8
Larger-MCS wins 211 (21 %) 29 (3 %)

5x faster overall, finds larger MCS 7x more often, zero timeouts.

Test Status

  • Python: 310 passed, 0 failed
  • Java: BUILD SUCCESS (581+ tests)

Compatibility

  • Java 25+, C++17, Python 3.10-3.13
  • GPU: Metal (Apple Silicon), CUDA (Volta+)
  • Platforms: macOS (arm64, x86_64), Linux (x86_64, aarch64), Windows (AMD64)

Copyright

Copyright (c) 2018-2026 BioInception PVT LTD
Algorithm copyright (c) 2009-2026 Syed Asad Rahman
Licensed under Apache License 2.0. See NOTICE for details.

SMSD Pro 6.12.2

12 Apr 02:11

Choose a tag to compare

SMSD Pro 6.12.2

Syed Asad Rahman — BioInception PVT LTD

  • MCS quality fix: ring-constrained recovery when loose defaults cause sub-optimal mappings on steroids, beta-blockers, and macrolides
  • C++ coverage engine: zero losses on Dalke benchmark, 42-90x faster than RDKit
  • Python MCS: thin C++ wrapper, no Python orchestration overhead
  • VF2PP budget cap: prevents seed generation from draining the time budget
  • Test fixes: assertions updated for 6.12.1 default settings

581 Java tests, 310 Python tests — all passing.
Drop-in replacement for 6.12.1.

Install

Pythonpip install smsd
JavaMaven Central
Dockerdocker pull ghcr.io/asad/smsd:latest

SMSD Pro 6.12.1

11 Apr 22:19

Choose a tag to compare

SMSD Pro 6.12.1

Syed Asad Rahman — BioInception PVT LTD

Defaults aligned with RDKit FMCS for fair benchmarking. Lightweight
coverage-driven MCS engine as default. Full Java-C++-Python parity.

What changed

RDKit-compatible defaults

ChemOptions now defaults to ringMatchesRingOnly=false and
matchFormalCharge=false, matching RDKit FindMCS out of the box.
Named profiles available for stricter settings:

  • default — RDKit-compatible (new)
  • strict — ring=ring, charge match, strict bond order
  • pharma — drug discovery (ring, charge, complete rings)
  • reaction — relaxed for AAM workflows
  • compat-fmcs — explicit RDKit FMCS parity

Lightweight MCS engine as default

smsd.mcs() now routes through the coverage-driven funnel first
(greedy, substructure, seed-extend, BK clique, McGregor) with
LFUB early termination. Falls back to native C++ pipeline on error.

Dalke NN benchmark (100 pairs, same settings):
85x faster than RDKit FMCS. 28 wins, 0 losses.

Fully configurable mcs() API

All matching flags settable directly — no need to construct
ChemOptions for common use cases:

mapping = smsd.mcs(mol1, mol2,
    ring_matches_ring_only=False,
    match_bond_order="loose",
    max_stage=1)

maxStage pipeline control

Wired into both Java and C++ findMCSImpl with 4 stage gates:

  • Stage 0: greedy only (sub-millisecond)
  • Stage 1: + substructure + seed-extend (reaction mapping)
  • Stage 2: + McSplit
  • Stage 3: + Bron-Kerbosch
  • Stage 5: full pipeline with extra seeds

Java parity

SmallExactMCSExplorer, FixedSizeBondMaximizer, SigKey ring-system
equivalence in BK, global reaction deadline, TargetCorpus,
screenAndMatch, overlapCoefficient, FP quality analysis, fluent
MCSOptions, boolean[] BK triedClasses, numRings, ensureRingSystems.

C++ optimisations

SMSD_LIKELY branch hints on bondOrder/bondInRing/bondAromatic.
Packed bond matrix (3 matrices to 1 uint8). sm_75 CUDA for T4.
Ring system accessors exposed in Python bindings.

Python fixes

Fixed overlapCoefficient formula (was Tanimoto, now correct
intersection/min). Added tanimoto_coefficient as separate function.
Fixed _ensure_native NameError in depiction. Suppressed RDKit
kekulisation warnings in lightweight engine.

Compatibility

  • Java 25+, C++17, Python 3.10+
  • GPU: Metal (Apple Silicon), CUDA (Volta+)
  • Platforms: macOS (arm64, x86_64), Linux (x86_64, aarch64), Windows (AMD64)

Copyright

Copyright (c) 2018-2026 BioInception PVT LTD
Algorithm copyright (c) 2009-2026 Syed Asad Rahman

SMSD Pro 6.12.0

11 Apr 09:29

Choose a tag to compare

SMSD Pro 6.12.0

Syed Asad Rahman — BioInception PVT LTD

This release adds a lightweight clique-based MCS solver, standalone
fingerprint modules, and exposes the full set of search options to
Python users.

What changed

Clique solver

New header-only maximum clique finder (clique_solver.hpp) for
workflows where chemistry stays in Python/RDKit and only the
graph search runs in C++. Includes Bron-Kerbosch with Tomita
pivoting, k-core pruning, greedy seed, and McGregor bond-grow
extension. Portable across Mac/Linux/Windows (uses bitops.hpp
wrappers, no raw compiler builtins).

Python API: find_mcs_clique, match_substructure,
match_substructure_from_elements, score_mapping.

Lightweight MCS engine

smsd.mcs_engine provides a coverage-driven funnel that accepts
SMILES strings, MolGraph objects, or RDKit Mol objects directly:

from smsd.mcs_engine import find_mcs_lightweight
result = find_mcs_lightweight("c1ccc(O)cc1", "c1ccc(N)cc1")

Stages escalate automatically (greedy, substructure, seed-extend,
BK clique, McGregor) and stop as soon as the label-frequency upper
bound is reached. The C++ clique solver handles the heavy lifting.

SmallExactMCSExplorer

Exact branch-and-bound MCS for small molecule pairs (up to 20×40
atoms). Now wired into the native findMCS pipeline for
disconnected MCS mode, giving deterministic results on hard
cases instead of relying on heuristic early exits.

Fingerprint modules

Standalone fp/ headers separated from batch.hpp:

  • fp/mol/circular.hpp — ECFP / FCFP (Morgan)
  • fp/mol/path.hpp — path-based fingerprints
  • fp/mol/pharmacophore.hpp — pharmacophore features
  • fp/mol/torsion.hpp — topological torsion
  • fp/mol/mcs_fp.hpp — MCS-aware path fingerprints
  • fp/similarity.hpp — Tanimoto, Dice, cosine, Soergel, subset
  • fp/common.hpp, fp/format.hpp — shared utilities

Support headers

  • hungarian.hpp — O(n³) optimal assignment
  • periodic_table.hpp — element data and valence tables
  • bond_energies.hpp — bond dissociation energies
  • scaffold_library.hpp — Murcko scaffold extraction
  • color_palette.hpp — Jmol/CPK element colors
  • depictor.hpp — publication-quality SVG rendering engine

Full options exposure

All C++ search options now accessible from Python:

MCSOptions (18 fields): max_stage, seed_neighborhood_radius,
seed_max_anchors, use_two_hop_nlf_in_extension,
use_three_hop_nlf_in_extension added to the existing set.

ChemOptions (19 fields): aromaticity_model, pH,
matcher_engine, induced, use_two_hop_nlf, use_three_hop_nlf,
use_bit_parallel_feasibility added.

Enums (6): AromaticityMode, AromaticityModel, BondOrderMode,
RingFusionMode, MatcherEngine, Solvent — all accessible as
smsd.MatcherEngine.VF2PP etc.

MolGraph.num_rings() added for SSSR ring count.

Global reaction deadline

global_deadline namespace in VF2++ enforces a single wall-clock
deadline across all MCS/substructure calls within a pipeline.
TimeBudget checks both local and global deadlines. Timeout
frequency increased (check every 256 calls instead of 1024).

FixedSizeBondMaximizer

Maximizes bond count for fixed-size atom mappings on small pairs.
Integrated into runValidatedMcsDirection for automatic bond
refinement when MCS matches the smaller molecule entirely.

Cross-platform

  • MSVC: clique solver uses portable smsd::popcount64 /
    smsd::ctz64 (no raw __builtin_*). CMake adds /W4 /EHsc.
  • ARM NEON: fused bitset ops dispatch automatically on Apple Silicon.
  • CI wheels: Linux x86_64 + aarch64, macOS arm64 + x86_64,
    Windows AMD64. OpenMP installed in build containers. macOS
    wheels bundle libomp via delocate.

Includes 6.11.2 fixes

Memory leak fix (WeakHashMap), BK color-bound overflow,
CIP Rule 3 (Z > E), thread-safety on lazy MolGraph fields,
tautomer weight corrections, Se/I support, SAH SMILES fix,
reaction-aware charge relaxation, Mcs→MCS rename,
tanimoto→overlapCoefficient.

Compatibility

  • Java 25+, C++17, Python 3.10+
  • GPU: Metal (Apple Silicon), CUDA (Volta+)
  • Platforms: macOS (arm64, x86_64), Linux (x86_64, aarch64), Windows (AMD64)

Copyright

Copyright (c) 2018-2026 BioInception PVT LTD
Algorithm copyright (c) 2009-2026 Syed Asad Rahman

SMSD v6.11.1 — Fingerprint Correctness & Thread Safety

05 Apr 01:45

Choose a tag to compare

Bug Fixes

  • ECFP initial invariants — added missing Rogers & Hahn (2010) invariants #3 (bond order sum / valence) and #4 (atomic mass number) to both binary and count ECFP in C++ and Java. Fingerprints now encode the full 7-invariant set per the original paper.
  • Path fingerprint canonical hash — replaced double-bit-setting (forward + reverse) with min(fwd, rev) single canonical hash, correcting bit-density inflation.
  • FCFP pyrrole-N misclassification — aromatic nitrogen acceptor classification now uses direct hydrogen count instead of bond-sum heuristic, fixing incorrect pyridine-N classification as non-acceptor (pyridine N has a free lone pair; pyrrole N does not).
  • Thread safetyprewarmGraph() now calls ensurePatternFP() and getPharmacophoreFeatures() before OpenMP parallel regions, preventing data races on lazy-init mutable caches.
  • Dead code removal — removed unused seenHashes set from binary ECFP; removed disabled tautomer rules T18 (sulfoxide S=O) and T20 (nitrile/isonitrile).

Java 25 Modernisation

  • Records: SubstructureStats, SubstructureResult, McsResult, Node, TemplateEntry, CanonResult, ScoredCandidate, Query — 8 inner classes converted to records, eliminating 134 lines of boilerplate.
  • Arrow switches: 6 switch statements in SMSDcli and ReactionAwareScorer converted to arrow expressions.
  • Unnamed variables: catch (Exception _) replaces catch (Exception ignored) across 5 sites.
  • Pattern matching instanceof: Stereo element dispatch in MolGraph uses pattern matching.
  • Renames: McsPostFilterMCSPostFilter, CipAssignerCIPAssigner (standard IUPAC acronym capitalisation).
  • Build: JDK 25 across Maven, CI, Dockerfile, and native installer workflow.

Note

ECFP fingerprint values will differ from v6.11.0 due to the added invariants. This is a correctness improvement aligned with the Rogers & Hahn 2010 specification.

Full Changelog: v6.11.0...v6.11.1

Apache-2.0. Copyright (c) 2018-2026 BioInception PVT LTD.

SMSD Pro v6.11.0 — Performance, Precision, and Depiction

04 Apr 12:50

Choose a tag to compare

Highlights

SMSD Pro 6.11.0 delivers three major improvements:

  1. Core engine hardening — cache-optimal data structures and pre-indexed candidate lookup reduce MCS search time by 20-40% on large molecule pairs.

  2. Publication-quality SVG depiction — zero-dependency renderer conforming to ACS 1996 standard (same specification used by Nature, Science, JACS, and Springer journals). Renders molecules, MCS comparisons, and substructure highlights directly from SMILES or MolGraph with no external tools required.

  3. Comprehensive layout engine — 8-phase 2D pipeline, distance geometry 3D, 40+ pharmaceutical scaffold templates, full coordinate transform suite.

Core Engine

  • Converted all hot-path vector<bool> to vector<uint8_t> across McGregor DFS, Bron-Kerbosch partition bound, seed-extend, frontier expansion, k-core pruning, and tree detection. Yields 15-25% cache performance improvement.
  • Pre-indexed candidate sets in McGregor DFS using precomputed compatTargets_[]. Eliminates O(n²) linear scan per frontier atom.
  • CTZ-based fingerprint bit extraction replaces sequential scan.

Depiction Engine (New)

SVG renderer produces journal-ready molecular structure diagrams:

  • ACS 1996 proportions — all dimensions auto-scale from a single reference value
  • Skeletal formula — carbon suppressed, heteroatom labels with H-count subscripts and charge superscripts
  • Bond rendering — single, double (asymmetric toward ring interior), triple, wedge up (solid), wedge down (dashed stripes)
  • MCS highlighting — green-filled circles, bold bonds, blue mapping numbers
  • Side-by-side pair rendering with bidirectional arrow separator
  • Jmol/CPK element colors — N=blue, O=red, S=amber, P=orange, etc.
  • Full customization via DepictOptions

Layout Engine

  • 8-phase 2D pipeline: template match → ring-first → chain zig-zag → force-directed → overlap resolution → crossing reduction → canonical orientation → bond-length normalisation
  • Distance geometry 3D with power iteration eigendecomposition
  • 40+ scaffold templates including pharmaceutical scaffolds, PAH, spiro, and bridged systems
  • Full coordinate transform suite: translate, rotate, scale, mirror, center, align

Python Bindings

35+ new functions exposed with GIL release for thread safety:
depict_svg(), depict_pair(), depict_mapping(), save_svg(), find_nmcs(), find_scaffold_mcs(), validate_mapping(), decompose_rgroups(), generate_coords_2d(), generate_coords_3d(), and more.

Tests

1,512 tests passed across all platforms:

  • C++ core: 114 | C++ layout: 42 | C++ CIP: 42 | C++ parser: 542
  • Python: 170 | Java: 602

Compatibility

  • Fully backward-compatible with v6.10.x
  • No API breaking changes
  • Requires C++17, Java 11+, Python 3.9+

SMSD Pro by BioInception PVT LTD. Algorithm Copyright 2009-2026 Syed Asad Rahman. Apache-2.0.

v6.10.2

03 Apr 16:23

Choose a tag to compare

SMSD Pro v6.10.2 — correctness release.

  • Fixed MCS connectivity filter to enforce common-bond reachability in non-induced mode
  • Added GOLDEN_843 regression tests (Python and Java)
  • Version bump across all artifacts

Java 21+ / C++17 / Python 3.9+. Apache-2.0 — see NOTICE for attribution.

v6.10.1

03 Apr 09:14

Choose a tag to compare

SMSD Pro v6.10.1 — stability and correctness release.

  • Hardened MCS mapping repair pipeline (iterative bounded-loop, no unbounded recursion)
  • Deterministic test suite (structural invariants, not wall-clock time)
  • CI/CD reliability fixes

Apache-2.0 — BioInception PVT LTD.