Skip to content

Solvation data preparation#8

Open
craabreu wants to merge 3 commits intoQuantumPioneer:mainfrom
craabreu:solvation_data
Open

Solvation data preparation#8
craabreu wants to merge 3 commits intoQuantumPioneer:mainfrom
craabreu:solvation_data

Conversation

@craabreu
Copy link
Copy Markdown
Contributor

This PR introduces two notebooks designed to prepare solvation data for publication:

  1. The data are classified into three categories: transition_states, closed_shell_species, and open_shell_species
  2. For each category, data points are split by solvent and stored in files called <solvent-name>.csv, distributed in subdirectories according to the first letter of the solvent's name (ignoring non-letter characters).
  3. Reaction SMILES are stored without atom mappings and using canonical SMILES for each reactant/product.

Note: A file solvents.csv is included, which maps solvent names to canonical SMILES without stereochemical information. The solvent SMILES in the original solvation data is not in canonical form, thus requiring an intermediate canonicalization step.

@craabreu craabreu requested a review from KnathanM March 19, 2026 19:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant