Collecting SMILES from MolPort with Selenium

1. Introduction

This document introduces a script that enables users to collect SMILES strings from MolPort using web scraping techniques with Selenium. The script generates a .csv file as output, containing the IDs and SMILES strings of the desired particles. This .csv file can, for example, be used to build a library of reactants for reaction-based enumeration in lead optimization.

2. Usage

To use this script:

Download an .sdf file containing the molecules you want to collect.
Convert this file to a .csv file (you can use tools like the DataWarrior suite for this step).
Launch the script and follow the provided instructions.

3. Content

molport_webscraper - code to webscraping.
spiro_all.sdf - file with spirocyclic compounds, downloaded from MolPort.
sprio_all.csv - file with spirocyclic compounds, input file for script.

4. Dependencies

To run this script, ensure the following packages are installed in your virtual environment:

pandas
selenium
webdriver_manager

You can install them by running the following command in terminal:

pip install pandas selenium webdriver_manager

5. Selenium documentation

https://selenium-python.readthedocs.io/

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
README.md		README.md
molport_webscraper.py		molport_webscraper.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Collecting SMILES from MolPort with Selenium

1. Introduction

2. Usage

3. Content

4. Dependencies

5. Selenium documentation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Collecting SMILES from MolPort with Selenium

1. Introduction

2. Usage

3. Content

4. Dependencies

5. Selenium documentation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages