Skip to content

Adam-maz/MolPort-webscraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 

Repository files navigation

Collecting SMILES from MolPort with Selenium

1. Introduction

This document introduces a script that enables users to collect SMILES strings from MolPort using web scraping techniques with Selenium. The script generates a .csv file as output, containing the IDs and SMILES strings of the desired particles. This .csv file can, for example, be used to build a library of reactants for reaction-based enumeration in lead optimization.

2. Usage

To use this script:

  1. Download an .sdf file containing the molecules you want to collect.
  2. Convert this file to a .csv file (you can use tools like the DataWarrior suite for this step).
  3. Launch the script and follow the provided instructions.

3. Content

  1. molport_webscraper - code to webscraping.
  2. spiro_all.sdf - file with spirocyclic compounds, downloaded from MolPort.
  3. sprio_all.csv - file with spirocyclic compounds, input file for script.

4. Dependencies

To run this script, ensure the following packages are installed in your virtual environment:

  • pandas
  • selenium
  • webdriver_manager

You can install them by running the following command in terminal:

pip install pandas selenium webdriver_manager

5. Selenium documentation

https://selenium-python.readthedocs.io/

About

Here I provide a script that allows the user to browse molport pages and collect SMILES of molecules of their interest

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages