A Python library that queries Google, Bing, Yahoo and other search engines and collects the results from multiple search engine results pages.
Please note that web-scraping may be against the TOS of some search engines, and may result in a temporary ban.
Google
Bing
Yahoo
Duckduckgo
Startpage
Aol
Dogpile
Ask
Mojeek
Brave
Torch
- Creates output files (html, csv, json).
- Supports search filters (url, title, text).
- HTTP and SOCKS proxy support.
- Collects dark web links with Torch.
- Easy to add new search engines. You can add a new engine by creating a new class in
search_engines/engines/and add it to thesearch_engines_dictdictionary insearch_engines/engines/__init__.py. The new class should subclassSearchEngine, and override the following methods:_selectors,_first_page,_next_page. - Python3 only compatible.
<<<<<<< HEAD
Python 2.7 - 3.x with
Requests and
BeautifulSoup
- Python 3.6+
- Aiohttp
- BeautifulSoup
Install the dependencies this way: $ pip3 install -r requirements.txt
soxoj-master
Run the setup file: $ python setup.py install.
As a library:
from search_engines import Google
engine = Google()
results = engine.search("my query")
links = results.links()
print(links)
As a CLI script:
$ python search_engines_cli.py -e google,bing -q "my query" -o json,print
- async-search-scraper A really cool asynchronous implementation, written by @soxoj