- Install packages
pip install -r requirements.txt
- Set values in
.envfile
FIRECRAWL_API_KEY = ""
URL = "" #URL to crawl
SOURCE_LIBRARY = "" #Name of the library being crawled (optional)
- Crawl and save the data
python crawl_and_save.py
- Process the saved data to markdown
python process.py
- The output is available inside
markdown_docsfolder.
There are two scripts namely crawl_and_save.py and process.py to first crawl and save raw data to avoid having to crawl again and spend unnecessary credits in case of processing failures.