Scraping covid-19 data from worldometers.info
This repo is made for scraping data of covid-19 from worldometers, and is public for everone to look and work apon.
This code stores data from worldometers.info into the data directory of this repo. It stores a json and a csv format of the data with the file name formate being data-D-M-Y.FORMATE where D stands for day, M stands for mounth, Y stands for year and FORMATE standing for the formate of the file e.g. json or csv. If you want to view the data from Github itself I recoment you to view the csv format as the webiste will format it into a clean table.
It does two things:
- It will scrap all the important(the table) from worldometers everday 10 minutes before midnight. Thanks to Github Actions.
- It can scrap data from the last snapshots of the all the days it has been snapshoted from the Way Back Machine and store it.
Use the package manager pip to install the requirements.
python3 -m pip install -r requirements.txtRun the yesterdays scrap locally:
python3 Scrap.pyScrap all dates data from worldometers over WayBackMachine:
Note: An
scrap_error.txtfile will be created if not already created and will append the error file with theraw_urlwhich can be used incase of an1040 Database Errorfrom WayBackMachine when Scrap_wayback.py file is ran
python3 Scrap_wayback.pyScrap data from a particular date over WayBackMachine:
python3 Scrap_wayback.py [<--date>|<-d>] <last-snapshot|first-snapshot> <date-like-29-01-2020>Scrap data from a particular raw_url over WayBackMachine:
python3 Scrap_wayback.py [<--raw-url>|<-r>] <raw_url>- Work apon making more usable code
- Make
Scrap_wayback.pyuse multiprocess for faster processing - Make a cleaner python application to clean the data from
statistics/curve/get_data.py - Make better way of extracting
argv(s) - Make colored logs...
- Make
- Make a pipy library from it
- Make a proper doc of how to use the lib
- Work on better README
- List it on others repos