Table of Contents
The Vaccine Adverse Event Reporting System VAERS was created by the FDA and CDC to recieve and compile reports about the adverse events that may be associated vaccinations world wide, including the COVID-19 vaccines. Physicians and vaccine providers are encouraged to report adverse events after a vaccination. We used Kaggle COVID-19 Vaccine Adverse Event Reporting System to gather vaccine records.
Developed a web application to support individuals who seek to understand what side effects they may experience from a Covid-19 vaccination based on reported side effects from previous vaccine administrations. Our site uses machine learning and user input to predict how they will respond post-vaccination. The machine learning model predicts the possible side effects in the categories of "Mild", "Moderate", or "Severe" using the Logistic Regression model and the KNN model to determine possible side effects unique to each user.
Perform ETL, create a web-based platform to gather user inputs and return side effects they may experiece. On output a user can further define medical terms and learn about specific medical terminology associated with their predicted side effects.
Tools, Languages, & Libraries Utilized
-
ETL
- Combined 3 datasets
- Remove Non-Covid Vaccine Data
- Research symptoms and assigned to categorical groupings
- Features are vaccine manufacturer, dose series number, age, gender, preexisting medical conditions, prior vaccination related complications
-
User Interface
- Age - #
- Gender - F/M
- Taking Other Medications - Y/N
- Pre-existing Conditions - Y/N
- Allergies to Medications, Food or Other Products - Y/N
- Previous Adverse reactions to Vaccines - Y/N
- Vaccine Type: Pfizer, Moderna, Janssen
- Dose Series: First Dose or Second Dose
-
EDA
- KNN to review and create list of associated side effect based on user input selections and output a word cloud of actual side effects
- Logistic Regression Model, return rate of 87% accuracy
- Heroku app to deploy
- User interface
- Tableau to further review and possible side effect outcomes
- Word cloud was linked to a dictionary for users to understand unfamiliar medical terms for personalized predicted side effects
Challenges
-
As data scientists, we have limited medical expertise and many medical symptom terms had to be researched, resulting in the subjective assignment of category groups and the classification of severity of side effects.
-
Model selection, initially the KNN was returning an imbalance of “mild” categories based on the return of user input data which supports vaccine safety. We kept the KNN model to provide side effects for the user interface and our outputs for the world cloud. It was further determined to use the Logistic Regression model due to the large amount of datasets gathered to ensure the correct categorical outputs for either "Mild", "Moderate" or "Severe."
-
Dataset for Covid side effects only from Dec 2020 to March 2021
Future Implications
-
If we were given more time and resources our site is scalable. The data is readily available for all vaccinations so we could ptoentially expand to become a general vaccine information site to reassure patients about future risks to make informed decisions when considering upcoming vaccinations.
-
We would go directly to the source instead of Kaggle and extract data from the beginning of COVID vaccinations to now for more data.
-
Recruit a medical professional to decipher mild, moderate, severe side effect classess and review quality of data.
-
Further deep dive to examine the non-side effect patient data for trends that were reported to get an even better probability.
COVID-19 Vaccine Adverse Event Reporting System
Amy Bednarz ---|--- Darrell Horich ---|--- Taylor Lyons ---|--- Samantha Perez






