Skip to content

kevin3302/fake-news-detection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 

Repository files navigation

fake-news-detection

Developed an end‑to‑end Fake‑News Detection pipeline that blends rigorous exploratory data analysis with production‑ready machine‑learning practices. I began by profiling class balance and text‑length distributions, using regular expressions, lemmatization, contraction expansion, and stop‑word removal to create a clean corpus. Word‑ and character‑level insights were extracted through unigram, bigram, and trigram frequency analysis, while VADER sentiment scores captured tonal cues often overlooked by traditional features.

For feature engineering, I combined sentiment signals with high‑dimensional TF‑IDF vectors, then trained and tuned both Logistic Regression and SVM classifiers under a stratified five‑fold cross‑validation scheme. The model achieved over 92% F1-score with balanced generalization across classes, and the confusion matrix was instrumental in verifying that the classifier was not overfitting to the majority class or misclassifying borderline examples. Interpretability and bias detection were further enhanced by examining top TF-IDF tokens and analyzing misclassified samples.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors