Skip to content

TonioDominguez/Fake_News_Detector

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Home banner

FAKE NEWS PREDICTOR PROJECT 📰🔍


Make sure that what you are being told is truth


In today's world, we're bombarded with information constantly. It's everywhere — social media, digital newspapers, tv news, instant messaging — . But here's the thing: not all of it is reliable. Sometimes what seems legit might be totally bogus. That's why it's so important to double-check stuff. Don't just swallow everything you see or hear.

Take a step back, ask questions, and look for different angles.

It's kinda like sifting through a pile of junk to find that one golden nugget. Sure, it takes a bit of effort, but it's worth it. Because when you're armed with the truth, you're not just informed, you're empowered. And that's what we need to navigate this crazy world - a healthy dose of skepticism and a keen eye for the real deal.

So, let's embrace our inner skeptics. Let's challenge the status quo and demand accountability from those who shape the narratives we consume. In a world where information is power, it's up to us to wield that power responsibly. And that starts with being discerning consumers of information.


But what happens when we're drowning in information overload?


This is where technology can be our greatest ally. With the right tools and strategies, technology can help us sift through the noise and find the signal—the truth—in the midst of the chaos. From fact-checking websites and browser extensions that detect misinformation to algorithms that curate personalized news feeds based on our interests and credibility ratings, technology offers a multitude of resources to help us discern fact from fiction.


For all these reasons, I present my humble project


Here you have FAKE NEWS DETECTOR! A small personal project that can help you as a tool to separete fact from fiction.

Using a machine learning supervised classification algorithm (in this case Naive Bayes, chosen from among several models for their accuracy and reliability). This detector analyzes and evaluates texts (article titles and news extracts), identifying patterns and features that reveal their authenticity. Based on a dataset with 2000 entries (labeled as fake or real news) our algorithm has been trained to detect signals of misinformation, bias and manipulation.

My goal is to provide you with an additional layer of security and confidence in the task of evaluate if an information is thruth or fake. Whether you're scrolling through social media, reading an article or browsing a website, this FAKE NEWS DETECTOR can offer you a reliable guide you to discern what they are telling you.



PROJECT STRUCTURE 📂


This project is developed through a bunch of Python notebooks.

  1. FAKE NEWS PREPROCESSING & EDA: A first notebook where I clean and process the datasets that I'm going to use to build my prediction model. I also perform EDA (Exploratory Data Analysis) on crucial variables and Sentiment Analysis.
  2. ML DEVELOPING & TESTING: Here, I select, fine-tune, and measure the results of the ML model that best fits the project. I also perform tests of the complete model with a dataset external to the training data.
  3. FAKE NEWS DETECTOR PER SOLO IMPUTS: As a summary, I create this Notebook by importing the complete model and adding some functions that clean the user input to process their text and test it.

As a final step, I develop a Streamlit app where I thoroughly develop the storytelling of the project creation and allow you to use the predictor.



PROJECT DEVELOPMENT TIME ⏰


10 days from 03/12/24 to 03/22/24



SOURCES ⛲


A few sites I have turned to for enlightenment in the creation process:

Kaggle: Where I found the datasets that started it all.
Roberto Esteves Github: Inquiry about a project that also developed a fake news classifier.
sentiment-analysis-spanish: Library for performing sentiment analysis NLP.
My Github repositories: Where I reviewed old labs to refresh my memory on working with supervised classification models.
Streamlit Docs: Documentation checked to know how to tame this beast.
ChatGPT: The snitch that kept whispering to me what I was doing wrong when the code errors were driving me crazy.



LETS TALK ABOUT FEELINGS ⭐

Hope you found this project interesting!


Hello, I'm Toño Domínguez, and this is my final project for the Data Analyst Bootcamp at Ironhack, which has been my life for the past two months. With this work, I aim to summarize a large part of the knowledge acquired during this time and, in a way, put a final signature on my introduction to the world of data science.

The theme of this project is not random. I am a journalist by profession — yes, I know, it's rare... a journalist becoming a data analyst? — and the field of information, especially politics, is one of my great passions. That's why developing a final project that combines the world of information with the world of data made a lot of sense to me.

My Fake News predictor is not perfect — of course not — I need to update it to make it more effective with more recent news — it struggles a bit as it is trained with information from five years ago — and find a way to overcome the geographical barrier.

And yet... it works! I'm very proud of it because developing a tool that can help us discern what is real and what is fictional is, I believe, one of the greatest needs we have today.

And I'm also proud of myself, hell yes!

I hope you find it interesting and that it inspires you, as I have found many other projects that have inspired me to develop things that until recently I wouldn't have believed I was capable of.

Best regards and VIVA LA DATA!


Can't forget the acknowledgments


Acknowledgments to several people for being there, not only during the final project but throughout this journey in the Bootcamp.

To my classmates Carlos, Óscar, Axier and Víctor. Thank you for motivating me with your work throughout the entire time and for being such great colleagues.

To Xisca for that masterclass on Big Data and the help with the SQL block. It was a pleasure to meet such a whirlwind of a person!

To Isi. I don't have enough words to thank you for all your work and dedication with us. I can't imagine having had a better mentor than you.

To Gali, the guy who one day suggested to me that maybe Ironhack could be a good idea. And who, a month before it started, was more excited than I was.

To Cristina, for being by my side all this time and helping me navigate my frustrations until they became achievements.


About

ML Binary Fake News Predictor | NLP

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published