A nostalgic first-year university project that turns WhatsApp chat exports into beautiful data stories.
This project holds a special place in my heart—it was my very first dive into the world of data science back in my first year at the University of Waterloo. I was just getting interested in data and wanted a fun, personal way to learn data manipulation and visualization.
One day, I discovered that WhatsApp lets you export entire chat histories as text files. That's when inspiration struck: What if I could turn those messy text files into meaningful insights?
I started by analyzing two chats:
- My high school class group chat - watching the chaos of high school life unfold in data form
- My chat with my closest friend (let's call her Joe)
While analyzing message timestamps with Joe, I stumbled upon something magical: a huge spike in messages every Wednesday around 2pm. At first, I was puzzled. Then it hit me—Wednesday afternoons were when we'd both just gotten back from school, and it was the moment we'd been waiting for all week.
You see, we were absolutely obsessed with Storm and Silence by Rob Thier, a historical romance novel that was being published chapter-by-chapter on Wattpad every Wednesday. The story follows Lilly Linton, a fierce feminist in Victorian London who disguises herself as a man to work for the cold, calculating (and undeniably swoon-worthy) Mr. Rikkard Ambrose. Every Wednesday, a new chapter would drop, and Joe and I would immediately start dissecting every plot twist, every romantic moment, and every one of Lilly's hilarious feminist rants.
Our 2pm message spike wasn't random—it was pure excitement captured in data. Those were simpler days, and honestly? I miss them.
This tool takes exported WhatsApp chat files and transforms them into a structured dataframe for analysis. You can discover:
- 📅 Temporal patterns: When do people message most? (Apparently Wednesdays at 2pm for book fans!)
- 😂 Most popular emojis: What emotions dominate your conversations?
- 💬 Message frequency: Who's the most active in the group chat?
- ⏰ Time-of-day insights: Are you a morning person or a night owl texter?
- 🔤 Popular phrases: What inside jokes or phrases get repeated?
- 📈 Conversation trends: How does your texting behavior change over time?
This project was my playground for learning fundamental data science tools:
- Python: Core programming language
- Pandas: Data manipulation and creating dataframes from messy text
- Matplotlib: Creating visualizations to tell the data story
- Regular Expressions (Regex): Parsing the WhatsApp export format
- Datetime: Handling timestamps and temporal analysis
-
Export your WhatsApp chat:
- Open any WhatsApp chat
- Go to Settings → More → Export Chat
- Choose "Without Media" to get a
.txtfile
-
Run the analysis: Point the code to your
.txtfile -
Explore the insights:
- View generated visualizations
- Discover your messaging patterns
- Find your own "Wednesday at 2pm" moments!
The analysis includes visualizations for:
- Message frequency over time (line charts)
- Hourly/daily message distribution (heatmaps)
- Most used emojis (bar charts)
- Active hours comparison (radar charts)
- Conversation dynamics (who starts conversations, response times)
This project taught me:
- How to wrangle messy, real-world data
- The power of regex for text parsing
- Data cleaning and preprocessing techniques
- Creating meaningful visualizations
- That data can tell beautiful, personal stories
While this project is complete and functional, here are some ideas I'd love to explore:
-
Desktop GUI Application: Create a user-friendly interface where anyone can:
- Drag-and-drop their chat exports
- Select analysis options (emojis, time patterns, word clouds)
- Generate automatic insights and visualizations
- Export a shareable report
-
Sentiment Analysis: Add NLP to detect conversation mood over time
-
Multi-Language Support: Handle WhatsApp chats in different languages
-
Comparison Mode: Compare messaging patterns between different chats
-
Interactive Dashboards: Use Plotly or Streamlit for interactive exploration
-
Group Chat Analytics: Special features for analyzing group dynamics (who responds to whom, conversation threads)
This project works with any WhatsApp chat export. Whether you want to:
- Analyze your family group chat dynamics
- See how your friendships have evolved over time
- Find out which friend uses the most emojis
- Discover your peak texting hours
The data is there, waiting to tell its story!
This tool processes chat data locally on your machine. No data is uploaded or shared anywhere. Your conversations remain private—this is just for your own exploration and fun!