Skip to content

AVrachimis/post-sentiment-and-subreddit-classification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 

Repository files navigation

Single Label Multi Class Text Classification

The script consists of two parts

  • Part A: Sentiment Polarity Classification (5 distinct labels)
  • Part B: Subreddit Classification (20 distinct labels)

Both parts involve the following steps:

  1. Data pre-processing

Tokenization
Normalization

  1. Vectorization

One Hot Encoding
TF-IDF

  1. Model Creation

Logistic Regression
SVC
Random Forest
BernoulliNB
Decision Tree

  1. Parameter Tuning

GridSearchCV

  1. Error Analysis
  2. Word Embeddings, optimizing the models created

Word2Vec

About

Python script performing sentiment analysis prediction and subreddit category classification for a given dataset

Topics

Resources

Stars

Watchers

Forks

Contributors