GitHub - aditya-satope/Brand-Sentiment-Analysis

We briefly explain the salient features of our approach here. In #Approach, we explain each task in detail.

1) Simpler and faster models for binary classification

Binary classification for mobile-theme identification is not a very difficult task.
The amount of data being processed in this step is about 4 times that being processed in the other steps. This is because the ratio of mobile-themed to non-mobile themed data is about 1:3, and we only need to do the other tasks on mobile-themed data.
Therefore it makes sense to use simpler and faster models for this step.

2) Translation of all data to english for headline generation and sentiment analysis

Headline generation is a difficult task, which yielded poor results on multilingual data.
Translating all data to English language using an accurate model not only provides greater scope for scalability to additional languages, it even improves performance on other tasks for which we may already have superior pretrained models in English.

3) Regex matching for brand identification

The set of all possible mobile brands is a modestly-sized set
Using regex matching instead of framing it as an NER problem is much faster and often more reliable.

4) Using advanced models like T5 for headline generation

We tried a lot of possible variants but T5 performed the best.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
data		data
media		media
notebooks		notebooks
src		src
.dvcignore		.dvcignore
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
Report.pdf		Report.pdf
TO-DO.md		TO-DO.md
bridgei2i-PS.pdf		bridgei2i-PS.pdf
main.ipynb		main.ipynb
presentation.pdf		presentation.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

1) Simpler and faster models for binary classification

2) Translation of all data to english for headline generation and sentiment analysis

3) Regex matching for brand identification

4) Using advanced models like T5 for headline generation

Complete Pipeline

Approach

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

1) Simpler and faster models for binary classification

2) Translation of all data to english for headline generation and sentiment analysis

3) Regex matching for brand identification

4) Using advanced models like T5 for headline generation

Complete Pipeline

Approach

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages