The purpose Of this project is to demonstrate the usage of spark mllib from training to predicting ecommercial product's category.
Training Stage:
- Uses tfidf to vectorize text context
- Uses Cliffisier Algorithms such as Naive Bayes, Ovr Logistic Regression, Random Forest to train the model
Predict Stage:
- Load models from hdfs
- Use models to classify products' category