Skip to content

Latest commit

 

History

History
19 lines (11 loc) · 1.21 KB

File metadata and controls

19 lines (11 loc) · 1.21 KB

MachineLearningProject

Authors: António Pinto , Davide Farinati, Mohamed Elbawab, Tomás de Sá

Grade = 18,33/20

Average Median Q75
14.8 15.05 16.57

Best project in the course

Context

This paper gives an insight of how can we predict the taxes that the residents should pay on a found planet based on their information. The biggest challenge of the whole process is to handle the dataset reasonably size. Models from sklearn such as Decision Trees, Random Forest, Gradient Boosting or Neural Networks were used in order to create a solid predictive model. Gradient Boosting revealed to be the best model with 87% accuracy on train and 86.6% on validation. There were analyzed different approaches for each model to guarantee that all the options were explored but the number of categorical values presented in the dataset may have been limiting due to time cost and model acceptability to treat them.

Reproducibility

In order to reproduce the project you can finde in the folder data the dataset used, and in another folder the project description. Be sure to also dowload the requirements.txt file. Set the paths to import the data of the notebook as the one on your device.