This repository was created as part of the Machine Learning course in the Master’s program in Computer Science.
It includes datasets, exercises, lecture materials, implementations, and a final project focused on applying machine learning to a problem related to transport phenomena.
Machine Learning (ML) is a subfield of Artificial Intelligence that develops algorithms capable of learning patterns from data and making predictions or decisions without being explicitly programmed.
The core tasks in Machine Learning can be broadly categorized into:
- Supervised Learning: The model is trained on labeled data.
- Regression: Predicting a continuous value (e.g., predicting house prices).
- Classification: Predicting a discrete class label (e.g., spam vs. not spam).
- Unsupervised Learning: The model works with unlabeled data to find patterns or structures.
- Clustering: Grouping similar data points together (e.g., customer segmentation).
- Dimensionality Reduction: Reducing the number of variables in the data.
- Reinforcement Learning: An agent learns to make decisions by taking actions in an environment to maximize a cumulative reward.
Some of the fundamental algorithms and models studied in this field include:
- Linear and Logistic Regression
- Support Vector Machines (SVM)
- Decision Trees and Random Forests
- K-Means Clustering
- Principal Component Analysis (PCA)
- Artificial Neural Networks (ANN) and Deep Learning
data/– datasets for experiments and analysesdocs/– lecture notes, papers, summaries, and presentationsexercises/– exercise lists from the courseproject/– final project (ML applied to a transport phenomenon problem, to be defined)results/– plots and figures generated from analysessrc/– source code implementations for the studied methods
Clone the repository for study and experimentation:
git clone https://github.com/thiagoneye/master-machine_learning.git
cd master-machine_learningThe code implementations in this repository rely on common Python libraries for data science and machine learning. It is recommended to create a virtual environment to manage dependencies.
The main libraries used are:
- NumPy
- Pandas
- Scikit-learn
- Matplotlib
- Seaborn
- SciPy
- pyGAM
You can install them individually via pip:
pip install numpy pandas scikit-learn matplotlib seaborn scipy pygamMost of the material (study notes and codes) in the repository was developed with the help of LLMs (ChatGPT and Gemini).