This repository contains a set of mini machine learning projects designed to cover a variety of ML concepts, datasets, and problem types (classification, regression, clustering, anomaly detection, etc.). Each project is implemented using Jupyter Notebooks and primarily utilizes libraries from the Python data science ecosystem.
| Project | Description | Techniques & Tech Stack |
|---|---|---|
| P1 - Sonar Rock vs Mine | Classify sonar signals as rocks or mines. | Logistic Regression, sklearn |
| P2 - Diabetes Prediction | Predict diabetes based on patient data. | Logistic Regression, Random Forest, sklearn |
| P3 - House Price Prediction | Predict housing prices using features like area and location. | Linear Regression, XGBoost |
| P4 - Loan Prediction | Predict loan approval based on applicant details. | Decision Trees, Feature Engineering |
| P5 - Wine Quality | Classify wine quality based on physicochemical data. | Classification, sklearn, Data Preprocessing |
| P6 - Car Price Prediction | Predict resale prices of cars. | Multiple Linear Regression, sklearn |
| P7 - Gold Price Prediction | Forecast gold price trends. | Time Series Analysis, Regression |
| P8 - Heart Disease Detection | Predict heart disease based on diagnostic data. | Classification, Logistic Regression |
| P9 - Credit Card Fraud Detection | Detect fraudulent transactions. | Anomaly Detection, Imbalanced Data Handling |
| P10 - Medical Insurance Cost Prediction | Estimate insurance charges using age, BMI, etc. | Regression, sklearn |
| P11 - Sales Prediction | Predict sales from historical data. | Regression, sklearn |
| P12 - Spam Detection | Classify emails as spam or not spam using text data. | NLP, TF-IDF, Naive Bayes |
| P13 - Customer Segmentation | Cluster customers based on behavior for targeted marketing. | K-Means Clustering, PCA |
| P14 - Parkinson’s Disease Detection | Predict presence of Parkinson’s using voice data. | SVM, Feature Scaling |
| P15 - Titanic Survival Prediction | Predict survival of Titanic passengers. | Logistic Regression, Random Forest |
| P16 - Calories Burned Prediction | Predict calories burned from activity data. | Regression, Feature Engineering |
- Python 3
- Jupyter Notebooks
- scikit-learn
- pandas, numpy, matplotlib, seaborn
- XGBoost, Linear Regression, Logistic Regression, and many more such libraries
- Natural Language Toolkit (for NLP)
- StandardScaler, LabelEncoder, OneHotEncoder
- Model evaluation: accuracy, precision, recall, confusion matrix
- Clone this repository.
- Navigate into any project directory.
- Open the notebook with Jupyter or VS Code.
- Run the cells to see preprocessing, training, evaluation, and results.
This project is licensed under the MIT License.