Welcome to Machine Learning From Scratch β a comprehensive, beginner-friendly, and professional repository containing classic Machine Learning algorithms implemented in two ways:
β
From Scratch (Pure Python) β for complete understanding of the math and logic
β
With Scikit-learn β for production-ready and fast implementation
π This project includes only Machine Learning algorithms (no deep learning), with clean code, real datasets, and Jupyter notebooks for practice.
Link:Machine Learning From Scratch
To help students, developers, and AI enthusiasts learn Machine Learning the right way β by implementing core algorithms manually and then applying the same models using Scikit-learn for practical use.
Each folder includes:
- Python code (from scratch)
- Scikit-learn version
- Dataset(s)
- Jupyter notebook(s)
- β Simple Linear Regression
- β Polynomial Linear Regression
- β Multiple Linear Regression
- β Lasso Regression
- β Ridge Regression
- β Random Forest Regression
- β XGBoost Regressor (Scikit-learn version only)
- β Decision Tree Regressor
- β LightGBM Regressor
- β CatBoost Regression
- β ElasticNet Regression
- β Bayesian Ridge Regression
- β Gradient Boosting Regression
- β Support Vector Machine (SVM) Regression
- β K-Nearest Neighbors (KNN) Regression
- β Huber Regression
- β Orthogonal Matching Pursuit (OMP)
- β Theil-Sen Regression
- β Quantile Regression
- β Tweedie Regression
- β Principal Component Regression (PCR)
- β Gamma Regression
- β AdaBoost Regression
- β
Stepwise Regression
- i) Forward Stepwise
- ii) Backward Stepwise
- β Logistic Regression
- β Random Forest Classification
- β Decision Tree Classification
- β LightGBM Classification
- β K-Nearest Neighbors Classification (KNN)
- β Support Vector Machine Classification (SVM)
- β Gaussian Naive Bayes Classification
- β Bernoulli Naive Bayes Classification
- β Multinomial Naive Bayes Classification
- β AdaBoost Classification
- β CatBoost Classification
- β XGBoost Classification
- β Gradient Boosting Classification
- β Extra Tree Classification
- β Linear Discriminant Analysis (LDA)
- β K-Means Clustering
- β Hierarchical Clustering
- β DBSCAN
- β PCA (Principal Component Analysis)
- β Apriori Algorithm (Association Rule Mining)
- β Frequent Pattern Growth (FP-Growth)
- Python 3.x
- NumPy, Pandas, Matplotlib, Seaborn
- Scikit-learn (for sklearn versions only)
Install dependencies:
pip install numpy pandas matplotlib seaborn scikit-learnThis repository is perfect for:
- Students learning ML theory and code
- Preparing for interviews and exams
- Building data science portfolios
- Training models with real datasets
Zohaib Sattar
π§ Email: zabizubi86@gmail.com
π LinkedIn: Zohaib Sattar
If you find this project helpful, please βοΈ star the repo and share it with your network. It motivates further open-source contributions!