Skip to content

HossamSaoud/MachineLearning_Notebooks

Repository files navigation

MachineLearning_Notebooks

Here are all my Machine Learning/LLMs/Deep Learning Notebooks.

Project List:

  1. Applied Machine Learning Techniques
  2. NanoGPT
  3. Company Brochure Generator
  4. Meeting Minutes Generator
  5. Python Code to C++ Code Converter

Applied Machine Learning Techniques:

  • Implemented and fine-tuned machine learning algorithms: Random Forests, XGBoost (XGB), LightGBM (LGBM), Logistic Regression, Linear Regression, Decision Trees, and Ensemble Techniques.
  • Worked with diverse datasets (thousands to millions of records) for classification, regression, and feature engineering.
  • Achieved up to 95% predictive accuracy in various projects.
  • Conducted model evaluation, cross-validation, and hyperparameter tuning for robust and optimized performance.

NanoGPT :

  • Designed and implemented a small-scale GPT-inspired language model trained on Shakespeare’s complete works.
  • Generated text mimicking Shakespeare’s language, tone, poetic style, and literary devices.
  • Preprocessed and tokenized Shakespeare’s texts to create a high-quality training dataset.
  • Utilized transformer-based architecture with attention mechanisms to capture complex syntax and metaphors.
  • Fine-tuned the model using state-of-the-art techniques for coherent, stylistically accurate text generation.
  • Demonstrated expertise in NLP, deep learning, and creative text generation by adapting modern techniques to classical literature.

Company Brochure Generator:

  • Designed and implemented a web-based summarization tool using a custom-built web scraper to extract and summarize website content.
  • Developed a scraper to handle diverse HTML structures and dynamically loaded elements for accurate data extraction.
  • Applied advanced text preprocessing (e.g., cleaning HTML, removing redundancy) to prepare data for summarization.
  • Utilized NLP techniques to identify key sentences and generate concise, context-preserving summaries.
  • Enabled users to reduce lengthy content into digestible highlights while retaining key insights.
  • Demonstrated expertise in web scraping, data processing, NLP, and building end-to-end solutions to address information overload.

Meeting Minutes Generator :

  • Designed and developed a Meeting Minutes Generator to automate meeting summaries from audio recordings.
  • Integrated OpenAI’s Whisper for accurate transcription of meeting discussions.
  • Utilized Google’s Gemma for summarization to extract key insights (summary, discussion points, takeaways, action items).
  • Automated the generation of structured meeting minutes in markdown format, including attendees, location, date, and assigned action items.
  • Enhanced productivity by reducing manual effort and improving accessibility of meeting information.
  • Built the tool to be scalable and adaptable for various meeting types and industries.

Python Code to C++ Code Converter :

  • Developed a Python-to-C++ Code Generator using ChatGPT-4.0-mini and Claude Sonnet3.5 APIs for code generation and optimization.
  • Built an interactive Gradio-based UI for users to input Python code, customize parameters, and view real-time C++ output.
  • Enhanced usability and efficiency, enabling seamless Python-to-C++ translation with minimal developer effort.
  • Integrated modern programming standards and best practices to ensure high-quality, clean C++ code generation.
  • Delivered a practical tool for improving performance or transitioning Python prototypes to production-grade C++.

Predictive Modeling for Kaggle Competitions:

  • Designed and implemented machine learning models for Kaggle competitions (House Prices Prediction, Titanic Disaster), achieving over 80% accuracy in both.
  • Utilized feature engineering, exploratory data analysis (EDA), and data preprocessing to extract insights and enhance model performance.
  • Applied ensemble methods like Random Forests and Gradient Boosting to address prediction challenges.
  • Conducted hyperparameter tuning and model evaluation to optimize performance and ensure robustness.
  • Demonstrated problem-solving skills by applying data-driven approaches to real-world predictive tasks.

PyTorch: Linear Regression, CNN, and ANN

1. Introduction

  • Overview of PyTorch and its libraries
  • Implementing Linear Regression, CNN, and ANN efficiently

2. Setup

  • Install and import necessary libraries
  • Enable GPU acceleration if available

3. Linear Regression

  • Generate synthetic data using torch.randn
  • Define model with torch.nn.Linear
  • Train using MSE loss and SGD optimizer
  • Plot regression results

4. Convolutional Neural Network (CNN)

  • Load MNIST/CIFAR-10 with torchvision.datasets
  • Normalize and batch data using DataLoader
  • Define CNN with convolution, pooling, and fully connected layers
  • Train using Cross-Entropy loss and Adam optimizer
  • Evaluate accuracy on test data

5. Artificial Neural Network (ANN)

  • Prepare tabular data with torch.utils.data.TensorDataset
  • Define MLP using torch.nn.Sequential
  • Train using BCE/Cross-Entropy loss and Adam optimizer
  • Evaluate performance

6. Summary

  • Key insights from Linear Regression, CNN, and ANN
  • Future exploration in deep learning

About

Applying machine learning techniques

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors