Skip to content
View preetsojitra2712's full-sized avatar

Block or report preetsojitra2712

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
preetsojitra2712/README.md

Hi there, I'm Preet Sojitra 👋

About Me

  • MS in Computer Science at UC Riverside, focused on Machine Learning and AI
  • Interested in building intelligent systems, optimizing models, and designing scalable ML and backend infrastructure
  • Currently working on bias detection tools for vision-language models and predictive autoscaling for AI workloads
  • Strong interest in generative modeling, semantic search, and AI systems deployed on cloud-native platforms
  • Google Summer of Code 2025 Contributor at the Unicode Consortium, building an LLM-based validator for global locale data quality (CLDR)
  • Actively seeking Software Engineering and Machine Learning roles focused on real-world, large-scale systems
  • Always open to meaningful ML collaborations and impactful open-source work

Skills

Programming Languages
Python, C++, SQL, Bash

Machine Learning and AI
Diffusion Models, Transformers, Vision-Language Models, Supervised Learning, Semantic Search, Bias Analysis, Retrieval-Augmented Generation

Computer Vision
OpenCV, ResNet, DenseNet, MobileNet, Vision Transformers, Semantic Segmentation, Deconvolutional Networks, Person Re-Identification

Cloud and DevOps
AWS, Kubernetes (EKS), Docker, CloudLab, Horizontal Pod Autoscaling, CI/CD pipelines

Frameworks and Libraries
PyTorch, TensorFlow, HuggingFace, Scikit-learn, LangChain, FastAPI, Gradio

Databases
PostgreSQL, MySQL, MongoDB


Featured Projects

Gravitational Lensing Image Generator
Realistic astrophysical image synthesis using denoising diffusion probabilistic models (DDPM), implemented in PyTorch and evaluated using FID metrics.

Semantic Search Engine for Medicine
End-to-end system combining web crawling, document indexing, and semantic retrieval to enable intelligent search across online pharmacy data.

Smart Autoscaler for Kubernetes
Machine learning-driven horizontal autoscaler designed to optimize GPU and AI workloads under dynamic traffic patterns.

Vision-Language Bias Analyzer
Research-oriented pipeline to detect and quantify social and occupational biases in vision-language models using synthetic and benchmark datasets.

GSoC 2025: CLDR LLM Validator
An automated validation system using large language models to detect inconsistencies and errors in multilingual locale data used by ICU and internationalization libraries.

DocFlow AI (Early-Stage Startup Project)
A production-oriented backend system for document upload, storage, and secure retrieval.
Built with Node.js, Fastify, PostgreSQL, Prisma, and S3-compatible object storage.
Focuses on clean architecture, security, auditability, and scalable system design.

Project link:
https://github.com/preetsojitra2712/docflow-ai


Connect With Me


Turning complex ML models and backend systems into practical, real-world solutions.

Pinned Loading

  1. fake-profile-detection-n-social-media-acct fake-profile-detection-n-social-media-acct Public

    Jupyter Notebook

  2. University-prediction-for-Masters-student-in-USA University-prediction-for-Masters-student-in-USA Public

    Jupyter Notebook