- MS in Computer Science at UC Riverside, focused on Machine Learning and AI
- Interested in building intelligent systems, optimizing models, and designing scalable ML and backend infrastructure
- Currently working on bias detection tools for vision-language models and predictive autoscaling for AI workloads
- Strong interest in generative modeling, semantic search, and AI systems deployed on cloud-native platforms
- Google Summer of Code 2025 Contributor at the Unicode Consortium, building an LLM-based validator for global locale data quality (CLDR)
- Actively seeking Software Engineering and Machine Learning roles focused on real-world, large-scale systems
- Always open to meaningful ML collaborations and impactful open-source work
Programming Languages
Python, C++, SQL, Bash
Machine Learning and AI
Diffusion Models, Transformers, Vision-Language Models, Supervised Learning, Semantic Search, Bias Analysis, Retrieval-Augmented Generation
Computer Vision
OpenCV, ResNet, DenseNet, MobileNet, Vision Transformers, Semantic Segmentation, Deconvolutional Networks, Person Re-Identification
Cloud and DevOps
AWS, Kubernetes (EKS), Docker, CloudLab, Horizontal Pod Autoscaling, CI/CD pipelines
Frameworks and Libraries
PyTorch, TensorFlow, HuggingFace, Scikit-learn, LangChain, FastAPI, Gradio
Databases
PostgreSQL, MySQL, MongoDB
Gravitational Lensing Image Generator
Realistic astrophysical image synthesis using denoising diffusion probabilistic models (DDPM), implemented in PyTorch and evaluated using FID metrics.
Semantic Search Engine for Medicine
End-to-end system combining web crawling, document indexing, and semantic retrieval to enable intelligent search across online pharmacy data.
Smart Autoscaler for Kubernetes
Machine learning-driven horizontal autoscaler designed to optimize GPU and AI workloads under dynamic traffic patterns.
Vision-Language Bias Analyzer
Research-oriented pipeline to detect and quantify social and occupational biases in vision-language models using synthetic and benchmark datasets.
GSoC 2025: CLDR LLM Validator
An automated validation system using large language models to detect inconsistencies and errors in multilingual locale data used by ICU and internationalization libraries.
DocFlow AI (Early-Stage Startup Project)
A production-oriented backend system for document upload, storage, and secure retrieval.
Built with Node.js, Fastify, PostgreSQL, Prisma, and S3-compatible object storage.
Focuses on clean architecture, security, auditability, and scalable system design.
Project link:
https://github.com/preetsojitra2712/docflow-ai
- LinkedIn: https://www.linkedin.com/in/preet-sojitra-a4b616208
- GitHub: https://github.com/preetsojitra2712
Turning complex ML models and backend systems into practical, real-world solutions.
