Skip to content
View Kunal-Somani's full-sized avatar

Highlights

  • Pro

Block or report Kunal-Somani

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Kunal-Somani/README.md

Kunal

B.E. Robotics and Artificial Intelligence - Thapar Institute of Engineering and Technology

B.S. Data Science and Applications - Indian Institute of Technology Madras

LinkedIn Email


Pre-final year undergrad building at the intersection of robotics, computer vision, and autonomous systems. My work runs on real hardware and real infrastructure: multimodal ML pipelines, edge AI systems, agentic backends, and production-grade open-source tooling.

Open Source: Active contributor at JdeRobot/RoboticsAcademy with 16 merged PRs. Resolved an FP16 precision crash in the Object Detection pipeline, fixed deployment script bugs across run_academy.sh and develop_academy.sh, refactored the Hardware Abstraction Layer, and shipped 52 unit tests across 5 test classes.

Research at Thapar ELC (Summer 2025): Multimodal CNN and DNN for Parkinson's early detection. Fused MPU9250 tremor signals with voice recordings via late fusion, pushing combined model accuracy from 88% to 91%.

Computer Vision: Toll fraud detection system built on HOG features and LinearSVC. 24K+ images, 97% accuracy on multi-axle vehicle classification.

Robotics Research (ongoing): Audio-Visual-Thermal fusion architecture for autonomous SAR navigation in visually degraded environments, under Dr. Ankit Soni at Thapar, using Isaac Sim and ROS 2.


Featured Projects

Project Description Stack
Archon Production-deployable instruction-to-deployment backend. Hybrid RAG (Cohere dense + BM25 sparse, RRF fusion) retrieves context, Anthropic Tool Use generates schema-validated code, and a GitHub App deployer pushes live sites to GitHub Pages. FastAPI and Celery handle async execution; Redis Pub/Sub streams logs over WebSocket to a React and TypeScript dashboard; full observability via Prometheus, Grafana, and OpenTelemetry. FastAPI - Celery - Redis - Cohere - Anthropic API - React - TypeScript - Vite - PostgreSQL - SQLAlchemy - Alembic - Prometheus - Grafana - OpenTelemetry - Docker
Parkinson's Early Detection Multimodal early detection fusing MPU9250 IMU tremor signals with voice recordings. CNN on voice features (88% accuracy) and DNN on tremor data combined via late fusion to reach 91%. Custom ESP32 hardware pipeline from sensor to model inference. TensorFlow - Keras - Librosa - Parselmouth - scikit-learn - SoundDevice
Axon Core Production-deployable fully local tri-modal AI assistant. A BART-MNLI zero-shot router dispatches across three paths: knowledge retrieval via Qdrant and local Gemma, OS-level tool execution with user confirmation, and general conversation. Hybrid RAG with MiniLM and BM25, reranked by a cross-encoder; GBNF-constrained sampling for tool calling. FastAPI - LangChain - Qdrant - Ollama - Next.js - Docker - SQLAlchemy
Helix Production-deployable recursive autonomous web agent on the OODA loop. Playwright handles JS-heavy DOMs, Claude Tool Use synthesizes Python solutions just-in-time, RestrictedPython and SIGALRM sandbox execution, and HTTP submission loops until a terminal state. Durable jobs via ARQ on Redis; Prometheus, Loki, and Grafana cover observability. FastAPI - Playwright - Claude API - ARQ - Redis - Prometheus - Loki - Grafana - Docker
TruthTag: Toll-Audit Classical CV pipeline cross-verifying RFID FASTag claims against physical vehicle geometry at toll plazas. 3780-dimensional HOG vectors, LinearSVC trained on 24K+ images, 97% accuracy on multi-axle classification. Cross-modal centroid tracker, MOG2 virtual tripwire, and a Streamlit audit dashboard. OpenCV - scikit-learn - HOG - LinearSVC - NumPy - Streamlit - Seaborn - Matplotlib

Active Research

Project Description Status
Canary Rover Autonomous mine inspection rover. PPO locomotion trained in PyBullet (200K timesteps), real-time ROS 2 sensor stack for IMU, LiDAR, and BLDC encoders, SLAM via slam_toolbox and RTAB-Map, and full 3D simulation in NVIDIA Isaac Sim 5.1. Capstone project, team of 5. Ongoing -- Capstone
MRI Reconstruction Dual-branch physics-guided framework for accelerated MRI reconstruction. Learned gating network routes k-space data adaptively without anatomy labels at inference. Achieves +1.78% SSIM over single-branch baselines at 112ms and 302 GFLOPs on an RTX 4060. Paper authored
Audio-Visual-Thermal SAR Multimodal fusion architecture for autonomous SAR in visually degraded environments. Thermal, acoustic, and visual modalities fused for robust SLAM and detection, under Dr. Ankit Soni at Thapar. Review paper authored

Tech Stack

Languages Python C++ C SQL Bash JavaScript TypeScript

Robotics & Edge ROS 2 NVIDIA Jetson Isaac Sim PyBullet Stable Baselines3 Gymnasium Gazebo Eclipse Zenoh MATLAB CUDA

AI & Computer Vision PyTorch PyTorch Lightning TensorFlow Keras OpenCV YOLOv8 LangChain Hugging Face Scikit-learn NumPy Pandas SciPy Librosa Parselmouth einops Matplotlib Seaborn Plotly

Backend & Databases FastAPI Flask Next.js React PostgreSQL SQLAlchemy Alembic Qdrant Playwright Prometheus Streamlit WebSocket Pydantic

DevOps & Infra Docker Kubernetes GitHub Actions Linux AWS Vercel Anaconda


GitHub Stats

GitHub Stats

Streak Stats


Currently working on: Physics-guided deep learning for medical imaging -- RL-based autonomous navigation -- Multimodal sensor fusion for SAR -- Beyond-transformer sequence architectures

Pinned Loading

  1. eshaansingla/ParkinsonsEarlyPrediction eshaansingla/ParkinsonsEarlyPrediction Public

    Python 1

  2. axon-core axon-core Public

    Python 1

  3. archon archon Public

    Python 1

  4. helix-agent helix-agent Public

    TypeScript 1

  5. TruthTag-Toll-Audit TruthTag-Toll-Audit Public

    Python