Kunal Kunal-Somani

Kunal

B.E. Robotics and Artificial Intelligence - Thapar Institute of Engineering and Technology

B.S. Data Science and Applications - Indian Institute of Technology Madras

Pre-final year undergrad building at the intersection of robotics, computer vision, and autonomous systems. My work runs on real hardware and real infrastructure: multimodal ML pipelines, edge AI systems, agentic backends, and production-grade open-source tooling.

Open Source: Active contributor at JdeRobot/RoboticsAcademy with 16 merged PRs. Resolved an FP16 precision crash in the Object Detection pipeline, fixed deployment script bugs across run_academy.sh and develop_academy.sh, refactored the Hardware Abstraction Layer, and shipped 52 unit tests across 5 test classes.

Research at Thapar ELC (Summer 2025): Multimodal CNN and DNN for Parkinson's early detection. Fused MPU9250 tremor signals with voice recordings via late fusion, pushing combined model accuracy from 88% to 91%.

Computer Vision: Toll fraud detection system built on HOG features and LinearSVC. 24K+ images, 97% accuracy on multi-axle vehicle classification.

Robotics Research (ongoing): Audio-Visual-Thermal fusion architecture for autonomous SAR navigation in visually degraded environments, under Dr. Ankit Soni at Thapar, using Isaac Sim and ROS 2.

Featured Projects

Project	Description	Stack
Archon	Production-deployable instruction-to-deployment backend. Hybrid RAG (Cohere dense + BM25 sparse, RRF fusion) retrieves context, Anthropic Tool Use generates schema-validated code, and a GitHub App deployer pushes live sites to GitHub Pages. FastAPI and Celery handle async execution; Redis Pub/Sub streams logs over WebSocket to a React and TypeScript dashboard; full observability via Prometheus, Grafana, and OpenTelemetry.	FastAPI - Celery - Redis - Cohere - Anthropic API - React - TypeScript - Vite - PostgreSQL - SQLAlchemy - Alembic - Prometheus - Grafana - OpenTelemetry - Docker
Parkinson's Early Detection	Multimodal early detection fusing MPU9250 IMU tremor signals with voice recordings. CNN on voice features (88% accuracy) and DNN on tremor data combined via late fusion to reach 91%. Custom ESP32 hardware pipeline from sensor to model inference.	TensorFlow - Keras - Librosa - Parselmouth - scikit-learn - SoundDevice
Axon Core	Production-deployable fully local tri-modal AI assistant. A BART-MNLI zero-shot router dispatches across three paths: knowledge retrieval via Qdrant and local Gemma, OS-level tool execution with user confirmation, and general conversation. Hybrid RAG with MiniLM and BM25, reranked by a cross-encoder; GBNF-constrained sampling for tool calling.	FastAPI - LangChain - Qdrant - Ollama - Next.js - Docker - SQLAlchemy
Helix	Production-deployable recursive autonomous web agent on the OODA loop. Playwright handles JS-heavy DOMs, Claude Tool Use synthesizes Python solutions just-in-time, RestrictedPython and SIGALRM sandbox execution, and HTTP submission loops until a terminal state. Durable jobs via ARQ on Redis; Prometheus, Loki, and Grafana cover observability.	FastAPI - Playwright - Claude API - ARQ - Redis - Prometheus - Loki - Grafana - Docker
TruthTag: Toll-Audit	Classical CV pipeline cross-verifying RFID FASTag claims against physical vehicle geometry at toll plazas. 3780-dimensional HOG vectors, LinearSVC trained on 24K+ images, 97% accuracy on multi-axle classification. Cross-modal centroid tracker, MOG2 virtual tripwire, and a Streamlit audit dashboard.	OpenCV - scikit-learn - HOG - LinearSVC - NumPy - Streamlit - Seaborn - Matplotlib

Active Research

Project	Description	Status
Canary Rover	Autonomous mine inspection rover. PPO locomotion trained in PyBullet (200K timesteps), real-time ROS 2 sensor stack for IMU, LiDAR, and BLDC encoders, SLAM via slam_toolbox and RTAB-Map, and full 3D simulation in NVIDIA Isaac Sim 5.1. Capstone project, team of 5.	Ongoing -- Capstone
MRI Reconstruction	Dual-branch physics-guided framework for accelerated MRI reconstruction. Learned gating network routes k-space data adaptively without anatomy labels at inference. Achieves +1.78% SSIM over single-branch baselines at 112ms and 302 GFLOPs on an RTX 4060.	Paper authored
Audio-Visual-Thermal SAR	Multimodal fusion architecture for autonomous SAR in visually degraded environments. Thermal, acoustic, and visual modalities fused for robust SLAM and detection, under Dr. Ankit Soni at Thapar.	Review paper authored

Tech Stack

Languages

Robotics & Edge

AI & Computer Vision

Backend & Databases

DevOps & Infra

GitHub Stats

Currently working on: Physics-guided deep learning for medical imaging -- RL-based autonomous navigation -- Multimodal sensor fusion for SAR -- Beyond-transformer sequence architectures

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kunal Kunal-Somani

Achievements

Achievements

Highlights

Block or report Kunal-Somani

Kunal

Featured Projects

Active Research

Tech Stack

GitHub Stats

Pinned Loading

Uh oh!