ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β PROCESS : sahil.sharma β
β ROLE : Full-Stack AI Engineer Β· Systems Architect β
β STACK : MERN Β· LangGraph Β· Reverse-RAG Β· Real-time ML Pipelines β
β UPTIME : B.Tech CSE Sem 5 @ JSS Academy, Noida β
β STATUS : βΆ BUILDING SYSTEMS THAT THINK β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
I don't write features. I design intelligence layers β deterministic state machines that reason, hallucination firewalls that self-heal, and inference pipelines that scale. Every abstraction I write, I can trace back to the disk seek that executes it.
"The abstraction layer is a lease, not a permanent home. Own the metal beneath it."
ARCHITECTURE: enterprise orchestration + distributed ML inference
STATUS: ββββββββββ active
A legal compliance engine that doesn't just flag dark patterns β it maps them to specific clauses in the DPDP Act, EU AI Act, and CCPA. Built on a five-layer pipeline: Puppeteer DOM extraction β YOLO + Open-CLIP visual deception detection β Transformer + XGBoost NLP ensemble β ChromaDB Legal RAG β Gemini cross-verification with automated multi-key failover.
Express Puppeteer YOLO Open-CLIP LangChain ChromaDB Gemini 2.5 XGBoost LightGBM
β³ Architecture deep-dive
| Layer | Technology | Signal |
|---|---|---|
| πΈοΈ Crawler | Express + Puppeteer | DOM tree, bounding boxes, computed CSS tokens |
| ποΈ Vision | YOLO + Open-CLIP | Layout distortion, fake urgency, visual deception |
| π§ NLP | Transformers + XGBoost + LightGBM | Deceptive text classification ensemble |
| βοΈ Legal RAG | ChromaDB + LangChain | Clause-level mapping to DPDP / EU AI Act / CCPA |
| π Verifier | Gemini secondary vectors | LLM cross-check with automated API failover |
ARCHITECTURE: real-time streaming answer synthesis
STATUS: ββββββββββ active
A Perplexity-class answer engine built from scratch. The non-obvious engineering here: a Socket.io streaming bridge that keeps token latency under 80ms end-to-end, a Redis JWT blacklist that prevents stale auth from hitting LLM endpoints, and Mistral Large used purely for deterministic title generation β not because it's cheaper, but because its output is more structurally consistent.
Node.js Socket.io LangChain Gemini 2.5 Flash Mistral Large Tavily AI Redis React 19
ARCHITECTURE: reverse-RAG stream interception
STATUS: ββββββββββ research
The core insight: instead of post-hoc fact-checking, intercept the LLM output stream before it reaches the client. LangGraph state machine routes live token emissions through MongoDB Atlas Vector Search for semantic factual validation. Corrections are injected mid-stream, not appended as disclaimers. The system heals its own output.
LangGraph MongoDB Atlas Vector Search Reverse-RAG Python FastAPI
ARCHITECTURE: parallel LangGraph state machines
STATUS: ββββββββββββ shipped
Performance duels between competing LLMs, orchestrated through LangGraph state machines that safely isolate and pipeline parallel model I/O. MongoDB aggregation handles win/loss tracking, category performance curves, and global leaderboard generation. Secure cookie architecture: HttpOnly, SameSite=None, Secure, with JWT refresh rotation.
LangGraph Cohere Gemini Mistral MongoDB TypeScript React
ARCHITECTURE: amazon-class storefront ecosystem
STATUS: ββββββββββ in progress
Not another CRUD store. Dual Buyer/Seller dashboards with RBAC + Google OAuth 2.0, a LangChain + LangGraph style recommendation engine that builds a "Style DNA" profile per user, and a GSAP + glassmorphism frontend that treats UI as a first-class product decision.
MERN LangChain LangGraph GSAP SCSS ImageKit JWT OAuth 2.0
ARCHITECTURE: on-device ML + adaptive UI
STATUS: ββββββββββββ shipped
Full biometric emotion classification running client-side β no round-trip to inference servers. MediaPipe classifies six emotional states at 30fps, Redis tracks session state and token invalidation arrays, and the UI adapts its layout and content in real-time. Zero-latency because the model never leaves the browser.
MediaPipe React Redis Node.js JWT
AI Β· ML Β· Orchestration
Backend Β· Systems
Frontend Β· UI
Data Β· Infrastructure
SAHIL(1) Developer Manual SAHIL(1)
NAME
sahil β full-stack AI engineer, systems thinker
SYNOPSIS
sahil [--build] [--research] [--obsess-over-fundamentals]
DESCRIPTION
Builds production-grade AI systems as a B.Tech undergrad.
Traces every abstraction to its machine-level origin.
Ships systems most engineers won't touch for years.
OPTIONS
--build MERN + LangGraph + real-time ML pipelines
--research Reverse-RAG hallucination firewalls
--fundamentals B-Trees, OS memory layout, theory of computation
--friends Samarth, Meghanshu, Shubh
--hardware MacBook Air M4
PHILOSOPHY
The abstraction layer is a lease, not a permanent home.
Trace your queries to disk seeks.
Map your variables to memory.
SEE ALSO
github(1), langraph(1), mongodb-atlas-vector-search(1)
SAHIL 2025 SAHIL(1)
