I build autonomous AI systems — reinforcement learning agents, LLM-orchestrated pipelines, and full-stack AI applications that operate in high-stakes real-world domains.
| Project | Description |
|---|---|
| surgical-robot-tremor-compensator-rl | SAC RL agent filters involuntary surgical hand tremor in real time — 0.30mm compensation error |
| brand-conscience | Fully autonomous Meta ad system — writes briefs, generates creatives, deploys & self-improves via RL |
| puzzlegen-rl | PPO agent solves infinite DCGAN-generated mazes; Claude provides reward shaping at plateaus |
| icu-treatment-sequencer-rl | RL learns optimal ICU treatment sequences for critically ill patients |
| neighborhood-microgrid-balancer-marl | 10 MARL agents self-organize on a shared power grid without explicit coordination |
| threadline | Full-stack social feed — Next.js 14 + FastAPI + PostgreSQL, JWT auth, cursor-based pagination |
AI/ML
PyTorch Stable-Baselines3 Gymnasium LangGraph Claude API OpenCLIP RAG
Backend
Python FastAPI Django PostgreSQL Redis Elasticsearch Docker
Frontend
Next.js React TypeScript Vue 3
Infra
AWS (DynamoDB, CDK) Docker Alembic
- Reinforcement Learning — continuous control, MARL, offline-to-online, RL from human feedback
- Autonomous Agents — LangGraph orchestration, multi-step reasoning, self-improving systems
- LLM Integration — RAG pipelines, reward shaping with LLMs, document intelligence
- Full-Stack AI Apps — end-to-end systems from model training to production API to React dashboard
- GitHub: Motssembillahmahin
- Linkdln - A S M Motssem Billah Mahin


