Bangash aqibbangash

Turning hard AI problems into reliable products — from green-field architecture to teams shipping at scale across PK · USA · UK · UAE.

👋 About

I'm an engineering leader with 10+ years shipping full-stack systems and the last several leading AI transformation for ambitious products. I move comfortably from a whiteboard system diagram to a Rust async worker to a multi-tenant Kubernetes deploy — and I most enjoy the messy middle where research-grade models have to survive production traffic.

Today I lead engineering at Clerint Media, building a real-time broadcast intelligence platform that analyzes dozens of live TV streams concurrently — face recognition, OCR, and speech-to-text into a searchable, alertable evidence layer.

🚀 What I Help Teams Do


🧠 AI products from zero	Take a model that works in a notebook and turn it into a pipeline that survives 30 concurrent live streams, restarts, GPU loss, and 3 AM pages.
🏗️ Solution architecture audits	Walk into an existing system and find the 20% of design driving 80% of incidents and cloud bill. Output: a phased plan, not a 60-slide deck.
🧭 Fractional CTO for early-stage	Pick the stack, hire the first engineers, ship the first version, and stay long enough to make sure it doesn't collapse under its own weight.
🔁 AI transformation for established orgs	Wire LLMs, RAG, vector search, and computer vision into workflows that move real metrics — not demo metrics.

🛰️ Selected Work

Clerint Media — Real-time Broadcast Intelligence (currently leading)

A multi-tenant SaaS analyzing 30+ live HLS/RTSP TV channels in parallel, on a single-node Kubernetes cluster.

Rust ML worker (tonic gRPC + tokio) supervises an FFmpeg frame + audio pipeline per channel, fanning frames out via broadcast channels to OCR / face / speech workers.
Face recognition with SCRFD + ArcFace ONNX models; embeddings stored in pgvector for sub-second identity search across hours of footage.
OCR via PaddleOCR HTTP service; speech-to-text via Deepgram WebSocket streams.
NestJS orchestrator drains gRPC events → Prisma writes + Socket.io fan-out + BullMQ stories/alerts.
React 19 + Vite + Tailwind 4 SPA with a live DVR timeline and custom clip range slider.
Plain-YAML Kubernetes — two parallel deployments (main + MOIB) on bare-metal.

Urdu STT Benchmark — open source

github.com/aqibbangash/urdu-stt-bench CPU-only benchmark harness for offline Urdu speech-to-text. faster-whisper / CTranslate2 + Streamlit UI + Docker. A decision-support tool for picking the right STT model for low-resource languages without burning a GPU budget.

🧰 Stack

AI / ML

Backend

Data

Frontend

Infrastructure & DevOps

Mobile (legacy)

🧭 Engineering Principles

Boring tech for the load-bearing parts. Plain YAML over Helm, Postgres over five exotic stores, monolith-until-it-hurts.
Pipelines, not point solutions. A model that works in isolation is a science project; a supervised, restartable, observable pipeline is a product.
Architecture follows team shape. I pick stacks for the people who'll maintain them on Wednesday at 4 PM, not for a conference talk.
AI is plumbing, not magic. The interesting work is in latency budgets, fallbacks, eval harnesses, and what happens when the model is wrong.

📊 GitHub

📫 Let's Build

If you're shipping something at the intersection of real-time systems, computer vision, NLP, or AI-into-existing-workflows — and you want a partner who'll architect it, code the hard parts, and stay until it ships — I'd like to hear about it.

_{Thanks for stopping by 🤝}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly