Skip to content
View harsha-gouru's full-sized avatar
πŸ§‘β€πŸ³
Cooking
πŸ§‘β€πŸ³
Cooking

Block or report harsha-gouru

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
harsha-gouru/README.md

Sri Harsha Gouru

Website Email

I take things apart until I understand them β€” models, protocols, binaries, whatever's in front of me. Curiosity is the engine; reverse engineering is the method. If I can't open it up and see what's actually happening inside, I don't trust that I know it.

Right now I'm deep in agents, x402, and how small models actually behave under the hood β€” training them from scratch, watching them learn, hooking them up to onchain payments so they can pay for their own resources and compose with other agents. I think the web is about to get a lot more autonomous, and I want to be building the plumbing for that.

I work across the whole stack β€” training loops, inference kernels, deployment, the interface someone actually touches. The interesting problems live at the seams between layers, not inside them. Lately that's looked like: porting CUDA kernels to AMD, running on-device LLMs on Apple Silicon, reverse-engineering iOS/macOS internals, and building tooling that makes opaque systems legible.


Research

  • peft-hybrid-paper β€” LoRA placement determines continual learning outcomes in hybrid SSM-attention models. Attention-only LoRA β†’ 9x lower perplexity, 5.6x fewer params
  • ane-gpu-speculative β€” first speculative decoding using Apple Neural Engine as draft + GPU as verifier
  • ane-gmlp-research β€” training a custom gated MLP directly on the ANE
  • apple-neural-engine-notes β€” hands-on findings: model execution and training on Apple Silicon
  • attention-residuals β€” study of Kimi's AttnRes: learned softmax attention over depth

Building


Looking to collaborate with domain experts who need AI tooling for their work. If you're great at what you do β€” research, medicine, finance, hardware, anything β€” and the tool you need doesn't exist yet, reach out. You bring the problem and the taste, I'll bring the AI and systems side.

Pinned Loading

  1. ai-traffic-audit ai-traffic-audit Public

    Local tool for auditing browser-based AI traffic via Chrome DevTools Protocol.

    Python 1

  2. apple-fm-server apple-fm-server Public

    Serve Apple Foundation Models through an OpenAI-compatible local API for macOS development.

    Swift 1

  3. TDDSDD TDDSDD Public

    Multi-agent TDD/SDD scaffold. Rebuilds Spring Boot apps from specs + tests using parallel headless Claude agents with context isolation.

    Java 5