I take things apart until I understand them β models, protocols, binaries, whatever's in front of me. Curiosity is the engine; reverse engineering is the method. If I can't open it up and see what's actually happening inside, I don't trust that I know it.
Right now I'm deep in agents, x402, and how small models actually behave under the hood β training them from scratch, watching them learn, hooking them up to onchain payments so they can pay for their own resources and compose with other agents. I think the web is about to get a lot more autonomous, and I want to be building the plumbing for that.
I work across the whole stack β training loops, inference kernels, deployment, the interface someone actually touches. The interesting problems live at the seams between layers, not inside them. Lately that's looked like: porting CUDA kernels to AMD, running on-device LLMs on Apple Silicon, reverse-engineering iOS/macOS internals, and building tooling that makes opaque systems legible.
Research
- peft-hybrid-paper β LoRA placement determines continual learning outcomes in hybrid SSM-attention models. Attention-only LoRA β 9x lower perplexity, 5.6x fewer params
- ane-gpu-speculative β first speculative decoding using Apple Neural Engine as draft + GPU as verifier
- ane-gmlp-research β training a custom gated MLP directly on the ANE
- apple-neural-engine-notes β hands-on findings: model execution and training on Apple Silicon
- attention-residuals β study of Kimi's AttnRes: learned softmax attention over depth
Building
- apple-fm-server β Apple's on-device Foundation Model as an OpenAI-compatible local API
- localdictate β Right-Option-to-dictate menu bar app, 100% on-device
- llm-inspector β see what an LLM is actually doing, token by token
- ai-traffic-audit β what browser-based AI products send over the network
- ai-privacy-monitor β track what AI platforms track about you, locally
- x402-pay-per-joke β micropayment-powered API on Base, an agent paying per request
- gpu-inference-playground β benchmark LLM inference on H100 / MI300X
Looking to collaborate with domain experts who need AI tooling for their work. If you're great at what you do β research, medicine, finance, hardware, anything β and the tool you need doesn't exist yet, reach out. You bring the problem and the taste, I'll bring the AI and systems side.



