theo-ai-lab

theo-ai-lab

AI product builder: agentic workflows, evals, RAG, trace regression, proof-first UX. USC Marshall + Viterbi ’24.

Popular repositories Loading

ai-career-coach ai-career-coach Public

Resume-grounded AI career coach — multi-agent RAG with a preregistered, falsifiable LLM-as-judge eval benchmark and adversarial red-team.

TypeScript
plimsoll plimsoll Public

Deterministic, zero-dependency CLI that catches AI-agent regressions in CI from recorded traces: policy + baseline checks, no LLM judge, no account.

Python
toffoli toffoli Public

The undo layer for AI agents — classifies each agent action reversible / compensable / irreversible, plans the restitution, and escalates only what truly can't be undone. Measured as an eval.

TypeScript
pacioli pacioli Public

Double-entry bookkeeping for AI agents — reconcile what your agent claimed against what the evidence shows, and get a receipt.

TypeScript
coehoorn coehoorn Public

Adversarial red-teaming for chat and tool-using agents — every failure cited to the turn that proves it.

Python