Skip to content
View WillLewis's full-sized avatar

Block or report WillLewis

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
WillLewis/README.md

Will Lewis

AI/ML Product Manager — agentic systems, evals, and ML decisioning in regulated environments.

Agents propose, code decides. Models are good at compressing the messy middle of a workflow. The harness — policy layer, held-out evals, deterministic gates — is what makes them shippable. 🚀

Each pin below is the same thesis tested in a different domain: the eval surface is the product

Live demos and full case studies: wxl3.com

Pinned Loading

  1. atlas-agentic-fraud-lab atlas-agentic-fraud-lab Public

    Adversarial Testing Lab for Agentic Safeguards (ATLAS). A synthetic multi-agent eval environment for adversarial fraud decisioning inspired by Anthropic's Project Deal. Measures how model quality, …

    Python 1

  2. agent-harness-environment agent-harness-environment Public

    An eval and observability cockpit for coding agents. It runs policy-controlled coding agents in sandboxed toy repos, tool-use traces, MCP tools, compares harness policies, scores recovery and safet…

    Python

  3. regulated-agent-launch-kit regulated-agent-launch-kit Public

    A regulated-agent deployment kit for turning traces, evals, regressions, and approval gates into launch/no-launch decisions

    Python

  4. voice-agent-prompt-lab voice-agent-prompt-lab Public

    A voice agent demo and prompt evaluation harness for insurance first notice of loss claims

    TypeScript

  5. coachbench coachbench Public

    Do agents make for good offensive & defensive coordinators in football? This is an adversarial-agent arena for short red-zone strategy contests. OC & DC agents compete through simultaneous legal pl…

    Python 1

  6. canon-ai canon-ai Public

    Can you eval an art form? Canon is a continuity linter for serialized TV, YouTube and micro-drama fiction. Canon plays the role of whats currently the scriptwriting coordinator, verifies your story…

    Python