From 0108b5bd091ead708f0632db28d08dae99d6a964 Mon Sep 17 00:00:00 2001 From: mateo-berri <277851410+mateo-berri@users.noreply.github.com> Date: Thu, 28 May 2026 03:37:51 +0000 Subject: [PATCH] blog: introduce Agent Shin triage bot for external PRs and issues Adds the public-facing explanation linked from every Agent Shin comment. Covers what the bot checks, the PR + issue rubrics (including the non-mocked QA-proof requirement), the 24h grace window, and the @agent-shin reconsider / @greptileai recovery paths. Co-Authored-By: Claude Opus 4.7 (1M context) --- blog/agent_shin_triage/index.md | 120 ++++++++++++++++++++++++++++++++ 1 file changed, 120 insertions(+) create mode 100644 blog/agent_shin_triage/index.md diff --git a/blog/agent_shin_triage/index.md b/blog/agent_shin_triage/index.md new file mode 100644 index 00000000..8daefebd --- /dev/null +++ b/blog/agent_shin_triage/index.md @@ -0,0 +1,120 @@ +--- +slug: agent-shin-triage +title: "Meet Agent Shin: how we triage external PRs and issues on LiteLLM" +date: 2026-05-28T10:00:00 +authors: + - mateo +description: "Agent Shin is the automated triage bot for BerriAI/litellm. This post explains what it checks, why we built it, and exactly what to do if it comments on your PR or issue." +tags: [community, contributing] +hide_table_of_contents: false +--- + +If you've opened a pull request or issue on [BerriAI/litellm](https://github.com/BerriAI/litellm), you may have heard from **Agent Shin** — the automated triage bot that runs against external contributions. This post is the long-form explanation that every Agent Shin comment links to: what it does, why, and what to do if you think it got things wrong. + +{/* truncate */} + +## Why we built it + +LiteLLM gets a lot of inbound. On a typical week the repo sees ~150 new pull requests and ~100 new issues, most of them from external contributors we've never worked with before. That volume is wonderful — it's why LiteLLM has integrations with 100+ providers — but it has a cost: a meaningful fraction of those PRs and issues are not actionable as filed. + +The most common patterns we see are: + +- A PR with no body, or a body that's just the template with nothing filled in. +- A PR that touches non-trivial provider code but doesn't link an issue, doesn't describe the bug, and doesn't show any output of the fix actually working. +- A bug report that's two sentences and a screenshot of a stack trace — no repro, no config, no way for a maintainer to make it happen on their machine. +- A feature request with no use case (just "support X"). + +Each of these costs a maintainer 10–30 minutes of reading code, guessing intent, and writing a comment asking for more information. Multiply by a few hundred and the result is that high-quality contributions — the ones that *did* link an issue, *did* attach a screenshot, *did* show a clean before/after — wait days for a first response while we sift. + +Agent Shin's job is to do that first sift consistently, in public, and with a clear path back into the queue. + +## What Agent Shin does + +Agent Shin runs on every external PR and issue. For each one, it asks a small LLM to check the body against the contribution rubric. Based on the verdict, it does one of three things: + +1. **Pass quietly** — most PRs and issues land here. Agent Shin leaves no comment; the contribution flows to the normal review queue. +2. **Post a heads-up and give you 24 hours** — if the body is missing the basics, Agent Shin comments on the PR/issue with a list of what it couldn't find and a one-day grace window to fix it. +3. **Auto-close after the grace window** — if 24 hours pass and the body still doesn't meet the bar, Agent Shin closes the PR/issue with a follow-up comment explaining how to bring it back. Closing is reversible — see [If you disagree](#if-you-disagree) below. + +A few important things Agent Shin will **not** do: + +- It does not touch PRs or issues from BerriAI org members, repo collaborators, or other bots. +- It does not auto-close anything a human maintainer has already engaged with or labeled for review. +- It does not delete your work. A closed PR keeps all its commits, comments, and diff — it's just moved out of the open queue. + +## The rubric for pull requests + +A PR passes triage when **both** of the following are true: + +### (1) Context — at least one of: + +- **A linked issue.** `Fixes #1234`, `Closes #1234`, `Resolves #1234`, or a link to the issue. A bare `#1234` without a closing keyword counts only if it's clearly the related issue. +- **Or a clear problem description in the body** that explains what bug or missing feature this addresses (beyond the title), plus expected vs. actual behavior (or, for features, "what's possible now vs. with this PR"). + +### (2) End-to-end QA proof — at least one of: + +- **A screenshot** (or before/after screenshots) showing the fix or feature working. +- **A short screen recording / video** showing the fix or feature working. +- **The exact commands you ran, paired with their real output**, demonstrating the change works against the real system — a real curl against the real upstream, a real proxy request, real log output from your dev instance. + +> **"No mocking allowed" for option 3.** Unit tests in `tests/test_litellm/*` stub the upstream LLM provider, the database, and the network. They're great for catching regressions and you should still add them — they're a hard requirement in the PR template — but they don't *prove* the change works end-to-end. For Agent Shin's purposes, the output of a real integration run, a real proxy hit, or a screenshot of the feature in the UI is what counts. "I ran pytest, 36 passed" is not QA proof on its own. + +If your PR has a linked issue but no QA proof, it fails. The linked issue gives the maintainer context; QA proof gives them evidence. We need both. + +### What doesn't count as QA proof + +- Generic claims: "I tested it", "works locally", "all tests pass". +- A checked "I added tests" box with no output shown. +- A description of what tests exist without their actual output in the PR body. +- A linked issue (that's context, not proof). + +## The rubric for issues + +### Bug reports must contain: + +- **A reproduction** — runnable code, a curl command, or an example config a maintainer can paste into their machine. +- **A screenshot, traceback, or log** showing the bug. +- **Expected vs. actual behavior.** + +### Feature requests must contain: + +- **A clear description** of what LiteLLM should do that it doesn't today. +- **A use case with a concrete example** — config, API call, UI flow, or scenario showing what's blocked today. + +## What happens if you got the heads-up comment + +The first time Agent Shin spots a problem, you'll get a comment that looks roughly like this: + +> 👋 Hi, thanks for the PR! I'm Agent Shin… I couldn't find: visual QA proof… ⏳ You have 1 day to address this before this PR is auto-closed. + +You have three ways to clear it: + +1. **Edit the PR/issue description** to add what's missing, then comment `@agent-shin reconsider`. Agent Shin will re-run the rubric. If it now passes, the grace warning is dismissed and the PR/issue stays open. +2. **Comment `@greptileai`** on a PR to request a fresh Greptile review. A Greptile confidence score of 4/5 or higher is one of the signals that lifts a PR out of the close queue, even after the grace warning. (This works for closed PRs too.) +3. **Do nothing** — if the body genuinely doesn't have what's being asked for, the PR/issue will be auto-closed after 24 hours. That's not a final answer — see below. + +## What happens after auto-close + +A closed PR or issue is not a "won't fix". To bring it back: + +- For PRs: you can either open a fresh PR with the same commits (the most reliable path, since GitHub doesn't always let an external contributor reopen a bot-closed PR), or comment `@agent-shin reconsider` after updating the description. If the rubric now passes, Agent Shin reopens the PR. +- For issues: comment `@agent-shin reconsider` after editing the issue. The bot re-evaluates and reopens if it now meets the bar. +- For either: a human maintainer can always reopen with a one-line comment. If you ping one and they agree the close was wrong, they'll override Agent Shin. + +## If you disagree + +Agent Shin uses an LLM, and LLMs aren't perfect. We've spot-checked it against 50+ recent contributions, but it will misjudge edge cases. + +If you think Agent Shin got it wrong: + +- **Comment `@agent-shin reconsider`** with a sentence explaining why. It re-runs the rubric on the current body and posts a fresh verdict. +- **Ping a maintainer.** A human can override the bot at any time. +- **File feedback** on the [Agent Shin tracking issue](https://github.com/BerriAI/litellm/issues) if you think the *rubric itself* is wrong — too strict, too lenient, or missing a category that should pass. + +The goal of this system is to make external contributions faster to land, not harder. If Agent Shin is doing the opposite for you, we want to know. + +## A note on what this isn't + +Agent Shin doesn't decide whether your idea is good, whether the code is correct, or whether the PR will be merged. Those are humans-and-Greptile decisions and they happen after triage. All Agent Shin does is check that the PR or issue has enough context and evidence for a maintainer to act on it. If you've linked the issue, described the problem, and shown that your fix works against the real system, you've cleared the bar — the rest of the review is up to the team. + +Thanks for contributing to LiteLLM.