From 0108b5bd091ead708f0632db28d08dae99d6a964 Mon Sep 17 00:00:00 2001
From: mateo-berri <277851410+mateo-berri@users.noreply.github.com>
Date: Thu, 28 May 2026 03:37:51 +0000
Subject: [PATCH] blog: introduce Agent Shin triage bot for external PRs and
 issues

Adds the public-facing explanation linked from every Agent Shin comment.
Covers what the bot checks, the PR + issue rubrics (including the
non-mocked QA-proof requirement), the 24h grace window, and the
@agent-shin reconsider / @greptileai recovery paths.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---
 blog/agent_shin_triage/index.md | 120 ++++++++++++++++++++++++++++++++
 1 file changed, 120 insertions(+)
 create mode 100644 blog/agent_shin_triage/index.md

diff --git a/blog/agent_shin_triage/index.md b/blog/agent_shin_triage/index.md
new file mode 100644
index 00000000..8daefebd
--- /dev/null
+++ b/blog/agent_shin_triage/index.md
@@ -0,0 +1,120 @@
+---
+slug: agent-shin-triage
+title: "Meet Agent Shin: how we triage external PRs and issues on LiteLLM"
+date: 2026-05-28T10:00:00
+authors:
+  - mateo
+description: "Agent Shin is the automated triage bot for BerriAI/litellm. This post explains what it checks, why we built it, and exactly what to do if it comments on your PR or issue."
+tags: [community, contributing]
+hide_table_of_contents: false
+---
+
+If you've opened a pull request or issue on [BerriAI/litellm](https://github.com/BerriAI/litellm), you may have heard from **Agent Shin** — the automated triage bot that runs against external contributions. This post is the long-form explanation that every Agent Shin comment links to: what it does, why, and what to do if you think it got things wrong.
+
+{/* truncate */}
+
+## Why we built it
+
+LiteLLM gets a lot of inbound. On a typical week the repo sees ~150 new pull requests and ~100 new issues, most of them from external contributors we've never worked with before. That volume is wonderful — it's why LiteLLM has integrations with 100+ providers — but it has a cost: a meaningful fraction of those PRs and issues are not actionable as filed.
+
+The most common patterns we see are:
+
+- A PR with no body, or a body that's just the template with nothing filled in.
+- A PR that touches non-trivial provider code but doesn't link an issue, doesn't describe the bug, and doesn't show any output of the fix actually working.
+- A bug report that's two sentences and a screenshot of a stack trace — no repro, no config, no way for a maintainer to make it happen on their machine.
+- A feature request with no use case (just "support X").
+
+Each of these costs a maintainer 10–30 minutes of reading code, guessing intent, and writing a comment asking for more information. Multiply by a few hundred and the result is that high-quality contributions — the ones that *did* link an issue, *did* attach a screenshot, *did* show a clean before/after — wait days for a first response while we sift.
+
+Agent Shin's job is to do that first sift consistently, in public, and with a clear path back into the queue.
+
+## What Agent Shin does
+
+Agent Shin runs on every external PR and issue. For each one, it asks a small LLM to check the body against the contribution rubric. Based on the verdict, it does one of three things:
+
+1. **Pass quietly** — most PRs and issues land here. Agent Shin leaves no comment; the contribution flows to the normal review queue.
+2. **Post a heads-up and give you 24 hours** — if the body is missing the basics, Agent Shin comments on the PR/issue with a list of what it couldn't find and a one-day grace window to fix it.
+3. **Auto-close after the grace window** — if 24 hours pass and the body still doesn't meet the bar, Agent Shin closes the PR/issue with a follow-up comment explaining how to bring it back. Closing is reversible — see [If you disagree](#if-you-disagree) below.
+
+A few important things Agent Shin will **not** do:
+
+- It does not touch PRs or issues from BerriAI org members, repo collaborators, or other bots.
+- It does not auto-close anything a human maintainer has already engaged with or labeled for review.
+- It does not delete your work. A closed PR keeps all its commits, comments, and diff — it's just moved out of the open queue.
+
+## The rubric for pull requests
+
+A PR passes triage when **both** of the following are true:
+
+### (1) Context — at least one of:
+
+- **A linked issue.** `Fixes #1234`, `Closes #1234`, `Resolves #1234`, or a link to the issue. A bare `#1234` without a closing keyword counts only if it's clearly the related issue.
+- **Or a clear problem description in the body** that explains what bug or missing feature this addresses (beyond the title), plus expected vs. actual behavior (or, for features, "what's possible now vs. with this PR").
+
+### (2) End-to-end QA proof — at least one of:
+
+- **A screenshot** (or before/after screenshots) showing the fix or feature working.
+- **A short screen recording / video** showing the fix or feature working.
+- **The exact commands you ran, paired with their real output**, demonstrating the change works against the real system — a real curl against the real upstream, a real proxy request, real log output from your dev instance.
+
+> **"No mocking allowed" for option 3.** Unit tests in `tests/test_litellm/*` stub the upstream LLM provider, the database, and the network. They're great for catching regressions and you should still add them — they're a hard requirement in the PR template — but they don't *prove* the change works end-to-end. For Agent Shin's purposes, the output of a real integration run, a real proxy hit, or a screenshot of the feature in the UI is what counts. "I ran pytest, 36 passed" is not QA proof on its own.
+
+If your PR has a linked issue but no QA proof, it fails. The linked issue gives the maintainer context; QA proof gives them evidence. We need both.
+
+### What doesn't count as QA proof
+
+- Generic claims: "I tested it", "works locally", "all tests pass".
+- A checked "I added tests" box with no output shown.
+- A description of what tests exist without their actual output in the PR body.
+- A linked issue (that's context, not proof).
+
+## The rubric for issues
+
+### Bug reports must contain:
+
+- **A reproduction** — runnable code, a curl command, or an example config a maintainer can paste into their machine.
+- **A screenshot, traceback, or log** showing the bug.
+- **Expected vs. actual behavior.**
+
+### Feature requests must contain:
+
+- **A clear description** of what LiteLLM should do that it doesn't today.
+- **A use case with a concrete example** — config, API call, UI flow, or scenario showing what's blocked today.
+
+## What happens if you got the heads-up comment
+
+The first time Agent Shin spots a problem, you'll get a comment that looks roughly like this:
+
+> 👋 Hi, thanks for the PR! I'm Agent Shin… I couldn't find: visual QA proof… ⏳ You have 1 day to address this before this PR is auto-closed.
+
+You have three ways to clear it:
+
+1. **Edit the PR/issue description** to add what's missing, then comment `@agent-shin reconsider`. Agent Shin will re-run the rubric. If it now passes, the grace warning is dismissed and the PR/issue stays open.
+2. **Comment `@greptileai`** on a PR to request a fresh Greptile review. A Greptile confidence score of 4/5 or higher is one of the signals that lifts a PR out of the close queue, even after the grace warning. (This works for closed PRs too.)
+3. **Do nothing** — if the body genuinely doesn't have what's being asked for, the PR/issue will be auto-closed after 24 hours. That's not a final answer — see below.
+
+## What happens after auto-close
+
+A closed PR or issue is not a "won't fix". To bring it back:
+
+- For PRs: you can either open a fresh PR with the same commits (the most reliable path, since GitHub doesn't always let an external contributor reopen a bot-closed PR), or comment `@agent-shin reconsider` after updating the description. If the rubric now passes, Agent Shin reopens the PR.
+- For issues: comment `@agent-shin reconsider` after editing the issue. The bot re-evaluates and reopens if it now meets the bar.
+- For either: a human maintainer can always reopen with a one-line comment. If you ping one and they agree the close was wrong, they'll override Agent Shin.
+
+## If you disagree
+
+Agent Shin uses an LLM, and LLMs aren't perfect. We've spot-checked it against 50+ recent contributions, but it will misjudge edge cases.
+
+If you think Agent Shin got it wrong:
+
+- **Comment `@agent-shin reconsider`** with a sentence explaining why. It re-runs the rubric on the current body and posts a fresh verdict.
+- **Ping a maintainer.** A human can override the bot at any time.
+- **File feedback** on the [Agent Shin tracking issue](https://github.com/BerriAI/litellm/issues) if you think the *rubric itself* is wrong — too strict, too lenient, or missing a category that should pass.
+
+The goal of this system is to make external contributions faster to land, not harder. If Agent Shin is doing the opposite for you, we want to know.
+
+## A note on what this isn't
+
+Agent Shin doesn't decide whether your idea is good, whether the code is correct, or whether the PR will be merged. Those are humans-and-Greptile decisions and they happen after triage. All Agent Shin does is check that the PR or issue has enough context and evidence for a maintainer to act on it. If you've linked the issue, described the problem, and shown that your fix works against the real system, you've cleared the bar — the rest of the review is up to the team.
+
+Thanks for contributing to LiteLLM.