PostHog Support Triage Agent

TL;DR: Paste a support ticket, get a structured triage report in 2-5 minutes. The agent pulls evidence from PostHog's own data, live docs, and GitHub issues in parallel — then produces a confidence-graded root cause assessment with a ready-to-send customer response. No guessing, no stale knowledge, and a read-only investigation workflow by design.

Watch it diagnose a real unanswered issue

The Vision

Every support engineer gets a head start on every ticket.

The bottleneck in support isn't answering tickets — it's the 15-30 minutes of investigation before an answer is even possible. Checking person properties, querying events, searching GitHub for known bugs, cross-referencing docs, figuring out which SDK version matters — all before a single word is written back to the customer.

This agent runs that same investigation in parallel — the same MCP queries, the same GitHub searches, the same docs lookups — and produces a structured report that a human can review, edit, and send. It doesn't replace support engineers; it gives them a head start so they can focus on the hard problems.

         Without the agent                    With the agent

  Ticket arrives                        Ticket arrives
       │                                     │
       ▼                                     ▼
  TSE reads ticket (2 min)              Agent triages (5-10 min)
       │                                     │
       ▼                                     ▼
  Check PostHog data (5 min)            TSE reviews report (2 min)
       │                                     │
       ▼                                     ▼
  Search GitHub issues (5 min)          Edit + send response (3 min)
       │                                     │
       ▼                                     ▼
  Read docs (5 min)                     Done ✓  (~10-15 min total)
       │
       ▼
  Draft response (5 min)
       │
       ▼
  Done ✓  (~20-40 min total)

Architecture at a Glance

See It Work: A Real Triage

Here's the agent triaging an actual support ticket — a customer reporting that posthog.displaySurvey() fails to show a survey on subsequent calls after closing it.

The Ticket

"When I use posthog.displaySurvey() to programmatically show a survey, it displays correctly the first time but fails to appear on subsequent attempts after being closed. My survey is API-mode. I call displaySurvey() with ignoreConditions: true and ignoreDelay: true. After closing and calling displaySurvey() again with the same survey ID, nothing appears. Manually removing the DOM element (.PostHogSurvey-{surveyId}) before calling displaySurvey() fixes it. posthog-js latest. Chrome and Safari."

What the Agent Did (2-5 minutes)

Phase 0 — Intake (5s): Parsed the ticket, identified product area (Surveys), extracted key details (API mode, displaySurvey, DOM cleanup workaround), routed to posthog-survey-diagnosis skill.

Phase 1 — Parallel Research (30s): Fired these simultaneously:

Query	Source	Finding
`gh search issues --repo PostHog/posthog-js "displaySurvey"`	GitHub CLI	Exact match: Issue #2586 — closed Nov 2025
`gh issue view 2586`	GitHub CLI	Root cause confirmed by @adboio: cleanup only triggered for `Popover` type, not `API` type displayed as popover
`gh pr view 2595`	GitHub CLI	Fix PR merged Nov 19 2025 — changed cleanup to check `displayType` not `surveyType`
`npm view posthog-js version`	npm	Current latest: v1.363.5 — fix available since v1.296.0
`docs-search("displaySurvey API mode")`	PostHog MCP	Confirmed `displaySurvey` is the recommended method
Web search: `displaySurvey not showing second time`	WebSearch	Additional confirmation across multiple reports

Phase 2 — Triage Report:

Root Cause:   Known fixed bug (posthog-js#2586)
Confidence:   ✅ Confirmed by data
Fix:          Upgrade to posthog-js >= 1.296.0
Workaround:   Remove DOM element before re-calling (customer already found this)
Escalation:   Not needed — unless customer confirms they're on >= 1.296.0

The Draft Customer Response (ready to edit and send)

Thanks for the detailed report and reproduction steps — really helpful.

This is a known bug fixed in posthog-js v1.296.0 (November 2025). The close handler only cleaned up DOM elements for Popover-type surveys, not API-type surveys rendered as popovers. Fix: PR #2595.

To resolve: npm install posthog-js@latest (current latest is v1.363.5).

Could you confirm your exact version? Your workaround of removing the DOM element is safe to keep using until you upgrade.

Time: 2-5 minutes end-to-end versus 20-40 minutes of manual investigation.

Quick Start

# 1. Clone and enter the project
git clone https://github.com/mongo-ai/posthog-triage-agent.git
cd posthog-triage-agent

# 2. Create .env from the tracked template
cp .env.example .env

# 3. Fill in .env (or export these in your shell)
export POSTHOG_API_KEY="phx_..."      # Personal API key (Settings → Personal API Keys → MCP preset)
export POSTHOG_ORG_ID="..."           # Settings → Organization → General
export POSTHOG_PROJECT_ID="..."       # Visible in URL: posthog.com/project/<ID>
export GITHUB_PAT="ghp_..."           # GitHub PAT with repo read access

# 4. Install Playwright for browser inspection
npx playwright install chromium

# 5. Run the setup smoke test
./test-setup.sh

# 6. Open Claude Code
claude

# Then invoke the agent with a ticket:
#   /posthog-support-agent <paste ticket text here>
#
# Demo tickets test the workflow (intake, search, report structure).
# They reference synthetic entities that won't exist in your project —
# "not found" results are expected. For a full demo, use a real ticket.

How It Works

The agent runs a three-phase workflow. Code gathers facts; the model connects them.

Phase	What happens	Time
Phase 0: Intake	Parse the ticket — extract distinct_id, product area, flag keys, timeframe, urgency	~5s
Phase 1: Parallel Research	Fire 7+ queries simultaneously across PostHog MCP, GitHub, DeepWiki, Context7, and web search	~30s
Phase 2: Synthesis	Produce an evidence-graded triage report with root cause, known-bug match, and draft customer response	~15s

Total: 2-5 minutes per ticket, versus 20-40 minutes of manual investigation.

Phase 0: Intake

The ticket intake skill parses messy customer messages into structured investigation inputs:

Identifiers: distinct_id, person ID, event names, flag keys, error fingerprints
Context: product area, timeframe, blast radius, urgency level
Routing: which SDK repo is relevant, which diagnosis skill to invoke

Phase 1: Parallel Research Blast

Parallelization is mandatory. Every tool call that could run simultaneously does:

Track	Source	What it checks
PostHog MCP	Customer's pinned project	Person properties, events, flag definitions, error details, survey config, pipeline status
PostHog Docs	`docs-search`	Current feature config, known limitations, setup requirements
DeepWiki	Source code analysis	PostHog codebase architecture for the relevant component
Context7	SDK documentation	Version-specific changes, migration guides, API surface
GitHub	Issue search (2+ variants)	Known bugs, PRs, fix status — searches BEFORE blaming the customer
Web Search	Broader symptom search	Community reports, Stack Overflow, related issues

Phase 2: Triage Report

The output is always structured:

Evidence table — every claim cites a real tool/query
Root cause assessment — with honest confidence grading
Known bug check — GitHub search with 2+ query variants
Draft customer response — empathetic, specific, actionable
Escalation decision — with engineering-ready context if needed

Confidence Grading

Every conclusion is tagged with one of three levels. The agent downgrades confidence rather than overstates certainty.

Level	Meaning	When
Confirmed by data	Direct evidence proves the cause	Flag definition shows wrong condition; events query shows zero ingestion
Likely based on pattern match	Evidence strongly suggests but doesn't prove	Symptoms match a known GitHub issue; docs say feature requires X
Suspected, needs human verification	Plausible hypothesis but insufficient evidence	Can't inspect customer's code; multiple possible causes

If confidence is only "suspected," the agent produces an engineering-ready escalation packet instead of pretending certainty.

Tracked Skills

The tracked repo currently includes 18 skills. The agent routes across domain diagnosis skills plus supporting workflow skills.

Diagnosis skills

Skill	Covers	Key Checks
Ticket Intake	Normalize raw tickets	Issue type, product area, identifiers, blast radius, first investigation path
Feature Flag Diagnosis	Flags, experiments, A/B tests	Flag definition, evaluation events, identify timing, property race conditions, bootstrap flicker, local evaluation latency
Session Replay Diagnosis	Web + mobile replay	SDK version, capture vs playback, CSP/CORS, config flags, mobile OOM/throttle, masking issues
Event Diagnosis	Missing, delayed, duplicated, filtered events	Event query, SDK version/region, ad-blocker/proxy detection, ingestion warnings, pipeline filters, UUID dedup
Error Tracking Diagnosis	Exceptions, stack traces, source maps	Exception events, source map upload verification, release context, symbolication
Survey Diagnosis	Display, targeting, response collection	Survey status, targeting conditions, URL rules, repeat suppression, `surveys_opt_in` config, API-mode integration
Pipeline Diagnosis	CDP functions, destinations	Function status, execution logs, auth/rate-limit errors, trigger/filter conditions, credential/payload verification
Data Warehouse Diagnosis	Warehouse syncs and joins	Sync failures, schema mismatches, stale data, query errors, source-specific setup
Web Analytics Diagnosis	Pageviews, bounce rate, attribution	`$pageview`/`$pageleave`, SPA routing, UTMs, reverse proxy gaps, GA discrepancies
Billing Diagnosis	Plan/entitlement/account issues	Plan tier, feature availability, billing bugs, access vs role confusion
Self-Hosted Diagnosis	Docker/Helm/Kubernetes issues	Infra boundaries, ClickHouse/Postgres health, reverse proxy, migrations
Site Inspector	Browser evidence (escalation only)	PostHog SDK presence, config extraction, network resources, CSP violations — requires public URL and browser-relevant issue

Workflow skills

Skill	Purpose
Triage Report	Final synthesis with evidence and confidence grading
Response Drafting	Customer-facing replies matched to severity and certainty
Escalation	Engineering-ready escalation packet
Ship Fix	PR-ready docs or small-fix brief
KB Article	Knowledge-base or docs draft from a resolved issue
Slack Triage	Slack formatting and posting workflow

Architecture

Support Ticket
    │
    ▼
┌─────────────────────┐
│   Ticket Intake      │  Parse → extract identifiers → route to skill
│   (Phase 0, ~5s)     │
└─────────┬───────────┘
          │
          ▼
┌─────────────────────────────────────────────────────────┐
│                PARALLEL RESEARCH BLAST                    │
│                (Phase 1, ~30s)                            │
│                                                          │
│  ┌──────────────┐  ┌───────────┐  ┌──────────────────┐  │
│  │ Domain Skill │  │ PostHog   │  │ GitHub           │  │
│  │ (specialized)│  │ MCP       │  │ (MCP + gh CLI)   │  │
│  │              │──│ EU / US   │  │ 10 SDK repos     │  │
│  │ Flag diag    │  │           │  │ PostHog/posthog  │  │
│  │ Replay diag  │  ├───────────┤  └──────────────────┘  │
│  │ Event diag   │  │ DeepWiki  │  ┌──────────────────┐  │
│  │ Error diag   │  │ MCP       │  │ Playwright       │  │
│  │ Survey diag  │  ├───────────┤  │ (escalation only)│  │
│  │ Pipeline diag│  │ Context7  │  └──────────────────┘  │
│  │ Site inspect │  │ MCP       │  ┌──────────────────┐  │
│  └──────────────┘  └───────────┘  │ Web Search       │  │
│                                    └──────────────────┘  │
└─────────────────────────┬───────────────────────────────┘
                          │
                          ▼
┌─────────────────────────────────────────────────────────┐
│                    TRIAGE REPORT                         │
│                    (Phase 2, ~15s)                        │
│                                                          │
│  Evidence Table → Root Cause → Known Bugs → Draft Reply  │
│                                                          │
│  ┌────────────────┐ ┌──────────┐ ┌───────────────────┐  │
│  │ ● Confirmed    │ │ ● Likely │ │ ● Suspected       │  │
│  └────────────────┘ └──────────┘ └───────────────────┘  │
└─────────────────────────────────────────────────────────┘

PostHog Platform Context

The agent investigates issues across PostHog's multi-layer ingestion pipeline. This reference shows how data flows from SDKs through Kafka to ClickHouse/Postgres, plus the diagnostic strategies the agent applies at each layer.

_{Credit: Generated with NotebookLM}

Security Model

The workflow is intended to be read-only. The agent prompt forbids mutations, and .claude/settings.json explicitly denies:

rm — no file deletion
git push — no repo mutations
curl -X POST/PUT/PATCH/DELETE — no write API calls

Important nuance: this is a workflow-level guarantee, not a formally sandboxed proof of non-mutation. The shell allowlist is still broader than a hard read-only sandbox because local diagnostics rely on gh, node, npx, and Playwright. Treat the repo as operator-reviewed automation, not an unbreakable enforcement boundary.

Integrations

MCP Servers

MCP Server	Purpose	What it provides
PostHog (EU + US)	Primary evidence source	Persons, events, flags, errors, surveys, logs, HogQL, CDP functions, docs search
DeepWiki	Source code analysis	Architecture-level understanding of PostHog codebase components
Context7	Customer stack docs	Framework docs (Next.js, React, Django, etc.) for diagnosing integration-boundary issues
GitHub	Known-issue search (Agent SDK)	Issue tracking, PR status, release notes — required for headless Agent SDK deployments where no CLI is available

CLI Tools

CLI Tool	Purpose	Why CLI over MCP
`gh` (GitHub CLI)	Local GitHub fallback and deep inspection	Useful for fast local inspection, `--json` output, and full comment threads. The tracked agent prompt prefers GitHub MCP first, then falls back to `gh` when needed.
Playwright	Browser inspection (escalation only)	SDK presence, config extraction, CSP/CORS detection — only for browser-relevant issues with a public URL

Roadmap: From Prototype to Pipeline

North Star: Pre-Triage at Scale

The current prototype runs interactively in Claude Code. The production vision is a pipeline that complements existing support tooling (Zendesk, Pylon, HogHero) by pre-triaging tickets so engineers start with evidence, not a blank page:

Zendesk MCP              PostHog MCP              Slack MCP
(ticket arrives)    →    (investigation)     →    (report delivered)
                                                       │
                                                       ▼
                                              TSE reviews + sends
                                              (5 min, not 20)

Integration	MCP Server	Status	What it provides
Zendesk	`zendesk-mcp-server`	External option only	Pull tickets, read comments/tags/priority, search ticket history
Slack	Official Slack MCP	Configured in `.mcp.json` and permitted in `settings.json` — requires OAuth login on first use	Post triage reports to channels, thread follow-ups, notify on-call
PostHog	`mcp.posthog.com`	In use	Project data, docs search, event queries, flag definitions

Target: "Triage Zendesk ticket #48291 and post the report to #support-escalations"

Claude Agent SDK

The path from prototype to production is the Claude Agent SDK — the same tools, agent loop, and context management that power Claude Code, programmable in Python and TypeScript.

What it unlocks:

Programmatic invocation — trigger triage from an API call, not a chat window
Hooks — PreToolUse/PostToolUse for audit logging, cost tracking, and guardrails
Subagents — spawn specialized diagnosis agents per product area in parallel
Sessions — resume investigations across multiple exchanges with full context

# Future: headless triage agent
from claude_agent_sdk import query, ClaudeAgentOptions

async for message in query(
    prompt=f"Triage this support ticket: {ticket_text}",
    options=ClaudeAgentOptions(
        allowed_tools=["Read", "Glob", "Grep", "WebSearch", "WebFetch", "Agent"],
        mcp_servers={
            "posthog": {"type": "http", "url": "https://mcp.posthog.com/mcp", ...},
            "github": {"command": "npx", "args": ["-y", "@modelcontextprotocol/server-github"]},
        },
    ),
):
    if hasattr(message, "result"):
        post_to_slack(message.result)

Self-Monitoring with PostHog

Use PostHog itself to observe the agent. PostHog LLM Analytics tracks generations, traces, costs, and latency — the same product the agent investigates, now monitoring the agent.

response = client.messages.create(
    model="claude-sonnet-4-6",
    messages=[{"role": "user", "content": ticket_text}],
    posthog_distinct_id="support-agent-v1",
    posthog_trace_id=f"triage-{ticket_id}",
    posthog_properties={
        "product_area": "session_replay",
        "priority": "P1",
        "$ai_prompt_name": "posthog-triage-agent",
    },
)

A feedback loop: the agent triages PostHog issues while PostHog monitors the agent's performance.

Limitations & Open Questions

This is a working prototype, not a production system. Deploying it for real support requires answering several questions that are intentionally left open.

Data Access & Compliance

Question	Status	Notes
Can an AI agent directly query customer project data via MCP?	Needs investigation	The prototype currently pins to a single org/project. In production, would the agent query customer accounts directly, or only PostHog's internal instance? This is a policy decision, not a technical one.
What customer data flows through the LLM?	Needs review	Event properties, person properties, and distinct IDs pass through Claude during triage. A compliance review is needed to determine what data can be sent to third-party LLM providers, and whether PII scrubbing or data masking is required before queries.
EU data residency	Partially addressed	PostHog MCP supports EU/US region pinning. But the LLM provider (Anthropic) processes data — the compliance implications of routing EU customer data through a US-based LLM need evaluation.

Integration Maturity

Component	Status	What's needed
PostHog MCP	Working	Core evidence source — stable and in use
GitHub search	Working	`gh` CLI + MCP fallback — reliable
Zendesk MCP	Not configured in this repo	The community Zendesk MCP exists but hasn't been evaluated here for reliability, auth model, or feature completeness.
Slack MCP	Partially wired	`.mcp.json` and a Slack formatting skill exist, but end-to-end posting has not been validated in the tracked repo.
Claude Agent SDK	Not started	The headless deployment path. The prototype validates the workflow; migrating to the Agent SDK is the production path.

Operational Questions

Cost per triage: Each triage involves multiple LLM calls + MCP queries. The per-ticket cost hasn't been measured. PostHog's own LLM Analytics could track this.
Accuracy measurement: Initial blind tests score 97/100 across 5 real issues (see evaluations.md), but a production accuracy loop comparing agent output to TSE-written responses at scale would be needed.
Internal vs customer data: The safest starting point may be pointing the agent at PostHog's internal dogfood instance rather than customer accounts — triaging based on what the support team can see, not direct customer data access. This sidesteps most compliance questions while still providing value.
Hallucination risk: The agent is designed to cite sources and grade confidence, but LLMs can still present plausible-sounding information that isn't grounded in evidence. Human review of every triage report remains essential.

Project Structure

.
├── .claude/
│   ├── agents/
│   │   └── posthog-support-agent.md    # Agent brain — workflow, rules, speed requirements
│   ├── skills/
│   │   ├── posthog-ticket-intake/      # Phase 0: parse tickets into structured inputs
│   │   ├── posthog-feature-flag-diagnosis/
│   │   ├── posthog-session-replay-diagnosis/
│   │   ├── posthog-event-diagnosis/
│   │   ├── posthog-error-tracking-diagnosis/
│   │   ├── posthog-survey-diagnosis/
│   │   ├── posthog-pipeline-diagnosis/
│   │   ├── posthog-data-warehouse-diagnosis/
│   │   ├── posthog-web-analytics-diagnosis/
│   │   ├── posthog-billing-diagnosis/
│   │   ├── posthog-selfhosted-diagnosis/
│   │   ├── posthog-site-inspector/
│   │   ├── posthog-triage-report/      # Phase 2: synthesize evidence into report
│   │   ├── posthog-response-drafting/
│   │   ├── posthog-escalation/
│   │   ├── posthog-ship-fix/
│   │   ├── posthog-kb-article/
│   │   └── posthog-slack-triage/
│   └── settings.json                   # Read-only permissions + deny rules
├── .mcp.json                           # MCP server connections (PostHog EU/US, DeepWiki, Context7, GitHub, Slack)
├── .env.example                        # Copy to .env before running the smoke test
├── CLAUDE.md                           # Workflow instructions
├── demo-tickets.md                     # (local-only, gitignored) Synthetic tickets for demos
├── evaluations.md                      # Blind test results + live issue diagnoses
├── test-setup.sh                       # Smoke test for all connections
└── docs/
    ├── architecture.svg
    └── posthog-architecture-blueprint.jpeg

Design Decisions

Inspired by PostHog's own lessons: What we wish we knew before building AI agents.

Decision	Why
Read-only only	Support agents should never mutate customer state — one misconfigured flag could break production
Least privilege	MCP access is scoped to one org/project with feature filtering, not blanket admin access
Live docs over cached knowledge	SDK behavior changes every release — the agent fetches current docs before making claims
Known-bug search before user blame	Searching GitHub with 2+ query variants before concluding "misconfiguration" prevents false accusations
Confidence grading	Three explicit levels instead of fake certainty — support teams need to know what's proven vs suspected
Parallel-first architecture	7+ queries fire simultaneously — a human doing this sequentially takes 15 minutes; the agent takes 30 seconds
Specialized skills over one big prompt	Multiple diagnosis and workflow skills with specific checks per product area, not one generic "investigate everything" instruction
Escalation packets, not guesses	When evidence is insufficient, the agent produces an engineering-ready escalation packet instead of pretending to know

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
.claude		.claude
docs		docs
tests		tests
triage-reports		triage-reports
.env.example		.env.example
.gitignore		.gitignore
.mcp.json		.mcp.json
CLAUDE.md		CLAUDE.md
README.md		README.md
evaluations.md		evaluations.md
test-setup.sh		test-setup.sh

Folders and files

Latest commit

History

Repository files navigation

PostHog Support Triage Agent

The Vision

Architecture at a Glance

See It Work: A Real Triage

The Ticket

What the Agent Did (2-5 minutes)

The Draft Customer Response (ready to edit and send)

Quick Start

How It Works

Phase 0: Intake

Phase 1: Parallel Research Blast

Phase 2: Triage Report

Confidence Grading

Tracked Skills

Diagnosis skills

Workflow skills

Architecture

PostHog Platform Context

Security Model

Integrations

MCP Servers

CLI Tools

Roadmap: From Prototype to Pipeline

North Star: Pre-Triage at Scale

Claude Agent SDK

Self-Monitoring with PostHog

Limitations & Open Questions

Data Access & Compliance

Integration Maturity

Operational Questions

Project Structure

Design Decisions

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages