Skip to content

feat(042): telemetry tier 2 — privacy-respecting usage signals#373

Open
Dumbris wants to merge 8 commits intomainfrom
042-telemetry-tier2
Open

feat(042): telemetry tier 2 — privacy-respecting usage signals#373
Dumbris wants to merge 8 commits intomainfrom
042-telemetry-tier2

Conversation

@Dumbris
Copy link
Copy Markdown
Member

@Dumbris Dumbris commented Apr 10, 2026

Summary

Additive expansion of the v1 anonymous telemetry heartbeat with 12 new privacy-respecting signals that answer the product questions we cannot answer today: "which client surface do users prefer?", "which built-in MCP tools matter?", "are installs hitting startup failures?", "where do errors cluster?". Implements spec 042-telemetry-tier2.

  • In-memory counter registry with atomic surface counters and locked maps for built-in tools, REST endpoint histogram, error categories, doctor checks.
  • Counters reset only on a successful 2xx heartbeat send; failures preserve state for retry.
  • Daily heartbeat now includes schema_version: 2, surface_requests, builtin_tool_calls, upstream_tool_call_count_bucket, rest_endpoint_calls, feature_flags, last_startup_outcome, previous_version/current_version, error_category_counts, doctor_checks, anonymous_id_created_at.
  • New CLI: mcpproxy telemetry show-payload (no network call, audit tool).
  • New env vars honored: DO_NOT_TRACK (consoledonottrack.com convention), CI (auto-disable in CI).
  • One-time first-run notice on stderr the first time mcpproxy serve runs.
  • Annual anonymous-ID rotation with legacy-install migration and clock-skew safety.
  • X-MCPProxy-Client header set by the CLI HTTP client (cli/<version>), the web UI fetch (webui/web), and the macOS tray URLRequest (tray/<bundle version>).

Privacy contract

The spec's hard privacy constraints are enforced by internal/telemetry/payload_privacy_test.go::TestPayloadHasNoForbiddenSubstrings, which builds a fully populated payload with deliberately distinctive canary inputs (upstream server named MY-CANARY-SERVER, path-like URL /Users/alice/private-token-store, attempt to leak an upstream tool name) and asserts the rendered JSON contains none of: server names, upstream tool names, file paths, hostnames, IPs, bearer tokens, client secrets, or free-text error messages. Also asserts payload size is under 8 KB.

The canary test passes. Privacy contract is verifiable.

Test plan

  • go test -race ./internal/telemetry/... → pass (14 test files)
  • go test -race ./internal/httpapi/... → pass (new surface + REST histogram middleware tests)
  • go test -race ./internal/cliclient/... → pass (new header-injection test)
  • go test -race -run "TestHandleUpstream|TestHandleCallTool|TestHandleRetrieveTools|TestHandleQuarantine" ./internal/server/ → pass
  • go test -race ./cmd/mcpproxy/... → pass (new startup-outcome mapping test)
  • ./scripts/run-linter.sh → 0 issues
  • go build ./cmd/mcpproxy (personal edition) → pass
  • go build -tags server ./cmd/mcpproxy (server edition) → pass
  • Manual: ./mcpproxy telemetry show-payload emits schema_version=2 and all new fields with no network call
  • Manual: DO_NOT_TRACK=1 ./mcpproxy telemetry status reports Override: DO_NOT_TRACK
  • ./scripts/test-api-e2e.sh → 61 pass, 10 fail (all 10 failures are pre-existing on main, verified by running the same script on main — same test names, same failures, unrelated to telemetry)

Documents

Follow-ups (out of scope)

  • Backend ingester at telemetry.mcpproxy.app needs a separate PR to accept the new schema_version: 2 fields.
  • https://mcpproxy.app/telemetry privacy policy page referenced in the first-run notice.
  • docs/features/telemetry.md full Tier 2 field inventory.

Related #42

claude added 8 commits April 10, 2026 18:21
… telemetry tier 2

Related #42

Initial spec artifacts for the Tier 2 telemetry expansion:
- spec.md: 10 user stories (P1-P3), 44 functional requirements, privacy
  constraints, success criteria, assumptions, out-of-scope
- plan.md: technical context, constitution gate, project structure, complexity
  tracking
- research.md: 16 technical decisions resolved (counter primitives, surface
  classification, REST template extraction, env var precedence, ID rotation,
  schema versioning, etc.)
- data-model.md: HeartbeatPayloadV2, CounterRegistry, ErrorCategory enum,
  FeatureFlagSnapshot, extended TelemetryConfig
- contracts/heartbeat-v2.schema.json: full JSON Schema for the new payload
- quickstart.md: build/run/verify instructions
- tasks.md: 91 dependency-ordered sub-tasks across 13 phases

Spec deliberately follows the autonomous-mode constraint from CLAUDE.md: no
[NEEDS CLARIFICATION] markers; all ambiguous decisions captured in the
Assumptions section. TDD is mandatory: every implementation task is preceded
by a failing test.
…tegories, env overrides

Related #42

Phase 2 of spec 042. Adds the in-memory CounterRegistry that aggregates Tier 2
telemetry counters, the fixed ErrorCategory enum, env-var-based opt-out, and
the extended HeartbeatPayload schema. Hooks ID rotation, upgrade funnel, and
counter reset into the existing telemetry Service so successful sends advance
state and failures preserve it.

## Changes
- internal/config/config.go: extend TelemetryConfig with AnonymousIDCreatedAt,
  LastReportedVersion, LastStartupOutcome, NoticeShown
- internal/telemetry/error_categories.go: 11-value ErrorCategory enum + helper
- internal/telemetry/registry.go: CounterRegistry with atomic surface counters,
  RWMutex-protected built-in/REST/error/doctor maps, Snapshot, Reset, surface
  parser, OAuth provider sort/dedupe helper
- internal/telemetry/feature_flags.go: BuildFeatureFlagSnapshot derives the
  config-flag matrix and OAuth provider type list (no client IDs/URLs ever)
- internal/telemetry/env_overrides.go: DO_NOT_TRACK, CI, MCPPROXY_TELEMETRY
  precedence chain
- internal/telemetry/telemetry.go: HeartbeatPayload extended with all Tier 2
  fields; Service captures envDisabledReason at construction; Start respects
  it; sendHeartbeat resets counters and advances upgrade funnel only on 2xx;
  Snapshot triggers annual ID rotation; legacy installs get created_at
  initialized non-destructively

## Testing
All telemetry tests pass under -race. Personal and server editions both build
clean.
Related #42

Wires Tier 2 counters into:
- HTTP middleware (surface classifier, REST endpoint histogram with Chi templates)
- MCP server (built-in tool histogram, MCP surface, upstream bucketed counter)
- CLI client (X-MCPProxy-Client: cli/<version> via transport wrapper)
- serve startup (first-run notice, startup outcome categorization)
- telemetry CLI (new show-payload subcommand)

All counters are nil-safe so the registry can be attached after route setup.
Upstream tool names are NEVER recorded — only a bucketed total. Built-in
tools come from a fixed allow-list. REST templates come from chi.RouteContext
to prevent path-parameter leakage.

All tests pass under -race. Both editions build clean.
…ntend, swift

Related #42

Final integration pass: error category counters at OAuth refresh, tool
quarantine, and upstream tool call error sites; doctor result aggregation;
X-MCPProxy-Client header on web UI fetch and macOS tray URLRequest;
canonical privacy substring regression test.
Related #42

Removes EnableTray from FeatureFlagSnapshot because the underlying config
field is documented as deprecated/unused. The snapshot should only report
flags that have runtime effect.

## Changes
- internal/telemetry/feature_flags.go: drop EnableTray field and assignment
- internal/telemetry/feature_flags_test.go: drop EnableTray assertions
- internal/telemetry/payload_v2_test.go: drop EnableTray
- internal/telemetry/payload_privacy_test.go: drop EnableTray
Related #42

Previously `mcpproxy telemetry status` only checked MCPPROXY_TELEMETRY=false
for the env override flag. Now it uses the same precedence chain as the
service itself (DO_NOT_TRACK > CI > MCPPROXY_TELEMETRY) and reports the
specific override reason in both the table and JSON outputs.
Related #42

Run of make swagger. Picks up the new TelemetryConfig fields from spec 042:
anonymous_id_created_at, last_reported_version, last_startup_outcome,
notice_shown. No runtime change.
@cloudflare-workers-and-pages
Copy link
Copy Markdown

Deploying mcpproxy-docs with  Cloudflare Pages  Cloudflare Pages

Latest commit: 0f74e52
Status: ✅  Deploy successful!
Preview URL: https://851f24ff.mcpproxy-docs.pages.dev
Branch Preview URL: https://042-telemetry-tier2.mcpproxy-docs.pages.dev

View logs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants