fix(badges): harden rendering against upstream failures and bad input#123
Conversation
GitHub provider (the main API-dependent badge source):
- Detect 403 rate limits (x-ratelimit-remaining: 0 / Retry-After), not just
429/503 — GitHub signals primary and secondary limits as 403, so a
rate-limited pool previously kept getting hammered with no backoff and
badges degraded to "not found"
- Vet the unauthenticated 401-retry response through the same status checks
instead of returning it raw (error bodies could flow into badge values)
- Fix /github/last-commit/{owner}/{repo}/{branch}: the URL-building hack
produced "commits?sha=X?per_page=1", breaking branch-scoped last-commit
- Fail downloads/downloads-asset (all releases) on any failed page instead
of rendering "0" — a transient blip could persist a bogus 0 as the 7-day
last-known-good value, clobbering the real count
- Only render dependabot "not found" on a definitive 404 of both spellings;
transient failures now fall back to last-known-good
- Validate milestone/tag/release fields so undefined/NaN never reach text
- Guard JSON parsing and invalid dates; share the repo-exists HEAD probe
All providers:
- Add an 8s upstream timeout (race-based, leaves Next fetch caching intact)
to providerFetch/providerFetchText, GitHub, VS Code, twemoji, flags, the
https proxy, and dynamic JSON badges — a hung upstream no longer hangs
the badge until the platform kills the request
- Guard JSON body parsing in the shared fetch helpers
- tokscale: return null on fetch failure instead of a fake "not found"
badge that was cached as a success for an hour
- sonar: numeric measure guards — non-numeric values rendered "NaN%"
- gitlab: fix pipeline URL missing "?" when no branch is set (the default
pipeline badge was broken); validate status/ref fields
- chocolatey/liberapay/vscode: NaN and missing-field guards
- cocoapods: drop a dead CDN fetch whose every branch returned the same badge
- https proxy + dynamic JSON badges: mark failure verdicts error:true so
they get short error cache headers and self-heal instead of being pinned
at the CDN; coerce arbitrary endpoint JSON to strings (no more
"[object Object]", falsy 0 values now render)
Render pipeline:
- Validate all user color params (color, labelColor, logoColor, valueColor,
labelTextColor) through a shared resolver: named colors now work
everywhere, short hex is expanded, garbage is dropped instead of reaching
the SVG
- Clamp numeric layout params (height, fontSize, padX, ...) to sane ranges
- Truncate badge text at 256 chars and coerce non-strings so a provider bug
can never paint "undefined" or balloon the SVG
- Guard malformed viewBox in user-supplied logo SVGs (scale(NaN))
- Guard parseFormat against empty path segments
Adds tests for color resolution and text sanitization.
https://claude.ai/code/session_019og7QYLUhEsszoq3266W7X
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
| export function raceTimeout<T>(promise: Promise<T>, ms: number = UPSTREAM_TIMEOUT_MS): Promise<T | null> { | ||
| // Swallow the late rejection if the timeout wins the race. | ||
| promise.catch(() => {}) | ||
| return Promise.race([ | ||
| promise, | ||
| new Promise<null>(resolve => setTimeout(() => resolve(null), ms)), | ||
| ]) |
There was a problem hiding this comment.
Bug: When a fetch request times out via raceTimeout, the underlying fetch continues, and the token's last_used_at timestamp is updated, which can degrade the effectiveness of the token rotation.
Severity: LOW
Suggested Fix
To improve token rotation efficiency, consider updating the last_used_at timestamp only after the fetch operation successfully completes, rather than when the token is initially picked. This would prevent timed-out requests from skewing the rotation logic.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's
not valid.
Location: packages/core/src/provider-fetch.ts#L25-L31
Potential issue: The `raceTimeout` function does not cancel the underlying fetch
operation when a timeout occurs. While the request's result is discarded, the fetch
continues to run in the background. The `pickToken` function updates the `last_used_at`
timestamp for a token when it's selected, not when the associated operation completes.
If many requests time out, the `last_used_at` timestamps for their tokens are updated,
which pollutes the timestamp data and makes the round-robin token rotation less
effective. This can lead to a minor performance degradation as the token pool's rotation
becomes less efficient.
Also affects:
packages/core/src/providers/github.ts:41~53
Did we get this right? 👍 / 👎 to inform future reviews.
There was a problem hiding this comment.
Not applying this one. The request was sent — a timed-out fetch still consumes the token's GitHub quota even though we stop awaiting it, so updating last_used_at at pick time is the accurate accounting. Moving the update to successful completion would make timed-out/failed requests invisible to rotation, so the pool would keep re-picking the same (possibly degraded) token instead of spreading load. Round-robin on "last picked" is the behavior we want here.
Generated by Claude Code
When GitHub is rate limited / backed off and a badge key has no last-known-good value (e.g. a recently added stars badge), the route rendered a red "github | not found" — which reads as "your repo/badge URL is wrong" even though the repo is perfectly valid. Badges that happened to have a 7-day stale copy (e.g. last-commit) kept rendering, making the failure look repo-specific. Now, when the failure is provably transient — the repo-exists probe couldn't reach GitHub, or the provider is in a backoff window — the badge renders a gray "github | unavailable" verdict instead: error-cached for 60s, never persisted as last-known-good, self-heals as soon as the upstream recovers. A genuine 404 still renders "invalid repository" and a repo that exists but has no data for the topic still renders "not found". Group badges now also use short error cache headers when any segment carries a terminal-error verdict. https://claude.ai/code/session_019og7QYLUhEsszoq3266W7X
GitHub provider (the main API-dependent badge source):
429/503 — GitHub signals primary and secondary limits as 403, so a
rate-limited pool previously kept getting hammered with no backoff and
badges degraded to "not found"
instead of returning it raw (error bodies could flow into badge values)
produced "commits?sha=X?per_page=1", breaking branch-scoped last-commit
of rendering "0" — a transient blip could persist a bogus 0 as the 7-day
last-known-good value, clobbering the real count
transient failures now fall back to last-known-good
All providers:
to providerFetch/providerFetchText, GitHub, VS Code, twemoji, flags, the
https proxy, and dynamic JSON badges — a hung upstream no longer hangs
the badge until the platform kills the request
badge that was cached as a success for an hour
pipeline badge was broken); validate status/ref fields
they get short error cache headers and self-heal instead of being pinned
at the CDN; coerce arbitrary endpoint JSON to strings (no more
"[object Object]", falsy 0 values now render)
Render pipeline:
labelTextColor) through a shared resolver: named colors now work
everywhere, short hex is expanded, garbage is dropped instead of reaching
the SVG
can never paint "undefined" or balloon the SVG
Adds tests for color resolution and text sanitization.
https://claude.ai/code/session_019og7QYLUhEsszoq3266W7X