feat: opt-in ETag read cache for github storage by openrijal · Pull Request #6 · awecode/autoadmin

openrijal · 2026-05-20T22:59:32Z

Stacked on #3 — please merge #3 first. Until #3 lands, the diff here shows both #3's commits and the new cache commit (`6c3103f`). After #3 merges, only the cache commit will remain in the diff.

Summary

Adds an in-process LRU read cache (cap 64) keyed by `owner/repo@ref:path`, storing parsed JSON alongside the GitHub response `ETag`. When enabled, reads send `If-None-Match` on subsequent requests and short-circuit on 304. GitHub returns 304 without spending rate-limit budget, so this is a meaningful latency + quota win on repeated reads of the same resource — admin list re-renders, navigation between detail and list views, etc.

Disabled by default at every layer.

Enabling

Globally (recommended for trusted deployments where you own all writes):

```ts
// nuxt.config.ts
runtimeConfig: {
autoadmin: {
github: { cacheReads: true },
},
}
```

Per-resource (overrides the global default):

```ts
register({
kind: 'array',
key: 'blogs',
storage: {
kind: 'github',
owner: 'me',
repo: 'cms',
path: 'data/blogs.json',
cacheReads: true,
},
// ...
})
```

Why opt-in, not always-on

Module-scoped state is undesirable in multi-tenant shared isolates.
A stale cache could hide a manual repo edit (e.g. someone edits the JSON in GitHub's web UI between admin reads).
Some deployments deliberately want every read to hit GitHub for audit reasons.

Writes always invalidate the cached entry unconditionally (success or 409 conflict), so the cache invariants stay correct regardless of the flag. The gate only controls whether reads populate and consult the cache.

What it does NOT do

No TTL — entries live until they're evicted by LRU or invalidated by a write. ETag-conditional requests are cheap (304 is free), so a TTL would be a regression. If you want time-based eviction, call `clearGithubReadCache()` from your own scheduler.
No cross-isolate sharing. Each worker/isolate has its own cache. For Cloudflare Pages this means warm isolates benefit, cold starts don't.
Does not cache writes, error responses, or 404s.

API additions (all optional)

`GithubReadOptions.cacheReads?: boolean`
`GithubJsonRepositoryOptions.cacheReads?: boolean`
`JsonStorageConfig` (github variant): `cacheReads?: boolean`
`runtimeConfig.autoadmin.github.cacheReads?: boolean`
`clearGithubReadCache(): void` exported from `server/utils/githubContents.ts` for tests

Test plan

Standalone typecheck clean (only Nitro auto-imports flagged).
Manual: enable, hit the same endpoint twice, confirm second response is 304 from GitHub.
Manual: write a row, then read — confirm cache is bypassed and new content is returned.
CI / lint on this repo.

The GitHub Contents API only returns the `content` field for files under 1MB. For files between 1-100 MB it responds with `type: 'file'` and `encoding: 'none'` but an empty `content`, causing the existing guard to throw `GitHub response is not a single file with content.` When `content` is missing, fetch the blob by sha via the Git Blobs API (`GET /repos/{owner}/{repo}/git/blobs/{sha}`), which streams base64 content up to 100 MB, then decode as before. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@ref

Builds on the Blobs API fallback with three operational improvements: 1. `maxBytes` / `warnAtBytes` size guardrail. New per-resource `storage.maxBytes` throws a 413 with the actual byte count when a read or write exceeds it; `warnAtBytes` logs `console.warn` once per path. Enforced against the Contents API's `body.size` on reads and `Buffer.byteLength(payload.content, 'base64')` on writes. 2. Locator-prefixed error messages. Every `createError` now embeds `owner/repo:path[@ref]` and, where relevant, the file size or short blob sha. This matters in serverless logs where the original request context is otherwise lost. 3. Explicit narrowing for `base64Content`. Replaces the post-fallback non-null assertion with a typed check, and surfaces empty-file decode as a clear 422 instead of the previous misleading "not valid JSON". Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Adds an in-process LRU read cache (cap 64) keyed by `owner/repo@ref:path`, storing parsed JSON alongside the response ETag. When enabled, reads send `If-None-Match` on subsequent requests and short-circuit on 304 — GitHub returns 304 without spending rate-limit budget, so this is a meaningful latency + quota win on repeated reads of the same resource (e.g. admin list re-renders). **Disabled by default at every layer.** Enable globally via `runtimeConfig.autoadmin.github.cacheReads = true` or per-resource via `storage.cacheReads = true`. Per-resource takes precedence. Opt-in rather than always-on because module-scoped state is undesirable in multi-tenant shared isolates and a stale read could hide a manual repo edit. Successful and conflicting writes always invalidate the cached entry unconditionally, so the cache code is safe to leave in even when `cacheReads` is `false` (the gate ensures it's never populated in that case). Also exports `clearGithubReadCache()` for tests. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

openrijal and others added 3 commits May 20, 2026 09:37

openrijal mentioned this pull request May 20, 2026

fix: fall back to git blobs API for github files >1MB #3

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: opt-in ETag read cache for github storage#6

feat: opt-in ETag read cache for github storage#6
openrijal wants to merge 3 commits into
awecode:mainfrom
openrijal:feat/github-etag-cache

openrijal commented May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

openrijal commented May 20, 2026

Summary

Enabling

Why opt-in, not always-on

What it does NOT do

API additions (all optional)

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant