Skip to content

perf(indexer): batched JSON-RPC + composite address indexes + edge cache#45

Merged
github-actions[bot] merged 1 commit into
mainfrom
chore/indexer-tier1-perf
May 10, 2026
Merged

perf(indexer): batched JSON-RPC + composite address indexes + edge cache#45
github-actions[bot] merged 1 commit into
mainfrom
chore/indexer-tier1-perf

Conversation

@satyakwok
Copy link
Copy Markdown
Member

Summary

Tier 1 of an indexer audit pass — three independent perf improvements that share one PR because they touch different layers (chain client, DB schema, API edge) with zero behavior change user-visible.

1. viem batch transport

`packages/chain/src/index.ts` — `http(url, { batch: { batchSize: 100, wait: 0 } })`. Indexer backfill fires 50–5000 concurrent JSON-RPC calls per block batch (getBlock + getLogs + per-tx fetch). With batching, viem coalesces them into a single HTTP request before the next micro-task — N round-trips collapse into 1 per block. `wait:0` keeps single-call latency unchanged.

2. Composite address-history indexes

`packages/db/src/schema.ts` + drizzle migration `0004_workable_zeigeist.sql` — four indexes on `(from_addr, block_height)` and `(to_addr, block_height)` for both `transactions` and `token_transfers`. The single-column `from_idx` / `to_idx` filters by address but the planner has to sort `block_height` separately for `/address/:addr/txs ?before=cursor`. Composite serves filter + sort in one index scan.

Migration uses `IF NOT EXISTS` so the operator can pre-create indexes with `CREATE INDEX CONCURRENTLY` on a write-active production indexer to avoid blocking writes:

```sql
CREATE INDEX CONCURRENTLY IF NOT EXISTS txs_from_block_idx ON transactions(from_addr, block_height);
CREATE INDEX CONCURRENTLY IF NOT EXISTS txs_to_block_idx ON transactions(to_addr, block_height);
CREATE INDEX CONCURRENTLY IF NOT EXISTS transfers_from_block_idx ON token_transfers(from_addr, block_height);
CREATE INDEX CONCURRENTLY IF NOT EXISTS transfers_to_block_idx ON token_transfers(to_addr, block_height);
```

The Drizzle migration then becomes a no-op when run.

3. HTTP Cache-Control hook

`apps/api/src/cache-control.ts` — per-route TTL policy:

Pattern TTL Rationale
`/chain/info`, `/blocks` 2 s Live-tip; one block-time window
`/blocks/:height`, `/tx/:hash`, `/contracts/pioneers` 5 min + immutable Finalized object, never changes
`/address/:addr/(txs|transfers)`, `/contracts/recent`, `/whale/tx` 10 s Paginated lists, dedupe burst
`/stats/daily`, `/epochs` 60 s Aggregate stats
`/contracts/stats`, `/accounts/active`, `/validators`, `/tokens`, `/tokens/:addr/holders` 30 s Mid-rate aggregates
`/health` no-store Probe should always live-fetch

Hook respects any explicit `Cache-Control` set inside a route handler. Sets `Vary: Accept, Accept-Encoding` so future content-type variants don't collide in shared caches.

Out of scope (deferred to follow-ups)

Per the audit doc: batched per-tx fetch in `sync.ts` (Tier 2), materialized view for `/stats/daily`, RPC failover, reorg buffer, declarative event handlers, GraphQL, partitioning.

Test plan

  • `pnpm turbo build` passes
  • `pnpm db:migrate` against staging DB applies 0004 cleanly
  • `EXPLAIN ANALYZE SELECT * FROM transactions WHERE from_addr = '0x…' ORDER BY block_height DESC LIMIT 25` shows Index Scan on txs_from_block_idx (not Bitmap Heap Scan)
  • Live indexer endpoints return Cache-Control headers per the policy table
  • viem batched mode reduces `/chain/info?` round-trip count visible in indexer worker pino logs during backfill burst

Tier 1 of an indexer audit pass — three independent perf improvements
that share one PR because they touch different layers (chain client,
DB schema, API edge) with zero behavior changes user-visible.

1. **viem batch transport** (packages/chain/src/index.ts)
   `http(url, { batch: { batchSize: 100, wait: 0 } })`. Indexer backfill
   fires 50–5000 concurrent JSON-RPC calls per block batch (getBlock +
   getLogs + per-tx fetch). With batching, viem coalesces them into a
   single HTTP request before the next micro-task — N round-trips
   collapse into 1 per block. wait:0 keeps single-call latency unchanged.

2. **Composite address-history indexes** (packages/db/src/schema.ts +
   drizzle migration 0004)
   Four indexes on (from_addr, block_height) and (to_addr, block_height)
   for both `transactions` and `token_transfers`. The single-column
   from_idx / to_idx filters by address but the planner has to sort
   block_height separately for `/address/:addr/txs ?before=cursor`. The
   composite serves filter + sort in one index scan. Migration uses
   `IF NOT EXISTS` so the operator can pre-create with `CONCURRENTLY`
   on a write-active production indexer to avoid blocking writes.

3. **HTTP Cache-Control hook** (apps/api/src/cache-control.ts +
   index.ts)
   Per-route TTL policy: 2 s for live-tip endpoints (chain/info, /blocks
   list), 10 s for paginated history (address/txs, contracts/recent,
   whale/tx), 30–60 s for aggregate stats, 5 min + immutable for
   finalized objects (specific block / tx, contracts/pioneers). Same
   data fetched 100x by 100 clients now serves from edge cache;
   indexer DB pool freed for actual unique queries. The hook respects
   any explicit Cache-Control set inside a route handler.

No schema-incompatible changes — safe to deploy without coordinating
with the running worker. New migration adds indexes only; query plans
on existing endpoints will pick up the composites automatically.

Subsequent tiers (per the audit doc): batched per-tx fetch in sync.ts,
materialized view for /stats/daily, RPC failover, declarative event
handlers, partitioning, GraphQL.
@github-actions github-actions Bot enabled auto-merge (squash) May 10, 2026 20:02
@github-actions github-actions Bot merged commit d71d63f into main May 10, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant