Context
Surfaced while dogfooding @stackbilt/worker-observability in tarotscript-worker (Stackbilt-dev/tarotscript#163). Every dogfood target repeats the same ~30 lines of boilerplate to get root-span-per-request, plus awkward child-span creation. These patterns should live in the library.
Pain point 1 — child spans require .getContext() + { parent } wrapping
Every child-span call site becomes:
```ts
const child = tracer.startSpan('scaffold.classify', {
parent: rootSpan.getContext(),
attributes: { ... },
});
// ... work ...
child.end();
```
Three lines of mechanical plumbing per child span. In tarotscript's `/run` handler I have 4 child spans, so ~12 lines of `rootSpan.getContext()` + `parent:` passing that carries zero signal.
Proposed ergonomic: Span.startChild(name, attrs?)
```ts
const child = rootSpan.startChild('scaffold.classify', { ... });
```
Implementation is trivial — `Span.startChild()` becomes a method that calls `this.tracer.startSpan(name, { parent: this.getContext(), attributes })` on the same tracer. It's the ergonomic shape every OpenTelemetry SDK converges on and it makes the child-span relationship textually obvious at the call site.
Pain point 2 — no Hono-aware middleware helper
The library exports `tracingMiddleware(tracer)` but:
- It doesn't know about Hono's Variables context → can't stash the root span for downstream handlers to create child spans against
- It doesn't use `c.executionCtx.waitUntil(tracer.flush())` → blocks response on HTTP ingest
- It doesn't record errors via `span.recordError()` on thrown exceptions
- It doesn't set `span.setStatus('error')` on 5xx responses
Every dogfood worker ends up writing this ~30-line middleware themselves:
```ts
app.use('*', async (c, next) => {
const obs = getMonitoring(c.env);
if (!obs?.tracer) return next();
const url = new URL(c.req.url);
const span = obs.tracer.startTrace(`${c.req.method} ${url.pathname}`, {
'http.method': c.req.method,
'http.target': url.pathname,
'http.host': url.host,
});
c.set('rootSpan', span);
try {
await next();
span.setAttributes({ 'http.status_code': c.res.status });
if (c.res.status >= 500) span.setStatus('error');
} catch (err) {
span.recordError(err as Error);
span.setStatus('error');
throw err;
} finally {
span.end();
c.executionCtx.waitUntil(Promise.allSettled([
obs.tracer.flush(),
obs.metrics.flush(),
]));
}
});
```
Proposed ergonomic: honoTracing(monitoring, options?)
```ts
import { honoTracing } from '@stackbilt/worker-observability/hono';
app.use('*', honoTracing(getMonitoring(c.env)));
// downstream handlers:
const rootSpan = c.get('rootSpan');
const child = rootSpan?.startChild('scaffold.classify');
```
Options could include:
- `skip?: (c) => boolean` — e.g., skip health checks or static asset routes
- `attributes?: (c) => Record<string, any>` — add tenant_id, user_id, route_pattern
- `spanNamer?: (c) => string` — custom naming (default: `${method} ${pathname}`)
Impact on users
Every Stackbilt worker instrumented with this library (stackbilt-web, edge-auth, tarotscript, img-forge, aegis, and future pro-tier customer workers) copy-pastes the same middleware. Centralizing it means:
- Bug fixes propagate to everyone automatically (e.g., if we discover a `waitUntil` edge case)
- New workers go from zero to traces in 2 lines instead of 30
- Consistency across dashboards — same attribute names, same error-recording semantics
Related
- Sibling issue for AsyncLocalStorage active span context — will file separately
- Sibling issue for npm publishing + quickstart docs — will file separately
These three issues together represent the friction a dogfood user hits on first instrumentation. Fixing them turns the library from "read the source to figure it out" into "3-line integration."
Context
Surfaced while dogfooding
@stackbilt/worker-observabilityintarotscript-worker(Stackbilt-dev/tarotscript#163). Every dogfood target repeats the same ~30 lines of boilerplate to get root-span-per-request, plus awkward child-span creation. These patterns should live in the library.Pain point 1 — child spans require
.getContext()+{ parent }wrappingEvery child-span call site becomes:
```ts
const child = tracer.startSpan('scaffold.classify', {
parent: rootSpan.getContext(),
attributes: { ... },
});
// ... work ...
child.end();
```
Three lines of mechanical plumbing per child span. In tarotscript's `/run` handler I have 4 child spans, so ~12 lines of `rootSpan.getContext()` + `parent:` passing that carries zero signal.
Proposed ergonomic:
Span.startChild(name, attrs?)```ts
const child = rootSpan.startChild('scaffold.classify', { ... });
```
Implementation is trivial — `Span.startChild()` becomes a method that calls `this.tracer.startSpan(name, { parent: this.getContext(), attributes })` on the same tracer. It's the ergonomic shape every OpenTelemetry SDK converges on and it makes the child-span relationship textually obvious at the call site.
Pain point 2 — no Hono-aware middleware helper
The library exports `tracingMiddleware(tracer)` but:
Every dogfood worker ends up writing this ~30-line middleware themselves:
```ts
app.use('*', async (c, next) => {
const obs = getMonitoring(c.env);
if (!obs?.tracer) return next();
const url = new URL(c.req.url);
const span = obs.tracer.startTrace(`${c.req.method} ${url.pathname}`, {
'http.method': c.req.method,
'http.target': url.pathname,
'http.host': url.host,
});
c.set('rootSpan', span);
try {
await next();
span.setAttributes({ 'http.status_code': c.res.status });
if (c.res.status >= 500) span.setStatus('error');
} catch (err) {
span.recordError(err as Error);
span.setStatus('error');
throw err;
} finally {
span.end();
c.executionCtx.waitUntil(Promise.allSettled([
obs.tracer.flush(),
obs.metrics.flush(),
]));
}
});
```
Proposed ergonomic:
honoTracing(monitoring, options?)```ts
import { honoTracing } from '@stackbilt/worker-observability/hono';
app.use('*', honoTracing(getMonitoring(c.env)));
// downstream handlers:
const rootSpan = c.get('rootSpan');
const child = rootSpan?.startChild('scaffold.classify');
```
Options could include:
Impact on users
Every Stackbilt worker instrumented with this library (stackbilt-web, edge-auth, tarotscript, img-forge, aegis, and future pro-tier customer workers) copy-pastes the same middleware. Centralizing it means:
Related
These three issues together represent the friction a dogfood user hits on first instrumentation. Fixing them turns the library from "read the source to figure it out" into "3-line integration."