feat: pluggable QuotaHook for consumer-side quota enforcement

## Summary

Add an optional \`quotaHook\` configuration to \`LLMProviderFactory\` that allows consumers to inject their own \`check\` / \`record\` logic for per-tenant quota enforcement — without llm-providers having any knowledge of a specific auth or billing backend.

## Motivation

Stackbilt (the primary consumer of this package) is a multi-tenant platform — Hobby / Pro tiers, paying users, per-tenant budget caps. Per-tenant budget enforcement is a cross-cutting concern that affects **every** resource package in the ecosystem: LLM, images, storage, email, etc. To keep llm-providers OSS-clean and auth-backend-agnostic, the tenant/billing logic must live elsewhere.

The canonical contract is being defined in \`Stackbilt-dev/edge-auth#82\` (\`ResourceQuotaProvider\`). llm-providers just needs to expose the seam so consumers can wire their edge-auth-backed implementation in. Other ecosystems (non-Stackbilt) can wire in anything that implements the minimal interface — or skip the hook entirely and use the built-in \`CreditLedger\` for single-tenant deployments.

## Proposed interface (minimal, no Stackbilt coupling)

\`\`\`ts
export interface QuotaHook {
  check(input: QuotaCheckInput): Promise<QuotaCheckResult>;
  record(input: QuotaRecordInput): Promise<void>;
}

export interface QuotaCheckInput {
  tenantId?: string;         // from LLMRequest.tenantId if set
  provider: string;          // 'anthropic', 'groq', etc.
  model: string;
  estimatedCost: number;     // USD, computed from model cost table + request tokens
  metadata?: Record<string, unknown>;
}

export interface QuotaCheckResult {
  allowed: boolean;
  reason?: string;
  remainingBudget?: number;
}

export interface QuotaRecordInput {
  tenantId?: string;
  provider: string;
  model: string;
  actualCost: number;
  inputTokens?: number;
  outputTokens?: number;
  metadata?: Record<string, unknown>;
}
\`\`\`

## Factory integration

1. **Before request dispatch**, factory calls \`quotaHook.check()\` (if configured). If \`allowed: false\`, factory throws \`QuotaExceededError\` (class already exists in llm-providers) with \`reason\` as context. This fires *before* circuit breaker and retry logic.
2. **After successful request**, factory calls \`quotaHook.record()\` in parallel with the existing \`CostTracker.trackCost()\`. Hook errors on \`record()\` should be non-fatal (logged but don't break the response).
3. **Hook errors on \`check()\`** — configurable fail policy:
   - \`failPolicy: 'closed'\` (default) — hook error treated as \`allowed: false\`. Safer, prevents uncontrolled spend if auth backend is down.
   - \`failPolicy: 'open'\` — hook error treated as \`allowed: true\`. Availability-first; risks spend overrun.
4. **Existing CreditLedger continues to work** — QuotaHook is additive. Simple deployments with no multi-tenant needs use the global ledger as today. Multi-tenant deployments layer the hook on top.
5. **Observability** — fire new \`onQuotaCheck\` / \`onQuotaDenied\` events via the existing \`ObservabilityHooks\` surface.

## Non-goals

- llm-providers does NOT implement multi-tenant budgets internally (that's the consumer's auth/billing backend)
- llm-providers does NOT force consumers to use a quota hook — purely opt-in, backwards compatible
- The existing global \`CreditLedger\` is not removed or changed

## Cross-reference

- **edge-auth#82** — canonical \`ResourceQuotaProvider\` contract that Stackbilt consumers will wrap when wiring this hook.
- Same hook pattern will be used by future \`llm-providers\` txt2img support, img-forge, storage wrappers, etc.

## Priority

**MEDIUM** — no consumer is blocked *today*, but it's the seam that unlocks multi-tenant production deployments at scale and is core to the fractal resource-quota architecture we're standardizing on.

🤖 Filed by AEGIS during Phase D scoping session

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: pluggable QuotaHook for consumer-side quota enforcement #27

Summary

Motivation

Proposed interface (minimal, no Stackbilt coupling)

Factory integration

Non-goals

Cross-reference

Priority

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

feat: pluggable QuotaHook for consumer-side quota enforcement #27

Description

Summary

Motivation

Proposed interface (minimal, no Stackbilt coupling)

Factory integration

Non-goals

Cross-reference

Priority

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions