Skip to content

feat: pluggable QuotaHook for consumer-side quota enforcement #27

@stackbilt-admin

Description

@stackbilt-admin

Summary

Add an optional `quotaHook` configuration to `LLMProviderFactory` that allows consumers to inject their own `check` / `record` logic for per-tenant quota enforcement — without llm-providers having any knowledge of a specific auth or billing backend.

Motivation

Stackbilt (the primary consumer of this package) is a multi-tenant platform — Hobby / Pro tiers, paying users, per-tenant budget caps. Per-tenant budget enforcement is a cross-cutting concern that affects every resource package in the ecosystem: LLM, images, storage, email, etc. To keep llm-providers OSS-clean and auth-backend-agnostic, the tenant/billing logic must live elsewhere.

The canonical contract is being defined in `Stackbilt-dev/edge-auth#82` (`ResourceQuotaProvider`). llm-providers just needs to expose the seam so consumers can wire their edge-auth-backed implementation in. Other ecosystems (non-Stackbilt) can wire in anything that implements the minimal interface — or skip the hook entirely and use the built-in `CreditLedger` for single-tenant deployments.

Proposed interface (minimal, no Stackbilt coupling)

```ts
export interface QuotaHook {
check(input: QuotaCheckInput): Promise;
record(input: QuotaRecordInput): Promise;
}

export interface QuotaCheckInput {
tenantId?: string; // from LLMRequest.tenantId if set
provider: string; // 'anthropic', 'groq', etc.
model: string;
estimatedCost: number; // USD, computed from model cost table + request tokens
metadata?: Record<string, unknown>;
}

export interface QuotaCheckResult {
allowed: boolean;
reason?: string;
remainingBudget?: number;
}

export interface QuotaRecordInput {
tenantId?: string;
provider: string;
model: string;
actualCost: number;
inputTokens?: number;
outputTokens?: number;
metadata?: Record<string, unknown>;
}
```

Factory integration

  1. Before request dispatch, factory calls `quotaHook.check()` (if configured). If `allowed: false`, factory throws `QuotaExceededError` (class already exists in llm-providers) with `reason` as context. This fires before circuit breaker and retry logic.
  2. After successful request, factory calls `quotaHook.record()` in parallel with the existing `CostTracker.trackCost()`. Hook errors on `record()` should be non-fatal (logged but don't break the response).
  3. Hook errors on `check()` — configurable fail policy:
    • `failPolicy: 'closed'` (default) — hook error treated as `allowed: false`. Safer, prevents uncontrolled spend if auth backend is down.
    • `failPolicy: 'open'` — hook error treated as `allowed: true`. Availability-first; risks spend overrun.
  4. Existing CreditLedger continues to work — QuotaHook is additive. Simple deployments with no multi-tenant needs use the global ledger as today. Multi-tenant deployments layer the hook on top.
  5. Observability — fire new `onQuotaCheck` / `onQuotaDenied` events via the existing `ObservabilityHooks` surface.

Non-goals

  • llm-providers does NOT implement multi-tenant budgets internally (that's the consumer's auth/billing backend)
  • llm-providers does NOT force consumers to use a quota hook — purely opt-in, backwards compatible
  • The existing global `CreditLedger` is not removed or changed

Cross-reference

  • edge-auth#82 — canonical `ResourceQuotaProvider` contract that Stackbilt consumers will wrap when wiring this hook.
  • Same hook pattern will be used by future `llm-providers` txt2img support, img-forge, storage wrappers, etc.

Priority

MEDIUM — no consumer is blocked today, but it's the seam that unlocks multi-tenant production deployments at scale and is core to the fractal resource-quota architecture we're standardizing on.

🤖 Filed by AEGIS during Phase D scoping session

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions