Summary
Add an optional `quotaHook` configuration to `LLMProviderFactory` that allows consumers to inject their own `check` / `record` logic for per-tenant quota enforcement — without llm-providers having any knowledge of a specific auth or billing backend.
Motivation
Stackbilt (the primary consumer of this package) is a multi-tenant platform — Hobby / Pro tiers, paying users, per-tenant budget caps. Per-tenant budget enforcement is a cross-cutting concern that affects every resource package in the ecosystem: LLM, images, storage, email, etc. To keep llm-providers OSS-clean and auth-backend-agnostic, the tenant/billing logic must live elsewhere.
The canonical contract is being defined in `Stackbilt-dev/edge-auth#82` (`ResourceQuotaProvider`). llm-providers just needs to expose the seam so consumers can wire their edge-auth-backed implementation in. Other ecosystems (non-Stackbilt) can wire in anything that implements the minimal interface — or skip the hook entirely and use the built-in `CreditLedger` for single-tenant deployments.
Proposed interface (minimal, no Stackbilt coupling)
```ts
export interface QuotaHook {
check(input: QuotaCheckInput): Promise;
record(input: QuotaRecordInput): Promise;
}
export interface QuotaCheckInput {
tenantId?: string; // from LLMRequest.tenantId if set
provider: string; // 'anthropic', 'groq', etc.
model: string;
estimatedCost: number; // USD, computed from model cost table + request tokens
metadata?: Record<string, unknown>;
}
export interface QuotaCheckResult {
allowed: boolean;
reason?: string;
remainingBudget?: number;
}
export interface QuotaRecordInput {
tenantId?: string;
provider: string;
model: string;
actualCost: number;
inputTokens?: number;
outputTokens?: number;
metadata?: Record<string, unknown>;
}
```
Factory integration
- Before request dispatch, factory calls `quotaHook.check()` (if configured). If `allowed: false`, factory throws `QuotaExceededError` (class already exists in llm-providers) with `reason` as context. This fires before circuit breaker and retry logic.
- After successful request, factory calls `quotaHook.record()` in parallel with the existing `CostTracker.trackCost()`. Hook errors on `record()` should be non-fatal (logged but don't break the response).
- Hook errors on `check()` — configurable fail policy:
- `failPolicy: 'closed'` (default) — hook error treated as `allowed: false`. Safer, prevents uncontrolled spend if auth backend is down.
- `failPolicy: 'open'` — hook error treated as `allowed: true`. Availability-first; risks spend overrun.
- Existing CreditLedger continues to work — QuotaHook is additive. Simple deployments with no multi-tenant needs use the global ledger as today. Multi-tenant deployments layer the hook on top.
- Observability — fire new `onQuotaCheck` / `onQuotaDenied` events via the existing `ObservabilityHooks` surface.
Non-goals
- llm-providers does NOT implement multi-tenant budgets internally (that's the consumer's auth/billing backend)
- llm-providers does NOT force consumers to use a quota hook — purely opt-in, backwards compatible
- The existing global `CreditLedger` is not removed or changed
Cross-reference
- edge-auth#82 — canonical `ResourceQuotaProvider` contract that Stackbilt consumers will wrap when wiring this hook.
- Same hook pattern will be used by future `llm-providers` txt2img support, img-forge, storage wrappers, etc.
Priority
MEDIUM — no consumer is blocked today, but it's the seam that unlocks multi-tenant production deployments at scale and is core to the fractal resource-quota architecture we're standardizing on.
🤖 Filed by AEGIS during Phase D scoping session
Summary
Add an optional `quotaHook` configuration to `LLMProviderFactory` that allows consumers to inject their own `check` / `record` logic for per-tenant quota enforcement — without llm-providers having any knowledge of a specific auth or billing backend.
Motivation
Stackbilt (the primary consumer of this package) is a multi-tenant platform — Hobby / Pro tiers, paying users, per-tenant budget caps. Per-tenant budget enforcement is a cross-cutting concern that affects every resource package in the ecosystem: LLM, images, storage, email, etc. To keep llm-providers OSS-clean and auth-backend-agnostic, the tenant/billing logic must live elsewhere.
The canonical contract is being defined in `Stackbilt-dev/edge-auth#82` (`ResourceQuotaProvider`). llm-providers just needs to expose the seam so consumers can wire their edge-auth-backed implementation in. Other ecosystems (non-Stackbilt) can wire in anything that implements the minimal interface — or skip the hook entirely and use the built-in `CreditLedger` for single-tenant deployments.
Proposed interface (minimal, no Stackbilt coupling)
```ts
export interface QuotaHook {
check(input: QuotaCheckInput): Promise;
record(input: QuotaRecordInput): Promise;
}
export interface QuotaCheckInput {
tenantId?: string; // from LLMRequest.tenantId if set
provider: string; // 'anthropic', 'groq', etc.
model: string;
estimatedCost: number; // USD, computed from model cost table + request tokens
metadata?: Record<string, unknown>;
}
export interface QuotaCheckResult {
allowed: boolean;
reason?: string;
remainingBudget?: number;
}
export interface QuotaRecordInput {
tenantId?: string;
provider: string;
model: string;
actualCost: number;
inputTokens?: number;
outputTokens?: number;
metadata?: Record<string, unknown>;
}
```
Factory integration
Non-goals
Cross-reference
Priority
MEDIUM — no consumer is blocked today, but it's the seam that unlocks multi-tenant production deployments at scale and is core to the fractal resource-quota architecture we're standardizing on.
🤖 Filed by AEGIS during Phase D scoping session