Skip to content

Commit d469278

Browse files
committed
RFC-0909 v9: adopt RFC-0903 spend_ledger schema
- Replace parallel usage_ledger table with RFC-0903 spend_ledger - Rename columns: prompt_tokens→input_tokens, completion_tokens→output_tokens, cost_units→cost_amount - Add provider_usage_json field, remove route column - Update approval criteria, relationship section, footer version - Add v9 changelog entry
1 parent a1c0d3c commit d469278

1 file changed

Lines changed: 86 additions & 77 deletions

File tree

rfcs/draft/economics/0909-deterministic-quota-accounting.md

Lines changed: 86 additions & 77 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
## Status
44

5-
Draft (v8 - binary hash storage)
5+
Draft (v9 - adopts RFC-0903 spend_ledger, removes parallel ledger schema)
66

77
## Authors
88

@@ -107,11 +107,11 @@ Cost is computed using deterministic rules.
107107

108108
```rust
109109
// Simple cost: just tokens
110-
let cost = prompt_tokens + completion_tokens;
110+
let cost = input_tokens + output_tokens;
111111

112112
// Or rate-based cost:
113-
let cost = (prompt_tokens * prompt_rate) +
114-
(completion_tokens * completion_rate);
113+
let cost = (input_tokens * prompt_rate) +
114+
(output_tokens * completion_rate);
115115
```
116116

117117
Rates must be represented using **integer scaling**.
@@ -151,24 +151,24 @@ pub struct UsageEvent {
151151
pub team_id: Option<String>,
152152
/// Unix timestamp (seconds)
153153
pub timestamp: u64,
154-
/// Route that was called
155-
pub route: String,
156154
/// Provider name
157155
pub provider: String,
158156
/// Model name
159157
pub model: String,
160158
/// Number of prompt tokens
161-
pub prompt_tokens: u32,
159+
pub input_tokens: u32,
162160
/// Number of completion tokens
163-
pub completion_tokens: u32,
161+
pub output_tokens: u32,
164162
/// Total cost units (deterministic)
165-
pub cost_units: u64,
163+
pub cost_amount: u64,
166164
/// Pricing hash (SHA256 of pricing table used)
167165
pub pricing_hash: [u8; 32],
168166
/// Token source for deterministic accounting (CRITICAL for cross-router determinism)
169167
pub token_source: TokenSource,
170168
/// Canonical tokenizer version (if token_source is CanonicalTokenizer)
171169
pub tokenizer_version: Option<String>,
170+
/// Raw provider usage JSON for audit
171+
pub provider_usage_json: Option<String>,
172172
}
173173

174174
/// Generate deterministic event_id from request content
@@ -308,40 +308,39 @@ Duplicate requests therefore cannot double charge.
308308
All usage events are written to a **ledger table**.
309309

310310
```sql
311-
-- Usage ledger - THE authoritative economic record
311+
-- Spend ledger - THE authoritative economic record
312+
-- Adopted from RFC-0903 (Final) spend_ledger schema for consistency
312313
-- Token counts MUST originate from provider when available (see Canonical Token Accounting)
313-
-- Hash storage: BYTEA(32) for SHA256 hashes (32 bytes) instead of TEXT hex (64+ chars)
314-
CREATE TABLE usage_ledger (
315-
event_id BYTEA(32) PRIMARY KEY, -- SHA256 = 32 bytes (not 64-char hex string)
316-
request_id BYTEA(32) NOT NULL, -- SHA256 = 32 bytes (not 64-char hex string)
317-
key_id TEXT NOT NULL, -- UUID as text (36 chars with dashes)
314+
CREATE TABLE spend_ledger (
315+
event_id TEXT PRIMARY KEY, -- UUID as text (36 chars with dashes)
316+
request_id TEXT NOT NULL, -- UUID as text
317+
key_id TEXT NOT NULL,
318318
team_id TEXT,
319-
timestamp BIGINT NOT NULL, -- Unix epoch (authoritative event time)
320-
route TEXT NOT NULL, -- Route path (e.g., "/v1/chat/completions")
321-
provider TEXT NOT NULL, -- Provider name
322-
model TEXT NOT NULL, -- Model name
323-
prompt_tokens INTEGER NOT NULL,
324-
completion_tokens INTEGER NOT NULL,
325-
cost_units BIGINT NOT NULL,
326-
pricing_hash BYTEA(32) NOT NULL, -- SHA256 of pricing table used
327-
-- Token source for deterministic accounting (CRITICAL)
319+
provider TEXT NOT NULL, -- Provider name
320+
model TEXT NOT NULL, -- Model name
321+
input_tokens INTEGER NOT NULL, -- Prompt tokens
322+
output_tokens INTEGER NOT NULL, -- Completion tokens
323+
cost_amount BIGINT NOT NULL, -- Cost in smallest unit
324+
pricing_hash BYTEA(32) NOT NULL, -- SHA256 of pricing table used
325+
timestamp INTEGER NOT NULL, -- Unix epoch (authoritative event time)
328326
token_source TEXT NOT NULL CHECK (token_source IN ('provider_usage', 'canonical_tokenizer')),
329327
tokenizer_version TEXT,
330-
-- Note: created_at removed - timestamp is authoritative for determinism
328+
provider_usage_json TEXT, -- Raw provider usage for audit
329+
created_at INTEGER NOT NULL DEFAULT (strftime('%s', 'now')),
331330
-- Scoped uniqueness: request_id unique per key (idempotency constraint)
332331
UNIQUE(key_id, request_id),
333332
-- Foreign keys for integrity
334333
FOREIGN KEY(key_id) REFERENCES api_keys(key_id) ON DELETE CASCADE,
335334
FOREIGN KEY(team_id) REFERENCES teams(team_id) ON DELETE SET NULL
336335
);
337336

338-
CREATE INDEX idx_usage_ledger_key_id ON usage_ledger(key_id);
339-
CREATE INDEX idx_usage_ledger_team_id ON usage_ledger(team_id);
340-
CREATE INDEX idx_usage_ledger_timestamp ON usage_ledger(timestamp);
337+
CREATE INDEX idx_spend_ledger_key_id ON spend_ledger(key_id);
338+
CREATE INDEX idx_spend_ledger_team_id ON spend_ledger(team_id);
339+
CREATE INDEX idx_spend_ledger_timestamp ON spend_ledger(timestamp);
341340
-- Composite index for efficient quota queries
342-
CREATE INDEX idx_usage_ledger_key_time ON usage_ledger(key_id, timestamp);
341+
CREATE INDEX idx_spend_ledger_key_time ON spend_ledger(key_id, timestamp);
343342
-- Index for pricing verification queries
344-
CREATE INDEX idx_usage_ledger_pricing_hash ON usage_ledger(pricing_hash);
343+
CREATE INDEX idx_spend_ledger_pricing_hash ON spend_ledger(pricing_hash);
345344
```
346345

347346
## Replay and Verification
@@ -364,7 +363,7 @@ pub fn replay_events(events: &[UsageEvent]) -> BTreeMap<Uuid, u64> {
364363

365364
for event in sorted_events {
366365
let entry = key_spend.entry(event.key_id).or_insert(0);
367-
*entry = entry.saturating_add(event.cost_units);
366+
*entry = entry.saturating_add(event.cost_amount);
368367
}
369368

370369
key_spend
@@ -382,11 +381,11 @@ Verification nodes can reconstruct:
382381
For audit and verification, deterministic replay MUST follow this procedure:
383382

384383
```
385-
1. Load all usage_ledger for a key_id
384+
1. Load all spend_ledger for a key_id
386385
2. Order by timestamp ASC, then event_id ASC (canonical identity)
387-
3. Compute current_spend = SUM(events.cost_units)
386+
3. Compute current_spend = SUM(events.cost_amount)
388387
4. Verify equality: computed_spend == stored current_spend
389-
5. If mismatch, trust usage_ledger as authoritative
388+
5. If mismatch, trust spend_ledger as authoritative
390389
```
391390

392391
This ensures economic audit can always reconcile the ledger.
@@ -396,8 +395,8 @@ This ensures economic audit can always reconcile the ledger.
396395
The following invariants MUST hold at all times:
397396

398397
```
399-
1. usage_ledger are the authoritative economic record
400-
2. current_spend = SUM(usage_ledger.cost_units)
398+
1. spend_ledger are the authoritative economic record
399+
2. current_spend = SUM(spend_ledger.cost_amount)
401400
3. 0 ≤ current_spend ≤ budget_limit
402401
4. request_id uniqueness prevents double charging
403402
5. pricing_hash ensures deterministic cost calculation
@@ -450,15 +449,15 @@ pub fn get_pricing(model: &str) -> Option<PricingModel> {
450449
/// Calculate cost deterministically
451450
pub fn calculate_cost(
452451
model: &str,
453-
prompt_tokens: u32,
454-
completion_tokens: u32,
452+
input_tokens: u32,
453+
output_tokens: u32,
455454
) -> Result<u64, Error> {
456455
let pricing = get_pricing(model)
457456
.ok_or_else(|| Error::UnknownModel(model.to_string()))?;
458457

459458
// Integer math only - no floating point
460-
let prompt_cost = (prompt_tokens as u64 * pricing.prompt_cost_per_1k) / 1000;
461-
let completion_cost = (completion_tokens as u64 * pricing.completion_cost_per_1k) / 1000;
459+
let prompt_cost = (input_tokens as u64 * pricing.prompt_cost_per_1k) / 1000;
460+
let completion_cost = (output_tokens as u64 * pricing.completion_cost_per_1k) / 1000;
462461

463462
Ok(prompt_cost + completion_cost)
464463
}
@@ -533,19 +532,20 @@ The router must recompute cost using **its own pricing tables**, ignoring provid
533532
```rust
534533
/// Process response and record usage
535534
/// CRITICAL: Uses provider-reported tokens and deterministic event_id for cross-router determinism
535+
/// Note: ProviderResponse.provider_usage_json contains the raw provider usage JSON for audit
536536
pub fn process_response(
537537
db: &Database,
538538
key_id: &Uuid,
539539
team_id: Option<&str>,
540540
provider: &str,
541541
model: &str,
542-
response: &ProviderResponse,
542+
response: &ProviderResponse, // Contains: usage, timestamp, id, provider_usage_json
543543
pricing_hash: [u8; 32],
544544
) -> Result<UsageEvent, Error> {
545545
// CRITICAL: Use provider-reported tokens for deterministic accounting
546546
// This ensures all routers produce identical token counts
547-
let prompt_tokens = response.prompt_tokens;
548-
let completion_tokens = response.completion_tokens;
547+
let input_tokens = response.input_tokens;
548+
let output_tokens = response.output_tokens;
549549

550550
// Determine token source: check if provider returned usage metadata
551551
// A provider may legitimately return 0 tokens, so check .is_some() not token count
@@ -557,7 +557,7 @@ pub fn process_response(
557557
};
558558

559559
// Calculate cost using deterministic pricing
560-
let cost_units = calculate_cost(model, prompt_tokens, completion_tokens)?;
560+
let cost_amount = calculate_cost(model, input_tokens, output_tokens)?;
561561

562562
// Generate deterministic request_id (binary SHA256)
563563
let request_id = compute_request_id(key_id, response.timestamp, &response.id);
@@ -568,8 +568,8 @@ pub fn process_response(
568568
key_id,
569569
provider,
570570
model,
571-
prompt_tokens,
572-
completion_tokens,
571+
input_tokens,
572+
output_tokens,
573573
&pricing_hash,
574574
token_source,
575575
);
@@ -581,15 +581,15 @@ pub fn process_response(
581581
key_id: *key_id,
582582
team_id: team_id.map(String::from),
583583
timestamp: response.timestamp,
584-
route: response.route.clone(),
585584
provider: provider.to_string(),
586585
model: model.to_string(),
587-
prompt_tokens,
588-
completion_tokens,
589-
cost_units,
586+
input_tokens,
587+
output_tokens,
588+
cost_amount,
590589
pricing_hash,
591590
token_source,
592591
tokenizer_version,
592+
provider_usage_json: response.provider_usage_json.clone(),
593593
};
594594

595595
// Wrap in transaction for atomicity - prevents orphan ledger entries
@@ -604,42 +604,42 @@ pub fn process_response(
604604

605605
// 2. Compute current spend from ledger
606606
let current: i64 = tx.query_row(
607-
"SELECT COALESCE(SUM(cost_units), 0) FROM usage_ledger WHERE key_id = $1",
607+
"SELECT COALESCE(SUM(cost_amount), 0) FROM spend_ledger WHERE key_id = $1",
608608
params![key_id.to_string()],
609609
|row| row.get(0),
610610
)?;
611611

612612
// 3. Check budget
613-
if current + cost_units as i64 > budget {
613+
if current + cost_amount as i64 > budget {
614614
return Err(Error::BudgetExceeded { current: current as u64, limit: budget as u64 });
615615
}
616616

617-
// 4. Insert into ledger (binary hash storage for event_id, request_id)
617+
// 4. Insert into ledger
618618
tx.execute(
619-
"INSERT INTO usage_ledger (
620-
event_id, request_id, key_id, team_id, timestamp, route,
621-
provider, model, prompt_tokens, completion_tokens, cost_units,
622-
pricing_hash, token_source, tokenizer_version
619+
"INSERT INTO spend_ledger (
620+
event_id, request_id, key_id, team_id, timestamp,
621+
provider, model, input_tokens, output_tokens, cost_amount,
622+
pricing_hash, token_source, tokenizer_version, provider_usage_json
623623
) VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14)
624624
ON CONFLICT(key_id, request_id) DO NOTHING",
625625
params![
626-
&event.event_id, -- BYTEA(32) binary
627-
&event.request_id, -- BYTEA(32) binary
626+
&event.event_id,
627+
&event.request_id,
628628
event.key_id.to_string(),
629629
event.team_id,
630630
event.timestamp as i64,
631-
&event.route,
632631
&event.provider,
633632
&event.model,
634-
event.prompt_tokens as i32,
635-
event.completion_tokens as i32,
636-
event.cost_units as i64,
637-
&event.pricing_hash, -- BYTEA(32) binary
633+
event.input_tokens as i32,
634+
event.output_tokens as i32,
635+
event.cost_amount as i64,
636+
&event.pricing_hash,
638637
match event.token_source {
639638
TokenSource::ProviderUsage => "provider_usage",
640639
TokenSource::CanonicalTokenizer => "canonical_tokenizer",
641640
},
642641
event.tokenizer_version,
642+
&event.provider_usage_json,
643643
],
644644
)?;
645645

@@ -722,7 +722,7 @@ pub fn build_merkle_tree(events: &[UsageEvent]) -> MerkleNode {
722722
.map(|e| {
723723
let mut hasher = Sha256::new();
724724
hasher.update(&e.event_id); // Binary hash, not hex string
725-
hasher.update(e.cost_units.to_le_bytes());
725+
hasher.update(e.cost_amount.to_le_bytes());
726726
hasher.finalize().into()
727727
})
728728
.collect();
@@ -832,7 +832,7 @@ RFC-0909 follows a **ledger-based architecture** for deterministic quota account
832832
**Core principle:**
833833

834834
```
835-
usage_ledger is the authoritative economic record.
835+
spend_ledger is the authoritative economic record.
836836
All balances MUST be derived from the ledger.
837837
```
838838

@@ -846,7 +846,7 @@ This simplifies the system and makes it more deterministic:
846846

847847
**Key architectural points:**
848848

849-
1. **Ledger is authoritative** - All economic events are appended to `usage_ledger`
849+
1. **Ledger is authoritative** - All economic events are appended to `spend_ledger`
850850
2. **Balances are derived** - `current_spend` is computed from ledger, not stored
851851
3. **Idempotent events** - `request_id UNIQUE` prevents double charging
852852
4. **Deterministic event_id** - SHA256 hash ensures same request = same event across routers
@@ -875,25 +875,25 @@ pub fn record_usage(
875875

876876
// 2. Compute current spend from ledger (not a counter)
877877
let current: i64 = tx.query_row(
878-
"SELECT COALESCE(SUM(cost_units), 0) FROM usage_ledger WHERE key_id = $1",
878+
"SELECT COALESCE(SUM(cost_amount), 0) FROM spend_ledger WHERE key_id = $1",
879879
params![key_id.to_string()],
880880
|row| row.get(0),
881881
)?;
882882

883883
// 3. Check budget with locked row
884-
if current + event.cost_units as i64 > budget {
884+
if current + event.cost_amount as i64 > budget {
885885
return Err(KeyError::BudgetExceeded { current: current as u64, limit: budget as u64 });
886886
}
887887

888888
// 4. Insert into ledger (idempotent with ON CONFLICT - must match UNIQUE(key_id, request_id))
889889
tx.execute(
890-
"INSERT INTO usage_ledger (
891-
event_id, request_id, key_id, team_id, timestamp, route,
892-
provider, model, prompt_tokens, completion_tokens, cost_units,
893-
pricing_hash, token_source, tokenizer_version
890+
"INSERT INTO spend_ledger (
891+
event_id, request_id, key_id, team_id, timestamp,
892+
provider, model, input_tokens, output_tokens, cost_amount,
893+
pricing_hash, token_source, tokenizer_version, provider_usage_json
894894
) VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14)
895895
ON CONFLICT(key_id, request_id) DO NOTHING",
896-
params![...],
896+
params![...], // Same params as record_spend above
897897
)?;
898898

899899
tx.commit()?;
@@ -908,7 +908,7 @@ Without row locking, two routers can race and overspend. With `FOR UPDATE`, only
908908
**Deterministic replay:**
909909

910910
```
911-
1. SELECT * FROM usage_ledger ORDER BY timestamp, event_id
911+
1. SELECT * FROM spend_ledger ORDER BY timestamp, event_id
912912
2. Recompute balances
913913
3. Verify equality with any cached balances
914914
```
@@ -951,6 +951,7 @@ authentication
951951
authorization
952952
rate limits
953953
budgets
954+
spend_ledger table schema (Final)
954955
```
955956

956957
RFC-0909 defines:
@@ -959,20 +960,28 @@ RFC-0909 defines:
959960
how usage is measured and deducted
960961
```
961962

963+
**Ledger adoption (v9):** RFC-0909 previously defined a parallel `usage_ledger` table with different column names and types. As of v9, RFC-0909 adopts RFC-0903's `spend_ledger` schema as the canonical ledger. Both RFCs now share the same ledger table definition (`spend_ledger` with `input_tokens`/`output_tokens`/`cost_amount`/`provider_usage_json` columns). This eliminates the earlier inconsistency where the two RFCs had conflicting ledger schemas.
964+
962965
Together they form the **quota router economic core**.
963966

964967
## Approval Criteria
965968

966969
This RFC can be approved when:
967970

968971
- deterministic cost units are implemented
969-
- usage ledger is append-only
972+
- spend_ledger is append-only (per RFC-0903)
970973
- atomic quota deduction is implemented
971974
- idempotent request accounting exists
972975

976+
## Changelog
977+
978+
| Version | Date | Changes |
979+
|---------|------|---------|
980+
| v9 | 2026-03-27 | Adopt RFC-0903 `spend_ledger` schema; remove parallel `usage_ledger` table; rename columns (`prompt_tokens``input_tokens`, `completion_tokens``output_tokens`, `cost_units``cost_amount`); add `provider_usage_json` field; remove `route` column |
981+
973982
---
974983

975984
**Draft Date:** 2026-03-25
976-
**Version:** v8
985+
**Version:** v9
977986
**Related Use Case:** Enhanced Quota Router Gateway
978987
**Related RFCs:** RFC-0903 (Virtual API Key System)

0 commit comments

Comments
 (0)