size escrow off real prompt + output cap with headroom + per-job ceiling#16
Open
ffaerber wants to merge 2 commits into
Open
size escrow off real prompt + output cap with headroom + per-job ceiling#16ffaerber wants to merge 2 commits into
ffaerber wants to merge 2 commits into
Conversation
Previous maxPayment formula padded by a fixed 1M tokens on each side off max_tokens (default 1024). That under-budgets long-context requests (Gemini 2M, GPT-5 long inputs) — provider's honest claimJob reverts on PaymentTooHigh, gateway times out, provider gets slashed for an honest client-sized prompt. New compute_max_payment sizes the escrow off the estimated prompt length (chars/4 fallback) and the requested or default output cap, each padded by T4T_ESCROW_HEADROOM_RATIO. Optional T4T_MAX_ESCROW_PER_JOB rejects oversized requests with HTTP 413 instead of locking that much xBZZ.
…ment Two defensive changes so an honest provider can't get slashed for an honest workload that overshoots the escrow: 1. Before calling chatCompletion, worker derives the maximum completion tokens the on-chain maxPayment can pay for (given the provider's declared per-million prices and a conservative chars/4 prompt estimate), and lowers req.max_tokens if it's higher (or absent). Any backend that honors max_tokens then physically cannot produce a response whose actualPayment would exceed maxPayment. 2. In the claim path, re-read the on-chain Job to get the authoritative maxPayment (defense against a gateway that tampers with notify.body), then clip actualPayment = min(actual, maxPayment). A backend that ignores max_tokens still claims what it can instead of reverting with PaymentTooHigh and burning to timeoutJob's 3x slash.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Previous maxPayment formula padded by a fixed 1M tokens on each side off
max_tokens (default 1024). That under-budgets long-context requests
(Gemini 2M, GPT-5 long inputs) — provider's honest claimJob reverts on
PaymentTooHigh, gateway times out, provider gets slashed for an honest
client-sized prompt.
New compute_max_payment sizes the escrow off the estimated prompt length
(chars/4 fallback) and the requested or default output cap, each padded
by T4T_ESCROW_HEADROOM_RATIO. Optional T4T_MAX_ESCROW_PER_JOB rejects
oversized requests with HTTP 413 instead of locking that much xBZZ.