feat(bifrost): enforce the rule-16 spend cap via a governance budget#186
Draft
elronbandel wants to merge 1 commit into
Draft
feat(bifrost): enforce the rule-16 spend cap via a governance budget#186elronbandel wants to merge 1 commit into
elronbandel wants to merge 1 commit into
Conversation
The per-run EVAL_MODEL_MAX_BUDGET cap (models/RULES.md rule 16) was enforced only by core/litellm; the default gpt-5.4--bifrost path ignored it. Add bifrost governance: a budget on the sk-proxy virtual key (the bearer the agent already sends), priced by a pricing_override at the same $2.50/$10-per-1M rates as the litellm config, behind the governance plugin. The start script renders EVAL_MODEL_MAX_BUDGET into the config (mirrors core/litellm's wrapper). Tests: a static guard locks the wiring; an #[ignore] behavioral test (upstream_bifrost_budget_cap_rejects) proves bifrost rejects once spend crosses the cap — also the empirical check that config.json governance enforces in our single-node setup (the silent no-op maximhq/bifrost#2408 is multinode-only). Lean config: the budget attaches directly to the virtual key (no team object, no client block). Enforcement is validated by the #[ignore] test against the live endpoint. Resolves #185. Signed-off-by: Elron Bandel <elron.bandel@ibm.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Resolves #185. Draft — do not merge until the enforcement test passes on a live endpoint (see Verification).
Summary
The per-run spend cap (
EVAL_MODEL_MAX_BUDGET, default $1 — models/RULES.md rule 16) was enforced only bycore/litellm; the defaultgpt-5.4--bifrostpath ignored it. This adds the cap to the bifrost gateway via bifrost governance: a budget on thesk-proxyvirtual key (the bearer the agent already sends), priced by apricing_overrideat the same $2.50/$10-per-1M rates as the litellm config, behind the governance plugin.gateways/bifrost/startrendersEVAL_MODEL_MAX_BUDGETinto the config (mirrors core/litellm's wrapper). Prerequisite for retiring the 2 GBcore/litellm.Type of change
Rules checked against
.agents/models/RULES.mdrule 16 (hard budget cap) — implements it for the bifrost path; rule text unchanged..agents/gateways/RULES.md— gateway flavor contract preserved..agents/contributing/RULES.md2 (code-only) + 3 (declared rules);tests/run/gateways/RULES.md(#[ignore]for credentialed behavior).Verification
cargo fmt --check+ scopedclippy -D warnings(gateways target) cleanstatic_bifrost_wires_budget_cappasses#[ignore](needs a real upstream):provider_configs.RULES.md impact
Breaking changes
sk-proxyvirtual key (is_vk_mandatory: true). The agent already sendssk-proxy(run-agent / runner.yaml), so real runs are unaffected; the test client was updated to match. Other callers to a bifrost gateway withoutsk-proxywill now get 401.Test added (R-15)
upstream_bifrost_budget_cap_rejects(#[ignore]) asserts rejection once spend exceeds the cap;static_bifrost_wires_budget_caplocks the config wiring.