Agentic AI Systems Should Be Designed as Marginal Token Allocators

📄 Read the paper (PDF) · 🌐 Project page

TL;DR

Routers, agents, serving stacks, and trainers look like four different engineering problems. They are not. They are four readings of one allocation problem, evaluated at four shadow prices that today no single layer can see.

The system should spend the next token on:

$$ i^{*} = \arg\max_{i}\Big[V,\Delta Q_i - \Delta C_i - \lambda,\Delta L_i - \rho,\Delta R_i\Big] $$

where $V$ is task value, $\Delta C_i$ is marginal compute cost, $\lambda$ is the shadow price of latency, and $\rho$ is the shadow price of risk.

The four layers

Layer	Mechanism	Index	Prices observed
Demand	Routing as screening	model tier	$V$, $\Delta C_i$
Action	Agent as principal–agent	plan / act / verify	$\rho$, $V$
Supply	Serving as production	prefill / decode / KV	$\lambda$, $\Delta C_i$
Capital	Caches & RL as investment	rollout / store	$\Delta C_i$, $\rho$

Predicted failure modes

The unified view turns failures into corner cases of one equation: over-routing, under-routing, over-delegation, under-verification, serving congestion, stale RL rollouts, and cache misuse.

Design implications

Token-aware evaluation — report all four prices, not just dollar cost.
Risk-adjusted routing — publish a regret bound or an incentive-compatible menu.
Autonomy pricing — make action class explicit; price irreversible actions higher.
Congestion-priced serving — expose shadow prices for prefill / decode / KV.
RL token budgeting — equalize marginal capability gain across rollouts, verifiers, and updates.

Citation

@misc{zhu2026marginaltoken,
  title  = {Agentic AI Systems Should Be Designed as Marginal Token Allocators},
  author = {Siqi Zhu},
  year   = {2026},
  note   = {Position paper, preprint}
}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.md		README.md
UPLOAD_INSTRUCTIONS.md		UPLOAD_INSTRUCTIONS.md
index.html		index.html
paper.pdf		paper.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Agentic AI Systems Should Be Designed as Marginal Token Allocators

TL;DR

The four layers

Predicted failure modes

Design implications

Citation

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Agentic AI Systems Should Be Designed as Marginal Token Allocators

TL;DR

The four layers

Predicted failure modes

Design implications

Citation

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages