Agentic Provider Gateway is a portable model-provider layer for agentic applications. It gives agents, task runners, and AI apps one stable gateway for model calls while keeping provider-specific behavior inside provider-owned packages.
It is intentionally lower-level than an agent framework. It does not decide goals, memory, task planning, or tool permissions. It handles the model/provider infrastructure those systems need:
- provider registry
- model capability metadata
- request compilation
- provider routing
- per-provider adapters
- per-provider rate limits
- structured output handling
- tool-call normalization
- streaming event normalization
- token and usage accounting
- SQLite trace storage
- real-provider smoke tooling
Agent runtimes usually grow the same failure mode: model metadata, provider quirks, rate limits, streaming chunks, tool calls, JSON schemas, and usage tracking get mixed into the agent logic.
That makes agents harder to reason about and harder to move between providers.
This project separates the layers:
- agents decide what work should happen
- the gateway decides which provider lane can satisfy the model request
- provider packages own provider-specific API details
The result is a clean lower layer that can be reused by many agent systems.
Alpha reference implementation.
Implemented providers:
fakecontract provider for local testsgemininative REST providergroqofficial SDK provideropenai_compatibleshared base for OpenAI-compatible providers
Verified locally:
- unit tests for gateway contracts, registry, routing, trace store, Gemini, and Groq
- real Gemini text, streaming, structured output, function tools, native tools, media input, embeddings, file upload, file references, and token counting
- real Groq text, streaming, function tools, reasoning, and strict structured JSON
Generated media support is wired for Gemini catalogue entries, but real success depends on model access and quota.
- Run local contract tests with no provider keys.
- Route text requests through a deterministic fake provider.
- Call Gemini and Groq with opt-in real-provider smokes.
- Query model capability metadata for UI/model selection.
- Normalize streaming, tool calls, structured output, and usage.
- Use SQLite trace receipts for debugging and observability.
python -m pip install -e .[dev]For runtime-only usage without Groq:
python -m pip install -e .For Groq:
python -m pip install -e .[groq]from agentic_provider_gateway import ModelGatewayService
from agentic_provider_gateway.core.contracts import (
Capability,
CapabilityRequest,
GatewayRequest,
Identity,
InputItem,
InputPart,
InputRole,
OutputContract,
OutputMode,
PartType,
RuntimeOptions,
Target,
TaskType,
)
gateway = ModelGatewayService.from_defaults()
request = GatewayRequest(
identity=Identity(caller="example"),
task=TaskType.GENERATE_TEXT,
target=Target(model="fake-text-model", provider="fake"),
capability=CapabilityRequest(required={Capability.TEXT_INPUT, Capability.TEXT_OUTPUT}),
input=[
InputItem(
role=InputRole.USER,
parts=[InputPart(type=PartType.TEXT, text="Say hello.")],
)
],
output=OutputContract(mode=OutputMode.TEXT),
runtime=RuntimeOptions(stream=False),
)
response = gateway.execute(request)
print(response.status, response.output[0].text)More examples:
- Basic fake request
- Streaming request
- Model catalogue query
- Groq structured output
- Gemini media input
apg-smoke health
apg-smoke catalogue
apg-smoke execute --model fake-text-model
apg-smoke stream --model fake-text-modelReal provider calls are opt-in:
Copy-Item .env.example .env
# Add GEMINI_API_KEY and/or GROQ_API_KEY
apg-smoke execute --provider gemini --model gemini-2.5-flash --allow-real-provider
apg-smoke stream --provider groq --model openai/gpt-oss-20b --allow-real-providerBy default, generated trace and rate-limit state is written to .apg/ in the current working directory.
Override it with:
$env:APG_STORAGE_DIR="C:\path\to\gateway-state"Do not commit .apg/ or provider keys.
The gateway keeps routing logic provider-neutral. Provider runtimes own provider-specific details:
- credentials
- API request conversion
- response parsing
- streaming conversion
- model catalogue sync
- rate-limit policy
- provider-specific tools and surfaces
Upper layers should call ModelGatewayService, not provider adapters directly.
See:
- Why This Exists
- Quickstart
- Gateway Architecture
- Provider Contract
- Capability Model
- API Overview
- Request Contract
- Real Provider Smokes
- Roadmap
- OSS Checklist
python -m unittest discover -s tests
python -m compileall -q src tests examples- Secrets are read from environment variables or local
.envfiles. .env,.env.local,.apg, traces, and rate-limit state are gitignored.- Trace storage redacts common secret-like keys before writing payloads.
Apache-2.0.