A weekly brief of what's actually new from the frontier AI labs — an autonomous agent that remembers what it has already told you and reports only the delta.
OpenAI, Anthropic, and Google ship something almost every week. Keeping up means re-reading the same changelogs and news pages and mentally diffing them against last time. Frontier Brief does that for you: each week it reads the canonical sources for all three labs, compares them against a durable memory of everything it has reported before, and writes a short, cited brief of only the genuinely new items.
It's also, deliberately, a code-first study of Claude Managed Agents — no console clicking; the agent and its infrastructure are version-controlled config. A weekly digest is a small, low-stakes workload that happens to exercise the most distinctive parts of the platform: persistent cross-session memory, server-side web tools, the session/event lifecycle, and a reconcilable control plane.
Why Managed Agents, concretely: a coordinator fans out per-provider research to isolated subagent threads, each fetching evolving, self-healing source URLs inside a sandbox (a hijacked link can't reach your infrastructure) and rewriting its own slice of a shared memory store that carries state across runs. It's the multi-agent runtime — orchestration, tool harness, sandbox, and persistence — you'd otherwise build and secure yourself.
A cited, deduplicated Markdown brief, assembled by the coordinator from its researchers' findings. An excerpt from a real run (2026-05-28, Claude Sonnet 4.6, ~$0.65 — 18 OpenAI / 15 Anthropic / 12 Gemini items across three parallel researcher threads):
# Frontier Brief — 2026-05-28
## OpenAI
- Workload identity federation — exchange external identity tokens for short-lived
access tokens, no stored API keys. (developers.openai.com/…/workload-identity-federation)
## Anthropic
- Claude Platform on AWS — the Claude API on Anthropic-managed infra via AWS, with AWS
billing + IAM auth. (platform.claude.com/…/claude-platform-on-aws)
- Fast mode now supports Claude Opus 4.7. (platform.claude.com/…/fast-mode)
## Google Gemini
- Gemini 3.5 Flash GA — beats 3.1 Pro on agentic/coding, ~4× faster output, now the
default behind the Gemini app and AI Mode in Search. (deepmind.google/blog/…)
- Gemini Omni Flash — video generation from text/image/audio/video, multi-turn editing.
Every item is cited; nothing reported in a previous week appears again.
A coordinator + a generalized research subagent. The frontier-brief coordinator reads
the provider list, dispatches one frontier-brief-researcher per provider as parallel,
context-isolated threads, collects their findings, and assembles the brief. Each researcher
owns its provider's slice of a shared memory store, so the threads never step on each other.
Adding a provider is a config edit, not a code change.
Two layers, kept deliberately separate:
- Config — in git, reconciled by
apply. The agents, environment, and memory-store definition, each a small YAML file.applyis a thin name-as-key reconciler: create if absent, update an agent only on a real diff, resolve the coordinator's roster names→ids, and never touch a store's contents. - State — server-side, owned by the agents. The memory store's contents: per-provider ledgers, watermarks, and self-healing source registries. Git never syncs this back — the only path in is a one-way seed, applied once when the store is created.
local harness → run.py → coordinator session (memory store mounted)
→ coordinator reads /providers.yaml → delegates a researcher per provider (parallel)
→ each researcher: read its slice → fetch → self-heal → diff → write its ledger
→ return new items
→ coordinator assembles + emits the brief
Sources move — pages get reorganized, feeds change paths. So each researcher keeps its provider's source list in memory and repairs it: when a fetch comes back broken or redirected, it rediscovers the canonical page, validates it, and — only if the replacement stays on the source's trusted domains — updates the registry and notes the change in the brief. Untrusted web content never becomes trusted memory without that guard.
- Python 3.11+
- An Anthropic API key with Claude Managed Agents access (enabled by default on API
accounts). Copy
.env.exampleto.envand setANTHROPIC_API_KEY.
pip install -r requirements.txt # set ANTHROPIC_API_KEY — see .env.example
python scripts/apply.py --plan # preview control-plane changes (no writes)
python scripts/apply.py # reconcile coordinator + researcher + env + seeded store
python scripts/run.py # produce one brief now
python scripts/teardown.py # drain sessions + delete it all (reset / recover)These three scripts are the whole harness. This is a local exploration of Managed Agents, so CI, scheduling, and external delivery are intentionally out of scope.
A working multi-agent exploration of Claude Managed Agents, run from a local harness. The
coordinator + researcher, environment, and seeded memory store are applied to a live
workspace; apply → teardown → re-apply is validated; and an end-to-end run confirmed
delegation (coordinator → three researcher threads → results back), each researcher writing
its own provider ledger. CI, scheduling, and external delivery are intentionally out of
scope — see docs/DESIGN.md.
docs/DESIGN.md— architecture and the reasoning behind it.docs/REFERENCES.md— the Managed Agents documentation this derives from, plus a change-watch for tracking the beta.