Skip to content

arodmor/arc-agi-3

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 

Repository files navigation

arc-agi-3

The interactive, agentic track of ARC Prize 2026 — why it's a genuine departure from static ARC.

License: MIT Prose: CC BY 4.0 Hub: arc-agi Website Verified

Why this repo

A focused look at the ARC-AGI-3 track of ARC Prize 2026 — the interactive, agentic benchmark, and the most genuinely novel part of the 2026 cycle. Background and the wider story live in the arc-agi hub; this repo zooms in on the interactive track.

Round 1 — public explainer. This first pass is a sourced explainer of the track. A later round will add the working material — an approach write-up, an agent implementation with environment notes, and experimentation notebooks — see the Roadmap.

Dating discipline. Competition details change. Claims below are dated and linked in Sources, re-verified 2026-06-20. Re-check arcprize.org and the ARC-AGI-3 docs before relying on any figure.

What makes ARC-AGI-3 different

ARC-AGI-1 and -2 are static: you see input→output grid examples and produce an output. ARC-AGI-3 is interactive. An agent is dropped into a novel, turn-based environment with no instructions and has to work everything out by acting. It must, on its own:

  • explore — take actions to gather information about how the environment behaves;
  • model — build an internal theory of the environment's dynamics from what it sees;
  • set goals — infer what "success" even means (nothing tells it), then plan toward it.

That's much closer to a person picking up a video game they've never played than to solving a puzzle from worked examples — and it's where current AI is weakest. A side-by-side with the static track is in the hub's ARC-AGI-2 vs -3 comparison.

The state of play (March 2026 launch)

  • Humans solved 100% of the environments.
  • Frontier LLMs scored below 1% used directly — e.g. Gemini 3.1 Pro ~0.37%, Claude Opus 4.6 ~0.2%.
  • The top preview agent reached ~12.6% — and it was a purpose-built agent, not a frontier language model.

The signal worth sitting with: raw model scale did not buy interactive competence. Efficient exploration of an unknown world is not what next-token pretraining optimises for, and the early lead went to a system designed for the task rather than the biggest model pointed at it.

How the competition is structured

  • Milestone prizes at checkpoints on June 30, 2026 and September 30, 2026 (each: 1st $25K · 2nd $10K · 3rd $2.5K).
  • A public community leaderboard intended to support harness / agent research, not just final standings.
  • Sandboxed & open-source, like every track: no internet / no hosted-API calls during scoring; prize-eligible work must be open-sourced (CC0 / MIT-0 in practice).
  • Key dates (2026): opens Mar 25, submissions due Nov 2, winners announced Dec 4.

The environment is accessed via the ARC-AGI-3 API/SDK (see the official docs); any credentials belong in .env (gitignored), and pulled environment assets should never be committed.

Roadmap

The approach I'm exploring is an adaptation and implementation of Yann LeCun's JEPA world-model architecture — from his 2022 position paper A Path Towards Autonomous Machine Intelligence. JEPA (Joint Embedding Predictive Architecture) learns to predict the future in a latent representation space rather than reconstructing raw observations, which gives an agent a compact world model it can use to anticipate dynamics and plan — a natural fit for ARC-AGI-3's explore → model → goal-set loop. Specifically, I'm exploring an object-centric world model (structuring the latent state around discrete objects/entities) adapted to the ARC-AGI-3 environments. This is a direction under active exploration, not a finished result.

Planned working material for the next round:

arc-agi-3/
├── docs/approach.md   # the object-centric JEPA world model — design + rationale
├── agent/             # agent implementation + observations on the environment / API
└── notebooks/         # experimentation in the public preview environments

Note for that work: access is via API — keep credentials in .env (gitignored), don't commit private content, and decide how much of the harness to publish during the live competition.

Siblings


Sources

Re-verified 2026-06-20. Re-check before reuse.

Prose and figures in this repo are © 2026 Antonio Rodriguez-Moral, licensed CC BY 4.0; code is MIT.


🌐 arodmor.me · 💻 github.com/arodmor · ✉️ antonio.rodriguez.moral@pm.me

Part of a series: AI/ML Lab · voice-ai-landscape · arc-agi · recursive-reasoning-models

About

The ARC-AGI-3 track of ARC Prize 2026 — interactive, agentic reasoning.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors