🤖 Awesome Autonomous AI Agents

A curated list of truly exceptional autonomous AI agent frameworks, tools, benchmarks, and papers.

Frameworks · OpenClaw Ecosystem · Coding Agents · Multi-Agent · Web & Browser · Chinese Contributions · Benchmarks · Papers

Why this list?

The AI agent space exploded in 2023 and hasn't stopped. Hundreds of repos claim to be "autonomous agent frameworks." Most are weekend wrappers around API calls. This list only includes projects that moved the field forward — frameworks with real architectural ideas, benchmarks that changed how we evaluate agents, and papers that introduced paradigms now used everywhere.

Every entry was verified for active development status, genuine technical contribution, and community adoption. Chinese-origin projects are marked with 🇨🇳 — the Chinese AI ecosystem has produced some of the most impactful agent work and deserves first-class visibility.

Selection criteria: Novel architecture OR significant community adoption (1,000+ ★) OR foundational paper (100+ citations) OR production deployment at scale. No vaporware, no thin wrappers.

🏗️ General-Purpose Frameworks
🐾 The OpenClaw Ecosystem
💻 Coding Agents
👥 Multi-Agent Systems
🌐 Web & Browser Agents
🖥️ Computer Use & GUI Agents
🔬 Research & Science Agents
🧠 Memory & Knowledge Systems
🔧 Tool-Use & Function Calling
🧩 Planning & Reasoning Frameworks
🇨🇳 Chinese Contributions
📊 Benchmarks & Evaluation
🔗 Protocols & Standards
📄 Foundational Papers
📚 Surveys & Curated Lists
Contributing

🏗️ General-Purpose Frameworks

Project	Stars	Description	Key Differentiator
AutoGPT	~183k ★	The original autonomous GPT-4 agent (April 2023), now evolved into a visual agent-builder platform with marketplace and cloud deployment.	Pioneered autonomous agents — defined the category; most-starred agent repo on all of GitHub.
LangGraph	~27k ★	Low-level graph-based orchestration for stateful agents with durable execution, human-in-the-loop, and streaming.	Most downloaded agent framework (~34.5M monthly PyPI downloads); Pregel/Apache Beam–inspired execution model.
AutoGen / AG2	~54k ★	Microsoft's multi-agent conversation framework; AG2 (ag2ai/ag2) is the community fork continuing active development.	Best Paper at ICLR 2024 LLM Agents Workshop; defined the multi-agent conversation paradigm. Now merging into Microsoft Agent Framework with Semantic Kernel.
CrewAI	~44k ★	Standalone framework for orchestrating role-playing AI agents with "Crews" (collaborative teams) and "Flows" (event-driven pipelines).	1.4B agentic automations claimed; 60% of Fortune 500 as users. Fully independent of LangChain.
Semantic Kernel	~27k ★	Microsoft's enterprise SDK for .NET, Python, and Java with planners, function-calling, and deep Azure integration.	Enterprise-first — merging with AutoGen into unified Microsoft Agent Framework (GA Q1 2026).
smolagents	~25k ★	HuggingFace's minimal agent library where agents write Python code as actions. ~1,000 lines of core logic.	Radical simplicity — code-as-action paradigm with Hub integrations for sharing tools. Sandboxed execution via E2B/Docker.
Agno	~26k ★	High-performance multi-modal agent runtime (formerly Phidata) with memory, knowledge graphs, and MCP support.	Claims 5,000× faster than LangGraph; levels 0–3 of progressive agent autonomy.
PydanticAI	~15k ★	Type-safe, code-first agent framework using Python type hints — the "FastAPI for agents."	Zero magic — built for senior Python teams wanting explicit control over every agent decision.
Open Interpreter	~62k ★	Natural-language interface that lets LLMs run code locally — open-source alternative to OpenAI Code Interpreter.	Full local access — no internet restrictions, file size limits, or time limits. Supports `--os` mode for computer control.
OpenAI Agents SDK	~19k ★	Lightweight, production-ready Python SDK (evolution from Swarm) with built-in web search, file search, and computer use tools.	Official OpenAI agent primitive — designed as the canonical way to build agents with OpenAI models.
Google ADK	~17k ★	Code-first agent toolkit optimized for Gemini with Agent-to-Agent (A2A) protocol support and MCP Toolbox for Databases.	A2A-native — first framework with built-in support for Google's agent interoperability protocol. Model-agnostic via LiteLLM.
Mastra	~19k ★	TypeScript-first agent framework from the Gatsby team with workflows, RAG, evals, and tool calling. 300k+ weekly npm downloads.	Fills the JS/TS gap — native OpenTelemetry, workflow engine, and first-class TypeScript support.
Strands Agents	Growing	AWS's model-agnostic agent framework with optional deep Bedrock integrations and first-class OpenTelemetry tracing.	AWS-native — production-ready with deep AWS service integrations while remaining model-agnostic.
DSPy	~33k ★	Stanford's framework for programming (not prompting) LLMs — compiles declarative calls into self-improving pipelines with optimizers.	Paradigm-shifting — replaces prompt engineering with compiled, optimizable programs. ICLR 2024 paper. MIPROv2 optimizer.
Haystack	~23k ★	Production-ready orchestration framework for NLP/AI pipelines with retrieval, generation, and agent components.	Mature production tooling — battle-tested since pre-LLM era; comprehensive pipeline abstraction for RAG and agents.

🐾 The OpenClaw Ecosystem

OpenClaw is the fastest-growing open-source project in GitHub history, spawning 52+ derivatives across 9 programming languages as of March 2026. Community directory: shelldex.com.

Core Projects

Project	Stars	Description	Key Differentiator
OpenClaw	~337k ★	Messaging-native autonomous personal AI agent connecting through 20+ platforms (WhatsApp, Telegram, Slack, Discord, Signal, iMessage, WeChat, etc.). 8 core agents, SOUL.md personality config, ClawHub skill marketplace (13,700+ skills).	Defined a new category — always-on, messaging-native autonomous agents. Beat React's 10-year star record in ~60 days. Created by Peter Steinberger; transferred to independent foundation after he joined OpenAI.
NemoClaw	~16k ★	NVIDIA's enterprise security wrapper for OpenClaw — adds OS-level sandboxing (Landlock + seccomp + network namespaces), default-deny networking, YAML policy engine, and privacy routing to local Nemotron models.	Enterprise-grade security layer — not a standalone agent but the missing security infrastructure for OpenClaw. Launched at GTC 2026. Hardware-agnostic despite NVIDIA origin.
Hermes Agent	~14k ★	NousResearch's self-improving personal AI agent with built-in RL training via Atropos, GEPA-based skill evolution, and cross-session memory. Multi-platform gateway (Telegram, Discord, Slack, WhatsApp, Signal).	Only agent with a closed-loop learning system — uses GEPA (ICLR 2026 Oral) + DSPy to optimize its own skills from execution traces. `hermes claw migrate` imports from OpenClaw.

Major Rewrites & Alternatives

Project	Stars	Language	Key Differentiator
Nanobot	~33k ★	Python	Ultra-lightweight (~4,000 lines vs OpenClaw's ~430,000); research-ready with minimal dependencies.
ZeroClaw	~27k ★	Rust	99% less RAM than OpenClaw; 10ms cold boot; targets edge/IoT devices.
NanoClaw	~26k ★	TypeScript	Container-first security — every agent runs in an isolated Linux container by default.
PicoClaw	~13k ★	Go	Runs on $10 hardware with <10MB RAM; built for embedded and edge deployments.
OpenFang	Growing	Rust	"Agent Operating System" with 7 autonomous modules, 38 tools, and 40 messaging channels.
Moltis	~2k ★	Rust	Enterprise Rust — 150K lines with zero unsafe blocks; native Prometheus/Grafana observability.

Ecosystem Tools

Project	Description
awesome-openclaw-agents	187 production-ready agent templates across 24 categories.
hermes-agent-self-evolution	GEPA-based self-improvement module for Hermes Agent.
claw0	Educational 10-section tutorial for building an OpenClaw-compatible agent from scratch.
ClawWork	Agents that earn income from professional tasks and pay for their own token usage — earned $15K in 11 hours.

💻 Coding Agents

Project	Stars	Description	Key Differentiator
OpenHands	~69k ★	Open-source AI software development platform — modifies code, runs terminals, browses the web, executes multi-step dev tasks in Docker sandboxes.	Most-starred coding agent on GitHub. 77.6% on SWE-bench Verified. Model-agnostic. $18.8M Series A (All Hands AI).
Aider	~41k ★	AI pair programming in your terminal — every AI edit auto-committed to git. Builds a map of your entire codebase for context.	Git-native workflow — terminal-first, BYOM, top SWE-bench scores. Aider wrote 21% of its own recent code.
GPT-Engineer	~55k ★	Specify what you want built and it generates the entire codebase. Asks clarifying questions, writes specs, then codes.	Precursor to Lovable.dev — one of the first "prompt-to-codebase" tools. Now mostly archived; commercial product spun out.
GPT-Pilot	~33k ★	AI developer with "95% AI / 5% human" paradigm — writes full features while keeping human-in-the-loop for review. Agent roles: Architect, Developer, etc.	Human-in-the-loop by design — YC W24 backed. Powers the Pythagora VS Code extension.
SWE-agent	~19k ★	Takes a GitHub issue and automatically fixes it. Introduced the Agent-Computer Interface (ACI) concept. Also handles cybersecurity CTFs via EnIGMA mode.	NeurIPS 2024 paper. SoTA on SWE-bench with Claude 3.7. Open-weights SWE-agent-LM-32b. Princeton + Stanford.
Devin	N/A	First widely-publicized fully autonomous AI software engineer — plans, codes, debugs, deploys with its own shell, editor, and browser.	Proprietary but category-defining. 13.86% on SWE-bench at launch (vs 1.96% prior SOTA). Cognition valued at $billions. Acquired Windsurf.
Cursor	N/A	AI-native code editor (VS Code fork) with Tab completion, Composer multi-file edits, and agent mode.	$1B ARR in <24 months, $29.3B valuation. 50%+ of Fortune 500. Proprietary but sets the standard for AI-assisted coding.

👥 Multi-Agent Systems

Project	Stars	Description	Key Differentiator
MetaGPT 🇨🇳	~58k ★	Simulates an entire software company — product managers, architects, and engineers collaborate via SOPs to turn one-line requirements into working code.	ICLR 2024 Oral (top 1.2%). Philosophy: "Code = SOP(Team)." Also produced AFlow (ICLR 2025 oral). DeepWisdom (Chinese company).
ChatDev 🇨🇳	~32k ★	Chat-powered virtual software company. Now ChatDev 2.0: zero-code multi-agent orchestration platform for "Developing Everything."	NeurIPS 2025 paper on evolving orchestration. From Tsinghua/OpenBMB. Bilingual docs (CN/EN).
CAMEL 🇨🇳	~15k ★	Pioneering role-playing framework for multi-agent cooperation — agents assume roles and collaborate via inception prompting.	First multi-agent framework (March 2023). NeurIPS 2023. Training data used in Microsoft Phi and OpenHermes. OWL subproject hit #1 on GAIA benchmark.
AgentVerse 🇨🇳	~5k ★	Dual-framework design: task-solving mode (collaborative problem-solving) and simulation mode (observing emergent behaviors).	Unique simulation capabilities — observe and study emergent multi-agent behaviors. OpenBMB/Tsinghua.
Generative Agents	~21k ★	25 AI agents inhabiting "Smallville" — they form relationships, gossip, coordinate events, all emerging from memory + reflection + planning.	Landmark UIST 2023 paper. Stanford (Park, Liang, Bernstein). Introduced "believable simulacra of human behavior."
Agency Swarm	Growing	Framework for creating collaborative AI agent swarms with role-based organization and inter-agent communication.	Swarm-native — agents self-organize with customizable communication topologies.

🌐 Web & Browser Agents

Project	Stars	Description	Key Differentiator
Browser Use	~55k ★	Open-source framework enabling LLMs to control browsers for web automation. Claims 89% on WebVoyager (SOTA).	Y Combinator backed. Released open-source bu-30b model. MIT licensed. Cloud offering available.
GPT Researcher	~26k ★	Autonomous deep research agent — scrapes and synthesizes 20+ web sources to produce factual, cited research reports.	Original open-source deep research agent (May 2023, predating the "deep research" trend by ~2 years). Created by Tavily founder.
BrowserGym	Growing	Unified gym-like environment for web agent research — standardized observation/action spaces across WebArena, MiniWoB, WorkArena, etc.	TMLR 2025 paper. ServiceNow. De facto standard for web agent evaluation infrastructure.
MindSearch 🇨🇳	~5k ★	LLM-based multi-agent web search engine (like Perplexity Pro). Supports DuckDuckGo, Bing, Brave, Google, Tencent search backends.	Open-source Perplexity alternative from Shanghai AI Lab. Bilingual (CN/EN) query support.

🖥️ Computer Use & GUI Agents

Project	Stars	Description	Key Differentiator
Agent TARS / UI-TARS 🇨🇳	~10k ★	ByteDance's multimodal AI agent stack — CLI and Web UI for browser automation + native GUI automation with remote computer/browser operators.	Pioneering GUI agent paper (arXiv:2501.12326). Ships as both CLI agent and desktop application.
AppAgent 🇨🇳	~5k ★	Multimodal agent that operates smartphone apps through tap/swipe — two-phase approach: exploration (learning) then deployment (executing).	Mobile-first agent. Tencent. Tested on 50 tasks across 10 apps. Supports GPT-4V and Qwen-VL-Max.
Voyager	~7k ★	First LLM-powered embodied lifelong learning agent in Minecraft — continuously explores, acquires skills, and makes discoveries without human intervention.	NeurIPS 2023 paper. NVIDIA (Jim Fan). 3.3× more unique items, 15.3× faster tech tree milestones vs prior SOTA. Code as action space.
OSWorld	Research	First scalable real computer environment benchmark for multimodal agents — 369 tasks across Ubuntu, Windows, macOS.	NeurIPS 2024. Best agents achieve ~30% vs much higher human baselines. The hardest computer-use benchmark.

🔬 Research & Science Agents

Project	Stars	Description	Key Differentiator
BabyAGI	~20k ★	Pioneering AI task management system (140 lines of code) — creates, prioritizes, and executes tasks in a loop. Now on BabyAGI 3 with scheduling.	Defined task-driven agents (March 2023). Created by VC Yohei Nakajima. One of the first viral autonomous agent concepts.
Tongyi DeepResearch 🇨🇳	Growing	Leading open-source deep research agent using on-policy RL with Group Relative Policy Optimization.	End-to-end RL approach to deep research — not prompt-engineered. Alibaba NLP. Compatible with ReAct and IterResearch paradigms.
OpenAgents	~5k ★	Three specialized agents: Data Agent, Plugins Agent (200+ tools), Web Agent — open-source ChatGPT Plus alternative.	COLM 2024 paper. XLANG Lab (HKU). First attempt at replicating full ChatGPT Plus as open source.

🧠 Memory & Knowledge Systems

Project	Stars	Description	Key Differentiator
Letta (MemGPT)	~22k ★	Platform for stateful agents with self-editing memory inspired by OS virtual memory — two-tier architecture: core memory + archival memory.	Foundational memory paper ("MemGPT: Towards LLMs as Operating Systems"). UC Berkeley. $10M seed (Felicis). Transitioning to Letta V1 for GPT-5/Claude 4.5.
Mem0	Growing	Memory layer for AI agents with personalization — short-term + long-term memory via conversation history and user preferences.	Plug-and-play memory — flexible embedding and search configs. Works with any agent framework.
Zep	Growing	Session-based memory with automatic summarization, graph-based knowledge extraction, and fact extraction.	Graph-based knowledge memory — maintains conversation transcripts as memory blocks with automated summarization.

🔧 Tool-Use & Function Calling

Project	Stars	Description	Key Differentiator
Gorilla	~12k ★	LLM fine-tuned for API/function calling. Hosts the Berkeley Function Calling Leaderboard (BFCL V4 Agentic). GoEx runtime for safe execution.	BFCL is the industry standard for evaluating function-calling in LLMs. Apache 2.0. UC Berkeley.
ToolBench / ToolLLM 🇨🇳	Growing	Training framework for LLMs to master 16,000+ real-world APIs with DFSDT (Depth-First Search Decision Tree) reasoning.	ICLR 2024. Introduced backtracking in tool-use reasoning. Three-stage pipeline: API collection → instruction gen → solution annotation.
Composio	Growing	Integration platform: 90+ tool connections for AI agents with managed authentication and hierarchical task execution.	Managed auth + 90 integrations — solves the "connecting agents to real tools" problem out of the box.
AgentLego 🇨🇳	~500 ★	Versatile tool API library extending agents with multimodal capabilities — vision, image gen/edit, speech, VL reasoning.	Multimodal tool library from Shanghai AI Lab. Integrates with LangChain, Transformers Agents, and Lagent.

🧩 Planning & Reasoning Frameworks

Project	Stars	Description	Key Differentiator
DeerFlow 2.0 🇨🇳	~45k ★	ByteDance's "SuperAgent harness" for research, coding, and creative tasks. v2.0 is a complete rewrite with sandboxed execution, persistent memory, sub-agent orchestration.	Hit #1 GitHub Trending and reached 45k stars in weeks. Built on LangGraph. MIT license. Supports Doubao, DeepSeek, Kimi, GPT-4o, Claude.
XAgent 🇨🇳	~8k ★	Autonomous agent with three-component architecture: Dispatcher (routing), Planner (milestones), Actor (execution). Proactive human collaboration via AskForHumanHelp tool.	Dispatcher-Planner-Actor paradigm. OpenBMB/Tsinghua. Docker-sandboxed. ToolServer with file editor, Python notebooks, shell, web browser.
TaskWeaver	~6k ★	Microsoft's code-first agent for data analytics — converts natural language to executable Python with two-layer planning (Planner → Code Generator → Executor).	Data analytics specialist — unique focus on rich data structures (pandas DataFrames). Stateful conversations. ⚠️ Archived.
SuperAGI	~17k ★	Full-featured agent platform with toolkit marketplace, GUI dashboard, Docker deployment, and multi-agent workflows.	Early marketplace approach for agent toolkits and templates. Dev has slowed significantly since mid-2024.
AgentGPT	~36k ★	First browser-based autonomous agent UI — name a custom AI and set it on any goal. YC-backed.	Pioneered browser-based agent UX. ⚠️ Archived — Reworkd team pivoted to data extraction.

🇨🇳 Chinese Contributions

China has produced some of the most impactful agent research and production frameworks. This section highlights projects not already featured above.

Alibaba Ecosystem

Project	Stars	Description	Key Differentiator
AgentScope	~20k ★	Production-ready multi-agent framework with ReAct agents, MCP/A2A support, real-time voice, memory management, and evaluation.	Broadest Chinese agent ecosystem — AgentScope-Java (2.2k ★), ReMe memory kit, CoPaw assistant, Trinity-RFT for agentic RL. Biweekly community meetings.
CoPaw	~13k ★	Personal AI assistant workstation — multi-channel (DingTalk, Feishu, QQ, Discord, iMessage), local LLM support via llama.cpp/MLX, cron scheduling.	Chinese-platform native — first-class support for DingTalk, Feishu, QQ, WeCom. Built on AgentScope.
Qwen-Agent	~13k ★	Official agent framework for the Qwen model family — function calling, MCP, Code Interpreter, RAG (1M+ token context), Chrome extension.	Qwen ecosystem's agent layer. Supports Qwen3, Qwen3-Coder, QwQ. Includes DeepPlanning evaluation benchmark.
ModelScope-Agent	~4k ★	Lightweight framework with MCP support, deep research (55.43 on DeepResearch Bench), code generation, and AgentFabric for custom agent creation.	Alibaba's model-hub-native agent with Anthropic Agent Skills protocol support.
Qwen Code	Growing	Terminal AI agent for coding (CLI), forked from Gemini CLI and optimized for Qwen models.	Qwen's answer to Claude Code — terminal-first coding agent with Qwen optimization.

ByteDance Ecosystem

Project	Stars	Description	Key Differentiator
Trae Agent	~2k ★	LLM-based agent for general software engineering — features "Lakeview" for concise step summarization, trajectory recording, YAML config.	Research-friendly design with trajectory recording for academic analysis. Multi-LLM support including Doubao.

Shanghai AI Lab / InternLM

Project	Stars	Description	Key Differentiator
Lagent	~2k ★	Lightweight, PyTorch-inspired agent framework with imperative Pythonic style, async/sync dual interface, ReAct/ReWOO agents.	PyTorch-like API for agent building. InternLM ecosystem integration. HTTP deployment for distributed multi-agent apps.
HuixiangDou	Growing	Domain-specific QA agent — technical group chat assistant for answering questions in specialized domains.	Domain QA specialist from Shanghai AI Lab's InternLM team.

Tencent

Project	Stars	Description	Key Differentiator
C3-Benchmark	Recent	Comprehensive agent evaluation covering all action spaces — bilingual (CN/EN) data generation, controllable task generation.	Most analysis dimensions among agent eval frameworks. Tencent Hunyuan. Multi-agent data generation.

Community & Education

Project	Description
Hello-Agents	Chinese-language educational tutorial for building AI agents from scratch. Datawhale community.
awesome-hermes-agent	Curated resource list for the Hermes Agent ecosystem.

📊 Benchmarks & Evaluation

Benchmark	GitHub / Link	Paper	Key Differentiator
SWE-bench	arXiv:2310.06770	ICLR 2024 Oral. 2,294 real GitHub issue-PR pairs. Gold standard for coding agents. Variants: Lite, Verified (500 human-validated), Multimodal.	The benchmark every coding agent cites.
AgentBench 🇨🇳	arXiv:2308.03688	ICLR 2024. First comprehensive multi-domain agent benchmark — 8 environments (OS, DB, KG, web, games).	Revealed massive gap between commercial and open-source LLMs on agent tasks.
τ-bench	arXiv:2406.12045	Emulates user-agent-tool conversations with domain-specific APIs and policy guidelines. pass^k reliability metric.	Even GPT-4o succeeds on <50% of tasks. Sierra AI (Shunyu Yao). Expanded to τ²-bench and τ³-bench.
WebArena	arXiv:2307.13854	Realistic web environment with functional website replicas. 812 tasks.	Spawned VisualWebArena, WorkArena, TheAgentCompany. BrowserGym integration.
GAIA	Meta AI	466 real-world questions requiring reasoning, multimodality, web browsing, tool use. Humans: 92%, GPT-4: 15%.	Conceptually simple for humans yet devastatingly hard for AI.
ToolBench 🇨🇳	arXiv:2307.16789	ICLR 2024. 16,000+ real-world APIs from RapidAPI with DFSDT reasoning.	Introduced backtracking reasoning for tool use. ToolLLaMA model.
OSWorld	arXiv:2404.07972	NeurIPS 2024. Real computer environment (Ubuntu/Windows/macOS). 369 tasks. Best agents ~30%.	Most challenging computer-use benchmark. Requires actual VM execution.
MLAgentBench	Stanford	Evaluates agents on ML research tasks: running experiments, analyzing results, writing code.	Tests the "AI researcher" use case end-to-end.
Mind2Web	NeurIPS 2023	2,000+ crowd-sourced web tasks across 137 real websites.	Large-scale real-website benchmark.
BFCL	UC Berkeley / Gorilla	Berkeley Function Calling Leaderboard — V4 Agentic evaluates tool-calling in real-world agentic settings.	Industry standard for function-calling evaluation.
ScienceAgentBench	ICLR 2025	102 tasks for data-driven scientific discovery.	First rigorous benchmark for science agents.

🔗 Protocols & Standards

Protocol	Origin	Description
MCP (Model Context Protocol)	Anthropic (Nov 2024) → Linux Foundation (Dec 2025)	Open standard for connecting AI models to tools and data sources. 97M+ monthly SDK downloads. Supported by ChatGPT, Claude, Gemini, Cursor, VS Code.
A2A (Agent-to-Agent Protocol)	Google → Linux Foundation	Protocol for agent discovery, capability advertisement, and collaboration across frameworks. 150+ supporting organizations.

📄 Foundational Papers

These papers introduced paradigms now embedded in virtually every agent framework.

Paper	Year	Venue	Key Contribution
ReAct: Synergizing Reasoning and Acting	2023	ICLR 2023	The agent paradigm. Interleaved reasoning traces and actions — the foundation of modern LLM agents. Shunyu Yao et al. (Princeton).
Chain-of-Thought Prompting	2022	NeurIPS 2022	Step-by-step reasoning. The foundational technique underlying all agent reasoning. Jason Wei et al. (Google).
Reflexion: Verbal Reinforcement Learning	2023	NeurIPS 2023	Self-correction without weight updates. Agents learn from verbal self-reflection. 91% pass@1 on HumanEval. Noah Shinn et al. (Princeton).
Tree of Thoughts	2023	NeurIPS 2023	Search-based reasoning. Extends CoT to explore multiple reasoning paths with BFS/DFS. Shunyu Yao et al. (Princeton).
Toolformer	2023	NeurIPS 2023	Self-supervised tool learning. LLMs learn when/how to call tools. 6B model outperforms GPT-3 175B on math. Meta AI.
HuggingGPT	2023	NeurIPS 2023	LLM as model orchestrator. ChatGPT plans and dispatches tasks to expert models from HuggingFace Hub.
LATS: Language Agent Tree Search	2024	ICML 2024	Unifies reasoning + acting + planning via MCTS. 92.7% pass@1 on HumanEval with GPT-4. Andy Zhou et al. (UIUC).
GEPA: Reflective Prompt Evolution	2026	ICLR 2026 Oral	Self-improving agents. Outperforms GRPO by 6% with 35× fewer rollouts. Powers Hermes Agent's self-evolution.
AutoGen	2024	ICLR 2024 Workshop (Best Paper)	Multi-agent conversation paradigm. Defined how agents collaborate through structured dialogue. Microsoft.
MetaGPT	2024	ICLR 2024 Oral	SOP-driven multi-agent collaboration. "Code = SOP(Team)" — encodes human workflows as agent prompts.
CAMEL	2023	NeurIPS 2023	Role-playing agents. First framework for multi-agent cooperation via inception prompting.
Generative Agents	2023	UIST 2023	Believable simulacra of human behavior. Memory streams + reflection + planning = emergent social behaviors. Stanford.
OpenHands	2024	—	AI software developer as generalist agent. Defines the platform architecture for coding agents.
SWE-agent (ACI)	2024	NeurIPS 2024	Agent-Computer Interfaces. How interface design changes agent performance. Princeton/Stanford.
Voyager	2023	NeurIPS 2023	Embodied lifelong learning. Auto-curriculum + skill library + self-verification in Minecraft. NVIDIA.
MemGPT	2023	—	LLMs as operating systems. Self-editing memory with virtual context management. UC Berkeley.
A Survey on LLM-based Autonomous Agents	2023	Updated Mar 2025	Most-cited agent survey. Unified framework: Profile + Memory + Planning + Action modules.

📚 Surveys & Curated Lists

Resource	Description
awesome-ai-agents	1,500+ resources on AI agents — the largest general-purpose collection.
awesome-ai-agents	300+ agentic resources with star counts and categorization.
AI Agent Benchmark Compendium	50+ benchmarks categorized and compared.
shelldex.com	Community directory tracking 52+ OpenClaw derivatives across 9 languages. Weekly growth rankings.
Agentic AI Survey	Dual-paradigm framework (Symbolic vs Neural); PRISMA-based review of 90 studies.
LLM Agent Methodology Survey	March 2025 methodology-centered taxonomy of agent architectures.

Contributing

Contributions welcome! Please read the guidelines below before submitting a PR.

What belongs here

Open-source frameworks with a real GitHub repo, >500 stars or a published paper
Academic papers that introduced a technique now widely used in agent systems
Benchmarks that are actively used by the research community
Production deployments that are open source and demonstrate novel architecture

What doesn't belong

Thin API wrappers with no novel architecture
Closed-source products without open-source components (exceptions: category-defining products like Devin and Cursor)
Projects with no activity for 12+ months and <1,000 stars
Tutorials, courses, or blog posts (except as supplementary links)

How to contribute

Fork this repository
Add your entry in the appropriate category with: [Project Name](URL) | stars | One-line description | Key differentiator
Verify the GitHub URL is live and the project is active
Submit a PR with a brief explanation of why this project is notable

⭐ Star this repo if you found it useful!

Last updated: March 2026 · Maintained with care · CC0 1.0 Universal

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
LICENSE		LICENSE
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🤖 Awesome Autonomous AI Agents

Why this list?

Contents

🏗️ General-Purpose Frameworks

🐾 The OpenClaw Ecosystem

Core Projects

Major Rewrites & Alternatives

Ecosystem Tools

💻 Coding Agents

👥 Multi-Agent Systems

🌐 Web & Browser Agents

🖥️ Computer Use & GUI Agents

🔬 Research & Science Agents

🧠 Memory & Knowledge Systems

🔧 Tool-Use & Function Calling

🧩 Planning & Reasoning Frameworks

🇨🇳 Chinese Contributions

Alibaba Ecosystem

ByteDance Ecosystem

Shanghai AI Lab / InternLM

Tencent

Community & Education

📊 Benchmarks & Evaluation

🔗 Protocols & Standards

📄 Foundational Papers

📚 Surveys & Curated Lists

Contributing

What belongs here

What doesn't belong

How to contribute

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 1

Folders and files

Latest commit

History

Repository files navigation

🤖 Awesome Autonomous AI Agents

Why this list?

Contents

🏗️ General-Purpose Frameworks

🐾 The OpenClaw Ecosystem

Core Projects

Major Rewrites & Alternatives

Ecosystem Tools

💻 Coding Agents

👥 Multi-Agent Systems

🌐 Web & Browser Agents

🖥️ Computer Use & GUI Agents

🔬 Research & Science Agents

🧠 Memory & Knowledge Systems

🔧 Tool-Use & Function Calling

🧩 Planning & Reasoning Frameworks

🇨🇳 Chinese Contributions

Alibaba Ecosystem

ByteDance Ecosystem

Shanghai AI Lab / InternLM

Tencent

Community & Education

📊 Benchmarks & Evaluation

🔗 Protocols & Standards

📄 Foundational Papers

📚 Surveys & Curated Lists

Contributing

What belongs here

What doesn't belong

How to contribute

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 1

Packages