This repository contains my complete submission for the Hugging Face AI Agent Certification, where I built a fully autonomous AI agent capable of solving real-world reasoning tasks from the GAIA (General AI Assistant) Benchmark.
The Hugging Face AI Agents Course is a free, certified program designed to teach the theory, design, and practical application of AI agents. The course covers:
- Agent Fundamentals: Understanding tools, thoughts, actions, observations, LLMs, messages, special tokens, and chat templates.
- Frameworks for AI Agents: Hands-on experience with popular libraries and frameworks such as
smolagents,LlamaIndex, andLangGraph. - Use Cases: Building real-world applications and contributing to the community.
- Final Project: Developing an AI agent for the GAIA benchmark test and competing on a leaderboard.
The course is structured into units containing written materials, coding notebooks, and interactive quizzes. Completing the full course involves building and evaluating an AI agent using a subset of the GAIA benchmark.
So that's the overall context.
I developed an intelligent agent using:
- 🤖
smolagents - 🔍 Tool-augmented search with DuckDuckGo & Wikipedia
- 🧠 A custom prompt system aligned to GAIA's strict answer format
- 🔄 Task-aware context injection (e.g., file parsing, OCR, YouTube transcription)
- 📜 Submission pipeline for automatic evaluation and scoring
Achievement: The agent was evaluated on 20 Level 1 GAIA benchmark tasks and successfully submitted to Hugging Face's scoring API.
The GAIA benchmark is a rigorous test suite designed to assess the general reasoning, retrieval, and tool-use capabilities of AI agents. Tasks require:
- Real-time web search
- Information synthesis
- Working with auxiliary file data (CSV, Excel, MP3, PNG, etc.)
- Interpreting YouTube links, performing OCR, and more
It's used to evaluate general-purpose AI assistants and is modeled as a stepping stone toward AGI-level capabilities.
| Tool | Purpose |
|---|---|
DuckDuckGoSearch |
Web search queries for factual data |
WikipediaSearch |
Specific topic lookups |
Whisper |
Audio + YouTube transcription |
Tesseract OCR |
Extract text from .png images |
pandas |
Preview and parse .csv or .xlsx data |
The agent adheres strictly to GAIA's required output format:
No extra words, units, or explanations — just the direct result, optimized for exact-match evaluation.
I've included screenshots below showing:
- Each GAIA task question
- The agent-generated response
- My final submission and result
- Authenticated via Hugging Face OAuth
- Pulled questions dynamically from HF API
- Automatically attached auxiliary files
- Posted all answers to the
/submitendpoint - Received official score & result breakdown
📈 Final Score: [7/20 correct]
🏅 Certification Status: Passed with ≥ 30% as required
🧾 View: [Certificate Link]
This project pushed me to design a system that combines:
- LLM reasoning
- Real-time retrieval
- Multi-modal input understanding
- Structured output formatting
It simulated real-world agent deployment scenarios and was an excellent hands-on exercise for tool-augmented agents. I would say, more than the satisfaction of obtaining the certification, I'm happy to have learned all these theories and concepts. And to have applied & implemented them!









