Multimodal AI Agent

A production-ready AI agent built with Agno that demonstrates handling and parsing various file types (PDF, CSV, etc.) in conversational interactions. This project showcases best practices for agent development using the Better Agents framework, including prompt management with LangWatch, comprehensive testing with Scenario, and proper instrumentation.

Features

File Parsing: Supports parsing PDF, CSV, and other file formats provided by users in conversations
Conversational AI: Maintains context across multi-turn conversations
Structured Responses: Uses Pydantic schemas for consistent output formatting
Comprehensive Testing: End-to-end Scenario tests ensure reliability
Prompt Management: Version-controlled prompts using LangWatch CLI
Instrumentation: Full LangWatch integration for monitoring and analytics

Project Structure

├── app/                    # Main application code
│   └── main.py            # Agent implementation
├── prompts/               # Version-controlled prompt files
│   └── *.yaml
├── tests/
│   ├── evaluations/       # Component evaluation notebooks
│   └── scenarios/         # End-to-end scenario tests
├── .env                   # Environment variables (copy from .env.example)
├── prompts.json           # Prompt registry
└── AGENTS.md              # Development guidelines

Setup

Prerequisites

Python 3.8+
uv package manager
API keys for OpenAI and LangWatch (already configured in .env)

Installation

Install uv (if not already installed):

curl -LsSf https://astral.sh/uv/install.sh | sh

Initialize the project:
```
uv init
```

Install dependencies:

uv add agno langwatch pytest python-dotenv
uv add --dev pytest-asyncio

Install LangWatch CLI:
```
uv tool install langwatch
```

Set up environment:

cp .env.example .env
# Edit .env with your API keys (already done)

Install Scenario for testing:
```
uv add scenario-langwatch
```

Usage

Running the Agent

Execute the main application:

uv run python app/main.py

The agent will start and be ready to handle conversational interactions with file uploads.

Development Server

To run a development server (if available):

uv run python -m uvicorn app.main:app --reload --host 0.0.0.0 --port 8000

Access the agent at: http://localhost:8000

Testing

Scenario Tests

Run end-to-end scenario tests:

uv run pytest tests/scenarios/ -v

Evaluations

Run evaluation notebooks in tests/evaluations/ for component-level testing.

Prompt Management

Creating Prompts

Use LangWatch CLI to create and manage prompts:

langwatch prompt create multimodal_agent
# Edit the created YAML file in prompts/
langwatch prompt sync

Using Prompts in Code

import langwatch

prompt = langwatch.prompts.get("multimodal_agent")
agent = Agent(prompt=prompt.prompt)

File Handling

The agent supports the following file types:

PDF: Text extraction and analysis
CSV: Data parsing and summarization
Images: OCR and description (if supported)
Other formats: As needed for specific use cases

Development Guidelines

Follow the guidelines in AGENTS.md for:

Prompt management best practices
Testing strategies (Scenario tests first)
Code organization
Performance optimization

Contributing

Follow the development workflow in AGENTS.md
Create Scenario tests for new features
Use LangWatch for prompt versioning
Ensure all tests pass before submitting

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.cursor		.cursor
app		app
files		files
prompts		prompts
tests		tests
.env.example		.env.example
.gitignore		.gitignore
.mcp.json		.mcp.json
.python-version		.python-version
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
README.md		README.md
main.py		main.py
prompts-lock.json		prompts-lock.json
prompts.json		prompts.json
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multimodal AI Agent

Features

Project Structure

Setup

Prerequisites

Installation

Usage

Running the Agent

Development Server

Testing

Scenario Tests

Evaluations

Prompt Management

Creating Prompts

Using Prompts in Code

File Handling

Development Guidelines

Contributing

Resources

About

Uh oh!

Releases

Packages

Languages

langwatch/multimodal-ai

Folders and files

Latest commit

History

Repository files navigation

Multimodal AI Agent

Features

Project Structure

Setup

Prerequisites

Installation

Usage

Running the Agent

Development Server

Testing

Scenario Tests

Evaluations

Prompt Management

Creating Prompts

Using Prompts in Code

File Handling

Development Guidelines

Contributing

Resources

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages