This file provides guidance to LLMs when working with code in this repository.
The rag/ directory contains the Light-RAG system - a lightweight Retrieval-Augmented Generation system inspired by attention mechanisms that can enhance automator agents with contextual document retrieval.
- Document ingestion with OpenAI-powered key generation and summarization
- Persistent storage using FileSystemStore with full serialization
- Intelligent reranking using OpenAI's structured outputs
- Hook integration with automator agents for automatic document retrieval
- Context-aware retrieval based on conversation history
The RAG system has been successfully merged into the main automator repository as a sibling package:
Directory Structure:
automator/
├── automator/ # Core Python package
├── rag/ # RAG package (sibling)
├── ui/ # Frontend code
├── examples/rag/ # RAG examples
├── tests/rag/ # RAG tests
└── docs/ # Documentation including rag.md
Installation Options:
pip install -e .- Core functionality onlypip install -e .[rag]- Core + RAG capabilitiespip install -e .[ui]- Core + UI build toolspip install -e .[dev]- Core + development toolspip install -e .[all]- Everything
RAG Hook Usage:
from automator import Agent
from rag import create_rag_hook # Only if [rag] installed
agent = Agent(
model="gpt-4.1",
hooks=['rag:.knowledge'] # Enable RAG for .knowledge directory
)The RAG system automatically:
- Ingests documents from specified directories
- Retrieves relevant documents based on conversation context
- Adds document content to agent message history
- Tracks document relevance across conversation turns
Output Management & Search Capabilities (Latest):
- Intelligent Output Truncation: Added tiktoken-based token counting to automatically truncate outputs > 4k tokens
- Keeps first 1k + last 3k tokens with clear truncation indicators
- Saves full output to
./tool_output/with timestamped filenames - Applied to all terminal.py and editor.py functions
- Find Tool: New
find(search_str, path)tool with glob pattern support- Search through files using patterns like
*.py,**/*.js,tool_output/*.txt - Case-insensitive search with highlighted matches and line numbers
- Integrated with truncated output workflow for easy access to full content
- Search through files using patterns like
- Files Added:
output_manager.py,find_tool.py, comprehensive test suite - Files Modified: All major output functions in terminal.py and editor.py now use smart truncation
This is a Research-Assistant monorepo containing an MCP-based LLM agent system with multiple interconnected components:
- automator/: Main Python SDK for creating LLM agents built on MCP (Model Context Protocol)
- terminal-mcp/: MCP server providing terminal, Jupyter, and codebase interaction tools
- web-mcp/: MCP server for web search and browsing capabilities
- talk-to-model/: MCP server enabling agent-to-agent communication for LLM evaluation
- squiggpy/: Monte Carlo probabilistic modeling library with agent prompts
- MCP Configuration:
~/mcp.jsondefines available MCP servers and their environment variables - Agent Creation: Agents use YAML prompt templates from
automator/prompts/and specify tool access patterns - Workspace Management: Agents and conversation threads persist in
~/.automator/workspaces/ - Tool Access: Agents connect to MCP servers to access tools (terminal, web search, model communication)
python install.py # Sets up entire repository including dependencies and MCP configuration# Individual component setup (run in component directory)
uv sync
# Frontend development (in automator/ui/frontend/)
npm run dev # Start development server (localhost:5173)
npm run build # Build for production
npm run lint # Run ESLint
# Backend development (in automator/ui)
uvicorn api.main:app --port 8000 # Start FastAPI server# Navigate to automator directory
cd automator
# Install with dev dependencies
pip install -e .[dev]
# Run all tests
pytest
# Run specific test categories
pytest tests/test_quickstart.py # Basic agent and terminal functionality
pytest tests/test_rag_hook.py # RAG integration tests (requires [rag] dependencies)
pytest tests/test_system.py # Core system tests
# Run with verbose output
pytest -v
# Run excluding RAG tests if dependencies not available
pytest -m "not rag"# Basic agent examples
python automator/examples/quickstart.py
python automator/examples/load_workspace.py
# Advanced examples
python more_examples/coder.py
python more_examples/evaluator.pymcp.json: Template for MCP server definitions (gets copied to~/mcp.json)- Each MCP server runs via:
uv --directory /path/to/server run entrypoint.py
- API keys configured through
~/mcp.jsonand propagated to.env - Required:
ANTHROPIC_API_KEY,OPENAI_API_KEY,HF_TOKEN,SERP_API_KEY
- Python 3.11+ with UV package management
- MCP (Model Context Protocol) for tool integration
- FastAPI + Uvicorn for REST API
- Pydantic for data validation
- React 18 + TypeScript + Vite
- React Router for navigation
- React Markdown with KaTeX for math rendering
- Anthropic Claude, OpenAI GPT, Google Gemini models
- Hugging Face and OpenWeights for additional model access
- Define prompt template in
automator/prompts/agent_name.yaml - Specify required tools using glob patterns (e.g.,
"terminal.*","web.*") - Set workspace environment variables for agent's working context
- Use workspace persistence for long-running agent sessions
"terminal.*": All terminal/codebase interaction tools"web.*": Web search and browsing tools"talk2model.*": Agent-to-agent communication tools- Specific tools:
"terminal.execute","web.search", etc.
- Workspaces auto-save agent state and conversation threads
- Load existing agents:
workspace.get_agent('agent_name') - List available:
workspace.list_agents(),workspace.list_threads()