ATLAS

ATLAS is a multimodal desktop AI assistant with a conversational interface. It can understand and process audio, video, and text, and responds in real-time. It features a persistent memory and the ability to access the internet for up-to-date information.

Features

Multimodal Interaction: Communicate with ATLAS using your voice, camera, screen share, or text.
Real-time Conversation: ATLAS processes information and responds in real-time for a natural conversational experience.
Persistent Memory: ATLAS remembers past conversations, providing context for future interactions.
Internet Access: ATLAS can search the internet to answer questions and provide current information.
Desktop Integration: As a desktop application, ATLAS can access the screen and other system resources.

Architecture

ATLAS is built with a Python backend and an Electron frontend, communicating via WebSockets.

Backend

The backend is a Python application that uses FastAPI for real-time communication and a variety of libraries for AI capabilities.

app.py: The main entry point for the backend, running a FastAPI server and providing a WebSocket endpoint for the frontend.
Brain/RTC.py: The core of the backend, handling real-time communication. It processes audio from the microphone, video from the camera or screen, and text from the user. It uses the Google Gemini API for its multimodal AI capabilities.
Brain/deepagent.py & Brain/subagents.py: Implement a "deep agent" architecture using the deepagents library. This allows for specialized sub-agents, such as a "researcher" that can access the internet.
Brain/RAG.py: Implements Retrieval-Augmented Generation (RAG) using ChromaDB. This gives ATLAS a persistent memory by storing and retrieving chat history.
Tools/: Contains tools that can be used by the agents.
- tavily.py: A tool for searching the internet using the Tavily API.

Frontend

The frontend is an Electron application that provides the user interface for ATLAS.

frontend/main.js: The main process for the Electron application. It creates the application window and handles system-level interactions like screen capture. It also runs a local HTTP server to receive state updates from the backend.
frontend/renderer.js: The user interface logic. It handles user input from the microphone, camera, and text input. It communicates with the backend via the GeminiClient.
frontend/gemini-client.js: A WebSocket client that handles the real-time communication with the Python backend.

Key Technologies

Backend: Python, FastAPI, WebSockets, Google Generative AI (Gemini), LangChain, deepagents (for agent-based architecture), ChromaDB, Tavily API
Frontend: Electron, JavaScript, HTML, CSS
Database: ChromaDB (for vector storage)

Getting Started

Prerequisites

Python
uv (Python Package manager)
Node.js 24+
An .env file with the following keys:
- GEMINI_API_KEY
- TAVILY_API_KEY

Installation

Backend:
```
uv sync
```
Frontend:
```
cd frontend
npm install
```

Running the Application

Start the backend:
```
uv run app.py
```

Usage

Text Input: Type a message in the input box and press Enter to send it.
Microphone: Click the microphone icon to start and stop recording your voice.
Camera: Click the camera icon to turn your camera on and off.
Screen Share: Click the screen share icon to start and stop sharing your screen.
State Indicator: The sphere in the middle of the screen indicates the application's current state (e.g., listening, thinking, speaking).
Connection Status: The pill in the top right corner shows the connection status to the backend.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
Brain		Brain
Database		Database
Tools		Tools
docs		docs
frontend		frontend
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
app.py		app.py
main.py		main.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ATLAS

Features

Architecture

Backend

Frontend

Key Technologies

Getting Started

Prerequisites

Installation

Running the Application

Usage

About

Uh oh!

Releases

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ATLAS

Features

Architecture

Backend

Frontend

Key Technologies

Getting Started

Prerequisites

Installation

Running the Application

Usage

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Contributors

Uh oh!

Languages