diff --git a/docs/alive/_category_.json b/docs/alive/_category_.json new file mode 100644 index 0000000..2b6395c --- /dev/null +++ b/docs/alive/_category_.json @@ -0,0 +1,8 @@ +{ + "label": "Alive", + "position": 0.0004, + "link": { + "type": "generated-index", + "description": "Avatar Liveness for Intelligent Virtual Empathy (ALIVE)" + } +} diff --git a/docs/alive/architecture.md b/docs/alive/architecture.md new file mode 100644 index 0000000..06233c9 --- /dev/null +++ b/docs/alive/architecture.md @@ -0,0 +1,84 @@ +--- +sidebar_position: 0.0006 +--- +# Architecture + +**ALIVE** is a modular system consisting of both frontend and backend subsystems, with clear separation of concerns, containerized deployment, and real-time multimodal interaction support. + +--- + +## System Overview + +ALIVE is split into two main subsystems: + +### EVA – Empathetic Virtual Assistant (Backend) + +- Built with **FastAPI** in **Python** +- Real-time WebSocket chat with LLM and RAG +- Emotional response tagging and summarization +- Handles user session, message history, and document retrieval + +### AVATAR – Emotional Avatar Renderer (Frontend + API) + +- Built with **ASP.NET Core** and **C#** +- Provides a REST API and WebRTC interface for avatar rendering +- Integrates with: + - **D-ID Clips API** for face animation + - **Azure Speech Services** for emotional TTS +- Supports dynamic rendering based on emotion-tagged content + +--- + +## Subsystem Components + +### EVA (FastAPI Application) + +- `/eva/main.py` – Entrypoint with WebSocket routing and lifecycle hooks +- `/eva/llm/` – Interfaces for calling OpenAI/Anthropic APIs, emotion segmentation +- `/eva/rag/` – Local RAG document store using **ChromaDB** +- `/eva/db/` – Conversation storage and admin setting persistence (SQLite/PostgreSQL) +- `/eva/admin/` – Secure admin panel for prompt configuration, document uploads, and conversation management + +### AVATAR (ASP.NET Core Service) + +- `/avatar/` – Main backend orchestrating avatar generation +- `/avatar/wwwroot/` – Web frontend assets for avatar display +- `/avatar/Program.cs`, `/avatar/appsettings.json` – Service initialization and configuration + +--- + +## Supporting Services + +- **ChromaDB** (embedded) – RAG vector store for semantic search +- **Azure TTS** – Expressive speech synthesis based on emotion +- **D-ID API** – Avatar face animation based on transcript and voice +- **WebRTC** – Low-latency avatar playback + +--- + +## Deployment Stack + +| Component | Tech | Role | +|----------------|------------------|-------------------------------| +| EVA | Python + FastAPI | WebSocket chat, LLM, RAG | +| AVATAR | C# + ASP.NET | Audio/visual rendering | +| Chroma | Rust (DB engine) | Vector database for RAG | +| D-ID API | SaaS | Face animation | +| Azure Speech | SaaS | Emotion-aware TTS | +| Frontend | HTML/CSS/JS | User interface for avatar | +| DB | PostgreSQL/SQLite| Conversation persistence | + +--- + +## Communication Flow + +![Architecture](images/diagram.svg 'Architecture') + +--- + +## Extensibility + +- Swap model providers using `LLM_PROVIDER=openai|anthropic` in `.env` +- Extend document support by dropping files into `documents/` (or configuring another vector store) +- Customize the avatar renderer by modifying the AVATAR pipeline or using alternatives to D-ID +- Replace Azure TTS with ElevenLabs or local TTS via configurable adapters diff --git a/docs/alive/deployment.md b/docs/alive/deployment.md new file mode 100644 index 0000000..6e6f441 --- /dev/null +++ b/docs/alive/deployment.md @@ -0,0 +1,149 @@ +--- +sidebar_position: 0.0007 +--- + +# Deployment + +This guide outlines how to deploy the ALIVE system using the official Docker images. ALIVE consists of two main services: + +- `sermas-eva`: Backend for chat, emotion tagging, RAG, and LLM integration +- `sermas-avatar`: Avatar rendering and emotional TTS using D-ID and Azure + +--- + +## Prerequisites + +Before you begin, ensure you have: + +- Docker installed (v27+ recommended) including the compose plugin +- A valid OpenAI or Anthropic API key +- An Azure Speech key (for TTS) +- A D-ID API key (for the 3D avatar) + +--- + +## Docker Images + +The official images are hosted on Docker Hub: + +| Component | Image | +|-----------|-------| +| EVA | [`thingenious/sermas-eva`](https://hub.docker.com/r/thingenious/sermas-eva) | +| AVATAR | [`thingenious/sermas-avatar`](https://hub.docker.com/r/thingenious/sermas-avatar) | + +--- + +## Quick Start with `docker-compose` + +1. Copy and rename the example configuration: + + ```shell + cp compose.example.yaml compose.yaml + # you might want to commment out / remove the "build" sections + cp .env.example .env + ``` + +2. Edit the `.env` file and fill in: + + ```env + OPENAI_API_KEY=your-key + AZURE_SPEECH_API_KEY=your-key + DID_API_KEY=your-did-key + CHAT_API_KEY=choose-a-secret + # example generation with: + # openssl rand -hex 32 + # or python -c "import secrets; print(secrets.token_hex(32))" + ``` + +3. Start the stack: + + ```shell + docker compose -f compose.yaml up -d + ``` + +Once running: + +- EVA backend: [http://localhost:8000](http://localhost:8000), WebSocket at [ws://localhost:8000/ws](ws://localhost:8000/ws) +- AVATAR frontend: [http://localhost:3000](http://localhost:3000) + +--- + +## Reverse Proxy with Nginx and HTTPS (Recommended) + +To expose ALIVE services over the internet securely, you can place an Nginx reverse proxy in front of the EVA and AVATAR containers and enable HTTPS using Let's Encrypt and Certbot. + +### Example Nginx Config (for `/etc/nginx/sites-available/alive`) + +```nginx +server { + listen 80; + server_name your-domain.com; + + location / { + return 301 https://$host$request_uri; + } +} + +server { + listen 443 ssl; + server_name your-domain.com; + + ssl_certificate /etc/letsencrypt/live/your-domain.com/fullchain.pem; + ssl_certificate_key /etc/letsencrypt/live/your-domain.com/privkey.pem; + + location / { + proxy_pass http://localhost:3000; # AVATAR frontend + proxy_set_header Host $host; + proxy_set_header X-Real-IP $remote_addr; + } + + location /ws/ { + proxy_pass http://localhost:8000/ws/; # EVA WebSocket + proxy_http_version 1.1; + proxy_set_header Upgrade $http_upgrade; + proxy_set_header Connection "Upgrade"; + } + + # optional (admin panel) + location /admin/ { + proxy_pass http://localhost:8000/admin/; + } + + location /healthz { + proxy_pass http://localhost:8000/healthz; + } +} +``` + +## Advanced Configuration + +You can configure behavior via environment variables. Some key options: + +| Variable | Description | +|----------|-------------| +| `LLM_PROVIDER` | `openai` or `anthropic` | +| `MAX_HISTORY_MESSAGES` | Number of chat messages retained | +| `CHROMA_DB_DIR` | Directory for Chroma vector DB | +| `CHAT_WS_URL` | URL for WebSocket clients | +| `ADMIN_API_KEY` | Protects admin endpoints | + +--- + +## Health & Debug + +- EVA: `GET /healthz` → `{ "status": "ok" }` +- Admin Panel: `GET /admin/` (requires token) + +You can inspect container logs with: + +```shell +docker compose logs -f +``` + +--- + +## Source & Updates + +For latest changes, visit the GitHub repository: + +👉 [`thingenious/sermas`](https://github.com/thingenious/sermas) diff --git a/docs/alive/images/diagram.svg b/docs/alive/images/diagram.svg new file mode 100644 index 0000000..b1235e2 --- /dev/null +++ b/docs/alive/images/diagram.svg @@ -0,0 +1 @@ +

Web Browser (JavaScript / WebRTC)

ASP.NET Core Backend

D-ID API (Avatar Clips)

Microsoft Azure
Speech Services

EVE LLM
(WebSocket)

\ No newline at end of file diff --git a/docs/alive/images/preview.webp b/docs/alive/images/preview.webp new file mode 100644 index 0000000..b337296 Binary files /dev/null and b/docs/alive/images/preview.webp differ diff --git a/docs/alive/introduction.md b/docs/alive/introduction.md new file mode 100644 index 0000000..a966da1 --- /dev/null +++ b/docs/alive/introduction.md @@ -0,0 +1,64 @@ +--- +sidebar_position: 0.0005 +--- + +# Introduction + +**ALIVE** (Avatar Liveness for Intelligent Virtual Empathy) is an open, modular system for building emotionally intelligent, multimodal virtual assistants. Designed as part of the [SERMAS Project](https://sermasproject.eu/), ALIVE integrates natural language processing, emotional modeling, real-time WebSocket communication, and 3D avatar animation into a unified experience. + +![Preview](images/preview.webp "Preview") + +At its core, ALIVE aims to combine **meaningful dialogue** with **human-like presence** by bringing together: + +- 🤖 **Conversational AI** – powered by LLMs and RAG +- 🎭 **Emotional Context** – responses tagged with affective cues +- 🧍‍♀️ **Avatar Expression** – lifelike visual feedback using D-ID and emotional TTS +- 🎙️ **Multimodal Input** – support for both typed and spoken interactions +- 🌐 **Real-Time WebSocket Streaming** – ensures low-latency conversational flow + +--- + +## System Components + +ALIVE is composed of two tightly-integrated subsystems: + +### EVA – Empathetic Virtual Assistant + +- Real-time WebSocket chat backend (FastAPI) +- LLM-driven generation with context memory and summarization +- Emotionally segmented responses with RAG source attribution +- Conversation persistence and summarization + +### AVATAR – Emotional Avatar Rendering + +- ASP.NET Core API + Web frontend +- Integration with D-ID Clips for animated face generation +- Azure Speech Services for expressive TTS output +- WebRTC streaming for real-time avatar playback + +--- + +## Key Capabilities + +- **Emotion-Aware Messaging** + Every message is tagged with emotional tone (e.g., happy, concerned, thoughtful) to drive both voice and facial expression. + +- **Contextual Memory** + Conversations are persisted, summarized, and reused to maintain continuity across sessions. + +- **Hybrid Input & Output** + Supports both voice and text inputs and delivers multimodal avatar responses. + +- **Open and Modular** + Designed to be composable, Dockerized, and adaptable for research, prototyping, or production deployments. + +--- + +## Source Code and Docker Images + +ALIVE is maintained under the open-source [`thingenious/sermas`](https://github.com/thingenious/sermas) repository on GitHub. + +You can also run the system using our prebuilt Docker images: + +- [`thingenious/sermas-eva`](https://hub.docker.com/r/thingenious/sermas-eva): The FastAPI-based backend with LLM and RAG support +- [`thingenious/sermas-avatar`](https://hub.docker.com/r/thingenious/sermas-avatar): The ASP.NET Core service for avatar animation and voice rendering