Skip to content

Swastik-59/HelixAI

Repository files navigation

HelixAI

A local-first AI assistant with multi-worker orchestration, tool execution, and document retrieval. Runs as a desktop application.

HelixAI Desktop App


What is HelixAI?

HelixAI is a desktop application that runs AI workloads locally. It routes user requests through a FastAPI orchestrator to specialized workers via Redis Streams, persists state in PostgreSQL, and uses Qdrant for vector storage.

The system decomposes requests into task DAGs. Each task specifies a required capability (llm, tool, rag, voice), and the scheduler routes it to the appropriate worker. Workers execute independently and report results back. This is actual workflow orchestration, not sequential prompt chaining.


Architecture

┌─────────────────────────────────────────────────────────────┐
│                    Electron Desktop App                     │
│              React + TypeScript + Tailwind CSS              │
└─────────────────────────┬───────────────────────────────────┘
                          │ HTTP / SSE
                          ▼
┌─────────────────────────────────────────────────────────────┐
│                   FastAPI Orchestrator                      │
│  ┌──────────────┐  ┌────────────────┐  ┌────────────────┐  │
│  │ Intent       │  │ Workflow       │  │ DAG Scheduler  │  │
│  │ Classifier   │  │ Builder        │  │                │  │
│  └──────────────┘  └────────────────┘  └────────────────┘  │
└─────────────────────────┬───────────────────────────────────┘
                          │
         ┌────────────────┼────────────────┐
         ▼                ▼                ▼
┌─────────────┐   ┌─────────────┐   ┌─────────────┐
│   Redis     │   │ PostgreSQL  │   │   Qdrant    │
│   Streams   │   │             │   │             │
└──────┬──────┘   └─────────────┘   └─────────────┘
       │
       │ Capability-based routing
       ▼
┌─────────────────────────────────────────────────────────────┐
│  ┌───────────┐ ┌───────────┐ ┌───────────┐ ┌─────────────┐ │
│  │LLM Worker │ │Tool Worker│ │RAG Worker │ │Voice Worker │ │
│  │ (Ollama)  │ │           │ │ (Qdrant)  │ │ (Whisper)   │ │
│  └───────────┘ └───────────┘ └───────────┘ └─────────────┘ │
└─────────────────────────────────────────────────────────────┘

Feature Maturity

Stable

Multi-Worker Architecture
Four worker types (LLM, Tool, RAG, Voice) consume from capability-specific Redis Streams. Workers register, send heartbeats, and can be monitored independently.

DAG-Based Task Orchestration
The scheduler resolves task dependencies, propagates parent outputs to child tasks, and marks workflows complete when all tasks finish. Tasks transition through PENDING → QUEUED → RUNNING → COMPLETED states.

LLM Integration (Ollama)
Connects to local Ollama for inference. Supports multiple models with role-based routing (e.g., assign a coding model to code tasks). Streams tokens via Redis Pub/Sub.

Tool Execution
Implements web search (DuckDuckGo), web scraping, sandboxed Python execution, shell commands, and file operations (read/write/list/delete). Input validation and SSRF protection included.

Document Retrieval (RAG)
Chunks uploaded documents, embeds them with sentence-transformers, stores in Qdrant. Retrieves relevant context for queries.

Chat Persistence
Saves conversations to PostgreSQL. Sessions can be resumed from the sidebar.

Prometheus Metrics
Exposes task counts, queue depths, worker health, and API latency. Grafana dashboards included in the repo.

Experimental

Long-Term Memory
Stores user facts and preferences in Qdrant. Retrieves and injects into prompts. Extraction heuristics are basic.

Vision Analysis
Routes images to vision-capable models (LLaVA). Upload and query flow works, but depends on Ollama vision model availability.

Voice Input/Output
Whisper for speech-to-text, Piper TTS or espeak for synthesis. Functional but latency and model loading are unoptimized.

Multi-Model Discussions
Multiple LLMs can debate or build consensus on a topic. Modes: debate, consensus, review, round-robin, expert panel. Feature is implemented but UI integration is minimal.

Workflow Checkpoints
API to save and restore workflow state exists. Resume logic is implemented but not heavily tested.

Plugin System
Plugins can add tools, workers, or integrations. Two example plugins exist (weather, GitHub). Loading and lifecycle management work.

Self-Correction
Detects task failures and can generate recovery plans. Wired into the system but not consistently triggered.


Technology Stack

Frontend: Electron 28, React 18, TypeScript, Tailwind CSS, React Query, ReactFlow, Zustand
Backend: FastAPI, SQLAlchemy, PostgreSQL, Redis Streams, Pydantic
AI/ML: Ollama, Sentence Transformers, Qdrant, Whisper, Piper TTS
Observability: Prometheus, Grafana, structured JSON logging


What Makes This Non-Trivial

  • Tasks route by capability, not by hardcoded worker assignment
  • Workers are separate processes with their own health endpoints
  • DAG scheduler handles dependency resolution and output propagation
  • Redis Streams provide durable, consumer-group-based task queuing
  • Offline mode auto-detection adjusts available tools
  • Desktop app bundles the full backend stack

How to Run

Prerequisites: Docker, Python 3.11+, Node.js 18+, Ollama with a pulled model

./start.sh

This starts PostgreSQL, Redis, Qdrant, the orchestrator, workers, and the desktop app.

Manual startup:

docker compose up -d
source venv/bin/activate
uvicorn orchestrator.main:app --reload

# In separate terminals:
python -m workers.llm_worker.llm_worker
python -m workers.tools_worker.tool_worker
python -m workers.rag_worker.rag_worker
python -m workers.voice_worker.voice_worker

cd desktop && npm run electron:dev

Project Status

Personal project. Actively developed. Focused on local-first AI agent orchestration as a reference implementation.


License

MIT

About

Local multi-agent AI with real workflows.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors