Skip to content

feat(aura): restore long-term memory thats broken from recent changes, persona stability, and async prewarm#34

Open
Raygama wants to merge 4 commits intomainfrom
feat/llm-provider-abstraction
Open

feat(aura): restore long-term memory thats broken from recent changes, persona stability, and async prewarm#34
Raygama wants to merge 4 commits intomainfrom
feat/llm-provider-abstraction

Conversation

@Raygama
Copy link
Copy Markdown
Collaborator

@Raygama Raygama commented Apr 10, 2026

Summary

Restores long-term memory recall, stabilizes AURA's expressive persona, and implements an asynchronous prewarm mechanism to prevent worker initialization timeouts. This PR also synchronizes all environment and dependency files across Windows and macOS.

Changes

  • Memory Restoration: Corrected agent.py session handling to ensure unique conversation IDs per voice call while correctly injecting persistent user facts (identity/preferences) into the system prompt.
  • Async Prewarm Pattern: Refactored the agent/worker entry point in agent.py to offload heavy TTS model loading to a background thread. This allows the worker to register instantly, bypassing the 10s LiveKit registration timeout and maintaining STT connectivity.
  • Expression Lifecycle Sync: Implemented task tracking and cancellation in aura_tts.py to ensure avatar expressions (like tongue clicks) reliably reset to neutral when speech ends, eliminating async race conditions.
  • VTube Studio Hardening: Updated VTubeController to explicitly wipe directly injected parameters at the end of every response.
  • Cross-Platform Dependency Sync: Unified environment.yml and environment-macos.yml to include missing libraries (anthropic, httpx, pyvts) and updated ai-service/requirements.txt.
  • Monorepo Cleanup: Refactored roots package.json to properly manage dashboard and documentation sub-packages with correct directory prefixes.

Testing

  • Run .\start_aura.bat (or start_aura.sh on macOS) and verify that "TTS warmup complete" appears in the logs without triggering a TimeoutError.
  • Initiate a voice session and ask AURA about yourself (e.g., "What is my name?" or "What do I like?"); verify she uses the injected long-term memory facts.
  • Inspect the VTube Studio avatar during and after speech to confirm expressions (like tongue-out) return to neutral immediately after the audio stream finishes.

Raygama added 4 commits April 9, 2026 20:00
Introduce a Provider Abstraction Layer and settings management: add provider registry and provider adapters (OpenAI-compatible and Anthropic), with retry/fallback logic and provider inference.
- Add settings_service (Supabase-backed) and FastAPI /api/v1/settings routes to get/update settings and API keys; wire settings router into app. Refactor LLMService to delegate to provider_registry.
- Enhance memory_service to support OpenAI, OpenRouter and local Ollama embeddings and detect Ollama runtime.
- Update prompter to respect admin/system_prompt from settings.
- Add dashboard ApiKeys UI, test fixtures, and minor model/asset tweaks. Also add new env variables to .env.example for Anthropic, Groq and Ollama.
Expose real-time streaming and async LLM support across backend and frontend.

Backend:
- chat API: add SSE streamingResponse for streaming conversations, include identity and stream flags, emit emotion then incremental text deltas, persist interactions asynchronously.
- make emotion and generation nodes async and integrate immediate user-interaction persistence; scrub bracketed tokens before storing.
- LLMService: convert generate to async and add stream helper that proxies provider streams.
- Provider registry: make generate async, add stream() to route streaming providers, use asyncio.to_thread for sync provider.generate with retry/backoff, and use async sleeps.
- MemoryService: improved embedding provider logging, allow nullable assistant_text on add_interaction, persist user/assistant chunks safely, and add get_long_term_memories for RAG context.
- SettingsService: add simple in-memory caching for settings and API keys with TTL and invalidation on updates.
- Providers/openai_compat: handle streaming chunks more robustly and filter internal 'reasoning' tokens.
- General: better error logging and stream error handling.

Frontend/UI:
- Updated AvatarRenderer with numerous animation/behavior fixes (mouth lock during tongue, blink/expression handling, scale adjustments) and smaller style/format cleanups.
- CallOverlay: refactor to use shared getOrCreateIdentity, accept conversationId, improve LiveKit connect flow, UI redesign for centered avatar and controls, and refactors for robustness.
- ChatFeed: visual redesign for empty state and message list, improved tool display and bubble styling.
- package.json: bump pixi.js to ^8.17.1.
- Added new dashboard components and user utility (Presence, SlideOver, lib/user.js) and other UI tweaks.

Voice agent:
- Multiple voice-agent scripts updated (agent, tts, token server, memory, vtube controller, deps) to align with backend changes and improve runtime behavior.

Why: these changes enable low-latency streaming responses from LLMs to the dashboard, improve persistence and memory retrieval for RAG, reduce blocking sync calls, and refresh the UI/avatar experience for live interactions.
Move TTS warmup off the main init path by running warmup in a background thread and switching to a thread-safe asyncio.Event (_tts_ready_event). prewarm now schedules a non-blocking background warmup and falls back to set the event on failure; voice_session awaits the event with a 60s timeout to avoid hanging.

Track and manage avatar expression tasks in aura_tts: maintain expr_tasks, add done callbacks to remove completed tasks, handle cancellation by resetting neutral, and cancel/await pending tasks at the end to ensure the avatar returns to neutral reliably.

Tighten VTube reset logic: add debug logging, explicitly toggle off active expressions, reset injected parameters including common 'sticking' params (TongueOut, MouthOpen, EyeOpenLeft/Right), and clear injected state. Also simplify/adjust bilingual emotion keyword lists and clarify comments. These changes reduce blocking, prevent stuck expressions, and make resets more robust.
@Raygama
Copy link
Copy Markdown
Collaborator Author

Raygama commented Apr 10, 2026

Laid out groundwork for #27

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant