Self-MoA (ICLR 2025) as a drop-in OpenAI-compatible proxy. Fan to 2 LLM upstreams, synthesize verbatim. +3.8%–6.6% on reasoning. 627 LOC FastAPI.
ensemble fastapi llamacpp vllm local-llm litellm llm-router mixture-of-agents openai-compatible self-moa
-
Updated
May 16, 2026 - Python