diff --git a/QUICKSTART.md b/QUICKSTART.md index f2bc6d7..caa9680 100644 --- a/QUICKSTART.md +++ b/QUICKSTART.md @@ -106,6 +106,54 @@ deploy the GTM specialist to production ← Tier 3, asks for YES --- +## Step 5 — Give your agent a persona + +By default Hermes runs as the Super Agent — a generalist that owns workflows and routes tasks. But you can switch it to a specialist persona at any time, right from Telegram. + +**See what personas are available:** +``` +/identity +``` +Hermes will reply with the current persona and a list of every option — like `coo`, `gtm`, `head_of_ops`. + +**Switch to a persona:** +``` +/identity coo +``` +From that point on, every message in that chat goes to Alex, your COO — with Alex's role, voice, and tool access baked in. The persona sticks until you change it or restart. + +**Create your own persona:** + +You don't need to touch any code. Just create a file called `.yaml` in `src/agent_os/orchestrator/config/identities/`. For example, to create a video production agent: + +```yaml +# src/agent_os/orchestrator/config/identities/video_agent.yaml +name: Vex +title: Video Production Agent +system_prompt: | + You are Vex, the video production agent. You own the full video pipeline: + scripting, transcription, editing workflows, thumbnail generation, and + publishing to YouTube/social. You know ffmpeg and have shell access. + You remember every project we've worked on together. +tools_allowed: + - hermes_self + - terminal + - exa +default_tier_ceiling: 2 +``` + +Commit that file, deploy, then send `/identity video_agent` in Telegram. That's it — Vex is live. + +**If you want a persona to be the default** (so it loads on startup without needing `/identity`), set this in your `.env` or Railway environment variables: + +``` +AGENT_IDENTITY=video_agent +``` + +Each machine or Railway service can have its own `AGENT_IDENTITY`. One instance is the Super Agent, another is the COO, a third is your video agent — all separate processes, all sharing the same memory vault so they have the same conversation history. + +--- + ## What to do if something breaks | Problem | Fix | diff --git a/SETUP.md b/SETUP.md index 2097302..6d5feeb 100644 --- a/SETUP.md +++ b/SETUP.md @@ -6,6 +6,71 @@ The default deploy is **Railway-managed** — every fabric service runs as its o --- +## Agent personas — giving each agent its own identity + +Every agent in the fleet has a **persona**: a name, a role, a voice, and a list of tools it's allowed to use. Personas are defined in plain YAML files — no code changes needed to create or swap them. + +### How it works + +When an agent starts up, it reads its identity from an environment variable called `AGENT_IDENTITY`. That name maps to a file in `src/agent_os/orchestrator/config/identities/.yaml`. The file contains a `system_prompt` that gets injected into every LLM call, so the agent always knows who it is, what it owns, and what tools it has access to. + +Four personas ship out of the box: + +| Name | Who they are | +|---|---| +| `supersan` | The Super Agent — the primary orchestrator. Owns everything, routes work to the right specialist. | +| `coo` | Alex, the COO — sees the whole org, delegates aggressively, holds everyone accountable. | +| `gtm` | Jordan, the GTM Agent — owns content, leads, and brand. Knows your CRM and email tools. | +| `head_of_ops` | Morgan, the Head of Operations — runs the client pipeline, watches the funnel, catches broken jobs. | + +### Switching personas from Telegram + +You don't need to redeploy to switch personas. In any Telegram conversation with Hermes: + +- `/identity` — shows which persona is active and lists all available options +- `/identity coo` — switches to Alex for the rest of that conversation +- `/identity video_agent` — switches to any custom persona you've created + +The persona you set is remembered for that chat session. Every message after the switch goes through that agent's system prompt, memory, and tool rules. + +### Setting a default persona for a deployment + +If you want a service to always start as a specific persona, set this in its environment: + +``` +AGENT_IDENTITY=coo +``` + +On Railway, set it under the service's Variables tab. On a VPS, add it to the `.env` file. On Docker, pass it with `-e AGENT_IDENTITY=coo`. + +### Creating a new persona + +1. Create a YAML file at `src/agent_os/orchestrator/config/identities/.yaml` +2. Give it a `system_prompt` that describes who the agent is, what it owns, and how it should behave +3. Optionally add `tools_allowed`, `tools_denied`, and `default_tier_ceiling` +4. Commit and deploy — then send `/identity ` in Telegram to activate it + +Example — a video production agent: + +```yaml +name: Vex +title: Video Production Agent +system_prompt: | + You are Vex, the video production agent. You own the full video pipeline: + scripting, transcription, editing workflows, thumbnail generation, and + publishing to YouTube and social. You have shell access and know ffmpeg. + You remember every project we've worked on together. +tools_allowed: + - hermes_self + - terminal + - exa +default_tier_ceiling: 2 +``` + +The file name (without `.yaml`) is what you type after `/identity`. That's all there is to it. + +--- + ## Why both Railway and DigitalOcean? - **Railway** runs the **fixed always-on services** (NATS, Temporal, Coordinator, Archon wrapper, Admiral). One Dockerfile per service, auto-restart, public TLS URLs, env vars in a dashboard. You set it up once and forget. diff --git a/pyproject.toml b/pyproject.toml index 591324f..00b6adb 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -49,6 +49,7 @@ select = ["E", "F", "W", "I", "B", "UP"] [tool.pytest.ini_options] testpaths = ["tests"] asyncio_mode = "auto" +pythonpath = ["src"] [build-system] requires = ["hatchling"] diff --git a/src/agent_os/channels/telegram/bot.py b/src/agent_os/channels/telegram/bot.py index 6fba5f5..569491a 100644 --- a/src/agent_os/channels/telegram/bot.py +++ b/src/agent_os/channels/telegram/bot.py @@ -38,6 +38,10 @@ _PENDING_APPROVALS: dict[int, dict[str, Any]] = {} _APPROVAL_TTL_SECONDS = 300 +# chat_id → active identity name (set via /identity ). +# Falls back to AGENT_IDENTITY env var when not set. +_ACTIVE_IDENTITY: dict[int, str] = {} + # Telegram message hard cap is 4096 chars; leave headroom for our wrapper text. _MAX_BODY_CHARS = 3500 @@ -111,9 +115,9 @@ async def _handle_message(client: httpx.AsyncClient, msg: dict[str, Any]) -> Non # Lazy imports keep cold-start cheap and avoid hard deps when running # other parts of the system. - from agent_os.orchestrator import plan_card, tier_classifier, intent_classifier - from agent_os.orchestrator.adapters.job_router import Job + from agent_os.orchestrator import intent_classifier, plan_card from agent_os.orchestrator.adapters import plan_overrides + from agent_os.orchestrator.adapters.job_router import Job from agent_os.orchestrator.tool_planner import plan as plan_fn # Override commands (/cancel, /use, /why, /plan on|off, /tier N, YES) — these @@ -168,6 +172,31 @@ async def _handle_message(client: httpx.AsyncClient, msg: dict[str, Any]) -> Non f"Forced to tier {pending['plan'].tier}. Reply 'yes' to run.") return + if override.kind == "identity": + name = override.identity + if not name: + # List available identities + from pathlib import Path # noqa: PLC0415 + identity_dir = ( + Path(__file__).parents[3] + / "orchestrator/config/identities" + ) + available = sorted( + p.stem for p in identity_dir.glob("*.yaml") + ) + current = _ACTIVE_IDENTITY.get(chat_id) or os.getenv("AGENT_IDENTITY", "supersan") + await _send( + client, chat_id, + f"Current identity: {current}\n" + f"Available: {', '.join(available)}\n\n" + "Switch with: /identity ", + ) + return + _ACTIVE_IDENTITY[chat_id] = name + await _send(client, chat_id, + f"Identity set to '{name}'. All future messages will use this persona.") + return + if override.kind == "confirm": if not pending: await _send(client, chat_id, "No pending tier-3 plan to confirm.") @@ -210,7 +239,10 @@ async def _handle_message(client: httpx.AsyncClient, msg: dict[str, Any]) -> Non # outbound intent it can prove from the wording. No fuzziness, no LLM # call, no auto-spawn from ambiguous prompts. intent = intent_classifier.classify(text) - job = Job(prompt=text, tags=set(intent.tags)) + meta: dict[str, str] = {"user_id": str(chat_id)} + if chat_id in _ACTIVE_IDENTITY: + meta["identity"] = _ACTIVE_IDENTITY[chat_id] + job = Job(prompt=text, tags=set(intent.tags), metadata=meta) tool_plan = plan_fn(job, identity="primary_hermes") # tool_plan.tier already came from tier_classifier.classify with the @@ -283,6 +315,8 @@ def _handle_command(text: str) -> str: " /why explain how this plan was picked\n" " /tier <1|2|3> force a tier override\n\n" "Other commands:\n" + " /identity show current persona + available options\n" + " /identity switch persona (e.g. /identity coo)\n" " /status — quick fleet status\n" " /help — this message" ) @@ -329,7 +363,7 @@ async def run_bot() -> None: # Handle each message in its own task so a slow LLM # call doesn't block the poll loop. asyncio.create_task(_handle_message(client, msg)) - except (httpx.HTTPError, asyncio.TimeoutError) as exc: + except (TimeoutError, httpx.HTTPError) as exc: logger.warning("Telegram poll error: %s — retrying in 5s", exc) await asyncio.sleep(5) except Exception: diff --git a/src/agent_os/orchestrator/adapters/plan_overrides.py b/src/agent_os/orchestrator/adapters/plan_overrides.py index 6db4013..1608c0b 100644 --- a/src/agent_os/orchestrator/adapters/plan_overrides.py +++ b/src/agent_os/orchestrator/adapters/plan_overrides.py @@ -24,7 +24,7 @@ OverrideKind = Literal[ "cancel", "use", "why", "plan_on", "plan_off", - "tier", "confirm", "unknown", + "tier", "confirm", "identity", "unknown", ] @@ -35,11 +35,13 @@ class Override: tool: str | None = None model: str | None = None tier: int | None = None + identity: str | None = None error: str | None = None _USE_RE = re.compile(r"^/use\s+([A-Za-z0-9_]+)(?:\s+([A-Za-z0-9_.-]+))?\s*$") _TIER_RE = re.compile(r"^/tier\s+([123])\s*$") +_IDENTITY_RE = re.compile(r"^/identity(?:\s+([A-Za-z0-9_]+))?\s*$") def parse(text: str) -> Override | None: @@ -91,6 +93,16 @@ def parse(text: str) -> Override | None: ) return Override(kind="tier", raw=raw, tier=int(m.group(1))) + if lower.startswith("/identity"): + m = _IDENTITY_RE.match(lower) + if not m: + return Override( + kind="unknown", + raw=raw, + error="usage: /identity (e.g. /identity coo)", + ) + return Override(kind="identity", raw=raw, identity=m.group(1)) + if raw.startswith("/"): return Override( kind="unknown", diff --git a/src/agent_os/orchestrator/adapters/vault_memory.py b/src/agent_os/orchestrator/adapters/vault_memory.py index 4c9713b..60be51a 100644 --- a/src/agent_os/orchestrator/adapters/vault_memory.py +++ b/src/agent_os/orchestrator/adapters/vault_memory.py @@ -27,3 +27,33 @@ def append_message(canonical_user_id: str, role: str, content: str) -> None: def load_history(canonical_user_id: str) -> str: p = conversation_path(canonical_user_id) return p.read_text() if p.exists() else "" + + +def parse_history(canonical_user_id: str, limit: int = 10) -> list[dict]: + """Return recent conversation as OpenAI-style messages list. + + Parses the markdown log written by append_message() back into + [{role, content}, ...] so runtimes can pass it directly to LLM APIs. + limit is the number of turns (each turn = one user + one assistant message). + """ + raw = load_history(canonical_user_id) + if not raw: + return [] + messages: list[dict] = [] + current_role: str | None = None + current_lines: list[str] = [] + for line in raw.split("\n"): + if line.startswith("## "): + if current_role and current_lines: + content = "\n".join(current_lines).strip() + if content: + messages.append({"role": current_role, "content": content}) + current_role = line[3:].strip() + current_lines = [] + else: + current_lines.append(line) + if current_role and current_lines: + content = "\n".join(current_lines).strip() + if content: + messages.append({"role": current_role, "content": content}) + return messages[-(limit * 2):] diff --git a/src/agent_os/runtimes/hermes_self/invoke.py b/src/agent_os/runtimes/hermes_self/invoke.py index cd05682..90e7290 100644 --- a/src/agent_os/runtimes/hermes_self/invoke.py +++ b/src/agent_os/runtimes/hermes_self/invoke.py @@ -19,11 +19,43 @@ import logging import os import time +from pathlib import Path from agent_os.runtimes._base import RuntimeResult, new_job_id, write_run_artifact logger = logging.getLogger(__name__) +# --------------------------------------------------------------------------- +# Identity — loaded once per identity name, keyed by AGENT_IDENTITY env var. +# +# Each deployed agent sets AGENT_IDENTITY= (e.g. coo, gtm, head_of_ops, +# supersan). The name maps directly to a YAML file in +# orchestrator/config/identities/.yaml. +# Defaults to "supersan" if unset. +# --------------------------------------------------------------------------- + +_IDENTITY_ROOT = Path(__file__).parents[3] / "orchestrator/config/identities" +_PROMPT_CACHE: dict[str, str] = {} + + +def _get_system_prompt(identity: str | None = None) -> str: + name = identity or os.getenv("AGENT_IDENTITY", "supersan") + if name in _PROMPT_CACHE: + return _PROMPT_CACHE[name] + try: + import yaml # noqa: PLC0415 + p = _IDENTITY_ROOT / f"{name}.yaml" + data = yaml.safe_load(p.read_text()) + _PROMPT_CACHE[name] = (data.get("system_prompt") or "").strip() + except Exception as exc: + logger.warning("Could not load system prompt for identity %r: %s", name, exc) + _PROMPT_CACHE[name] = "" + return _PROMPT_CACHE[name] + + +# --------------------------------------------------------------------------- +# Model selection +# --------------------------------------------------------------------------- def _default_model() -> str: """Final fallback — must match `default` task_class in config/models.yaml.""" @@ -34,6 +66,10 @@ def _default_model() -> str: ) +# --------------------------------------------------------------------------- +# Entry point +# --------------------------------------------------------------------------- + def invoke(job) -> RuntimeResult: """Run a single LLM call for the job's prompt. @@ -52,54 +88,95 @@ def invoke(job) -> RuntimeResult: if not isinstance(meta, dict): meta = {} model = meta.get("model") or meta.get("model_recommendation") or _default_model() + user_id = meta.get("user_id", "default") + identity = meta.get("identity") # optional per-job override; falls back to AGENT_IDENTITY env if not prompt: return _result(job_id, "error", {"error": "empty prompt"}, t0) + # Load identity and conversation history before the LLM call + from agent_os.orchestrator.adapters import vault_memory as _vault # noqa: PLC0415 + system_prompt = _get_system_prompt(identity) + history = _vault.parse_history(user_id, limit=10) + try: - text = _call_llm(model, prompt) + text = _call_llm(model, prompt, system_prompt=system_prompt, history=history) except Exception as exc: logger.warning("hermes_self LLM call failed: %s", exc) return _result(job_id, "error", {"error": str(exc), "model": model}, t0) + # Persist both turns so the next call can load them as context + _vault.append_message(user_id, "user", prompt) + _vault.append_message(user_id, "assistant", text) + return _result(job_id, "completed", {"text": text, "model": model}, t0) -def _call_llm(model: str, prompt: str) -> str: +# --------------------------------------------------------------------------- +# LLM dispatch +# --------------------------------------------------------------------------- + +def _call_llm( + model: str, + prompt: str, + *, + system_prompt: str = "", + history: list | None = None, +) -> str: """Single chat completion. Routes by model id prefix.""" if model.startswith("claude-"): - return _call_anthropic(model, prompt) - return _call_openai_compat(model, prompt) + return _call_anthropic(model, prompt, system_prompt=system_prompt, history=history) + return _call_openai_compat(model, prompt, system_prompt=system_prompt, history=history) -def _call_anthropic(model: str, prompt: str) -> str: +def _call_anthropic( + model: str, + prompt: str, + *, + system_prompt: str = "", + history: list | None = None, +) -> str: api_key = os.getenv("ANTHROPIC_API_KEY", "") if not api_key: raise RuntimeError("ANTHROPIC_API_KEY not set — cannot call claude") from anthropic import Anthropic # imported lazily so missing dep doesn't block boot client = Anthropic(api_key=api_key) - msg = client.messages.create( - model=model, - max_tokens=2048, - messages=[{"role": "user", "content": prompt}], - ) + + messages = list(history or []) + messages.append({"role": "user", "content": prompt}) + + kwargs: dict = {"model": model, "max_tokens": 2048, "messages": messages} + if system_prompt: + kwargs["system"] = system_prompt + + msg = client.messages.create(**kwargs) return "".join(b.text for b in msg.content if hasattr(b, "text")) -def _call_openai_compat(model: str, prompt: str) -> str: +def _call_openai_compat( + model: str, + prompt: str, + *, + system_prompt: str = "", + history: list | None = None, +) -> str: base_url, api_key = _resolve_openai_compat(model) if not api_key: raise RuntimeError(f"No API key configured for model {model!r}") from openai import OpenAI - kwargs = {"api_key": api_key} - if base_url: - kwargs["base_url"] = base_url - client = OpenAI(**kwargs) + client = OpenAI(api_key=api_key, base_url=base_url) if base_url else OpenAI(api_key=api_key) + + messages = [] + if system_prompt: + messages.append({"role": "system", "content": system_prompt}) + messages.extend(history or []) + messages.append({"role": "user", "content": prompt}) + resp = client.chat.completions.create( model=model, - messages=[{"role": "user", "content": prompt}], + messages=messages, max_tokens=2048, ) return resp.choices[0].message.content or "" @@ -114,10 +191,17 @@ def _resolve_openai_compat(model: str) -> tuple[str | None, str]: if model.startswith(("kimi", "moonshot")): return "https://api.moonshot.ai/v1", os.getenv("MOONSHOT_API_KEY", "") if model.startswith(("gemini", "google/")): - return "https://generativelanguage.googleapis.com/v1beta/openai", os.getenv("GOOGLE_API_KEY", "") + return ( + "https://generativelanguage.googleapis.com/v1beta/openai", + os.getenv("GOOGLE_API_KEY", ""), + ) return "https://openrouter.ai/api/v1", os.getenv("OPENROUTER_API_KEY", "") +# --------------------------------------------------------------------------- +# Result helper +# --------------------------------------------------------------------------- + def _result(job_id: str, status: str, output: dict, t0: float) -> RuntimeResult: result = RuntimeResult( runtime="hermes_self",