jbellsolutions · jbellsolutions · May 15, 2026 · May 14, 2026 · May 15, 2026 · May 15, 2026
diff --git a/QUICKSTART.md b/QUICKSTART.md
@@ -106,6 +106,54 @@ deploy the GTM specialist to production   ← Tier 3, asks for YES
 
 ---
 
+## Step 5 — Give your agent a persona
+
+By default Hermes runs as the Super Agent — a generalist that owns workflows and routes tasks. But you can switch it to a specialist persona at any time, right from Telegram.
+
+**See what personas are available:**
+```
+/identity
+```
+Hermes will reply with the current persona and a list of every option — like `coo`, `gtm`, `head_of_ops`.
+
+**Switch to a persona:**
+```
+/identity coo
+```
+From that point on, every message in that chat goes to Alex, your COO — with Alex's role, voice, and tool access baked in. The persona sticks until you change it or restart.
+
+**Create your own persona:**
+
+You don't need to touch any code. Just create a file called `<name>.yaml` in `src/agent_os/orchestrator/config/identities/`. For example, to create a video production agent:
+
+```yaml
+# src/agent_os/orchestrator/config/identities/video_agent.yaml
+name: Vex
+title: Video Production Agent
+system_prompt: |
+  You are Vex, the video production agent. You own the full video pipeline:
+  scripting, transcription, editing workflows, thumbnail generation, and
+  publishing to YouTube/social. You know ffmpeg and have shell access.
+  You remember every project we've worked on together.
+tools_allowed:
+  - hermes_self
+  - terminal
+  - exa
+default_tier_ceiling: 2
+```
+
+Commit that file, deploy, then send `/identity video_agent` in Telegram. That's it — Vex is live.
+
+**If you want a persona to be the default** (so it loads on startup without needing `/identity`), set this in your `.env` or Railway environment variables:
+
+```
+AGENT_IDENTITY=video_agent
+```
+
+Each machine or Railway service can have its own `AGENT_IDENTITY`. One instance is the Super Agent, another is the COO, a third is your video agent — all separate processes, all sharing the same memory vault so they have the same conversation history.
+
+---
+
 ## What to do if something breaks
 
 | Problem | Fix |

diff --git a/SETUP.md b/SETUP.md
@@ -6,6 +6,71 @@ The default deploy is **Railway-managed** — every fabric service runs as its o
 
 ---
 
+## Agent personas — giving each agent its own identity
+
+Every agent in the fleet has a **persona**: a name, a role, a voice, and a list of tools it's allowed to use. Personas are defined in plain YAML files — no code changes needed to create or swap them.
+
+### How it works
+
+When an agent starts up, it reads its identity from an environment variable called `AGENT_IDENTITY`. That name maps to a file in `src/agent_os/orchestrator/config/identities/<name>.yaml`. The file contains a `system_prompt` that gets injected into every LLM call, so the agent always knows who it is, what it owns, and what tools it has access to.
+
+Four personas ship out of the box:
+
+| Name | Who they are |
+|---|---|
+| `supersan` | The Super Agent — the primary orchestrator. Owns everything, routes work to the right specialist. |
+| `coo` | Alex, the COO — sees the whole org, delegates aggressively, holds everyone accountable. |
+| `gtm` | Jordan, the GTM Agent — owns content, leads, and brand. Knows your CRM and email tools. |
+| `head_of_ops` | Morgan, the Head of Operations — runs the client pipeline, watches the funnel, catches broken jobs. |
+
+### Switching personas from Telegram
+
+You don't need to redeploy to switch personas. In any Telegram conversation with Hermes:
+
+- `/identity` — shows which persona is active and lists all available options
+- `/identity coo` — switches to Alex for the rest of that conversation
+- `/identity video_agent` — switches to any custom persona you've created
+
+The persona you set is remembered for that chat session. Every message after the switch goes through that agent's system prompt, memory, and tool rules.
+
+### Setting a default persona for a deployment
+
+If you want a service to always start as a specific persona, set this in its environment:
+
+```
+AGENT_IDENTITY=coo
+```
+
+On Railway, set it under the service's Variables tab. On a VPS, add it to the `.env` file. On Docker, pass it with `-e AGENT_IDENTITY=coo`.
+
+### Creating a new persona
+
+1. Create a YAML file at `src/agent_os/orchestrator/config/identities/<name>.yaml`
+2. Give it a `system_prompt` that describes who the agent is, what it owns, and how it should behave
+3. Optionally add `tools_allowed`, `tools_denied`, and `default_tier_ceiling`
+4. Commit and deploy — then send `/identity <name>` in Telegram to activate it
+
+Example — a video production agent:
+
+```yaml
+name: Vex
+title: Video Production Agent
+system_prompt: |
+  You are Vex, the video production agent. You own the full video pipeline:
+  scripting, transcription, editing workflows, thumbnail generation, and
+  publishing to YouTube and social. You have shell access and know ffmpeg.
+  You remember every project we've worked on together.
+tools_allowed:
+  - hermes_self
+  - terminal
+  - exa
+default_tier_ceiling: 2
+```
+
+The file name (without `.yaml`) is what you type after `/identity`. That's all there is to it.
+
+---
+
 ## Why both Railway and DigitalOcean?
 
 - **Railway** runs the **fixed always-on services** (NATS, Temporal, Coordinator, Archon wrapper, Admiral). One Dockerfile per service, auto-restart, public TLS URLs, env vars in a dashboard. You set it up once and forget.

diff --git a/pyproject.toml b/pyproject.toml
@@ -49,6 +49,7 @@ select = ["E", "F", "W", "I", "B", "UP"]
 [tool.pytest.ini_options]
 testpaths = ["tests"]
 asyncio_mode = "auto"
+pythonpath = ["src"]
 
 [build-system]
 requires = ["hatchling"]

diff --git a/src/agent_os/channels/telegram/bot.py b/src/agent_os/channels/telegram/bot.py
@@ -38,6 +38,10 @@
 _PENDING_APPROVALS: dict[int, dict[str, Any]] = {}
 _APPROVAL_TTL_SECONDS = 300
 
+# chat_id → active identity name (set via /identity <name>).
+# Falls back to AGENT_IDENTITY env var when not set.
+_ACTIVE_IDENTITY: dict[int, str] = {}
+
 # Telegram message hard cap is 4096 chars; leave headroom for our wrapper text.
 _MAX_BODY_CHARS = 3500
 
@@ -111,9 +115,9 @@ async def _handle_message(client: httpx.AsyncClient, msg: dict[str, Any]) -> Non
 
     # Lazy imports keep cold-start cheap and avoid hard deps when running
     # other parts of the system.
-    from agent_os.orchestrator import plan_card, tier_classifier, intent_classifier
-    from agent_os.orchestrator.adapters.job_router import Job
+    from agent_os.orchestrator import intent_classifier, plan_card
     from agent_os.orchestrator.adapters import plan_overrides
+    from agent_os.orchestrator.adapters.job_router import Job
     from agent_os.orchestrator.tool_planner import plan as plan_fn
 
     # Override commands (/cancel, /use, /why, /plan on|off, /tier N, YES) — these
@@ -168,6 +172,31 @@ async def _handle_message(client: httpx.AsyncClient, msg: dict[str, Any]) -> Non
                         f"Forced to tier {pending['plan'].tier}. Reply 'yes' to run.")
             return
 
+        if override.kind == "identity":
+            name = override.identity
+            if not name:
+                # List available identities
+                from pathlib import Path  # noqa: PLC0415
+                identity_dir = (
+                    Path(__file__).parents[3]
+                    / "orchestrator/config/identities"
+                )
+                available = sorted(
+                    p.stem for p in identity_dir.glob("*.yaml")
+                )
+                current = _ACTIVE_IDENTITY.get(chat_id) or os.getenv("AGENT_IDENTITY", "supersan")
+                await _send(
+                    client, chat_id,
+                    f"Current identity: {current}\n"
+                    f"Available: {', '.join(available)}\n\n"
+                    "Switch with: /identity <name>",
+                )
+                return
+            _ACTIVE_IDENTITY[chat_id] = name
+            await _send(client, chat_id,
+                        f"Identity set to '{name}'. All future messages will use this persona.")
+            return
+
         if override.kind == "confirm":
             if not pending:
                 await _send(client, chat_id, "No pending tier-3 plan to confirm.")
@@ -210,7 +239,10 @@ async def _handle_message(client: httpx.AsyncClient, msg: dict[str, Any]) -> Non
     # outbound intent it can prove from the wording. No fuzziness, no LLM
     # call, no auto-spawn from ambiguous prompts.
     intent = intent_classifier.classify(text)
-    job = Job(prompt=text, tags=set(intent.tags))
+    meta: dict[str, str] = {"user_id": str(chat_id)}
+    if chat_id in _ACTIVE_IDENTITY:
+        meta["identity"] = _ACTIVE_IDENTITY[chat_id]
+    job = Job(prompt=text, tags=set(intent.tags), metadata=meta)
 
     tool_plan = plan_fn(job, identity="primary_hermes")
     # tool_plan.tier already came from tier_classifier.classify with the
@@ -283,6 +315,8 @@ def _handle_command(text: str) -> str:
             "  /why        explain how this plan was picked\n"
             "  /tier <1|2|3>   force a tier override\n\n"
             "Other commands:\n"
+            "  /identity           show current persona + available options\n"
+            "  /identity <name>    switch persona (e.g. /identity coo)\n"
             "  /status — quick fleet status\n"
             "  /help — this message"
         )
@@ -329,7 +363,7 @@ async def run_bot() -> None:
                         # Handle each message in its own task so a slow LLM
                         # call doesn't block the poll loop.
                         asyncio.create_task(_handle_message(client, msg))
-            except (httpx.HTTPError, asyncio.TimeoutError) as exc:
+            except (TimeoutError, httpx.HTTPError) as exc:
                 logger.warning("Telegram poll error: %s — retrying in 5s", exc)
                 await asyncio.sleep(5)
             except Exception:

diff --git a/src/agent_os/orchestrator/adapters/plan_overrides.py b/src/agent_os/orchestrator/adapters/plan_overrides.py
@@ -24,7 +24,7 @@
 
 OverrideKind = Literal[
     "cancel", "use", "why", "plan_on", "plan_off",
-    "tier", "confirm", "unknown",
+    "tier", "confirm", "identity", "unknown",
 ]
 
 
@@ -35,11 +35,13 @@ class Override:
     tool: str | None = None
     model: str | None = None
     tier: int | None = None
+    identity: str | None = None
     error: str | None = None
 
 
 _USE_RE = re.compile(r"^/use\s+([A-Za-z0-9_]+)(?:\s+([A-Za-z0-9_.-]+))?\s*$")
 _TIER_RE = re.compile(r"^/tier\s+([123])\s*$")
+_IDENTITY_RE = re.compile(r"^/identity(?:\s+([A-Za-z0-9_]+))?\s*$")
 
 
 def parse(text: str) -> Override | None:
@@ -91,6 +93,16 @@ def parse(text: str) -> Override | None:
             )
         return Override(kind="tier", raw=raw, tier=int(m.group(1)))
 
+    if lower.startswith("/identity"):
+        m = _IDENTITY_RE.match(lower)
+        if not m:
+            return Override(
+                kind="unknown",
+                raw=raw,
+                error="usage: /identity <name>  (e.g. /identity coo)",
+            )
+        return Override(kind="identity", raw=raw, identity=m.group(1))
+
     if raw.startswith("/"):
         return Override(
             kind="unknown",

diff --git a/src/agent_os/orchestrator/adapters/vault_memory.py b/src/agent_os/orchestrator/adapters/vault_memory.py
@@ -27,3 +27,33 @@ def append_message(canonical_user_id: str, role: str, content: str) -> None:
 def load_history(canonical_user_id: str) -> str:
     p = conversation_path(canonical_user_id)
     return p.read_text() if p.exists() else ""
+
+
+def parse_history(canonical_user_id: str, limit: int = 10) -> list[dict]:
+    """Return recent conversation as OpenAI-style messages list.
+
+    Parses the markdown log written by append_message() back into
+    [{role, content}, ...] so runtimes can pass it directly to LLM APIs.
+    limit is the number of turns (each turn = one user + one assistant message).
+    """
+    raw = load_history(canonical_user_id)
+    if not raw:
+        return []
+    messages: list[dict] = []
+    current_role: str | None = None
+    current_lines: list[str] = []
+    for line in raw.split("\n"):
+        if line.startswith("## "):
+            if current_role and current_lines:
+                content = "\n".join(current_lines).strip()
+                if content:
+                    messages.append({"role": current_role, "content": content})
+            current_role = line[3:].strip()
+            current_lines = []
+        else:
+            current_lines.append(line)
+    if current_role and current_lines:
+        content = "\n".join(current_lines).strip()
+        if content:
+            messages.append({"role": current_role, "content": content})
+    return messages[-(limit * 2):]