feat(agent): presence & continuity — persistent, proactive, learning, attuned#88
Merged
Conversation
The agent answers introspective questions like a generic stateless LLM, denying capabilities nomos already has: it claims it can't reach out, has no cross-session continuity, and can't grow. All three are false — proactive_send + loop_create are agent tools, the memory digest is re-injected every turn (NOMOS_ADAPTIVE_MEMORY on by default), and the user model accumulates + consolidates. The root cause is a self-model gap, not missing infrastructure. - docs/agent-presence-and-continuity.md: phased plan. P1 self-model manifesto (the highest-leverage fix), P2 proactive defaults, P3 continuity depth (elapsed-time + agent journal), P4 shared-experience narratives, P5 emotional presence. - docs/stress-anxiety-support.md: adapts the IVY emotional-intelligence patterns to nomos's companion context. Mood is persisted as timestamped EPISODES-with-causes (not a standing state): the live theory-of-mind read always wins, episodes decay, one bad day != a trait, and only recurring patterns generalize. Bounded by an explicit safety line (companionship, never therapy/crisis care) and vault-stored + user-editable. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The agent was answering introspective questions like a generic stateless LLM — denying it can reach out, persist across sessions, or grow — even though the daemon gives it all three (proactive_send, the per-turn memory digest, background consolidation). Root cause: the system prompt assembled disconnected feature sections but never asserted a self-model, so the agent fell back to training-data disclaimers. Add an always-on "## Agent Nature" section to buildSystemPromptAppend, right after Identity (before the utility sections). It asserts: you persist (your memory is re-handed to you every turn), you reach out (proactive_send / schedule_task), you grow (corrections + consolidation), and you attune (adapt to the user's state) — bounded by an explicit safety line (companion, not therapist/crisis service). Injected unconditionally so it survives a custom SOUL.md / agent.soul override (runtime fact, not personality). Verified against a real model run: asked the same "can you be my best friend?" question, the agent now affirms continuity, reach-out, and growth accurately while staying honest about genuine limits — instead of the blanket denial in the report. See docs/agent-presence-and-continuity.md (Phase 1). 21 profile tests pass (+1 regression locking the manifesto in); typecheck + lint clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Make the agent actually reach out by default, without surprise cost or spam. - commitmentTracking is now opt-out (env.ts: on unless NOMOS_COMMITMENT_TRACKING=false) — the agent reminds the user about their own commitments. - Cost-gate the per-turn extraction (memory-indexer): it only runs when reach-out is deliverable — a configured notification channel OR a registered mobile device. No channel ⇒ no extraction, no LLM cost. New hasRegisteredDevice() + getNotificationDefaultFor-based hasDeliverableTarget(). - Hosted mode's only channel is the mobile app: a registered device counts as the channel (reminders on by default, delivered via push), the commitment-reminder cron fires without a global channel (scheduler.ts: enabled when commitmentTracking && (target || isHosted()) — it fans out per-owner), and the agent is told it reaches the user through the Nomos mobile app (agent-runtime integrations summary) so it follows up + checks in unprompted. - Left opt-in (cost-heavy / intrusive): inbox/calendar autonomy + any unconditional daily check-in; the Phase 1 manifesto already has the agent OFFER check-ins. typecheck + lint clean; 646 tests (+1 opt-out-default regression). See docs/agent-presence-and-continuity.md (Phase 2). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…Phase 3) Make the continuity the agent now claims tangible. - Elapsed-time anchor: agent-runtime injects "Your last conversation ended N ago" from the sessions table (getPreviousSessionEnd, excluding the current session; suppressed under ~10 min) so the agent has a temporal sense between sessions. - Agent journal: buildMemoryDigest now also injects agent-journal.md under "Where we left off (your journal)", and the Phase 1 manifesto nudges the agent to jot a short first-person note at session end. Continuity in the agent's own voice; rides the existing user-editable vault, no new store. typecheck + lint clean; digest tests +1.
…hase 5)
The agent already DETECTS strain (theory-of-mind) but forgot it at session end. Persist
it as episodes-with-causes — never a standing state.
- src/memory/mood-log.ts: recordMoodEpisode upserts a { date, emotion, cause, status }
episode keyed by CAUSE into an editable mood-log.md vault note; episodes decay
(30d / 20 cap); readOpenMoodEpisodes surfaces open ones. captureMoodFromTurn is the
production trigger (forked Haiku names the cause only on real strain). Gated by
NOMOS_ADAPTIVE_MEMORY, user_id-scoped.
- agent-runtime: on a strain turn (tom emotion stressed/frustrated) fire-and-forget the
capture; inject open episodes as "Recently weighing on them" so the agent follows up
on the CAUSE ("how'd the launch land?") — never asserts a mood, and the live read
always wins.
- profile.ts manifesto: the "You attune" support protocol (acknowledge -> adapt ->
de-escalate only when sustained -> normalize) + the explicit safety boundary
(companion, never therapist/crisis care).
Episodes-not-state, decay, live-read-wins, patterns-not-episodes — per
docs/stress-anxiety-support.md. typecheck + lint clean; 8 mood-log tests.
Phase 4 — close the understanding != articulation gap. A weekly per-owner cron (`__relationship_narrative__`, 168h, fan-out) has the agent author a first-person "how we've come to work together" narrative grounded in the learned user_model, written to an editable relationship.md vault note. Forked-Haiku, NOMOS_ADAPTIVE_MEMORY-gated, MIN_ENTRIES floor so a barely known user gets nothing. - src/memory/relationship-narrative.ts: writeRelationshipNarrative(userId) - src/daemon/cron-engine.ts: __relationship_narrative__ handler (per-owner) - src/daemon/gateway.ts: seed the relationship-narrative cron (every 168h) Eval + audit coverage for Phases 4 and 5: - eval/feature-manifest.ts: declare `relationship-narrative` (cron sentinel, fan-out, relationship.md effect) + `mood-episodes` (turn, mood-log.md effect) - eval/agent-eval.ts: runMoodLog (deterministic episode → mood-log.md + per-user isolation) and runRelationshipNarrative (seed user_model → narrative → relationship.md; B with nothing learned writes nothing); wired into runEval - src/memory/relationship-narrative.test.ts: gating + min-entries unit tests Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The MobileApi Connect wire tests hardcoded ports 8797-8799. On a dev box running the studio sidecar (which binds 127.0.0.1:8799), the eval's server bound 0.0.0.0:8799 but the client's 127.0.0.1:8799 request routed to the sidecar, which has no MobileApi -> every wire RPC failed `[unimplemented]` and `pnpm eval:audit` aborted before the audit ran. startConnectServer/startWire now bind :0 and report the OS-assigned port; callers build their clients off the returned handle's port. No collision, no hardcoded ports. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Phases 1-5 of the agent-presence plan now ship: each phase heading carries an implemented marker with a "Shipped:" note pointing at the actual modules, manifest entries, and eval coverage. The companion stress doc marks Phase A (mood episodes) and Phase B (support protocol) implemented; Phase C (autonomous emotional check-in) is honestly flagged "not yet built" — its data + in-context follow-up is live, but the autonomous reach-out trigger isn't wired yet. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…eference
Both docs were written as forward-looking plans ("the plan", "the fix", phase
sizing, future tense). Rewrite them as accurate reference documentation of the
shipped behavior: what each capability does, how it's wired, exact verbatim
prompt strings, configuration, file map, eval/audit coverage, and an honest
"known limitations / not yet built" section.
Grounded in a parallel extraction of the actual source and verified by an
adversarial fact-check pass (factual-wiring + verbatim-quote/link lenses on each
doc, all green). Corrections folded in vs the old plan text:
- Agent Nature manifesto lives in buildSystemPromptAppend, NOT DEFAULT_SOUL, and
has four bullets (persist/reach out/grow/attune) + the safety line.
- Hosted push delivery is documented honestly: a registered device satisfies the
extraction cost gate, but proactive delivery still routes via the notification
default + channel adapters (Expo notifyUser is not yet wired for reminders /
proactive_send) — flagged as a known limitation.
- Mood capture is per-turn (not session-end); only stressed/frustrated trigger it;
no emotional_context table; resolution path + Phase C/D/E are not built.
- Relationship narrative: no relationship_narratives table / milestones /
proactive reflection shipped; it's a single editable relationship.md vault note
from a forked-Haiku weekly cron.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…nnels In hosted/mobile mode the agent described itself as showing up "across all the channels where you actually live (iMessage, Slack, Telegram, email)". Those are self-hosted (power-user) channels; a hosted tenant has none of them — its only messaging channel is the Nomos app (push). Two prompt leaks fixed: - src/config/profile.ts: the `## Memory` knowledge-base provenance line is now mode-aware. Power-user keeps the Slack/iMessage/email framing; hosted says the memory is built from conversations in the Nomos app. (imports isHosted) - src/daemon/agent-runtime.ts: the hosted reach-out line now states the app is the ONLY messaging channel and that iMessage/Slack/Telegram/WhatsApp/Discord are self-hosted channels it does not have, so it must not claim presence on them. Scoped to messaging channels so a connected Google integration still counts. profile.test.ts gets power-user vs hosted provenance assertions (deterministic env control). Docs updated to match. 660 tests green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… app
Follow-up to the hosted-channel fix. Two remaining leaks the agent surfaced:
1. It still listed **iMessage** as a direct channel. The hosted daemon often runs
on a Mac whose Messages.app is configured, so `isImessageEnabled()` is true and
the integrations summary advertised iMessage — contradicting the reach-out line
that says it has no iMessage. The agent resolved the contradiction by claiming
iMessage. Fix: `buildIntegrationsSummary` now gates ALL BYO messaging channels
(Slack/Discord/Telegram/WhatsApp/iMessage) on `!isHosted()`, matching the
mode.ts contract ("Channels are limited to what the central app supports").
2. It called the current conversation "This Claude Code conversation". Fix: the
hosted reach-out line now states the current conversation IS the Nomos app and
explicitly tells the agent not to describe it as Claude Code / a terminal.
Adds agent-runtime.test.ts asserting a configured BYO channel (WhatsApp) shows in
power-user mode and is suppressed in hosted mode, where the app is presented as the
only channel. 662 tests green. Docs updated.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This is the open-source self-hostable repo; managed-hosted behavior is private and does not belong in the public docs. Removed the "Hosted mode" subsection (mobile app, push delivery, BYO-channel suppression, the Expo/notifyUser delivery limitation) and all hosted/mode framing. Reframed §2 to the self-hosted reality: a "Channel awareness" subsection that describes how buildIntegrationsSummary tells the agent its actually-connected channels — Slack, Discord, Telegram, WhatsApp, iMessage (Messages.app), email, and connected tools like Google — so iMessage is correctly listed as an available self-hosted channel. The cost-gate description drops the mobile-device specifics. Code is unchanged: mode handling stays in the daemon; only the public docs are scoped to the self-hostable surface. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
On the mobile app the agent answered like a developer console: it listed channel internals (Slack "Socket Mode", Telegram "via grammY", WhatsApp "via Baileys", iMessage "via the imsg CLI — brew install steipete/tap/imsg"), named internal tools/commands (proactive_send, schedule_task, /schedule, /admin/integrations), and referred to "the daemon" and MCP. A hosted user is a consumer, not a developer. - src/config/profile.ts: in hosted mode, inject a "## Talking with the user" section that bans surfacing implementation detail (tool/command names, library + adapter names, CLI/install commands, file paths, daemon/settings plumbing) and asks for outcome-focused, brief, plain replies. Power-user installs skip it — technical detail is welcome on the CLI/self-hosted surface. - src/daemon/agent-runtime.ts: the hosted reach-out line drops the proactive_send / "self-hosted (power-user) channels / run their own daemon" framing that invited the jargon; it now describes the app channel in plain terms. profile.test.ts asserts the directive is present in hosted mode and absent in power-user mode; agent-runtime.test.ts updated for the reworded line. 664 tests green. Hosted-only behavior, so the public docs are untouched. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The Phase 5 mood-log files had over-long lines that the repo's oxfmt config wraps; the CI Format Check (oxfmt --check over the whole tree) flagged them. Pure line-wrapping, no logic change. Lint/Test/Typecheck were already green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
Asked whether it could be a real companion, the agent answered like a generic stateless LLM: "I can't reach out to check on you, I don't have independent continuity between sessions, I can't grow through shared experience." All three claims are false in nomos. The machinery existed; nothing in the prompt tied it into an identity, so the agent fell back on training-data disclaimers. This was a self-model gap, not missing infrastructure.
This PR closes that gap and deepens the continuity underneath it, in five parts. Everything lands in the MIT-licensed core, is
user_id-scoped, and stores what the agent writes in the user-editable vault.What changed
Phase 1 — Self-model (
5266af9) An always-on## Agent Natureblock inbuildSystemPromptAppendasserts You persist / reach out / grow / attune, with a companion-not-therapist safety line. Lives in the prompt builder (notDEFAULT_SOUL), so it survives a custom SOUL. Verified by a real run andprofile.test.ts.Phase 2 — Proactive reach-out, cost-gated (
80a701c)commitmentTrackingflips to opt-out (on unlessNOMOS_COMMITMENT_TRACKING=false). The per-turn extraction is cost-gated on a deliverable target (notification default or a registered mobile device), so it costs nothing without one. Hosted-mobile awareness wired into the agent's prompt.Phase 3 — Continuity depth (
19fd893) A## Continuityelapsed-time anchor ("your last conversation ended N ago", suppressed under 10 min) and an agent-authored journal (agent-journal.md, re-injected next session under "Where we left off").Phase 4 — Shared experience (
a0c6512) A weekly per-owner cron (__relationship_narrative__, 168h, fan-out) has a forked Haiku write a first-person, evidence-grounded narrative from the learneduser_modelinto an editablerelationship.mdvault note (MIN_ENTRIES=5floor). Closes the understanding ≠ articulation gap.Phase 5 — Emotional presence (
3b02ab4) Mood persisted as episodes with a cause (not a standing state) in an editablemood-log.mdvault note: captured per-turn only on real strain, forked-Haiku names the cause, episodes decay (30d/20-cap), open ones surfaced under "Recently weighing on them". The graduated support ladder ("You attune") lives in the Agent Nature block. The live read always wins; one bad day is not a trait.Eval + audit (
a0c6512) New manifest entriesrelationship-narrative(cron) andmood-episodes(turn) with effect SQL;runRelationshipNarrative+runMoodLogexercises wired into the agent eval; the__relationship_narrative__cron meta-check passes.Wire-harness fix (
29aa8c4) The eval's Connect server hardcoded port 8799, which collides with the studio sidecar and abortedeval:auditwith[unimplemented]. It now binds a free ephemeral port. Not a feature regression, purely environmental.Docs (
cb8a26d,a48f731,1531a02)docs/agent-presence-and-continuity.mdanddocs/stress-anxiety-support.mdare full feature reference docs (design, wiring, verbatim prompt strings, config, file map, eval coverage, honest known-limitations), grounded in a source extraction and adversarially fact-checked.Verification
pnpm eval:audit(real throwaway DB): AUDIT: PASS + SPEC-AUDIT: PASS,286 ran, 0 failed, 0 skipped. Therelationship.mdandmood-log.mdeffects both populate (count=1).pnpm test: 658 passing;pnpm typecheck+ lint clean (enforced by pre-commit on every commit).Known limitations (documented)
notifyUseris not yet wired for reminders /proactive_send).🤖 Generated with Claude Code