Skip to content

[feature] multi-source federation (recall across remote synapto, markdown, obsidian) #41

@ramonlimaramos

Description

@ramonlimaramos

Problem

today synapto has tenant isolation but no way to share core/feedback across tenants while keeping working/project separate. real case: i have personal projects in ~/Developer/personal/ and work projects in ~/Developer/podium*/ — they should share my preferences and identity memories but not project context. only escape valve today is one giant default tenant or fully separate DBs (loses cross-recall entirely).

competitor byterover-cli has a "swarm" subsystem that federates across pluggable adapters (byterover, gbrain, local-markdown, memory-wiki, obsidian) and fuses results via RRF.

Proposed Solution

introduce MemorySource interface; recall can query multiple sources in parallel and fuse via RRF.

interface

class MemorySource(Protocol):
    name: str
    weight: float                              # source-level RRF weight
    async def search(query: str, k: int) -> list[Memory]: ...
    @property
    def supports_writes(self) -> bool: ...

built-in adapters

adapter reads writes use case
PostgresSynaptoSource (local, default) yes yes self
RemoteSynaptoSource (HTTP MCP) yes yes shared team synapto
MarkdownSource yes no flat MEMORY.md, CLAUDE.md, AGENTS.md as read-only context
ObsidianSource yes no personal vault as background knowledge
BrvSource yes no bridge to byterover-cli .brv/ for users migrating

config

[sources.local]
type = "postgres"
weight = 1.0

[sources.team]
type = "remote_synapto"
url = "https://synapto.team.internal"
weight = 0.7

[sources.obsidian_vault]
type = "obsidian"
path = "~/Documents/Obsidian/Knowledge"
weight = 0.4
read_only = true

recall semantics

  • query all sources in parallel (asyncio.gather)
  • per-source results ranked locally
  • cross-source fusion via RRF, weighted by source weight × source_trust
  • result includes source field so caller knows provenance
  • writes go to first writable source in priority order (or explicit source= arg)

Trade-offs

  • latency: dominated by slowest source. mitigate via per-source timeout (default 500ms) + best-effort merge.
  • trust collisions: source A says X, source B says ¬X. solution: extend find_contradictions to be cross-source aware.
  • schema drift: external sources don't have HRR vectors. graceful fallback to 2-way RRF on those results.
  • complexity: 5 adapters means maintenance load. mitigation: only ship 2 in v1 (postgres + markdown), others as plugins.

Out of scope

  • bidirectional sync (write-back to obsidian/markdown)
  • conflict resolution between writable sources (single primary in v1)

Success criteria

  • ramon can configure local + obsidian + remote and recall returns fused results
  • removing a source doesn't break others
  • read-only sources never get write attempts (validated at config load)

References

  • byterover swarm: src/agent/infra/swarm/ — RRF cross-provider design

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions