pi-mcp-deferred

A deferred context engine for Pi MCP integration. Fork of @spences10/pi-mcp v0.0.37.

Problem

Every MCP server you connect loads ALL tool schemas into the LLM context upfront. With 9 MCP servers and ~140 tools, that's 50K+ tokens of unused schemas every single turn. Factory's telemetry shows 94.6% of sessions never use MCP tools at all.

This causes:

Attention dilution — relevant content competes with unused schemas
Tool-selection noise — similar tool names increase wrong selection rate
Earlier compression — static context fills the window faster

Solution

Three-phase deferred context engine (inspired by Factory's Deferred Context Engine):

Phase 1: Discover (startup)

Register compact stubs instead of full schemas. Each stub contains:

Tool name and first-sentence description
Property names, types, and required arrays (no descriptions/defaults/examples)
Nested object properties one level deep
Enum values (if <= 20 items)

This saves ~60-80% tokens vs full schemas while keeping enough structure for the LLM to generate valid function calls.

Phase 2: Promote (on first use)

When any tool from a server is called for the first time:

Auto-promote ALL tools from that server (fetch full schemas from MCP server)
Re-register with full descriptions, parameter schemas, defaults, examples
Execute the original call immediately — no retry needed

The model sees the full schemas on the next turn.

Phase 3: Reuse (session lifetime)

Promoted tools stay loaded with full schemas for the rest of the session. No re-defer.

Installation

Add to your Pi settings (~/.pi/agent/settings.json):

{
  "packages": [
    "git:github.com/drvova/pi-mcp-deferred"
  ]
}

Remove any existing npm:@spences10/pi-mcp entry — this is a drop-in replacement.

Configuration

Deferred mode (ON by default)

All MCP servers use deferred mode by default. No configuration needed.

Disable globally — set env var:

export MY_PI_MCP_DEFERRED=0

Disable per-server — add "deferred": false to your MCP config (~/.pi/agent/mcp.json):

{
  "my-important-server": {
    "command": "npx",
    "args": ["-y", "some-mcp-server"],
    "deferred": false
  }
}

Eager connect (original behavior)

To connect all MCP servers immediately at startup (original pi-mcp behavior):

export MY_PI_MCP_EAGER_CONNECT=1

Idle timeout

Disconnect idle MCP servers after N milliseconds (default: 1800000 / 30 minutes):

{
  "my-server": {
    "command": "npx",
    "args": ["-y", "some-mcp-server"],
    "idle_timeout_ms": 600000
  }
}

Tools

`mcp__expand`

Explicitly load full schemas from an MCP server without calling a tool first.

mcp__expand({ server: "exa" })        // promote one server
mcp__expand({ server: "all" })        // promote all connected servers

`/mcp` command

Manage MCP servers interactively:

/mcp                  // open management modal
/mcp list             // show all servers and status
/mcp connect <name>   // connect a specific server
/mcp enable <name>    // enable a disabled server
/mcp disable <name>   // disable a connected server
/mcp backup           // backup MCP config
/mcp restore          // restore MCP config
/mcp profile list     // list MCP profiles

How it works

Startup:
  mcp.json → connect servers → listTools() → register COMPACT STUBS
  (~60-80% smaller than full schemas)

First tool call from server "exa":
  mcp__exa__web_search_exa({query: "..."})
    → promote_server_tools("exa")
      → listTools() → re-register ALL exa tools with FULL schemas
    → execute tool with passed params (no retry)

Next turn:
  Model sees full schemas for all exa tools

Explicit promotion:
  mcp__expand({server: "exa"}) → full schemas loaded
  mcp__expand({server: "all"}) → all servers promoted

Token savings

Scenario	Tools	Stub tokens	Full tokens	Savings
5 MCP servers, 30 tools	30	~3K	~15K	~80%
9 MCP servers, 140 tools	140	~15K	~50K	~70%
20 MCP servers, 300+ tools	300+	~30K	~120K	~75%

Actual savings depend on tool schema complexity. Tools with many nested properties, long descriptions, and large enums save the most.

Compatibility

Drop-in replacement for @spences10/pi-mcp v0.0.37
Same /mcp command, same MCP config format, same trust model
All original features preserved (stdio/HTTP transports, profiles, backup/restore, idle timeout, project configs)
Pi CLI >= 0.76.0

Credits

Original extension: @spences10/pi-mcp by Scott Spence
Deferred context engine concept: Factory
Compact schema compression: mcp-context-proxy

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
dist		dist
.gitignore		.gitignore
README.md		README.md
package.json		package.json
pi-logo-animated.svg		pi-logo-animated.svg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

pi-mcp-deferred

Problem

Solution

Phase 1: Discover (startup)

Phase 2: Promote (on first use)

Phase 3: Reuse (session lifetime)

Installation

Configuration

Deferred mode (ON by default)

Eager connect (original behavior)

Idle timeout

Tools

`mcp__expand`

`/mcp` command

How it works

Token savings

Compatibility

Credits

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

pi-mcp-deferred

Problem

Solution

Phase 1: Discover (startup)

Phase 2: Promote (on first use)

Phase 3: Reuse (session lifetime)

Installation

Configuration

Deferred mode (ON by default)

Eager connect (original behavior)

Idle timeout

Tools

mcp__expand

/mcp command

How it works

Token savings

Compatibility

Credits

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

`mcp__expand`

`/mcp` command

Packages