Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 25 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,31 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [Unreleased]

### Changed

- **The Jupyter MCP integration is now a single MCP server.** `aexp
install --with-jupyter` writes only the `jupyter` server entry to
`.mcp.json` (laptop-side `uvx jupyter-mcp-server` in MCP_SERVER mode,
runtime-retargetable to any node via `connect_to_jupyter`). The second
`jupyter-compute` server — an `npx mcp-remote` proxy to a cluster
`/mcp` endpoint (JUPYTER_SERVER mode) — is no longer emitted. It was a
near-duplicate of `jupyter`, could not retarget to a different node
without a `.mcp.json` edit + MCP restart, and was a standing
"which server do I use?" confusion surface.
- The `/aexp-jupyter-iterate` and `/aexp-promote-nb` slash commands,
`AGENTS.md`, and `docs/setup/jupyter-mcp.md` are updated to the
single-server tool family (`mcp__jupyter__*`).
- **Lost capability:** the two `jupyter-mcp-tools` UI-delegated tools.
`notebook_run-all-cells` was already 404-broken upstream;
`notebook_get-selected-cell` ("which cell is the user looking at")
is genuinely gone — the affected slash commands now ask the user for
the notebook/cell or use `aexp.jupyter.init().attached_notebooks`.
- A consumer `.mcp.json` written by an earlier `--with-jupyter`
install keeps its `jupyter-compute` entry (the merge is
additive-only and never deletes servers); remove it by hand for the
cleanup. The cluster-side `[jupyter]` extra and `aexp jupyter setup`
extension recipe are unchanged.

## [0.4.0] - 2026-05-20

### Added
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -124,7 +124,7 @@ The design bet: agents already know how to run experiments. What they need is a
| **Slash commands** | Artifact creation: `/aexp-new-hypothesis`, `/aexp-new-experiment`, `/aexp-new-run`. Threads (forward-looking research concerns broader than a hypothesis): `/aexp-new-thread`, `/aexp-list-threads`, `/aexp-show-thread`, `/aexp-close-thread`. Finding creation (pick by what the finding cites): `/aexp-finding-from-run`, `/aexp-finding-from-batch`, `/aexp-finding-placeholder`. Read / inspect: `/aexp-show-run`, `/aexp-show-batch`, `/aexp-list-runs`, `/aexp-status`, `/aexp-validate`. Queue: `/aexp-queue-add`, `/aexp-queue-list`, `/aexp-queue-materialize`, `/aexp-queue-stop`. Notebook lifecycle (when `--with-jupyter` is configured): `/aexp-jupyter-iterate` (test loop), `/aexp-promote-nb` (promote working cells into a tracked-run script). Sandbox scaffolding: `/aexp-new-sandbox` (create an exploratory notebook subdir under `notebooks/_sandbox/`). 22 total. |
| **CLI** | 22 verbs covering install, artifact creation (H/E/F/T + thread lifecycle), run lifecycle, batch queries, tracker binding, validation, offline sync, optional `jupyter-setup`, the `queue` subcommand group (add/list/remove/stop/clear/materialize/run) + `run-queued`, and sandbox scaffolding (`new-sandbox`). See `aexp --help` for the full list. Python API is a one-line `from aexp import ...`. |
| **Typed JSON contracts** | Pydantic models (`RunLink`, `BatchSelector`, `Issue`, …) back the schema; MCP tools and CLI return the same shapes. |
| **Jupyter MCP integration** (optional, `[jupyter]` extra) | `aexp install --with-jupyter` adds `jupyter` and `jupyter-compute` MCP servers to `.mcp.json` so Claude can read/edit/execute cells in a remote JupyterLab through an existing SSH tunnel — no agent SSH required. `aexp jupyter-setup` applies the verified Jupyter Server extension state on the cluster (disable Datalayer experiments that conflict with the mainstream stack). After install, see `docs/setup/jupyter-mcp.md` for cluster-side recipe + investigation log. The `/aexp-jupyter-iterate` slash command guides the read → propose → execute loop. |
| **Jupyter MCP integration** (optional, `[jupyter]` extra) | `aexp install --with-jupyter` adds the `jupyter` MCP server to `.mcp.json` so Claude can read/edit/execute cells in a remote JupyterLab through an existing SSH tunnel — no agent SSH required. The target Jupyter is set per-session at runtime via `connect_to_jupyter`, so one entry retargets to any node. `aexp jupyter-setup` applies the verified Jupyter Server extension state on the cluster (disable Datalayer experiments that conflict with the mainstream stack). After install, see `docs/setup/jupyter-mcp.md` for cluster-side recipe + investigation log. The `/aexp-jupyter-iterate` slash command guides the read → propose → execute loop. |

### Exploratory surfaces

Expand Down
4 changes: 2 additions & 2 deletions src/aexp/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -249,8 +249,8 @@ def install(
False,
"--with-jupyter",
help=(
"Opt into the Jupyter MCP integration: writes `jupyter` and "
"`jupyter-compute` server entries to `.mcp.json`, sets "
"Opt into the Jupyter MCP integration: writes the `jupyter` "
"server entry to `.mcp.json`, sets "
"`jupyter_enabled: true` (sticky) in the install marker, and "
"ensures `docs/setup/jupyter-mcp.md` is vendored. Requires "
"`pip install agentic-experiments[jupyter]` for the Python "
Expand Down
83 changes: 28 additions & 55 deletions src/aexp/install.py
Original file line number Diff line number Diff line change
Expand Up @@ -481,11 +481,11 @@ def _merge_or_write_mcp_json(
current interpreter instead — lets editable installs take effect on
the MCP side (at the cost of a machine-specific ``.mcp.json``).

When ``with_jupyter=True``, also writes the ``jupyter`` and
``jupyter-compute`` entries used by the Jupyter MCP integration. The
entries are *additive*: once written, subsequent installs without the
flag leave them in place (matching the "never delete user-defined
servers" pattern). To back out, the user edits ``.mcp.json`` by hand.
When ``with_jupyter=True``, also writes the ``jupyter`` entry used by
the Jupyter MCP integration. The entry is *additive*: once written,
subsequent installs without the flag leave it in place (matching the
"never delete user-defined servers" pattern). To back out, the user
edits ``.mcp.json`` by hand.
"""
rel = _display_relpath(dst)
our_entries: dict[str, Any] = {"aexp": _build_mcp_server_entry(repo_root, dev=dev)}
Expand Down Expand Up @@ -517,14 +517,12 @@ def _merge_or_write_mcp_json(
merged.setdefault("mcpServers", {})
# Always refresh our own ``aexp`` entry; preserve any user-defined servers.
merged["mcpServers"]["aexp"] = our_entries["aexp"]
# Jupyter entries: only ever ADD. If the user already has a `jupyter` /
# `jupyter-compute` block (either from a prior --with-jupyter install or
# from a manual setup) leave it alone — they may have hardcoded the
# Windows-stable token there, which we must not clobber.
if with_jupyter:
for key in ("jupyter", "jupyter-compute"):
if key not in merged["mcpServers"]:
merged["mcpServers"][key] = our_entries[key]
# Jupyter entry: only ever ADD. If the user already has a `jupyter`
# block (from a prior --with-jupyter install or a manual setup) leave
# it alone — they may have customized the URL/port or pinned a
# version, which we must not clobber.
if with_jupyter and "jupyter" not in merged["mcpServers"]:
merged["mcpServers"]["jupyter"] = our_entries["jupyter"]

if merged == existing:
return InstallAction("skipped_identical", rel)
Expand Down Expand Up @@ -588,49 +586,25 @@ def _build_mcp_server_entry(repo_root: Path, *, dev: bool = False) -> dict[str,


def _jupyter_mcp_entries() -> dict[str, Any]:
"""MCP server entries for the Jupyter MCP integration.
"""MCP server entry for the Jupyter MCP integration.

Two side-by-side servers, both reaching the same JupyterLab through the
user's existing SSH tunnel:
A single laptop-side server:

- ``jupyter`` — laptop-side ``uvx jupyter-mcp-server`` running in
MCP_SERVER mode (stdio to Claude, HTTP+WS to remote Jupyter). The
token is provided per-session at runtime via the ``connect_to_jupyter``
tool, so no token lives in this entry.
- ``jupyter-compute`` — laptop-side ``npx mcp-remote`` proxy bridging
Claude's stdio to the cluster's ``/mcp`` SSE endpoint, where
``jupyter-mcp-server`` runs as a Jupyter Server extension
(JUPYTER_SERVER mode). Token is interpolated from the
``JUPYTER_TOKEN`` env var by default.

Default port is ``3618`` (matches the verified electricrag deployment).
Consumers using a different port edit ``.mcp.json`` post-install.

On Windows, ``${JUPYTER_TOKEN}`` interpolation is fragile because
``setx`` does not propagate to already-running processes (notably
Explorer, which spawns Start-Menu apps including Claude Desktop). The
documented fix is to hardcode the literal token in ``.mcp.json`` and
set the matching value in ``~/.jupyter/jupyter_server_config.py`` on
the cluster — see ``docs/setup/jupyter-mcp.md`` "Investigation log §4".
The install never auto-rewrites the literal token: token management
stays the consumer's responsibility.
MCP_SERVER mode (stdio to Claude, HTTP+WS to the remote Jupyter).
The target Jupyter URL + token are supplied per-session at runtime
via the ``connect_to_jupyter`` tool, so no token lives in this
entry and the *same* entry retargets to any node — open a tunnel on
a new local port, call ``connect_to_jupyter`` at the new URL, done.
No ``.mcp.json`` edit, no MCP restart. That runtime retargeting is
what makes the multi-node workflow (``/aexp-jupyter-connect`` /
``/aexp-jupyter-discover``) work.
"""
return {
"jupyter": {
"command": "uvx",
"args": ["jupyter-mcp-server"],
},
"jupyter-compute": {
"command": "npx",
"args": [
"-y",
"mcp-remote",
"http://127.0.0.1:3618/mcp",
"--allow-http",
"--header",
"Authorization:token ${JUPYTER_TOKEN}",
],
},
}


Expand Down Expand Up @@ -841,14 +815,13 @@ def install_limina(
override (e.g. dogfooding the consumer scaffold against the
dev repo on purpose).
with_jupyter : bool, optional
If ``True``, also write the ``jupyter`` and ``jupyter-compute``
MCP server entries into ``.mcp.json``, vendor
``docs/setup/jupyter-mcp.md`` into the consumer repo, and set
``jupyter_enabled: true`` in the install marker. The marker bit
is sticky — once set, subsequent installs preserve it even if
``with_jupyter=False``. The ``.mcp.json`` entries are additive:
existing user-defined ``jupyter`` / ``jupyter-compute`` blocks
are preserved (so a hardcoded Windows-stable token survives).
If ``True``, also write the ``jupyter`` MCP server entry into
``.mcp.json``, vendor ``docs/setup/jupyter-mcp.md`` into the
consumer repo, and set ``jupyter_enabled: true`` in the install
marker. The marker bit is sticky — once set, subsequent installs
preserve it even if ``with_jupyter=False``. The ``.mcp.json``
entry is additive: an existing user-defined ``jupyter`` block is
preserved (so a customized URL/port survives).
See ``docs/setup/jupyter-mcp.md`` for the full setup recipe.

Returns
Expand Down
84 changes: 45 additions & 39 deletions src/aexp/slash_commands/aexp-jupyter-iterate.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,18 +2,27 @@
description: "Iterate on a JupyterLab cell with the user via the Jupyter MCP bridge (read → propose → execute)."
---

Iterate on whatever cell the user is currently looking at in JupyterLab,
through the Jupyter MCP bridge.
Iterate on a notebook cell with the user, through the Jupyter MCP bridge.

> **Prerequisite.** This command requires the `mcp__jupyter-compute__*`
> tool family. Those tools come from `aexp install --with-jupyter` plus a
> JupyterLab process reachable through the user's SSH tunnel. If the
> tools are missing, run `/mcp` to inspect server status and consult
> `docs/setup/jupyter-mcp.md` for the cluster-side setup recipe.
> **Prerequisite.** This command requires the `mcp__jupyter__*` tool
> family — in particular `connect_to_jupyter`, `execute_code`,
> `read_cell`, and `execute_cell`. Those tools come from
> `aexp install --with-jupyter` plus a JupyterLab process reachable
> through the user's SSH tunnel. If the tools are missing, run `/mcp` to
> inspect server status and consult `docs/setup/jupyter-mcp.md` for the
> setup recipe.

Run through these steps:

0. **Confirm session identity.** Before touching any cells, dispatch:
1. **Check tool availability.** Verify that `mcp__jupyter__execute_cell`
and `mcp__jupyter__read_cell` are present in your tool list. If not,
stop and report:
"Jupyter MCP integration not available in this session. Run
`aexp install --with-jupyter`, ensure the SSH tunnel to the cluster is
open, connect with `/aexp-jupyter-connect`, and restart Claude. See
`docs/setup/jupyter-mcp.md`."

2. **Confirm session identity.** Before touching any cells, dispatch:
```
execute_code(code="from aexp.jupyter import init; import json; print(json.dumps(init().model_dump(), default=str))")
```
Expand All @@ -27,46 +36,43 @@ Run through these steps:
shouldn't disturb.

If anything mismatches — wrong SLURM job, wrong host, unexpected GPU
resident — STOP and ask. Do not proceed to step 1.
resident — STOP and ask. Do not proceed to step 3. To switch to a
different Jupyter, use `/aexp-jupyter-connect`.

1. **Check tool availability.** Verify that
`mcp__jupyter-compute__notebook_get-selected-cell` and
`mcp__jupyter-compute__execute_cell` are present in your tool list. If
not, stop and report:
"Jupyter MCP integration not available in this session. Run
`aexp install --with-jupyter`, ensure the SSH tunnel to the cluster is
open, and restart Claude Desktop. See `docs/setup/jupyter-mcp.md`."
3. **Identify the target notebook and cell.** This single-server setup
has no live "what is the user looking at" tool, so ask the user
directly:
"Which notebook should I work in, and which cell — give me the cell
index, or describe it (e.g. 'the training loop')?"
Cross-check the notebook name against the `attached_notebooks` list
from step 2. Open it with `use_notebook` if it isn't already open.

2. **Identify what the user is looking at.** Call
`mcp__jupyter-compute__notebook_get-selected-cell` to read the live UI
selection. Report the notebook path, cell index, and cell type, and
quote the source verbatim so the user can confirm you're targeting the
right cell.
4. **Locate and quote the cell.** Use `read_cell(cell_index=N)` for a
single cell, or `read_notebook` (brief mode) to find the cell the
user described. Report the notebook path, cell index, and cell type,
and quote the source verbatim so the user can confirm you're
targeting the right cell before any edit.

3. **Gather context if needed.** If the user's request references "the
cells above" or relies on prior state, use
`mcp__jupyter-compute__read_cell` on adjacent indices, or
`mcp__jupyter-compute__read_notebook` (brief mode) for an overview.
Don't dump the whole notebook unless asked.
5. **Gather context if needed.** If the user's request references "the
cells above" or relies on prior state, use `read_cell` on adjacent
indices, or `read_notebook` (brief mode) for an overview. Don't dump
the whole notebook unless asked.

4. **Propose a change.** Describe what you intend to modify and why.
6. **Propose a change.** Describe what you intend to modify and why.
Do NOT make the edit until the user confirms.

5. **On approval, apply the edit.** Use:
7. **On approval, apply the edit.** Use:
- `edit_cell_source` for surgical find/replace within one cell.
- `overwrite_cell_source` for full replacement.
- `insert_cell` to add a new cell.

6. **Execute the cell.** Call
`mcp__jupyter-compute__execute_cell(cell_index=N)` with a reasonable
timeout. Paste the actual stdout / errors verbatim — don't paraphrase.
8. **Execute the cell.** Call `execute_cell(cell_index=N)` with a
reasonable timeout. Paste the actual stdout / errors verbatim — don't
paraphrase.

7. **Iterate or wrap up.** If the cell errored, propose the next fix and
loop back to step 4. If it succeeded, ask the user whether to continue
or stop.
9. **Iterate or wrap up.** If the cell errored, propose the next fix and
loop back to step 6. If it succeeded, ask the user whether to
continue or stop.

> **Do NOT use** `notebook_run-all-cells` — it is exposed by the bridge
> but currently returns 404 (asymmetric upstream bug, see
> `docs/setup/jupyter-mcp.md` "Investigation log" §5). Loop
> `execute_cell(cell_index=i)` over indices instead when a multi-cell run
> is needed.
> **Multi-cell runs.** To run a span of cells, loop
> `execute_cell(cell_index=i)` over the indices in order.
38 changes: 19 additions & 19 deletions src/aexp/slash_commands/aexp-promote-nb.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,11 +13,10 @@ committed experiment.
> The notebook stays as the smoke-test record — it's not edited, only
> read. Outputs land at `<repo_root>/experiments/E<id>-<slug>.py`.

> **Prerequisite.** Best with `mcp__jupyter-compute__*` tools available
> **Prerequisite.** Best with the `mcp__jupyter__*` tools available
> (from `aexp install --with-jupyter`). If they're not, you can still
> promote cells from a `.ipynb` file on disk via the standard Read tool;
> you'll lose the live cell-selection convenience but everything else
> works.
> everything in this command works either way.

> **Invocation note.** The examples below use `python -m aexp` directly.
> If running from a Claude Code session where `python` does not resolve
Expand All @@ -33,25 +32,26 @@ committed experiment.

Run through these steps:

1. **Tool availability check.** Verify whether
`mcp__jupyter-compute__notebook_get-selected-cell` and
`mcp__jupyter-compute__read_cell` are present. If yes, use them in
the steps below. If not, ask the user for the path to the `.ipynb`
file on disk and read it via the standard Read tool — JupyterLab
notebooks are JSON; you can extract `cells[i].source` directly.
1. **Tool availability check.** Verify whether the `mcp__jupyter__*`
tools (`read_cell`, `read_notebook`, `use_notebook`) are present. If
yes, use them in the steps below. If not, ask the user for the path
to the `.ipynb` file on disk and read it via the standard Read tool —
JupyterLab notebooks are JSON; you can extract `cells[i].source`
directly.

2. **Identify the source notebook.** With the MCP bridge: call
`mcp__jupyter-compute__notebook_get-selected-cell` to anchor on the
user's current focus and report the notebook path back to them. Without
the bridge: ask the user for the notebook path explicitly.
2. **Identify the source notebook.** Ask the user for the notebook path
explicitly. With the MCP bridge, cross-check it against the open
notebooks (`aexp.jupyter.init().attached_notebooks`) and open it with
`use_notebook` if it isn't already open. Without the bridge, take the
path the user gives you.

3. **Identify the cell range to promote.** Ask:
"promote just the currently-selected cell, or a range? if a range,
give me indices (e.g., `4-12`) or describe the cells (e.g., 'from the
model definition through the training loop')." Read each target cell
(`read_cell(cell_index=N)` via MCP, or by indexing into
`cells[]` from the on-disk JSON). Quote the source verbatim back to
the user and confirm the selection before going further.
"which cells should I promote — give me indices (e.g., `4-12`) or
describe the cells (e.g., 'from the model definition through the
training loop')." Read each target cell (`read_cell(cell_index=N)`
via MCP, or by indexing into `cells[]` from the on-disk JSON). Quote
the source verbatim back to the user and confirm the selection before
going further.

4. **Identify the experiment.** Ask which `E###` this script is being
promoted under. If the user is unsure, suggest checking
Expand Down
Loading
Loading