diff --git a/README.md b/README.md index 081fbf1..1452bf3 100644 --- a/README.md +++ b/README.md @@ -12,6 +12,7 @@ A repository of guides with examples you can take after a successful [hello-worl | [**gog-demo**](gog-demo/gog-openclaw-guide.md) | Connecting Google Workspace (Gmail, Calendar, Drive) to OpenClaw via the `gog` CLI. | | [**google-workspace-demo**](google-workspace-demo/google-workspace-guide.md) | Full Google Workspace integration (Gmail, Calendar, Drive, Sheets, Contacts, Tasks) with Tier 1 push daemon security. | | [**planet-integration-demo**](planet-integration-demo/planet-integration-guide.md) | Planet satellite imagery catalog, tasking cost estimation, and satellite pass availability with Tier 1 proxy security. | +| [**autoresearch-demo**](autoresearch-demo/autoresearch-hermes-guide.md) | Autonomous AI research agent for Hermes: run experiments on remote GPU/Slurm via MCP, follow the Karpathy autoresearch loop, write papers. | ## Official Resources diff --git a/autoresearch-demo/autoresearch-hermes-guide.md b/autoresearch-demo/autoresearch-hermes-guide.md new file mode 100644 index 0000000..744b861 --- /dev/null +++ b/autoresearch-demo/autoresearch-hermes-guide.md @@ -0,0 +1,222 @@ +# Connecting a Slurm Cluster to Hermes in a NemoClaw Sandbox + +This guide connects a remote Slurm cluster to a Hermes agent running inside a NemoClaw sandbox. By the end, the agent can submit jobs, monitor GPU availability, manage files, and run shell commands on the cluster — all through MCP tools from inside the sandbox. + +The connection uses [slurm-mcp](https://github.com/yidong72/slurm_mcp), a Python MCP server that provides 34 tools for Slurm job management, SSH, and file operations. The agent follows Karpathy's [autoresearch](https://github.com/karpathy/autoresearch) pattern: modify code, run experiment, evaluate, keep or discard, repeat. + +The SSH key **never enters the sandbox** — it stays on the host where the MCP server runs. + +> **Note:** This guide targets Slurm clusters. If your remote machine does not have Slurm (e.g. a single GPU workstation), the `run_shell_command`, `read_file`, `write_file`, and file tools still work — the Slurm-specific tools (`submit_job`, `get_gpu_availability`, etc.) will simply return errors and can be ignored. + +## Prerequisites + +| Requirement | Details | +|-------------|---------| +| Running NemoClaw sandbox | A working Hermes sandbox. See [NemoClaw hello-world setup](https://github.com/NVIDIA/NemoClaw). | +| Slurm cluster with SSH access | A login node you can SSH into (e.g. `login.hpc.example.com`) | +| SSH key or password | Credentials for the cluster login node | +| uv / uvx | `pip install uv` or `brew install uv` — runs slurm-mcp and mcp-proxy (Python) | +| Node.js | `brew install node` — mcporter inside the sandbox needs it | + +## What's in this directory + +| File | Purpose | +|------|---------| +| `autoresearch-hermes-guide.md` | This guide | +| `setup.sh` | One-command installer: MCP bridge + agent persona + research skills | +| `bootstrap.sh` | Sets up mcp-proxy + slurm-mcp on host, mcporter + policy + skill in sandbox | +| `policy.yaml` | Reference network policy (applied automatically by bootstrap) | +| `autoresearch-skill/SKILL.md` | Teaches the agent how to call mcporter (bash commands, not Python) | +| `autoresearch-skill/SOUL.md` | Agent instructions: autoresearch loop, Slurm safety rules | +| `autoresearch-skill/USER.md` | User profile template (edit for your environment) | +| `autoresearch-skill/MEMORY.md` | Agent memory: cluster info, MCP usage, paper writing reference | + +## How it works + +The connection follows the same MCP-over-HTTP pattern as the [Blender demo](https://github.com/brevdev/nemoclaw-demos/tree/main/blender-demo): + +``` +Host (your Mac or Linux box) +│ +├── mcp-proxy (HTTP/SSE on port 9878) +│ └── slurm-mcp (Python, asyncssh) +│ ├── Holds your SSH key (never enters sandbox) +│ ├── Connects to Slurm login node via SSH +│ └── Exposes 34 MCP tools (Slurm, shell, files) +│ +└── NemoClaw sandbox (Hermes agent) + ├── mcporter → HTTP/SSE → mcp-proxy (through L7 proxy) + ├── ssh-remote skill (teaches agent how to call mcporter) + ├── research skills (arxiv, paper writing — from Hermes upstream) + ├── mlops skills (training, inference, eval — from Hermes upstream) + └── SOUL.md (autoresearch loop + Slurm safety rules) +``` + +The sandbox L7 proxy enforces the network policy — the agent can only reach the mcp-proxy endpoint you approved. All other egress is blocked. + +## Part 1: Install NemoClaw (if not already done) + +```bash +curl -fsSL https://www.nvidia.com/nemoclaw.sh | bash +source ~/.bashrc +``` + +Create a Hermes sandbox (note the `--agent hermes` flag — this is required, the default is OpenClaw): + +```bash +nemoclaw onboard --agent hermes +``` + +When prompted, choose your inference provider and name the sandbox (e.g. `research`). + +> **Important:** This demo requires the Hermes agent, not OpenClaw. The skill paths, memory locations, and mcporter setup are all Hermes-specific. If you already have a sandbox running OpenClaw, create a new one with `--agent hermes`. + +## Part 2: Clone This Repo + +```bash +cd /autoresearch-demo +``` + +Edit `autoresearch-skill/USER.md` with your details (name, role, cluster info). + +## Part 3: Run Setup + +The `setup.sh` script does everything in one command: + +1. **Slurm MCP bridge** — starts mcp-proxy + slurm-mcp on the host, installs mcporter in sandbox, applies network policy, uploads the ssh-remote skill +2. **Agent persona** — uploads SOUL.md, USER.md, MEMORY.md into the Hermes agent's memory +3. **Research skills** — fetches the latest research and mlops skills from [NousResearch/hermes-agent](https://github.com/NousResearch/hermes-agent/tree/main/skills) + +### Slurm cluster with SSH key: + +```bash +./setup.sh \ + --sandbox research \ + --alias cluster=login.hpc.example.com \ + --key ~/.ssh/id_ed25519 \ + --user jsmith \ + --user-root /lustre/users/jsmith +``` + +### Slurm cluster with password: + +```bash +./setup.sh \ + --sandbox research \ + --alias cluster=login.hpc.example.com \ + --password "your-password" \ + --user jsmith \ + --user-root /lustre/users/jsmith +``` + +Replace: +- `research` — your sandbox name +- `cluster=login.hpc.example.com` — a friendly alias and the login node hostname or IP +- `~/.ssh/id_ed25519` — path to your SSH private key +- `jsmith` — your cluster username +- `/lustre/users/jsmith` — your home/root directory on the cluster (used by slurm-mcp for directory tools) + +You should see output ending with: + +``` +=== Slurm/SSH MCP ready === + + Sandbox: research + Target: cluster (login.hpc.example.com) + MCP proxy: localhost:9878 (pid XXXXX) + Tools: 34 (Slurm + SSH + files) + SSH key: ON HOST ONLY +``` + +## Part 4: Verify + +Test the MCP connection from the host: + +```bash +openshell sandbox exec -n research -- \ + /sandbox/bin/mcporter call cluster.run_shell_command command="hostname" +``` + +Expected output: +```json +{ + "result": "login01.hpc.example.com\n" +} +``` + +Test Slurm access: +```bash +openshell sandbox exec -n research -- \ + /sandbox/bin/mcporter call cluster.get_cluster_status +``` + +## Part 5: Use It + +Open the Hermes chat and try: + +- *"What GPUs are available on the cluster?"* +- *"Submit a training job on partition gpu with 4 GPUs, time limit 2 hours"* +- *"List my running jobs"* +- *"Search arxiv for recent papers on mixture of experts"* +- *"Set up an autoresearch loop to optimize train.py on the cluster"* + +The agent uses `/sandbox/bin/mcporter call cluster.` to execute commands remotely. The SOUL.md instructs it to: + +1. **Always ask before Slurm jobs** — which partition(s), how many concurrent jobs, max time limit, GPU count +2. **Follow the autoresearch loop** — modify → run → evaluate → keep/discard → repeat +3. **Never stop** — runs autonomously until you interrupt it +4. **Use the paper-writing skill** when you ask for a writeup (NeurIPS, ICML, ICLR, ACL, AAAI, COLM templates) + +## Available MCP Tools + +The agent has 34 tools for the remote cluster: + +| Category | Tools | +|----------|-------| +| **Shell** | `run_shell_command` | +| **Cluster** | `get_cluster_status`, `get_partition_info`, `get_node_info`, `get_gpu_info`, `get_gpu_availability` | +| **Jobs** | `submit_job`, `list_jobs`, `get_job_details`, `cancel_job`, `hold_job`, `release_job`, `get_job_history` | +| **Interactive** | `start_interactive_session`, `exec_in_session`, `list_interactive_sessions`, `end_interactive_session`, `get_interactive_session_info` | +| **Profiles** | `save_interactive_profile`, `list_interactive_profiles`, `start_session_from_profile` | +| **Files** | `list_directory`, `list_datasets`, `list_model_checkpoints`, `list_job_logs`, `read_file`, `write_file`, `find_files`, `delete_file`, `get_disk_usage`, `get_cluster_directories` | +| **Containers** | `list_container_images`, `validate_container_image` | + +All called as: `/sandbox/bin/mcporter call cluster. ` + +## Re-deploy After Reboot + +The mcp-proxy runs as a host process — it stops when you reboot. To restart: + +```bash +cd /autoresearch-demo +./bootstrap.sh --sandbox research \ + --alias cluster=login.hpc.example.com \ + --key ~/.ssh/id_ed25519 \ + --user jsmith \ + --user-root /lustre/users/jsmith +``` + +## Stopping the MCP Proxy + +```bash +kill $(cat ~/.local/state/nemoclaw-ssh-skill/mcp-proxy.pid) +``` + +## Troubleshooting + +| Issue | Fix | +|-------|-----| +| `mcporter: command not found` | Use full path: `/sandbox/bin/mcporter` | +| Agent tries to install MCP libraries | The SKILL.md tells it not to. Tell the agent: "Run `/sandbox/bin/mcporter call cluster.run_shell_command command='hostname'`" | +| `l7_decision=deny` in `openshell logs` | Policy doesn't match. Run `openshell policy get --full research` and check the `ssh_mcp` block has the correct host IP and port. | +| `EHOSTUNREACH` from mcp-proxy | On macOS: Node.js may be blocked by the firewall. slurm-mcp uses Python (asyncssh) which is typically allowed. Check: `python3 -c "import socket; s = socket.create_connection(('', 22), timeout=5); print('OK'); s.close()"` | +| `SSH connection error` | Verify from host: `ssh -i ~/.ssh/id_ed25519 jsmith@login.hpc.example.com hostname`. If that works, restart mcp-proxy. | +| mcp-proxy died | Check log: `tail ~/.local/state/nemoclaw-ssh-skill/mcp-proxy.log`. Re-run `bootstrap.sh`. | +| Slurm tools return errors | If the remote machine has no Slurm, this is expected. The shell and file tools still work. | + +## Security Model + +- **SSH key isolation**: The private key stays on the host inside the mcp-proxy process. The agent calls MCP tools over HTTP — it never sees or handles the key. +- **Network policy**: The sandbox can only reach the mcp-proxy's HTTP endpoint (one IP + port). All other egress is denied by default. +- **L7 proxy**: OpenShell's proxy inspects and enforces all traffic. The `access: full` policy grants HTTP forwarding to the mcp-proxy, not raw TCP to arbitrary hosts. +- **Slurm safety**: The SOUL.md instructs the agent to always ask for partition/quota confirmation before submitting jobs. This is an instruction-level guard — the MCP tools themselves do not enforce limits. diff --git a/autoresearch-demo/autoresearch-skill/IDENTITY.md b/autoresearch-demo/autoresearch-skill/IDENTITY.md new file mode 100644 index 0000000..c240d93 --- /dev/null +++ b/autoresearch-demo/autoresearch-skill/IDENTITY.md @@ -0,0 +1,7 @@ +# Identity + +- **Name:** Researcher +- **Role:** Autonomous AI research agent +- **Approach:** Karpathy autoresearch loop — modify, run, evaluate, keep/discard, repeat +- **Execution:** All experiments on remote machine via MCP (never local) +- **Paper writing:** Full pipeline via Hermes research-paper-writing skill diff --git a/autoresearch-demo/autoresearch-skill/MEMORY.md b/autoresearch-demo/autoresearch-skill/MEMORY.md new file mode 100644 index 0000000..5983572 --- /dev/null +++ b/autoresearch-demo/autoresearch-skill/MEMORY.md @@ -0,0 +1,32 @@ +# Memory + +## Infrastructure + +- **Remote machine:** (configured by setup.sh — alias and IP filled in automatically) +- **Access:** MCP via `/sandbox/bin/mcporter call .` +- **Tools:** 34 Slurm/SSH tools (run_shell_command, submit_job, get_gpu_availability, read_file, write_file, etc.) + +## How to run commands on the remote machine + +```bash +/sandbox/bin/mcporter call .run_shell_command command="" +``` + +This is a bash command. Run it in the terminal. Do NOT install any MCP or SSH libraries. + +## Autoresearch pattern + +Based on Karpathy's autoresearch (https://github.com/karpathy/autoresearch): +- Modify code → run experiment → evaluate metric → keep or discard → repeat +- Fixed time budget per experiment for fair comparison +- Log everything to results.tsv +- Never stop — run autonomously until interrupted + +## Paper writing + +Use the Hermes research-paper-writing skill for the full pipeline: +- Literature review (arxiv search) +- Experiment design and execution (on remote machine) +- Analysis and visualization +- LaTeX drafting (NeurIPS, ICML, ICLR, ACL, AAAI, COLM templates) +- Self-review and revision diff --git a/autoresearch-demo/autoresearch-skill/SKILL.md b/autoresearch-demo/autoresearch-skill/SKILL.md new file mode 100644 index 0000000..5d8f6d5 --- /dev/null +++ b/autoresearch-demo/autoresearch-skill/SKILL.md @@ -0,0 +1,58 @@ +--- +name: ssh-remote +description: "Run commands on a remote GPU server via MCP. Use this skill whenever the user asks to do anything on the remote machine — training jobs, GPU checks, file operations, Slurm jobs." +--- + +# Remote Server Access + +Run shell commands on the remote server using `/sandbox/bin/mcporter`. +This is a bash command — run it in the terminal. Do NOT install any MCP libraries. + +## How to run a command + +```bash +/sandbox/bin/mcporter call .run_shell_command command="" +``` + +That's it. Just run that in the terminal. Examples: + +```bash +/sandbox/bin/mcporter call .run_shell_command command="hostname" +/sandbox/bin/mcporter call .run_shell_command command="nvidia-smi" +/sandbox/bin/mcporter call .run_shell_command command="ls -la /home/user" +/sandbox/bin/mcporter call .run_shell_command command="cd /workspace && python train.py --lr 0.001" +``` + +## Other available tools + +All called the same way — `/sandbox/bin/mcporter call . `: + +| Tool | What it does | +|------|-------------| +| `run_shell_command` | Run any shell command | +| `get_gpu_availability` | Check free GPUs | +| `get_cluster_status` | Slurm partitions and nodes | +| `submit_job` | Submit Slurm batch job | +| `list_jobs` | List running/pending jobs | +| `get_job_details` | Job details by ID | +| `cancel_job` | Cancel a job | +| `list_directory` | List remote directory | +| `read_file` | Read a remote file | +| `write_file` | Write a remote file | +| `find_files` | Search for files | + +## Important + +- `mcporter` is already installed at `/sandbox/bin/mcporter` — do NOT install anything +- Run it as a bash command in the terminal — it is NOT a Python library +- Timeout: 120s — for long jobs use `submit_job` instead of `run_shell_command` + +## MANDATORY: Before Slurm jobs + +Before submitting ANY Slurm job, you MUST ask the user: +1. Which partition(s) can I use? +2. How many concurrent jobs am I allowed to launch? +3. What is the maximum time limit per job? +4. What GPU type/count should I request? + +Do NOT assume you have unlimited cluster access. diff --git a/autoresearch-demo/autoresearch-skill/SOUL.md b/autoresearch-demo/autoresearch-skill/SOUL.md new file mode 100644 index 0000000..62ef975 --- /dev/null +++ b/autoresearch-demo/autoresearch-skill/SOUL.md @@ -0,0 +1,73 @@ +# Soul + +You are an autonomous AI research agent running inside a NemoClaw sandbox. + +## How you work + +You follow the autoresearch loop (Karpathy, 2026): modify code → run experiment → evaluate → keep or discard → repeat. You never stop unless the human interrupts you. + +All experiments run on a remote machine called by the alias you configured (e.g. **cluster**) via MCP tools. You do NOT run experiments locally inside the sandbox. Every training run, evaluation, and data operation happens on the remote machine. + +## Experiment execution + +To run any command on the remote machine: +```bash +/sandbox/bin/mcporter call .run_shell_command command="" +``` + +For Slurm clusters, use the Slurm tools: +```bash +/sandbox/bin/mcporter call .submit_job job_name="exp-001" partition="gpu" num_gpus=4 command="python train.py" time_limit="0:10:00" +/sandbox/bin/mcporter call .get_gpu_availability +/sandbox/bin/mcporter call .list_jobs +``` + +These are bash commands. Run them in the terminal. Do NOT install MCP libraries. + +## MANDATORY: Before running experiments + +Before launching ANY experiment, you MUST ask the human: + +1. **Where will this run?** Direct SSH to the remote machine, or Slurm? +2. **If Slurm:** + - Which partition(s) can I use? + - How many concurrent jobs am I allowed to launch? + - What is the maximum time limit per job? + - What GPU type/count should I request? +3. **If direct SSH:** Confirm the working directory and whether the GPU is free. + +Do NOT assume you have unlimited access to a cluster. Do NOT submit Slurm jobs without explicit partition and quota confirmation. + +## The autoresearch loop + +LOOP FOREVER: + +1. Look at current state: what's the best result so far? +2. Propose an experimental idea. Think about what to try — read papers, re-read code, try combining near-misses, try radical changes. +3. Modify the code on the remote machine (via `run_shell_command` or `write_file`). +4. Commit the change on the remote machine. +5. Run the experiment (via `run_shell_command` for SSH, `submit_job` for Slurm). +6. Read results (via `run_shell_command` to grep logs, or `read_file`). +7. If improved: keep. If not: revert. +8. Log the result in results.tsv. +9. Go to 1. + +**NEVER STOP.** Do not pause to ask "should I continue?" The human may be sleeping. Keep running experiments until manually interrupted. If you run out of ideas, think harder. + +## Paper writing + +When the human asks you to write up results, use the research-paper-writing skill from Hermes. It covers NeurIPS, ICML, ICLR, ACL, AAAI, COLM formats with LaTeX templates. The pipeline is: + +1. Literature review (arxiv skill) +2. Experiment design → execution (autoresearch loop on the remote machine) +3. Analysis (read results from the remote machine) +4. Paper drafting (LaTeX, locally in sandbox workspace) +5. Self-review and revision +6. Submission prep + +## Communication + +- Be direct and technical. No filler. +- Log every experiment result — the human needs to see what you tried and why. +- When proposing an experiment, say what you expect and why in one sentence. +- Keep the results.tsv updated at all times. diff --git a/autoresearch-demo/autoresearch-skill/USER.md b/autoresearch-demo/autoresearch-skill/USER.md new file mode 100644 index 0000000..37dda58 --- /dev/null +++ b/autoresearch-demo/autoresearch-skill/USER.md @@ -0,0 +1,8 @@ +# User + +- **Name:** (your name) +- **Role:** (your role) +- **Expertise:** ML training, inference, GPU systems +- **Hardware:** (your GPU setup, e.g. "GPU server at 10.0.0.1") +- **Communication:** Brief, direct, technical. Expects results, not explanations. +- **Tools:** uv (Python), PyTorch, Triton, vLLM, Slurm diff --git a/autoresearch-demo/bootstrap.sh b/autoresearch-demo/bootstrap.sh new file mode 100755 index 0000000..f7c2361 --- /dev/null +++ b/autoresearch-demo/bootstrap.sh @@ -0,0 +1,380 @@ +#!/usr/bin/env bash +# SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Bootstrap Slurm/SSH MCP access for a NemoClaw sandbox. +# +# Architecture (follows the Blender MCP demo pattern): +# Host: slurm-mcp (Python, holds SSH key) → mcp-proxy (HTTP/SSE) +# Policy: sandbox allowed to reach host mcp-proxy port +# Sandbox: mcporter connects to mcp-proxy, agent gets 34 Slurm + SSH tools +# +# The SSH private key NEVER enters the sandbox. +# +# Usage: +# ./ssh-skill/bootstrap.sh \ +# --sandbox \ +# --alias builder= \ +# [--key ] \ +# [--user ] \ +# [--user-root ] \ +# [--port ] \ +# [--mcp-port ] + +set -euo pipefail + +SKILL_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" + +# -- Parse flags --------------------------------------------------------------- + +SANDBOX="" +SSH_KEY="" +SSH_PASSWORD="" +SSH_PORT=22 +SSH_USER="" +USER_ROOT="" +MCP_PORT=9878 +ALIAS_NAME="" +ALIAS_HOST="" + +usage() { + echo "Usage: $0 \\" + echo " --sandbox --alias = \\" + echo " [--key ] [--user ] [--user-root ]" + echo " [--port ] [--mcp-port ]" + echo "" + echo " --sandbox OpenShell sandbox name" + echo " --alias SSH host alias (e.g. --alias builder=)" + echo " --key Path to SSH private key (stays on host, never enters sandbox)" + echo " --password SSH password (use --key OR --password, not both)" + echo " --user SSH username" + echo " --user-root Remote home/root directory (default: /home/)" + echo " --port SSH port (default: 22)" + echo " --mcp-port Port for mcp-proxy on host (default: 9878)" + exit 1 +} + +while [[ $# -gt 0 ]]; do + case "$1" in + --sandbox) SANDBOX="$2"; shift 2 ;; + --alias) + ALIAS_NAME="${2%%=*}" + ALIAS_HOST="${2#*=}" + if [[ "$ALIAS_NAME" == "$2" || -z "$ALIAS_HOST" ]]; then + echo "Error: --alias must be name=host (e.g. builder=)" + exit 1 + fi + shift 2 ;; + --key) SSH_KEY="$2"; shift 2 ;; + --password) SSH_PASSWORD="$2"; shift 2 ;; + --user) SSH_USER="$2"; shift 2 ;; + --user-root) USER_ROOT="$2"; shift 2 ;; + --port) SSH_PORT="$2"; shift 2 ;; + --mcp-port) MCP_PORT="$2"; shift 2 ;; + -h|--help) usage ;; + *) echo "Unknown option: $1"; usage ;; + esac +done + +[[ -z "$SANDBOX" ]] && echo "Error: --sandbox is required." && usage +[[ -z "$ALIAS_NAME" ]] && echo "Error: --alias is required." && usage + +if [[ -n "$SSH_KEY" && ! -f "$SSH_KEY" ]]; then + echo "Error: SSH key not found: $SSH_KEY" + exit 1 +fi + +[[ -z "$USER_ROOT" && -n "$SSH_USER" ]] && USER_ROOT="/home/$SSH_USER" +[[ -z "$USER_ROOT" ]] && USER_ROOT="/home/\$USER" + +# -- Pre-flight ---------------------------------------------------------------- + +for cmd in openshell uvx; do + if ! command -v "$cmd" &>/dev/null; then + echo "Error: '$cmd' not found." + exit 1 + fi +done + +if ! openshell sandbox get "$SANDBOX" &>/dev/null; then + echo "Error: sandbox '$SANDBOX' not found or not running." + exit 1 +fi + +echo "=== Slurm/SSH MCP bootstrap ===" +echo " Sandbox: $SANDBOX" +echo " Target: $ALIAS_NAME ($ALIAS_HOST)" +echo " MCP port: $MCP_PORT" +echo " Key: ${SSH_KEY:-} (stays on HOST)" +echo "" + +# -- Step 1: Start slurm-mcp via mcp-proxy on host ---------------------------- + +echo "--- Step 1: mcp-proxy + slurm-mcp on host ---" + +LOG_DIR="${XDG_STATE_HOME:-$HOME/.local/state}/nemoclaw-ssh-skill" +mkdir -p "$LOG_DIR" +LOG_FILE="$LOG_DIR/mcp-proxy.log" +PID_FILE="$LOG_DIR/mcp-proxy.pid" + +# Kill existing +if [[ -f "$PID_FILE" ]]; then + OLD_PID=$(cat "$PID_FILE" 2>/dev/null || true) + if [[ -n "$OLD_PID" ]] && kill -0 "$OLD_PID" 2>/dev/null; then + echo " Stopping existing mcp-proxy (pid $OLD_PID)..." + kill "$OLD_PID" 2>/dev/null || true + sleep 1 + fi +fi + +# Also kill anything on our port +EXISTING_PID=$(lsof -ti :$MCP_PORT 2>/dev/null || true) +if [[ -n "$EXISTING_PID" ]]; then + kill "$EXISTING_PID" 2>/dev/null || true + sleep 1 +fi + +# Create wrapper script with baked env (slurm-mcp reads config from env vars) +WRAPPER="$LOG_DIR/run-slurm-mcp.sh" +cat > "$WRAPPER" << EOF +#!/bin/bash +export SLURM_SSH_HOST=$ALIAS_HOST +export SLURM_SSH_PORT=$SSH_PORT +${SSH_USER:+export SLURM_SSH_USER=$SSH_USER} +${SSH_KEY:+export SLURM_SSH_KEY_PATH=$SSH_KEY} +${SSH_PASSWORD:+export SLURM_SSH_PASSWORD=$SSH_PASSWORD} +export SLURM_SSH_KNOWN_HOSTS=$HOME/.ssh/known_hosts +export SLURM_USER_ROOT=$USER_ROOT +export SLURM_COMMAND_TIMEOUT=120 +exec uvx --from "git+https://github.com/yidong72/slurm_mcp.git" slurm-mcp +EOF +chmod +x "$WRAPPER" + +echo " Starting: mcp-proxy :$MCP_PORT → slurm-mcp → $ALIAS_HOST" +nohup uvx mcp-proxy --host 0.0.0.0 --port "$MCP_PORT" "$WRAPPER" \ + > "$LOG_FILE" 2>&1 & +MCP_PID=$! +echo "$MCP_PID" > "$PID_FILE" + +echo -n " Waiting" +for i in $(seq 1 15); do + if lsof -ti :$MCP_PORT >/dev/null 2>&1 && kill -0 "$MCP_PID" 2>/dev/null; then + echo " OK (pid $MCP_PID)" + break + fi + if ! kill -0 "$MCP_PID" 2>/dev/null; then + echo " FAILED" + tail -5 "$LOG_FILE" 2>/dev/null + exit 1 + fi + echo -n "." + sleep 1 +done + +# -- Step 2: Network policy ---------------------------------------------------- + +echo "" +echo "--- Step 2: Network policy ---" + +# Get the Docker bridge IPv4 that reaches the host (where mcp-proxy listens). +# Must be IPv4 — the L7 proxy's SSRF guard and our policy block bracket-less IPv6, +# and `getent hosts` can return IPv6 first on some hosts. +HOST_IP=$(openshell sandbox exec -n "$SANDBOX" -- bash -c \ + "getent ahosts host.openshell.internal | awk '/STREAM/ && \$1 !~ /:/ {print \$1; exit}'" 2>/dev/null) +if [[ -z "$HOST_IP" ]]; then + HOST_IP="172.29.0.254" + echo " Warning: could not resolve host.openshell.internal IPv4, using $HOST_IP" +fi + +POLICY_FILE=$(mktemp /tmp/nemoclaw-mcp-policy-XXXXXX.yaml) +CURRENT=$(openshell policy get --full "$SANDBOX" 2>/dev/null | awk '/^---/{found=1; next} found{print}') + +python3 -c " +import re, sys +current = sys.stdin.read() +port = int(sys.argv[1]) +host_ip = sys.argv[2] + +# Remove any previous ssh_mcp or ssh_remote block +current = re.sub(r' ssh_(mcp|remote):.*?(?=\n \w|\Z)', '', current, flags=re.DOTALL) + +block = f''' ssh_mcp: + name: ssh_mcp + endpoints: + - host: \"{host_ip}\" + port: {port} + access: full + binaries: + - path: /usr/local/bin/node* + - path: /usr/bin/node* + - path: /usr/bin/curl* + - path: /bin/bash*''' + +result = current.rstrip() + '\n' + block + '\n' +print(result) +" "$MCP_PORT" "$HOST_IP" <<< "$CURRENT" > "$POLICY_FILE" + +openshell policy set --policy "$POLICY_FILE" --wait "$SANDBOX" +rm -f "$POLICY_FILE" +echo " Allowed: sandbox → $HOST_IP:$MCP_PORT" + +# -- Step 3: mcporter in sandbox ----------------------------------------------- + +echo "" +echo "--- Step 3: mcporter ---" + +if openshell sandbox exec -n "$SANDBOX" -- bash -c 'test -f /sandbox/node_modules/mcporter/dist/cli.js' 2>/dev/null; then + echo " mcporter already installed" +else + echo " Installing mcporter..." + openshell sandbox exec -n "$SANDBOX" -- bash -c 'mkdir -p /sandbox/bin && cd /sandbox && npm install --prefix /sandbox mcporter 2>&1 | tail -2 && printf "#!/bin/bash\nexec node /sandbox/node_modules/mcporter/dist/cli.js \"\$@\"\n" > /sandbox/bin/mcporter && chmod +x /sandbox/bin/mcporter' +fi + +# -- Step 4: mcporter config --------------------------------------------------- + +echo "" +echo "--- Step 4: mcporter config ---" + +MCPORTER_CONFIG="{\"mcpServers\":{\"$ALIAS_NAME\":{\"type\":\"http\",\"baseUrl\":\"http://$HOST_IP:$MCP_PORT/sse\"}}}" +openshell sandbox exec -n "$SANDBOX" -- bash -c "mkdir -p ~/.mcporter && echo '$MCPORTER_CONFIG' > ~/.mcporter/mcporter.json" +echo " Server '$ALIAS_NAME' → http://$HOST_IP:$MCP_PORT/sse" + +# Skill examples use the absolute path /sandbox/bin/mcporter, so no PATH export is +# strictly required. /sandbox/.bashrc is root-owned read-only on current images, +# so don't fail the bootstrap if we can't write to it. +openshell sandbox exec -n "$SANDBOX" -- bash -c \ + 'test -w /sandbox/.bashrc && (grep -q "/sandbox/bin" /sandbox/.bashrc 2>/dev/null || echo "export PATH=\"/sandbox/bin:\$PATH\"" >> /sandbox/.bashrc) || true' + +# -- Step 5: Skill file -------------------------------------------------------- + +echo "" +echo "--- Step 5: Agent skill ---" + +SKILL_UPLOAD=$(mktemp -d /tmp/nemoclaw-ssh-skill-XXXXXX) +cat > "$SKILL_UPLOAD/SKILL.md" << SKILLEOF +--- +name: ssh-remote +description: "Run commands on remote server '${ALIAS_NAME}' (${ALIAS_HOST}). Use this skill whenever the user asks to do anything on ${ALIAS_NAME}." +--- + +# Remote Server: ${ALIAS_NAME} + +Run shell commands on ${ALIAS_NAME} using \`/sandbox/bin/mcporter\`. +This is a bash command — run it in the terminal. Do NOT install any MCP libraries. + +## How to run a command on ${ALIAS_NAME} + +\`\`\`bash +/sandbox/bin/mcporter call ${ALIAS_NAME}.run_shell_command command="" +\`\`\` + +That's it. Just run that in the terminal. Examples: + +\`\`\`bash +/sandbox/bin/mcporter call ${ALIAS_NAME}.run_shell_command command="hostname" +/sandbox/bin/mcporter call ${ALIAS_NAME}.run_shell_command command="nvidia-smi" +/sandbox/bin/mcporter call ${ALIAS_NAME}.run_shell_command command="ls -la /home/${SSH_USER:-$USER}" +/sandbox/bin/mcporter call ${ALIAS_NAME}.run_shell_command command="cd /workspace && python train.py --lr 0.001" +\`\`\` + +## Interactive sessions (preferred for iterative work) + +Prefer interactive sessions over batch \`submit_job\` for debugging, short +experiments, and eval loops — they avoid re-queuing between commands. Use +\`submit_job\` only for long-running training that doesn't need interaction. + +\`\`\`bash +# 1. Check resources +/sandbox/bin/mcporter call ${ALIAS_NAME}.get_gpu_availability + +# 2. Allocate (replaces salloc) +/sandbox/bin/mcporter call ${ALIAS_NAME}.start_interactive_session partition="general" num_gpus=1 time_limit="0:30:00" + +# 3. Run commands in the session (replaces srun) +/sandbox/bin/mcporter call ${ALIAS_NAME}.exec_in_session session_id="" command="nvidia-smi" +/sandbox/bin/mcporter call ${ALIAS_NAME}.exec_in_session session_id="" command="python train.py" + +# 4. Release when done +/sandbox/bin/mcporter call ${ALIAS_NAME}.end_interactive_session session_id="" +\`\`\` + +Saved profiles can be reused: + +\`\`\`bash +/sandbox/bin/mcporter call ${ALIAS_NAME}.save_interactive_profile name="gpu-debug" partition="general" num_gpus=1 time_limit="0:30:00" +/sandbox/bin/mcporter call ${ALIAS_NAME}.start_session_from_profile name="gpu-debug" +\`\`\` + +## NVIDIA container registry (nvcr.io) + +To pull NGC images on ${ALIAS_NAME}, ask the human for their NVIDIA API key +first. Never hardcode or store the key. Once provided: + +\`\`\`bash +/sandbox/bin/mcporter call ${ALIAS_NAME}.run_shell_command command="echo '' | docker login nvcr.io --username '\\\$oauthtoken' --password-stdin" +\`\`\` + +## Other available tools + +All called the same way — \`/sandbox/bin/mcporter call ${ALIAS_NAME}. \`: + +| Tool | What it does | +|------|-------------| +| \`run_shell_command\` | Run any shell command | +| \`get_gpu_availability\` | Check free GPUs | +| \`get_cluster_status\` | Slurm partitions and nodes | +| \`submit_job\` | Submit Slurm batch job | +| \`list_jobs\` | List running/pending jobs | +| \`get_job_details\` | Job details by ID | +| \`cancel_job\` | Cancel a job | +| \`start_interactive_session\` | Allocate an interactive Slurm session | +| \`exec_in_session\` | Run a command in an interactive session | +| \`end_interactive_session\` | Release an interactive session | +| \`save_interactive_profile\` | Save a reusable session profile | +| \`start_session_from_profile\` | Start session from a saved profile | +| \`list_directory\` | List remote directory | +| \`read_file\` | Read a remote file | +| \`write_file\` | Write a remote file | +| \`find_files\` | Search for files | + +## Important + +- \`mcporter\` is already installed at \`/sandbox/bin/mcporter\` — do NOT install anything +- Run it as a bash command in the terminal — it is NOT a Python library +- Commands run as user \`${SSH_USER:-$USER}\` on ${ALIAS_NAME} (${ALIAS_HOST}) +- Timeout: 120s — for long-running work, use \`start_interactive_session\` + \`exec_in_session\`, or \`submit_job\` for batch +SKILLEOF + +openshell sandbox upload "$SANDBOX" "$SKILL_UPLOAD" /sandbox/.hermes-data/skills/ssh-remote +rm -rf "$SKILL_UPLOAD" +echo " Uploaded to /sandbox/.hermes-data/skills/ssh-remote" + +# -- Step 6: Verify ------------------------------------------------------------ + +echo "" +echo "--- Verify ---" + +RESULT=$(openshell sandbox exec -n "$SANDBOX" -- bash -c '/sandbox/bin/mcporter call '"$ALIAS_NAME"'.run_shell_command command="hostname" 2>&1' 2>&1) + +if echo "$RESULT" | grep -q "$ALIAS_HOST"; then + echo " mcporter call $ALIAS_NAME.run_shell_command command=\"hostname\" → OK" + echo "$RESULT" | grep -v "Warning\|UNDICI" +else + echo " Result: $RESULT" + echo " Check: $LOG_FILE" +fi + +# -- Done ---------------------------------------------------------------------- + +echo "" +echo "=== Slurm/SSH MCP ready ===" +echo "" +echo " Sandbox: $SANDBOX" +echo " Target: $ALIAS_NAME ($ALIAS_HOST)" +echo " MCP proxy: localhost:$MCP_PORT (pid $MCP_PID)" +echo " Log: $LOG_FILE" +echo " Tools: 34 (Slurm + SSH + files)" +echo " SSH key: ON HOST ONLY" +echo "" +echo " To stop: kill \$(cat $PID_FILE)" +echo " To restart: $0 $*" diff --git a/autoresearch-demo/policy.yaml b/autoresearch-demo/policy.yaml new file mode 100644 index 0000000..efcfc7b --- /dev/null +++ b/autoresearch-demo/policy.yaml @@ -0,0 +1,26 @@ +# SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Network policy for Slurm/SSH MCP access. +# +# The agent connects to a host-side mcp-proxy (HTTP/SSE) which bridges to +# slurm-mcp (Python). The SSH key stays on the host — the agent never sees it. +# +# Replace HOST_IP and MCP_PORT with your values. HOST_IP is the Docker bridge +# gateway IP visible from inside the sandbox (typically 172.29.0.254, resolved +# from host.openshell.internal in /etc/hosts). +# +# The bootstrap.sh script applies this automatically — this file is for reference. + +network_policies: + ssh_mcp: + name: ssh_mcp + endpoints: + - host: "HOST_IP" # e.g. 172.29.0.254 + port: MCP_PORT # e.g. 9878 + access: full + binaries: + - path: /usr/local/bin/node* + - path: /usr/bin/node* + - path: /usr/bin/curl* + - path: /bin/bash* diff --git a/autoresearch-demo/setup.sh b/autoresearch-demo/setup.sh new file mode 100755 index 0000000..fb8d815 --- /dev/null +++ b/autoresearch-demo/setup.sh @@ -0,0 +1,126 @@ +#!/usr/bin/env bash +# SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. +# SPDX-License-Identifier: Apache-2.0 +# +# Turn a NemoClaw + Hermes sandbox into an autonomous AI research agent. +# +# Installs: +# 1. Slurm/SSH MCP bridge (mcp-proxy + slurm-mcp on host, mcporter in sandbox) +# 2. Agent persona (SOUL.md, USER.md, MEMORY.md for autoresearch loop) +# 3. Research + MLOps skills from NousResearch/hermes-agent upstream +# +# Usage: +# ./setup.sh --sandbox --alias gpu-server= --key ~/.ssh/id_ed25519 --user +# ./setup.sh --sandbox # skills + persona only, no SSH/Slurm + +set -euo pipefail + +ROOT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" + +SANDBOX="" +SSH_ARGS=() +HAS_SSH=false + +usage() { + echo "Usage: $0 --sandbox [SSH options]" + echo "" + echo " --sandbox OpenShell sandbox name (required)" + echo "" + echo "Slurm/SSH MCP options (optional — omit all to install skills only):" + echo " --alias = SSH alias (e.g. --alias gpu-server=)" + echo " --key SSH private key (stays on host, never enters sandbox)" + echo " --password SSH password (use --key OR --password, not both)" + echo " --user SSH username" + echo " --user-root Remote home/root directory (default: /home/)" + echo " --port SSH port (default: 22)" + echo " --mcp-port MCP proxy port on host (default: 9878)" + echo "" + echo "Examples:" + echo " $0 --sandbox research --alias gpu-server= --key ~/.ssh/id_ed25519 --user " + echo " $0 --sandbox research # skills + persona only" + exit 1 +} + +while [[ $# -gt 0 ]]; do + case "$1" in + --sandbox) SANDBOX="$2"; shift 2 ;; + --alias|--key|--password|--user|--user-root|--port|--mcp-port) + HAS_SSH=true + SSH_ARGS+=("$1" "$2") + shift 2 ;; + -h|--help) usage ;; + *) echo "Unknown option: $1"; usage ;; + esac +done + +[[ -z "$SANDBOX" ]] && echo "Error: --sandbox is required." && usage + +if ! openshell sandbox get "$SANDBOX" &>/dev/null; then + echo "Error: sandbox '$SANDBOX' not found or not running." + echo " Create it first: nemoclaw onboard" + exit 1 +fi + +echo "=== Setting up AI Research Agent in sandbox '$SANDBOX' ===" +echo "" + +# -- Step 1: Slurm/SSH MCP (optional) ----------------------------------------- + +if [[ "$HAS_SSH" == true ]]; then + echo "--- Step 1/3: Slurm/SSH MCP ---" + "$ROOT_DIR/bootstrap.sh" --sandbox "$SANDBOX" "${SSH_ARGS[@]}" + echo "" +else + echo "--- Step 1/3: Slurm/SSH MCP (skipped, no --alias provided) ---" + echo "" +fi + +# -- Step 2: Agent persona ----------------------------------------------------- + +echo "--- Step 2/3: Agent persona ---" + +for f in SOUL.md USER.md MEMORY.md; do + if [[ -f "$ROOT_DIR/autoresearch-skill/$f" ]]; then + openshell sandbox upload "$SANDBOX" "$ROOT_DIR/autoresearch-skill/$f" /sandbox/.hermes-data/memories/ + fi +done +echo " Uploaded SOUL.md, USER.md, MEMORY.md to /sandbox/.hermes-data/memories/" + +# -- Step 3: Research + MLOps skills ------------------------------------------- + +echo "--- Step 3/3: Research & MLOps skills ---" + +SKILLS_CACHE="${XDG_CACHE_HOME:-$HOME/.cache}/nemoclaw-autoresearch/hermes-skills" + +echo "Fetching latest skills from NousResearch/hermes-agent..." +rm -rf "$SKILLS_CACHE" +mkdir -p "$(dirname "$SKILLS_CACHE")" +git clone --depth 1 --filter=blob:none --sparse \ + https://github.com/NousResearch/hermes-agent.git "$SKILLS_CACHE" 2>&1 +cd "$SKILLS_CACHE" +git sparse-checkout set skills/research skills/mlops 2>&1 +cd "$ROOT_DIR" + +echo "Uploading research skills..." +openshell sandbox upload "$SANDBOX" "$SKILLS_CACHE/skills/research" /sandbox/.hermes-data/skills/research + +echo "Uploading mlops skills..." +openshell sandbox upload "$SANDBOX" "$SKILLS_CACHE/skills/mlops" /sandbox/.hermes-data/skills/mlops + +SKILL_COUNT=$(openshell sandbox exec -n "$SANDBOX" -- bash -c 'find /sandbox/.hermes-data/skills -name "SKILL.md" | wc -l' 2>/dev/null | tr -d ' ') +echo " Installed: $SKILL_COUNT skills (latest from upstream)" + +echo "" +echo "=== AI Research Agent ready ===" +echo "" +echo " Sandbox: $SANDBOX" +echo " Skills: research (arxiv, paper-writing, literature review)" +echo " mlops (training, inference, evaluation, vllm, huggingface)" +if [[ "$HAS_SSH" == true ]]; then + echo " Slurm: configured (see Step 1 output above)" +fi +echo "" +echo "Try:" +echo ' "Check GPU availability on the cluster"' +echo ' "Search arxiv for recent papers on mixture of experts"' +echo ' "Set up an autoresearch loop to optimize train.py on the cluster"'