Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -148,7 +148,7 @@ you are running inside of — NOT the automation backend.
- **Production-fidelity launch**: The Playwright config (`playwright.mock-llm.config.ts`) starts the full `agent-canvas` stack via `bin/agent-canvas.mjs` — the same binary that `npx @openhands/agent-canvas` executes when users install the npm package. This means mock-LLM tests exercise the actual production path: pre-built static frontend + static-server.mjs + agent-server via uvx + automation backend via uvx + ingress proxy, all behind a single port.
- A pre-built `build/` directory is required. The Playwright webServer command runs `npm run build:app` when `build/index.html` is absent, but CI should run the build step explicitly for caching (`npm run build:app` in `.github/workflows/mock-llm-e2e.yml`).
- **Single ingress URL**: Tests use one URL for both the browser (`baseURL`) and backend API assertions (`BACKEND_URL`). The ingress proxy routes `/api/*` to the agent-server, `/api/automation/*` to the automation backend, and `/*` to the static frontend. Default ingress port for tests is `18300` (override via `MOCK_LLM_INGRESS_PORT` env var).
- **State isolation**: `OH_CANVAS_SAFE_STATE_DIR=.tmp/mock-llm-state` isolates test state from the user's real `~/.openhands/agent-canvas/` directory. The directory is cleaned before each test run.
- **State isolation**: `OH_CANVAS_SAFE_STATE_DIR=.tmp/mock-llm-state` isolates test state from the user's real `~/.openhands/agent-canvas/` directory. Both `STATE_DIR` (`.tmp/mock-llm-state`) and the automation DB dir (`.tmp/automation/`) are cleaned before each test run — the automation DB now lives outside STATE_DIR at `dirname(STATE_DIR)/automation/automations.db`, mirroring Docker's `~/.openhands/automation/automations.db`.
- **Session API key**: A random key is generated per test run and passed to the stack via `SESSION_API_KEY` / `OH_SESSION_API_KEYS_0` / `VITE_SESSION_API_KEY`. The static server injects it into `index.html` at serve time so the frontend authenticates automatically.
- **Mock LLM server** (`tests/e2e/mock-llm/scripts/mock-llm-server.py`): Python HTTP server using openhands-sdk's `TestLLM` to return scripted tool-call + text trajectories. Supports admin API endpoints for dynamic trajectory management:
- `POST /admin/reset` — reset to the default trajectory (terminal printf + text reply)
Expand Down
2 changes: 1 addition & 1 deletion __tests__/scripts/dev-safe.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -595,7 +595,7 @@ describe("buildSafeDevConfig", () => {
path.join(tmpdir(), "openhands-agent-canvas-tmux"),
);
expect(config.conversationsPath).toBe(
path.join(config.stateDir, "conversations"),
path.join(config.stateDir, "dev_conversations"),
);
expect(config.workspacesPath).toBe(
path.join(config.stateDir, "workspaces"),
Expand Down
16 changes: 10 additions & 6 deletions playwright.mock-llm.config.ts
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@

import { defineConfig, devices } from "@playwright/test";
import { randomBytes } from "node:crypto";
import { resolve } from "node:path";
import { dirname, join, resolve } from "node:path";

// ── Port allocation (separate from live E2E / dev to avoid collisions) ─
const MOCK_LLM_PORT = process.env.MOCK_LLM_PORT ?? "9999";
Expand All @@ -37,11 +37,13 @@ const sessionApiKey =
process.env.MOCK_LLM_SESSION_API_KEY = sessionApiKey;

// ── State directory (isolated per test run) ────────────────────────────
// MUST be absolute — the automation backend's SQLite DB URL is derived from
// this path, and a relative path gets double-nested when the child process
// cwd is also set to stateDir (cwd/stateDir/stateDir/automations.db).
const STATE_DIR = resolve(".tmp/mock-llm-state");

// Automation DB lives at $parent_of_STATE_DIR/automation/automations.db,
// mirroring docker/entrypoint.sh which uses $HOME/.openhands/automation/automations.db.
// Both STATE_DIR and AUTOMATION_DB_DIR must be cleaned between runs to avoid stale data.
const AUTOMATION_DB_DIR = join(dirname(STATE_DIR), "automation");

// ── URLs ───────────────────────────────────────────────────────────────
const INGRESS_URL = `http://localhost:${INGRESS_PORT}/`;
const MOCK_LLM_URL = `http://127.0.0.1:${MOCK_LLM_PORT}`;
Expand Down Expand Up @@ -116,8 +118,10 @@ export default defineConfig({
// kills children via process groups and exits cleanly.
{
command:
// Clean state dir to avoid stale profile/conversation data between runs
`node -e "const fs=require('node:fs'); fs.rmSync('${STATE_DIR}',{recursive:true,force:true});" && ` +
// Clean state dir and automation DB dir to avoid stale data between runs.
// Automation DB is stored outside STATE_DIR (at AUTOMATION_DB_DIR) so both
// must be cleaned; see scripts/dev-with-automation.mjs startAutomationBackend.
`node -e "const fs=require('node:fs'); fs.rmSync('${STATE_DIR}',{recursive:true,force:true}); fs.rmSync('${AUTOMATION_DB_DIR}',{recursive:true,force:true});" && ` +
// Build frontend if not already built (CI should pre-build for caching)
'[ -f build/index.html ] || npm run build:app && ' +
[
Expand Down
4 changes: 3 additions & 1 deletion scripts/dev-safe.mjs
Original file line number Diff line number Diff line change
Expand Up @@ -579,7 +579,7 @@ function buildConfigFromPorts(ports, cwd, env) {
env.OH_CANVAS_SAFE_STATE_DIR ||
path.join(homedir(), ".openhands", "agent-canvas"),
);
const conversationsPath = path.join(stateDir, "conversations");
const conversationsPath = path.join(stateDir, "dev_conversations");
const workspacesPath = path.join(stateDir, "workspaces");
// Use provided secret key, or read/generate one persisted to
// ~/.openhands/agent-canvas/secret-key.txt. Persisting ensures dev mode
Expand Down Expand Up @@ -655,6 +655,8 @@ export function buildAgentServerEnv(config) {
// This is a no-op on Linux/macOS where the locale is already UTF-8.
PYTHONUTF8: "1",
TMUX_TMPDIR: config.tmuxTmpDir,
// Parent of stateDir (= ~/.openhands) so settings/secrets match Docker.
OH_PERSISTENCE_DIR: path.dirname(config.stateDir),
OH_CONVERSATIONS_PATH: config.conversationsPath,
OH_BASH_EVENTS_DIR: config.bashEventsDir,
OH_VSCODE_PORT: String(config.vscodePort),
Expand Down
4 changes: 2 additions & 2 deletions scripts/dev-static.mjs
Original file line number Diff line number Diff line change
Expand Up @@ -557,7 +557,7 @@ async function main() {
const { mkdirSync } = await import("node:fs");
for (const dir of [
config.stateDir,
join(config.stateDir, "conversations"),
join(config.stateDir, "dev_conversations"),
join(config.stateDir, "workspaces"),
join(config.stateDir, "bash_events"),
join(config.stateDir, "storage"),
Expand Down Expand Up @@ -588,7 +588,7 @@ async function main() {
);
process.exit(1);
}
const conversationsPath = join(config.stateDir, "conversations");
const conversationsPath = join(config.stateDir, "dev_conversations");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor — conversations path is duplicated in instead of reading from .

(built in ) already holds . Both here and the identical expression at line 560 will drift silently if the subdirectory name ever changes in . Not a bug today, but worth keeping in mind for a follow-up if is ever refactored to expose a canonical .

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@OpenHands read this comment above, you will notice that it seems to miss words. Clone yourself the automations/ repo in the same org, and any other repo you may need, and try to understand what is the problem. Some quoting issue? Some eaten characters? It’s very likely recent so maybe find the agent that posts this (cloud automation pr review agent) and investigate to find the cause. Respond directly on github in a reply to me, and mention me so I can see it.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm on it! enyst can track my progress at all-hands.dev

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@enyst I traced this and it looks like a shell heredoc quoting bug in the review-posting step, not GitHub rendering and not the automation DB/service code.

What I found:

  • The GitHub API already stores the all-hands-bot review/comment bodies with the inline-code text removed (backticks == 0 on this review), so the text was mangled before submission.
  • The removed words are exactly the Markdown spans/fences written with backticks.
  • A local repro with an unquoted heredoc produces the same transformation byte-for-byte for this comment body:
cat > /tmp/review.json << EOF
{"body":"use `DEV_CONVERSATIONS_PATH` built in `buildStaticServerEnv`"}
EOF

becomes:

{"body":"use  built in "}

because bash treats backticks inside an unquoted heredoc as command substitution. The attempted substitutions fail, but cat still exits 0, so the agent can proceed to gh api ... --input /tmp/review.json and post a damaged review.

I cloned OpenHands/automation and related repos; automation is dispatching/running the automation, while the damaged payload is consistent with the Cloud Automation PR review agent creating a JSON review file in the shell before calling the GitHub reviews API. The fix/guardrail is to make that path always use a quoted delimiter (<<'EOF', as the github-pr-review skill already recommends) or generate JSON via Python/jq/raw files instead of an unquoted shell heredoc. A simple regression check would be: review bodies containing `SENTINEL` must still contain that text in /tmp/review.json before posting.

This reply was generated by an AI agent (OpenHands) on behalf of the user.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Final summary:

  • ✅ Investigated PR fix: align dev-stack OH_PERSISTENCE_DIR and automation DB path with Docker #960’s malformed all-hands-bot review comments.
  • ✅ Confirmed via GitHub API that the missing words were already absent in the stored review/comment bodies, so this was not a GitHub UI rendering issue.
  • ✅ Cloned and inspected OpenHands/automation plus related OpenHands review/integration repositories.
  • ✅ Reproduced the exact failure locally: an unquoted shell heredoc (<< EOF) strips Markdown backtick spans because bash treats them as command substitutions before writing the JSON review payload.
  • ✅ Posted a direct GitHub reply in the review thread, mentioning @enyst, with the root-cause analysis and recommended mitigation:
    • use quoted heredocs (<<'EOF'),
    • or generate JSON via Python/jq/raw files,
    • and add a regression check ensuring backtick-wrapped sentinel text survives before posting.
  • ✅ Did not modify or commit any repository files; the working tree remained clean.

GitHub reply: #960 (comment)

const cleared = releaseStaleConversationLeases(conversationsPath);
if (cleared > 0) {
logService(
Expand Down
7 changes: 5 additions & 2 deletions scripts/dev-with-automation.mjs
Original file line number Diff line number Diff line change
Expand Up @@ -381,10 +381,12 @@ function checkPrerequisites({ checkFrontendDependencies = true } = {}) {
function ensureDirectories(config) {
const dirs = [
config.stateDir,
join(config.stateDir, "conversations"),
join(config.stateDir, "dev_conversations"),
join(config.stateDir, "workspaces"),
join(config.stateDir, "bash_events"),
join(config.stateDir, "storage"),
// Automation DB directory — matches docker/entrypoint.sh mkdir -p behaviour.
dirname(join(dirname(config.stateDir), SHARED_DEFAULTS.paths.automationDb)),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit — nested is slightly hard to parse at a glance.

Since , the directory is always the fixed segment relative to the parent of . The double-dirname wrapping is correct, but this reads more clearly as:

or simply with a short comment. Non-blocking.

];

for (const dir of dirs) {
Expand Down Expand Up @@ -589,7 +591,8 @@ function startAutomationBackend(config) {
}
: {}),
AUTOMATION_AGENT_SERVER_API_KEY: config.sessionApiKey,
AUTOMATION_DB_URL: `sqlite+aiosqlite:///${join(config.stateDir, "automations.db")}`,
// ~/.openhands/automation/automations.db — matches docker/entrypoint.sh.
AUTOMATION_DB_URL: `sqlite+aiosqlite:///${join(dirname(config.stateDir), SHARED_DEFAULTS.paths.automationDb)}`,
// The automation backend uses this as its publicly-reachable base
// URL: it's appended to callback URLs and injected into each
// sandbox as `AUTOMATION_API_URL` (consumed by setup.sh for
Expand Down
Loading