Skip to content

feat: bundle upgrade flow with update UI#168

Open
Ovaculos wants to merge 4 commits intoNimbleBrainInc:mainfrom
Ovaculos:feat/bundle-upgrade-flow
Open

feat: bundle upgrade flow with update UI#168
Ovaculos wants to merge 4 commits intoNimbleBrainInc:mainfrom
Ovaculos:feat/bundle-upgrade-flow

Conversation

@Ovaculos
Copy link
Copy Markdown
Contributor

@Ovaculos Ovaculos commented May 4, 2026

Summary

  • Add upgrade() method to BundleLifecycleManager — best-effort hot-swap that stops the running source, re-installs from mpak, and starts the new version. Preserves workspace data and credentials.
  • Extend manage_app tool with "upgrade" action and add new check_updates tool that queries the registry for available updates
  • Add Update column to the About page — shows version diff per bundle, Update button gated to org_admin role, "updating" status badge during upgrade

Closes #35 (pieces 1 and 3 — upgrade action and updates-available surface. Piece 2, version pinning, deferred to follow-up PR)

Review feedback addressed

  • JSDoc "atomic swap" → "best-effort hot-swap" — docs now honestly describe the remove-then-start sequence. No atomicity claims.
  • Stale automations after upgraderemoveBundleAutomations() called before syncBundleAutomations(), matching the uninstall pattern. Prevents orphaned schedules when a new manifest drops a previously-declared automation.
  • UI swallowing upgrade failuresresult.isError now checked explicitly before calling fetchApps(). Error message extracted and routed to setUpgradeErrors so the inline error renders.
  • check_updates clobbering local-installed bundles — added installSource field ("registry" | "local" | "remote") to BundleInstance and AppInfo. check_updates and the frontend filter on installSource === "registry" instead of the bundleName.startsWith("@") heuristic. Local dev bundles with scoped manifest names are no longer offered for update.
  • Duplicated NB_WORK_DIR resolution — extracted to resolveWorkDir() function (not a module-level constant — constants capture env at import time, breaking integration tests that override process.env.NB_WORK_DIR at runtime).
  • trustScore stale after upgradefetchTrustScore() called during upgrade to refresh from mpak registry.
  • Placement unregister safety — unconditionally calls unregister() before conditionally re-registering. unregister() is a no-op filter when no entries exist.
  • No protected check on upgrade — intentional. Protected bundles can't be uninstalled but should receive version updates (security patches). Added code comment explaining the decision.
  • Unit tests expanded — 13 tests covering upgrade action guards, schema validation, check_updates filtering by installSource, upgrade no-op when already at latest, and unknown instance error path.

Test plan

  • 13 unit tests for upgrade flow (action guards, schema, check_updates filtering, no-op, error paths)
  • 2356 unit tests pass, 455 integration tests pass
  • Lint and format clean
  • Manual testing: About page shows Update column only for registry bundles, upgrade works end-to-end
  • Test with bundle that has a newer version available on mpak
  • Test role gating: non-admin user should see version info but no Update button

🤖 Generated with Claude Code

Copy link
Copy Markdown
Contributor

@mgoldsborough mgoldsborough left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Three critical issues from full QA review (warnings + suggestions in separate review summary). Happy to draft fixes for any of these.

Comment thread src/bundles/lifecycle.ts Outdated
Comment on lines +393 to +399
* Atomic swap: spawns the new version first, only tears down the old
* process after the new one is healthy. If the new version fails to
* start, the old process is left untouched.
*
* Preserves: workspace-scoped data dir, credentials, config entry.
* Emits: bundle.upgraded event on success.
*/
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JSDoc claims atomic swap, but the code is not atomic.

This docstring says: "Atomic swap: spawns the new version first, only tears down the old process after the new one is healthy. If the new version fails to start, the old process is left untouched."

The actual sequence at L428-440 calls registry.removeSource(serverName) before startBundleSource. If loadBundle or startBundleSource throws (network failure, manifest validation, spawn error), the user is left with no running source — the opposite of "left untouched." The PR description repeats the same false claim.

The inline comment at L427 ("Remove old source, then spawn new process") is correct; this JSDoc above it contradicts it.

Fix: pick one — either rewrite to truly atomic (start new under a temp serverName, swap on success), or update this docstring + the PR description to reflect the actual best-effort hot-swap.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. JSDoc now says "Best-effort hot-swap" and describes the actual remove-then-start sequence. No atomicity claims. PR description updated to match.

Comment thread src/bundles/lifecycle.ts
}

// Sync automations from new manifest
await this.syncBundleAutomations(manifest, name, registry);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stale bundle-contributed automations after upgrade.

syncBundleAutomations is idempotent on automationId (see L625 — createAutomation returns existing if the id matches) and never removes automations that existed for the prior version but were dropped in the new manifest.

If v1 declares schedules ["a", "b"] and v2 declares only ["a"], schedule b keeps running after upgrade with a stale prompt, until the bundle is fully uninstalled.

Fix: call removeBundleAutomations(name, registry) before syncBundleAutomations, or compute a name-set diff and delete the obsolete ones.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. removeBundleAutomations(name, registry) now called before syncBundleAutomations() in upgrade, matching the uninstall pattern. Stale schedules from dropped manifest entries are cleaned up.

Comment on lines +109 to +112
try {
await callTool("nb", "manage_app", { action: "upgrade", name: bundleName });
await fetchApps();
} catch (err) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UI silently swallows upgrade failures.

callTool returns the raw ToolCallResult and does not throw on isError: true — only parseToolResult throws (see web/src/api/tool-result.ts:18).

When the upgrade tool returns { isError: true, content: "Failed to upgrade …" }, the await resolves normally, the catch is never hit, and fetchApps() runs. The user sees the same version with no error indication.

Fix: check result.isError (or call parseToolResult and let it throw) and route to setUpgradeErrors so the destructive-text span at L226 actually renders the failure reason.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. result.isError checked explicitly after callTool. Error message extracted from content blocks and routed to setUpgradeErrors so the inline error span renders. Also fixed a TS2345 in the content filter/map — removed explicit type annotations and used c.text ?? "" + || "Upgrade failed".

Copy link
Copy Markdown
Contributor

@mgoldsborough mgoldsborough left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up review covering four additional issues (one warning, three suggestions) flagged in the QA pass.

Comment thread src/tools/system-tools.ts Outdated

const instances = lifecycle.getInstances().filter((i) => i.wsId === wsId);
// Only check named bundles (scoped names start with @)
const named = instances.filter((i) => i.bundleName.startsWith("@"));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check_updates can offer to clobber local-installed bundles.

installLocal sets instance.bundleName = manifest.name (src/bundles/lifecycle.ts:200), which can be a scoped name like @org/foo. Such an instance is a local dev bundle (its configKey is the disk path, not a scoped name), but this filter treats it as a registry bundle.

check_updates will fetch from mpak; if a registry version exists with the same scoped name, the UI offers an Update button that — when clicked — pulls the registry version and replaces the local dev bundle.

Fix: filter on configKey?.startsWith("@") (which is set per install path: scoped name for installNamed, filesystem path for installLocal, URL for installRemote), or add an explicit installSource field to BundleInstance.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Added installSource field ("registry" | "local" | "remote") to BundleInstance and AppInfo. Set during each install path (installNamed"registry", installLocal"local", installRemote"remote", seedInstance → derived from BundleRef shape). check_updates and frontend filter on installSource === "registry" instead of the bundleName.startsWith("@") heuristic.

const schema = manageApp!.inputSchema as { properties: { action: { enum: string[] } } };
expect(schema.properties.action.enum).toContain("upgrade");
});
});
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No unit test for the upgrade success path.

All three tests in this describe block cover error/dispatch surface (missing lifecycle, missing manageBundleCtx, schema enum). None exercise BundleLifecycleManager.upgrade() end-to-end:

  • instance metadata updated (version, description, type, ui, briefing)
  • bundle.upgraded event emitted with correct fromVersion / toVersion
  • registry source replaced (old serverName removed, new one registered)
  • placement registry re-registered

The PR description's test plan flags this as a manual test. Worth a deterministic unit test that fakes a v0.1.0 → v0.2.0 cache transition and asserts on instance state + emitted events. Would also catch the stale-trustScore and stale-automations bugs flagged elsewhere in this review.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. 13 unit tests now covering: upgrade action guards, schema validation, check_updates filtering by installSource, upgrade no-op when already at latest (verifies no events emitted, source untouched), and unknown instance error path.

Comment thread src/bundles/lifecycle.ts Outdated
await mpak.bundleCache.loadBundle(name, { force: true });

// Resolve workspace-scoped paths (same as installNamed)
const nbWorkDir = process.env.NB_WORK_DIR ?? join(homedir(), ".nimblebrain");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicated NB_WORK_DIR resolution.

process.env.NB_WORK_DIR ?? join(homedir(), ".nimblebrain") appears at L123, L251, L370, and now L424. Worth extracting a private helper (or a module-level constant) so the default path is defined in exactly one place. Drift here would be silent — different code paths could resolve different work dirs.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Extracted to resolveWorkDir() function. Intentionally a function, not a module-level constant — constants capture process.env at import time, which breaks integration tests that override NB_WORK_DIR at runtime. All 4 call sites now use resolveWorkDir().

Comment thread src/bundles/lifecycle.ts
Comment on lines +405 to +410
const serverName = deriveServerName(name);
const instance = this.instances.get(`${serverName}|${wsId}`);
if (!instance) {
throw new Error(`No bundle instance found for "${name}" in workspace "${wsId}"`);
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No protected check on upgrade.

uninstall (L336) refuses on instance.protected. Upgrade has no equivalent guard, so a protected bundle can be hot-swapped to a new version even though it can't be uninstalled.

This is probably intentional — protected bundles shouldn't be frozen out of security updates — but worth an explicit decision and a one-line code comment so a future contributor doesn't "fix" it by adding the check (or vice versa).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intentional — protected bundles can't be uninstalled but should receive version updates (security patches). Added code comment in upgrade() explaining the decision so future contributors don't accidentally add/remove the guard.

@Ovaculos Ovaculos force-pushed the feat/bundle-upgrade-flow branch from 2464c81 to 65d1c3a Compare May 5, 2026 22:09
Ovaculos and others added 4 commits May 5, 2026 17:18
Best-effort hot-swap: stops the running source, re-installs from mpak,
and starts the new version. Clears stale automations before re-syncing.
Refreshes trustScore and unconditionally unregisters placements before
re-registering. Adds installSource ("registry" | "local" | "remote")
to BundleInstance and AppInfo for explicit install-channel tracking.

Closes NimbleBrainInc#35

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
manage_app "upgrade" delegates to lifecycle.upgrade(). check_updates
filters on installSource === "registry" instead of bundleName heuristic.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Check for updates button, per-row upgrade with error handling.
Update column only shown for registry-sourced bundles. Filters on
installSource rather than bundleName prefix.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
13 tests covering upgrade action guards, schema validation,
check_updates filtering by installSource, upgrade no-op when
already at latest, and unknown instance error.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@Ovaculos Ovaculos force-pushed the feat/bundle-upgrade-flow branch from 65d1c3a to fd10a35 Compare May 5, 2026 22:19
@Ovaculos Ovaculos requested a review from mgoldsborough May 5, 2026 22:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bundle upgrade flow: no action, no version pinning, no reconcile

2 participants