feat: bundle upgrade flow with update UI#168
feat: bundle upgrade flow with update UI#168Ovaculos wants to merge 4 commits intoNimbleBrainInc:mainfrom
Conversation
mgoldsborough
left a comment
There was a problem hiding this comment.
Three critical issues from full QA review (warnings + suggestions in separate review summary). Happy to draft fixes for any of these.
| * Atomic swap: spawns the new version first, only tears down the old | ||
| * process after the new one is healthy. If the new version fails to | ||
| * start, the old process is left untouched. | ||
| * | ||
| * Preserves: workspace-scoped data dir, credentials, config entry. | ||
| * Emits: bundle.upgraded event on success. | ||
| */ |
There was a problem hiding this comment.
JSDoc claims atomic swap, but the code is not atomic.
This docstring says: "Atomic swap: spawns the new version first, only tears down the old process after the new one is healthy. If the new version fails to start, the old process is left untouched."
The actual sequence at L428-440 calls registry.removeSource(serverName) before startBundleSource. If loadBundle or startBundleSource throws (network failure, manifest validation, spawn error), the user is left with no running source — the opposite of "left untouched." The PR description repeats the same false claim.
The inline comment at L427 ("Remove old source, then spawn new process") is correct; this JSDoc above it contradicts it.
Fix: pick one — either rewrite to truly atomic (start new under a temp serverName, swap on success), or update this docstring + the PR description to reflect the actual best-effort hot-swap.
There was a problem hiding this comment.
Fixed. JSDoc now says "Best-effort hot-swap" and describes the actual remove-then-start sequence. No atomicity claims. PR description updated to match.
| } | ||
|
|
||
| // Sync automations from new manifest | ||
| await this.syncBundleAutomations(manifest, name, registry); |
There was a problem hiding this comment.
Stale bundle-contributed automations after upgrade.
syncBundleAutomations is idempotent on automationId (see L625 — createAutomation returns existing if the id matches) and never removes automations that existed for the prior version but were dropped in the new manifest.
If v1 declares schedules ["a", "b"] and v2 declares only ["a"], schedule b keeps running after upgrade with a stale prompt, until the bundle is fully uninstalled.
Fix: call removeBundleAutomations(name, registry) before syncBundleAutomations, or compute a name-set diff and delete the obsolete ones.
There was a problem hiding this comment.
Fixed. removeBundleAutomations(name, registry) now called before syncBundleAutomations() in upgrade, matching the uninstall pattern. Stale schedules from dropped manifest entries are cleaned up.
| try { | ||
| await callTool("nb", "manage_app", { action: "upgrade", name: bundleName }); | ||
| await fetchApps(); | ||
| } catch (err) { |
There was a problem hiding this comment.
UI silently swallows upgrade failures.
callTool returns the raw ToolCallResult and does not throw on isError: true — only parseToolResult throws (see web/src/api/tool-result.ts:18).
When the upgrade tool returns { isError: true, content: "Failed to upgrade …" }, the await resolves normally, the catch is never hit, and fetchApps() runs. The user sees the same version with no error indication.
Fix: check result.isError (or call parseToolResult and let it throw) and route to setUpgradeErrors so the destructive-text span at L226 actually renders the failure reason.
There was a problem hiding this comment.
Fixed. result.isError checked explicitly after callTool. Error message extracted from content blocks and routed to setUpgradeErrors so the inline error span renders. Also fixed a TS2345 in the content filter/map — removed explicit type annotations and used c.text ?? "" + || "Upgrade failed".
mgoldsborough
left a comment
There was a problem hiding this comment.
Follow-up review covering four additional issues (one warning, three suggestions) flagged in the QA pass.
|
|
||
| const instances = lifecycle.getInstances().filter((i) => i.wsId === wsId); | ||
| // Only check named bundles (scoped names start with @) | ||
| const named = instances.filter((i) => i.bundleName.startsWith("@")); |
There was a problem hiding this comment.
check_updates can offer to clobber local-installed bundles.
installLocal sets instance.bundleName = manifest.name (src/bundles/lifecycle.ts:200), which can be a scoped name like @org/foo. Such an instance is a local dev bundle (its configKey is the disk path, not a scoped name), but this filter treats it as a registry bundle.
check_updates will fetch from mpak; if a registry version exists with the same scoped name, the UI offers an Update button that — when clicked — pulls the registry version and replaces the local dev bundle.
Fix: filter on configKey?.startsWith("@") (which is set per install path: scoped name for installNamed, filesystem path for installLocal, URL for installRemote), or add an explicit installSource field to BundleInstance.
There was a problem hiding this comment.
Fixed. Added installSource field ("registry" | "local" | "remote") to BundleInstance and AppInfo. Set during each install path (installNamed → "registry", installLocal → "local", installRemote → "remote", seedInstance → derived from BundleRef shape). check_updates and frontend filter on installSource === "registry" instead of the bundleName.startsWith("@") heuristic.
| const schema = manageApp!.inputSchema as { properties: { action: { enum: string[] } } }; | ||
| expect(schema.properties.action.enum).toContain("upgrade"); | ||
| }); | ||
| }); |
There was a problem hiding this comment.
No unit test for the upgrade success path.
All three tests in this describe block cover error/dispatch surface (missing lifecycle, missing manageBundleCtx, schema enum). None exercise BundleLifecycleManager.upgrade() end-to-end:
- instance metadata updated (
version,description,type,ui,briefing) bundle.upgradedevent emitted with correctfromVersion/toVersion- registry source replaced (old
serverNameremoved, new one registered) - placement registry re-registered
The PR description's test plan flags this as a manual test. Worth a deterministic unit test that fakes a v0.1.0 → v0.2.0 cache transition and asserts on instance state + emitted events. Would also catch the stale-trustScore and stale-automations bugs flagged elsewhere in this review.
There was a problem hiding this comment.
Fixed. 13 unit tests now covering: upgrade action guards, schema validation, check_updates filtering by installSource, upgrade no-op when already at latest (verifies no events emitted, source untouched), and unknown instance error path.
| await mpak.bundleCache.loadBundle(name, { force: true }); | ||
|
|
||
| // Resolve workspace-scoped paths (same as installNamed) | ||
| const nbWorkDir = process.env.NB_WORK_DIR ?? join(homedir(), ".nimblebrain"); |
There was a problem hiding this comment.
Duplicated NB_WORK_DIR resolution.
process.env.NB_WORK_DIR ?? join(homedir(), ".nimblebrain") appears at L123, L251, L370, and now L424. Worth extracting a private helper (or a module-level constant) so the default path is defined in exactly one place. Drift here would be silent — different code paths could resolve different work dirs.
There was a problem hiding this comment.
Fixed. Extracted to resolveWorkDir() function. Intentionally a function, not a module-level constant — constants capture process.env at import time, which breaks integration tests that override NB_WORK_DIR at runtime. All 4 call sites now use resolveWorkDir().
| const serverName = deriveServerName(name); | ||
| const instance = this.instances.get(`${serverName}|${wsId}`); | ||
| if (!instance) { | ||
| throw new Error(`No bundle instance found for "${name}" in workspace "${wsId}"`); | ||
| } | ||
|
|
There was a problem hiding this comment.
No protected check on upgrade.
uninstall (L336) refuses on instance.protected. Upgrade has no equivalent guard, so a protected bundle can be hot-swapped to a new version even though it can't be uninstalled.
This is probably intentional — protected bundles shouldn't be frozen out of security updates — but worth an explicit decision and a one-line code comment so a future contributor doesn't "fix" it by adding the check (or vice versa).
There was a problem hiding this comment.
Intentional — protected bundles can't be uninstalled but should receive version updates (security patches). Added code comment in upgrade() explaining the decision so future contributors don't accidentally add/remove the guard.
2464c81 to
65d1c3a
Compare
Best-effort hot-swap: stops the running source, re-installs from mpak,
and starts the new version. Clears stale automations before re-syncing.
Refreshes trustScore and unconditionally unregisters placements before
re-registering. Adds installSource ("registry" | "local" | "remote")
to BundleInstance and AppInfo for explicit install-channel tracking.
Closes NimbleBrainInc#35
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
manage_app "upgrade" delegates to lifecycle.upgrade(). check_updates filters on installSource === "registry" instead of bundleName heuristic. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Check for updates button, per-row upgrade with error handling. Update column only shown for registry-sourced bundles. Filters on installSource rather than bundleName prefix. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
13 tests covering upgrade action guards, schema validation, check_updates filtering by installSource, upgrade no-op when already at latest, and unknown instance error. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
65d1c3a to
fd10a35
Compare
Summary
upgrade()method toBundleLifecycleManager— best-effort hot-swap that stops the running source, re-installs from mpak, and starts the new version. Preserves workspace data and credentials.manage_apptool with"upgrade"action and add newcheck_updatestool that queries the registry for available updatesorg_adminrole, "updating" status badge during upgradeCloses #35 (pieces 1 and 3 — upgrade action and updates-available surface. Piece 2, version pinning, deferred to follow-up PR)
Review feedback addressed
removeBundleAutomations()called beforesyncBundleAutomations(), matching the uninstall pattern. Prevents orphaned schedules when a new manifest drops a previously-declared automation.result.isErrornow checked explicitly before callingfetchApps(). Error message extracted and routed tosetUpgradeErrorsso the inline error renders.check_updatesclobbering local-installed bundles — addedinstallSourcefield ("registry" | "local" | "remote") toBundleInstanceandAppInfo.check_updatesand the frontend filter oninstallSource === "registry"instead of thebundleName.startsWith("@")heuristic. Local dev bundles with scoped manifest names are no longer offered for update.NB_WORK_DIRresolution — extracted toresolveWorkDir()function (not a module-level constant — constants capture env at import time, breaking integration tests that overrideprocess.env.NB_WORK_DIRat runtime).fetchTrustScore()called during upgrade to refresh from mpak registry.unregister()before conditionally re-registering.unregister()is a no-op filter when no entries exist.protectedcheck on upgrade — intentional. Protected bundles can't be uninstalled but should receive version updates (security patches). Added code comment explaining the decision.check_updatesfiltering byinstallSource, upgrade no-op when already at latest, and unknown instance error path.Test plan
🤖 Generated with Claude Code