Skip to content

Conversation

@alandau
Copy link
Contributor

@alandau alandau commented Feb 11, 2026

Clean cherry pick of PR #2789

Add vm/nvme_keepalive_mode inspect node shows the current status, but can be updated through ohcldiag-dev:

C:\openvmm>ohcldiag-dev.exe uhdiag.sock i vm/nvme_keepalive_mode
"enabled"

C:\openvmm>ohcldiag-dev.exe uhdiag.sock i vm/nvme_keepalive_mode -u disabled
"disabled"

C:\openvmm>ohcldiag-dev.exe uhdiag.sock i vm/nvme_keepalive_mode
"disabled"

C:\openvmm>ohcldiag-dev.exe uhdiag.sock i vm/nvme_keepalive_mode -u enabled
"enabled"

C:\openvmm>ohcldiag-dev.exe uhdiag.sock i vm/nvme_keepalive_mode
"enabled"

C:\openvmm>ohcldiag-dev.exe uhdiag.sock i vm/nvme_keepalive_mode -u blah
Error: update error: Invalid keepalive config: blah

Add a test for this in openhcl_servicing.rs that verifies that the keepalive can be disabled, and servicing completes successfully.

…2789)

Add `vm/nvme_keepalive_mode` inspect node shows the current status, but
can be updated through `ohcldiag-dev`:

```
C:\openvmm>ohcldiag-dev.exe uhdiag.sock i vm/nvme_keepalive_mode
"enabled"

C:\openvmm>ohcldiag-dev.exe uhdiag.sock i vm/nvme_keepalive_mode -u disabled
"disabled"

C:\openvmm>ohcldiag-dev.exe uhdiag.sock i vm/nvme_keepalive_mode
"disabled"

C:\openvmm>ohcldiag-dev.exe uhdiag.sock i vm/nvme_keepalive_mode -u enabled
"enabled"

C:\openvmm>ohcldiag-dev.exe uhdiag.sock i vm/nvme_keepalive_mode
"enabled"

C:\openvmm>ohcldiag-dev.exe uhdiag.sock i vm/nvme_keepalive_mode -u blah
Error: update error: Invalid keepalive config: blah
```

Add a test for this in `openhcl_servicing.rs` that verifies that the
keepalive can be disabled, and servicing completes successfully.

(cherry picked from commit 458c128)
Copilot AI review requested due to automatic review settings February 11, 2026 22:43
@alandau alandau requested a review from a team as a code owner February 11, 2026 22:43
@github-actions github-actions bot added the release_1.7.2511 Targets the release/1.7.2511 branch. label Feb 11, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an OpenHCL inspect node to view/update NVMe keepalive mode at runtime (primarily for diagnostics/testing), and extends servicing tests to validate disabling keepalive via inspect.

Changes:

  • Add mutable inspect handling for vm/nvme_keepalive_mode, including update logging.
  • Extend KeepAliveConfig parsing to accept "enabled" / "disabled" and add a stringifier helper.
  • Add a servicing test that disables NVMe keepalive via inspect and verifies IDENTIFY is observed after servicing restore.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
vmm_tests/vmm_tests/tests/tests/multiarch/openhcl_servicing.rs Adds a new servicing test that toggles NVMe keepalive via inspect and validates expected NVMe behavior post-restore.
openhcl/underhill_core/src/options.rs Updates KeepAliveConfig to support inspect-friendly strings and provides as_str() helper.
openhcl/underhill_core/src/dispatch/mod.rs Exposes vm/nvme_keepalive_mode as a mutable inspect field with logging on updates.

Comment on lines +109 to +116
/// Returns the string representation matching the inspect rename attributes.
pub fn as_str(&self) -> &'static str {
match self {
KeepAliveConfig::EnabledHostAndPrivatePoolPresent => "enabled",
KeepAliveConfig::DisabledHostAndPrivatePoolPresent => "nohost,privatepool",
KeepAliveConfig::Disabled => "disabled",
}
}
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The doc comment for KeepAliveConfig::as_str says it matches “inspect rename attributes”, but KeepAliveConfig doesn’t have any #[inspect(rename = ...)] attributes on its variants. This is misleading for future readers—either add the rename attributes to the enum variants (so Inspect outputs the same strings) or reword the comment to reflect that this is a custom string mapping used by the inspect handler.

Copilot uses AI. Check for mistakes.
Comment on lines +526 to +530
// Enable keepalive, then disable it later via inspect
flags.enable_nvme_keepalive = true;
// We need to disabled MANA KA since if either of the KA flasgs in on, DMA manager will save its state
// which includes NVMe regions and restore verification will fail ("unrestored allocations found"),
// since NVMe KA is off and we don't save anything).
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spelling/grammar in this comment makes the rationale hard to follow (e.g., “We need to disabled”, “flasgs in on”, and an extra closing parenthesis). Please fix the wording so it clearly explains why MANA keepalive must be disabled for this test.

Suggested change
// Enable keepalive, then disable it later via inspect
flags.enable_nvme_keepalive = true;
// We need to disabled MANA KA since if either of the KA flasgs in on, DMA manager will save its state
// which includes NVMe regions and restore verification will fail ("unrestored allocations found"),
// since NVMe KA is off and we don't save anything).
// Enable NVMe keepalive, then disable it later via inspect.
flags.enable_nvme_keepalive = true;
// Disable MANA keepalive. If either keepalive flag is enabled, the DMA manager will save its state,
// which includes NVMe regions, and restore verification will fail with "unrestored allocations found"
// because NVMe keepalive is off and we do not save any NVMe state in this test.

Copilot uses AI. Check for mistakes.
Comment on lines +363 to +377
"nvme_keepalive_mode",
|value| -> anyhow::Result<&'static str> {
if let Some(new_value) = value {
let old_value = self.nvme_keep_alive.as_str();
self.nvme_keep_alive = new_value.parse()?;
tracing::info!(
CVM_ALLOWED,
old_value,
new_value = self.nvme_keep_alive.as_str(),
"nvme_keepalive_mode updated via inspect"
);
}
Ok(self.nvme_keep_alive.as_str())
},
);
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nvme_keepalive_mode can be set to "enabled" via inspect, but this only updates self.nvme_keep_alive and does not update NVMe save/restore support in the NVMe manager (which is computed at startup). This can lead to an inconsistent state where DMA manager state is saved/restored because nvme_keep_alive.is_enabled() is true, but NVMe state is not saved (e.g., NvmeManager::save returns None when save_restore_supported is false), potentially causing servicing restore failures. Consider rejecting/returning an error when switching to "enabled" unless NVMe keepalive save/restore is actually supported for the current VM (e.g., NVMe VFIO is active and private pool is available), or gate the DMA-manager save decision on the same capability check used by NvmeManager::save.

Copilot uses AI. Check for mistakes.
@alandau alandau enabled auto-merge (squash) February 12, 2026 00:07
@alandau alandau merged commit db38865 into microsoft:release/1.7.2511 Feb 12, 2026
52 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release_1.7.2511 Targets the release/1.7.2511 branch.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants