-
Notifications
You must be signed in to change notification settings - Fork 11
Description
I haven't debugged this at all, but trying things out today bcvk was failing in a Codespaces instance with secure-boot, hit a KVM error.
Details
# UEFI Secure Boot Fails in Nested Virtualization (Azure/Codespaces)Problem Description
VMs created with bcvk libvirt run using the default --firmware=uefi-secure fail to boot in nested virtualization environments (GitHub Codespaces, Azure VMs) with the error:
KVM: entry failed, hardware error 0xffffffff
... SMM=1 ...
The VM starts but immediately enters a paused state with "internal-error" status.
Root Cause
UEFI Secure Boot requires SMM (System Management Mode) support. When running KVM inside another hypervisor (nested virtualization), SMM emulation is incomplete and unreliable, particularly on Azure/Microsoft hypervisors. The OVMF firmware tries to enter SMM during Secure Boot initialization, which triggers a KVM hardware error that cannot be recovered.
Environment
- Platform: GitHub Codespaces (Azure nested virtualization)
- Host Hypervisor: Microsoft (detected via
systemd-detect-virt) - Nested KVM: Enabled (
/sys/module/kvm_amd/parameters/nested = 1) - QEMU: 10.1.2
- libvirt: 11.8.0
- Kernel: 6.8.0-1030-azure
Symptoms
- VM domain shows state:
paused (internal-error) - QEMU log shows:
KVM: entry failed, hardware error 0xffffffff EAX=00000000 EBX=b7e03d78 ECX=000000b2 EDX=000000b2 ... EIP=00008000 EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=1 HLT=0 - VM XML shows:
<feature enabled='yes' name='secure-boot'/> <smm state='on'/> <loader readonly='yes' secure='yes' type='pflash'>/usr/share/OVMF/OVMF_CODE_4M.ms.fd</loader>
Reproduction
# This will fail in nested virtualization:
bcvk libvirt run --name test quay.io/centos-bootc/centos-bootc:stream10
# Check VM state:
virsh domstate test
# Output: paused
# Check QEMU monitor:
virsh qemu-monitor-command test --hmp info status
# Output: VM status: paused (internal-error)
# Check logs:
cat ~/.cache/libvirt/qemu/log/test.log | grep "KVM.*error"
# Output: KVM: entry failed, hardware error 0xffffffffSolutions
Option 1: UEFI without Secure Boot (Recommended)
Use --firmware=uefi-insecure to get UEFI boot without SMM requirements:
bcvk libvirt run --name test --firmware=uefi-insecure \
quay.io/centos-bootc/centos-bootc:stream10Result: ✅ Works perfectly. UEFI boot, no Secure Boot, no SMM.
Option 2: Legacy BIOS
Use --firmware=bios for maximum compatibility:
bcvk libvirt run --name test --firmware=bios \
quay.io/centos-bootc/centos-bootc:stream10Result: ✅ Works perfectly. Legacy BIOS boot, no UEFI.
Attempted Workarounds (All Failed)
The following were tested but did not resolve the SMM issue:
- CPU mode changes:
host-passthrough,host-model, custom CPU models all fail - Migratable flag: Setting
migratable=offmakes no difference - KVM parameters: No kernel module parameters can fix architectural limitations
- QEMU machine options: SMM issues are at the KVM level, not QEMU
Detection
We can reliably detect this problematic scenario before attempting to start a VM:
Detection Logic
fn is_nested_virtualization_risky() -> bool {
// Check if running in a VM
let in_vm = std::fs::read_to_string("/proc/cpuinfo")
.map(|s| s.contains("hypervisor"))
.unwrap_or(false);
// Check if nested KVM is enabled
let nested_kvm = std::fs::read_to_string("/sys/module/kvm_amd/parameters/nested")
.or_else(|_| std::fs::read_to_string("/sys/module/kvm_intel/parameters/nested"))
.map(|s| s.trim() == "1" || s.trim() == "Y")
.unwrap_or(false);
in_vm && nested_kvm
}Detection Signals
| Check | Command | Expected in Nested Virt |
|---|---|---|
| Hypervisor present | grep hypervisor /proc/cpuinfo |
Match found |
| Hypervisor vendor | lscpu | grep "Hypervisor vendor" |
Shows "Microsoft" (Azure) |
| systemd-detect-virt | systemd-detect-virt |
Returns non-"none" |
| DMI vendor | cat /sys/devices/virtual/dmi/id/sys_vendor |
"Microsoft Corporation" |
| Nested KVM | cat /sys/module/kvm_*/parameters/nested |
"1" or "Y" |
Recommended Warning
When --firmware=uefi-secure is used and nested virtualization is detected:
WARNING: Nested virtualization detected (Hypervisor: Microsoft)
UEFI Secure Boot (--firmware=uefi-secure) requires SMM which often
fails in nested environments.
Recommended alternatives:
--firmware=uefi-insecure (UEFI without Secure Boot)
--firmware=bios (Legacy BIOS, most compatible)
Continue anyway? [y/N]
Technical Background
Why SMM Fails in Nested Virtualization
- SMM Architecture: SMM is a special x86 CPU mode for firmware operations
- Nested KVM Limitation: KVM's nested virtualization doesn't fully emulate SMM
- OVMF Requirement: OVMF Secure Boot firmware requires functional SMM
- Azure Specific: Azure's hypervisor is optimized for Hyper-V, not nested KVM
Error Details
The error occurs when:
- libvirt starts QEMU with
smm=onand Secure Boot OVMF firmware - OVMF initializes and attempts to enter SMM
- Nested KVM cannot handle the SMM state transition
- CPU registers show
SMM=1flag set at time of error - VM pauses with "internal-error" and cannot recover
Research References
- KVM Nested Limitations: https://docs.kernel.org/virt/kvm/x86/running-nested-guests.html
- Libvirt Secure Boot: https://libvirt.org/kbase/secureboot.html
- TianoCore SMM Testing: https://github.com/tianocore/tianocore.github.io/wiki/Testing-SMM-with-QEMU,-KVM-and-libvirt
- Red Hat Bug #1308678: OVMF binary separation for SB+SMM vs non-SMM variants
Related Issues
- Affects all nested KVM environments (not just Azure)
- Proxmox nested virtualization shows similar issues
- Some cloud providers (AWS bare metal) may work better
- Issue has existed in KVM for multiple kernel versions
Testing Notes
Tested configurations:
- ✅
--firmware=uefi-insecure: Works, VM boots successfully - ✅
--firmware=bios: Works, VM boots successfully - ❌
--firmware=uefi-secure+host-passthrough: Fails with SMM error - ❌
--firmware=uefi-secure+host-model: Fails with SMM error - ❌
--firmware=uefi-secure+migratable=off: Fails with SMM error
Recommendations
For bcvk/bootc Development
- Add detection heuristic to warn users in nested virt scenarios
- Change default to
--firmware=uefi-insecurewhen nested virt detected - Document limitation in README/docs
- CI/CD: Use
--firmware=uefi-insecurein GitHub Actions/Codespaces
For Users
- In nested environments: Always use
--firmware=uefi-insecureor--firmware=bios - For Secure Boot: Test on bare metal or non-nested KVM hosts
- For development/testing:
--firmware=biosis fastest and most compatible
Investigation Date
2025-11-10