Bug 50124 - WinPE does not work if HyperV Enlightenment is activated
WinPE does not work if HyperV Enlightenment is activated
Status: CLOSED WONTFIX
Product: UCS
Classification: Unclassified
Component: Virtualization - UVMM
UCS 4.3
Other Linux
: P5 normal (vote)
: ---
Assigned To: UCS maintainers
UCS maintainers
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2019-09-06 12:26 CEST by Christina Scheinig
Modified: 2023-06-28 10:46 CEST (History)
2 users (show)

See Also:
What kind of report is it?: Bug Report
What type of bug is this?: 5: Major Usability: Impairs usability in key scenarios
Who will be affected by this bug?: 1: Will affect a very few installed domains
How will those affected feel about the bug?: 5: Blocking further progress on the daily work
User Pain: 0.143
Enterprise Customer affected?: Yes
School Customer affected?:
ISV affected?:
Waiting Support:
Flags outvoted (downgraded) after PO Review:
Ticket number: 2019090421000411
Bug group (optional):
Max CVSS v3 score:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Christina Scheinig univentionstaff 2019-09-06 12:26:49 CEST
If HyperV Enlightenment is activated the boot of the Windows PE fails. In this case the PE stops. If the server boots into the recovery system due to an error (which is only a "pimed" PE) the PE crashes and hangs a reboot loop, because the system tries to load the recovery PE again and again. 

With a Server 2019, the problem with the PE (system simply hangs during boot) obviously also occurs during normal startup of the VM and not only with the PE. Also here it helped to remove the Hyper-V Enlightenment. After that the server started again.

The HyperV Enlightenment feature is absolutely necessary for the live-migration in the customer environment
Comment 1 Philipp Hahn univentionstaff 2019-09-06 12:39:02 CEST
Hyper-V can be disabled via the UVMM UMC module.

"Windows Server 2019" was never tested by me or as far as I know from anyone else here at Univention.

Please try different CPU models as crashing VMs are most often caused by Qemu providing an artificial CPU with a combination of features / CPU model / CPU level not matching any real CPU from Intel / AMD - OSs then often take the wrong choice for the low level HW modules.

For example: The default Qemu CPU identifies itself as an "Core 2 Duo" CPU with lots of extra CPU features from more modern CPUs. If "Hyper-V Enlightment" is enabled, this even adds more features. Windows for example than assumes, that as Hyper-V is available, the CPU generation must be at least from "201x", which all have feature X - but if that X is not provides from Qemu, the guest OS will crash as soon as it uses feature X.

Someone should first check if it is really "Hyper-V" or some missing other CPU feature required by Windows(PE).
Comment 2 Ingo Steuwer univentionstaff 2019-09-10 09:54:20 CEST
Sounds to me like this should be analyzed more deeply in the environment where the problem occures. Or is it easily reporduceable in our test environments?
Comment 3 Philipp Hahn univentionstaff 2019-09-11 17:33:37 CEST
(qemu) info status
VM status: paused (shutdown)

(qemu) x/10i 0xfffff8025992b4d0
0xfffff8025992b4d0:  mov    %ecx,0x8(%rsp)
0xfffff8025992b4d4:  push   %rbx
0xfffff8025992b4d5:  sub    $0x50,%rsp
0xfffff8025992b4d9:  mov    %ecx,%ebx
0xfffff8025992b4db:  mov    %ebx,%ecx
0xfffff8025992b4dd:  callq  0xfffff80259852b90

(qemu) x/1i 0xfffff80259852b90
0xfffff80259852b90:  int3   

(qemu) info cpus
* CPU #0: pc=0xfffff80259852b90 thread_id=317


This may be caused by Bug #21860, which added patch patches/libvirt/4.4-0-0-ucs/3.0.0-4+deb9u3-errata4.4-0/0022-Bug-21860-Default-to-kvm32.quilt:
It changes the default CPU model from "qemu{32,64}" to "kvm{32,64}" to get PSE-36 working again for pae kernels.

If "Hyper-V Enlightenment" is _not_ enabled, `libvirt` passes no explicit `-cpu XXX` to `qemu` and thus `qemu` defaults to `qemu64`.
With "Hyper-V Enlightenment" enabled, `libvirt` must pick a CPU to add the Hyper-V-Features on top - it picks 'kvm64' as the base due to the above mentioned patch.

'kvm64' is a "Family 15 Model 5" based CPU, e.g. pre-"Core"-CPU!
'qemu64' is a "Faimily 6 Model 6" based CPU, e.g. post-"Core"-CPU!

<https://www.gigxp.com/windows-server-2019-system-requirements/> list the following minimum
> Processor requirements:
> A minimum of 1.4 GHz 64-bit EMT64 or AMD64 processor. Quad Core Recommended for production systems.
> Support for security features like NX Bit and DEP (Data Execution Prevention)
> The processor should support CMPXCHG16b, LAHF/SAHF, and PrefetchWNeeds
> Needs to Support EPT or NPT (Second Level Address Translation)

So 'kvm64' is too old.


Strangely trying to start a VM with 'qemu64' fails with:
> virsh # start phahn_qa36-ucs44-32b
> error: Failed to start domain phahn_qa36-ucs44-32b
> error: the CPU is incompatible with host CPU: Host CPU does not provide required features: svm

Please note that 'svm' is the ADM feature, the Intel one is called 'vmx' - and this is an Intel host!
Manually starting the VM with `qemu-system-x86_64 -cpu qemu64 ...` on the other hand works.

See <https://www.redhat.com/archives/libvir-list/2016-May/msg01940.html> for a similar report.

So both "qemu64" and "kvm64" are bad choices as a default and should be changed!

It might not be the original problem of the customer, but at least something is fishy here and complicated reproducing the problem locally here at Univention.
Comment 4 Christina Scheinig univentionstaff 2020-11-25 14:21:02 CET
I remove the waiting support flag, because the customer did not respond to our demand. The customer might not use our virtual environment anymore.
Comment 5 Ingo Steuwer univentionstaff 2021-05-14 13:46:03 CEST
This issue has been filed against UCS 4.3.

UCS 4.3 is out of maintenance and many UCS components have changed in later releases. Thus, this issue is now being closed.

If this issue still occurs in newer UCS versions, please use "Clone this bug" or reopen it and update the UCS version. In this case please provide detailed information on how this issue is affecting you.