Univention Bugzilla – Bug 21386
No migration possible between different CPU generations
Last modified: 2019-05-07 16:26:21 CEST
Am Ticket #2011012510001527 ist aufgefallen, daß es Probleme mit der Migration zwischen Systemen mit unterschiedlichen CPUs gibt: # ssh xenXXXXXXXX0[89] xm info | sed -ne 's/hw_caps.*: //p' bfebfbff:28100800:00000000:00000340:009ce3bd:00000000:00000001:00000000 bfebfbff:20100800:00000000:00000140:040ce3bd:00000000:00000001:00000000 ^ ^ ^^ libvirt <http://libvirt.org/formatdomain.html#elementsCPU> bietet bereits Support für das Einschränken der CPUID <http://www.sandpile.org/ia32/cpuid.htm>, mit Kvm tut das auch, bei Xen ist das nicht per libvirt nutzbar, aber zumindest im klassischen Xen-Xm-Format kann man das angeben: #--------------------------------------------------------------------- # Configure guest CPUID responses: # #cpuid=[ '1:ecx=xxxxxxxxxxx00xxxxxxxxxxxxxxxxxxx, # eax=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx' ] # # Each successive character represent a lesser-significant bit: # '1' -> force the corresponding bit to 1 # '0' -> force to 0 # 'x' -> Get a safe value (pass through and mask with the default policy) # 'k' -> pass through the host bit value # 's' -> as 'k' but preserve across save/restore and migration # # Configure host CPUID consistency checks, which must be satisfied for this # VM to be allowed to run on this host's processor type: #cpuid_check=[ '1:ecx=xxxxxxxxxxxxxxxxxxxxxxxxxx1xxxxx' ] # - Host must have VMX feature flag set # # The format is similar to the above for 'cpuid': # '1' -> the bit must be '1' # '0' -> the bit must be '0' # 'x' -> we don't care (do not check) # 's' -> the bit must be the same as on the host that started this VM Wünschenswert wäre es, wenn UVMM die CPUID von allen Host in einer Gruppe ausliest, sie mit Erklärung auflistet und ähnlich wie libvirts "cpu-baseline", vgl. <http://www.libvirt.org/html/libvirt-libvirt.html#virConnectBaselineCPU> und <http://www.libvirt.org/html/libvirt-libvirt.html#virDomainXMLFlags> VIR_DOMAIN_XML_UPDATE_CPU eine Möglichkeit bietet, den Standardwert für neuen Domains zu definieren.
Created attachment 3000 [details] Parse Xen hw_caps CPUID Zum Parser den hw_caps Information folgendes ausführen: xm info | sed -ne 's/^hw_caps *: //p' | ./2011012510001527.py
1. UVMM should warn before migration a VM between incompatible CPUs. 2. By default UVMM should use a restricted CPU feature set to guarantee migration between all hosts of the domain. For performance tuning changing the default should be documented in an extended document. (automatically determining the best CPU set is considered too complex for a normal administrator and too error prone.) PS: As of 2014-02-12 libvirt-xen still doesn't seem to support VIR_CPU_MODE_* The problem appeared again at a different customer.
Asked for again: UVMM should block migration is the CPUs are incompatible.
This issue has been filed against UCS 2.4. UCS 2.4 is out of maintenance and many UCS components have vastly changed in later releases. Thus, this issue is now being closed. If this issue still occurs in newer UCS versions, please use "Clone this bug". In this case please provide detailed information on how this issue is affecting you.
Still a problem with KVM. OpenStack fixed it here: <https://bugs.launchpad.net/nova/+bug/1082414>.
The easiest thing is to set <cpu mode="host-model"/> for which libvirt will insert its view of the host CPU into the XML while the VM is running. If such a VM is migrated to an incompatible host, migrate() will show an error: virsh # uri qemu+tls://lattjo.knut.univention.de/system?pkipath=/home/phahn/.pki/libvirt virsh # migrate --domain phahn_cpu_migration --desturi qemu+tls://utby.knut.univention.de/system?pkipath=/home/phahn/.pki/libvirt --live --persistent --undefinesource --verbose error: the CPU is incompatible with host CPU: Host CPU does not provide required features: pclmuldq, smx, fma, pcid, x2apic, movbe, tsc-deadline, aes, xsave, osxsave, avx, f16c, rdrand, arat, fsgsbase, tsc_adjust, bmi1, avx2, smep, bmi2, erms, invpcid, xsaveopt, pdpe1gb, abm mode="host-model" has one big disadvantage, namely that there are known cases where libvirt will create a virtual CPU which does not exist in reality and which will make the guest OS crash. The long story short is, that the set of usable CPU features depends on the host CPU *AND* the Qemu version *AND* Linux kernel. Only Qemu-2.9 and libvirt-3.2 ask each other to get rid of that problem. So we need to make it configurable which CPU to use: - "host-passthrough" for maximum performance - "host-model" for save migration - custom cpu model from `virsh cpu-models x86_64` Links to read: * <https://wiki.libvirt.org/page/TodoPreMigrationChecks> * <https://bugzilla.redhat.com/show_bug.cgi?id=1055002> * <https://bugzilla.redhat.com/show_bug.cgi?id=824989> * <https://libvirt.org/formatdomain.html#elementsCPU>
Links to read: * <https://www.berrange.com/posts/2018/06/29/cpu-model-configuration-for-qemu-kvm-on-x86-hosts/> * <https://lists.gnu.org/archive/html/qemu-devel/2018-06/msg08422.html> Ideas: - Add a UCRV to disable migration by default - Add UI in UMC-UVMM to select: "host-passthrough" for maximum performance in single-host environments "host-model" for 99% in multi-host-environments requiring migration "custom": very long list of models and flags depending on Qemu/libvirt/µCode/HW version "default": unspecified as currently - As libvirt might create a invalid "host-model", we may need a mechanism to provide an override? - Or update libvirt and qemu to a later version, which works "as expected by the customer".
I talked to the customer and UCS-4.3-x is okay for them as they're currently updating their environment to UCS-4.3 and have waited long enough for this feature, so they are willing to wait some more to get the improved version of UVMMd. The idea is as following: - uvmmd will fetch both inactive (and active) domain XML. - the active domain XML is only needed to get the VNC port if the VM is running - if the inactive domain XML is missing <cpu mode='host-model'/>, uvmmd will automatically add it. It will take effect on the next (shutdown+)start. - this will be the default behavior, but a UCRV will allow that to be turned off or even to remove that line if it exists. This is for those situations where host-model breaks or is not required (single-host). - care must be takren to not mix the active and interactive XML, as during run-time the active domain XML contains the concrete model; using the active domain XML to define an interactive domain XML will not reset the CPU model to "host-model" to be filled in next time the domain is started. - We will ignore the new features IBRS,IBPB,STIBP,SSB provided by newer micro-code updates for now. libvirt neither lists them in "capabilities" nor "domcapabilities", so there is no remote mechanism using only libvirtd to detect those and enable them when available.
[4.3-2] 7b0a03f869 Bug #21386: Merge branch 'phahn/21386-uvmm-cpu-migrate' into 4.3-2 [4.3-2] 7d448075e1 Bug #45721 UVMM: Handle backup exception [4.3-2] dafe54d7de Bug #45721 UVMM: Handle broken UVMM connection [4.3-2] 925f492833 Bug #21386 UVMM: Handle connection close exception [4.3-2] 803a932581 Bug #21386 UVMM: Close files through context [4.3-2] e54a02d530 Bug #21386 UVMM: Add more debug [4.3-2] 7ea8b9ffda Bug #21386 UVMM: Switch to absolute imports [4.3-2] daf82260bf Bug #21386 UVMM: Switch to EnvironmentError [4.3-2] b736476893 Bug #21386 UVMM: Use native logger string substitution [4.3-2] d6adbd49e3 Bug #21386 UVMM: Fix exception printing [4.3-2] d17927eb93 Bug #21386 UVMM: Convert legacy exception arguments [4.3-2] c924d39cb2 Bug #21386 UVMM: Exception renaming [4.3-2] e19284251d Bug #21386 UVMM: Code cleanup [4.3-2] bba38d99df Bug #21386 UVMM: Fix storage exception [4.3-2] 55d2b3ceed Bug #21386 UVMM: Assert compatible CPU during live migration [4.3-2] c50d53dbd4 Bug #21386 UVMM: Use listAll() methods [4.3-2] 806b90619f Bug #21386 UVMM: Handle transitioned domains [4.3-2] 8d90c7278b Bug #21386 UVMM: Un-private _update_xml [4.3-2] 157279c3f2 Bug #21386 UVMM: Split update_expensive into parts [4.3-2] 81c80e1ca1 Bug #21386 UVMM: Split xml2obj into parts [4.3-2] 81ce1a12e6 Bug #21386 UVMM: Switch to new event model [4.3-2] 43fb33e9d2 Bug #21386 UVMM: Unify handler deregistration [4.3-2] 63725b2dc0 Bug #21386 UVMM: Remove leftover supports_suspend|snapshot flags [4.3-2] 2e67ce188e Bug #21386 UVMM: Remove import fallback [4.3-2] d3421471b0 Bug #21386 UVMM: Remove unused Data_StoragePool [4.3-2] ff69c39f63 Bug #21386 UVMM: Simplify media change detection [4.3-2] 72f06ed1c0 Bug #21386 UVMM: Document UCRV uvmm/umc/autoupdate/interval Package: univention-virtual-machine-manager-daemon Version: 7.0.0-11A~4.3.0.201810021441 Branch: ucs_4.3-0 Scope: errata4.3-2 TODO: Write more documentation
[4.3-2] 17506eb5e4 Bug #21386: univention-virtual-machine-manager-daemon 7.0.0-11A~4.3.0.201810021441 .../univention-virtual-machine-manager-daemon.yaml | 16 ++++++++++++++++ 1 file changed, 16 insertions(+)
[4.3-2] 8c3869ed43 Bug #21386 UVMM: Fix spelling mistakes in UCR variable descriptions. .../univention-virtual-machine-manager-daemon/debian/changelog | 6 ++++++ ...al-machine-manager-daemon.univention-config-registry-variables | 8 ++++---- 2 files changed, 10 insertions(+), 4 deletions(-) Package: univention-virtual-machine-manager-daemon Version: 7.0.0-12A~4.3.0.201810051658 Branch: ucs_4.3-0 Scope: errata4.3-2 [4.3-2] 7783f9143a Bug #21386: univention-virtual-machine-manager-daemon 7.0.0-12A~4.3.0.201810051658 doc/errata/staging/univention-virtual-machine-manager-daemon.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
[4.3-2] 272abf1a0c Bug #21386: univention-virtual-machine-manager-daemon 7.0.0-12A~4.3.0.201810051658 .../univention-virtual-machine-manager-daemon.yaml | 6 +-- doc/manual/idm-cloud-en.xml | 2 +- doc/manual/uvmm-en.xml | 55 ++++++++++++++++++++++ 3 files changed, 59 insertions(+), 4 deletions(-)
(In reply to Philipp Hahn from comment #12) > [4.3-2] 272abf1a0c Bug #21386: univention-virtual-machine-manager-daemon > 7.0.0-12A~4.3.0.201810051658 > .../univention-virtual-machine-manager-daemon.yaml | 6 +-- > doc/manual/idm-cloud-en.xml | 2 +- > doc/manual/uvmm-en.xml | 55 > ++++++++++++++++++++++ > 3 files changed, 59 insertions(+), 4 deletions(-) Documentation moved to bug 47923
What I tested: Migration between different CPUs No error if cpu model not in dom description -> OK Error if incompatible cpu model in dom description -> OK No error if compatible cpu model in dom description -> OK uvmm/vm/cpu/host-model Not set -> No changes -> OK always -> overrides changes -> OK missing -> doesn't override changes -> OK remove -> removes host-model -> OK qemu process restarted in case the host-model was set -> OK Overall UVVM functionality snapshots -> OK vnc -> OK network -> OK No regressions noticed -> OK -> Verified
<http://errata.software-univention.de/ucs/4.3/269.html>