Bug 47934 - [4.2] Migration does not converge
[4.2] Migration does not converge
Status: CLOSED INVALID
Product: UCS
Classification: Unclassified
Component: Virtualization - UVMM
UCS 4.2
Other Linux
: P5 normal (vote)
: ---
Assigned To: UCS maintainers
UCS maintainers
:
Depends on: 47617
Blocks:
  Show dependency treegraph
 
Reported: 2018-10-09 15:42 CEST by Valentin Heidelberger
Modified: 2018-10-23 10:43 CEST (History)
8 users (show)

See Also:
What kind of report is it?: Bug Report
What type of bug is this?: 4: Minor Usability: Impairs usability in secondary scenarios
Who will be affected by this bug?: 1: Will affect a very few installed domains
How will those affected feel about the bug?: 5: Blocking further progress on the daily work
User Pain: 0.114
Enterprise Customer affected?: Yes
School Customer affected?:
ISV affected?:
Waiting Support: Yes
Flags outvoted (downgraded) after PO Review:
Ticket number: 2018082021000474, 2018090421000967, 2018100821000804
Bug group (optional):
Max CVSS v3 score:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Valentin Heidelberger univentionstaff 2018-10-09 15:42:18 CEST
+++ This bug was initially created as a clone of Bug #47617 +++

Most of the customer's virt systems are still on UCS 4.2. To be able to update the systems to 4.3 smoothly they need a backport of this bugfix for 4.2.

The live migration of a VM with 16 GiB RAM and 8 CPUs did non converge; UVMM returned an error with did not contain any specific details - probably just a timeout.

# virsh domjobinfo $DOM
Job type:         Unbounded
Time elapsed:     5499859      ms
Data processed:   530,696 GiB
Data remaining:   170,289 MiB
Data total:       12,009 GiB
Memory processed: 530,696 GiB
Memory remaining: 170,289 MiB
Memory total:     12,009 GiB
Dirty rate:       30740        pages/s
Iteration:        1695
Constant pages:   5252035
Normal pages:     138835932                                                                                                                                                        
Normal data:      529,617 GiB
Expected downtime: 1245         ms
Setup time:       167          ms

This is a known (Qemu) problem: <https://wiki.qemu.org/Features/AutoconvergeLiveMigration>

Using --postcopy the VM was migrated ~1 minute:
# virsh migrate --domain $DOM --live --persistent --undefinesource --postcopy --postcopy-after-precopy --verbose qemu://$DEST/system

UVMMd should use --postcopy by default.
Alternative 1: --auto-converge --auto-converge-initial 20 --auto-converge-increment 10
Alternative 2: Add UCRV to make it configurable.
Comment 1 Valentin Heidelberger univentionstaff 2018-10-16 17:48:52 CEST
Customer updates manually with virsh migrate and the necessary --postcopy parameter. They don't need a backport anymore. If another customer needs this, it can be reopened.
Comment 2 Stefan Gohmann univentionstaff 2018-10-23 10:42:43 CEST
OK
Comment 3 Stefan Gohmann univentionstaff 2018-10-23 10:43:01 CEST
Nothing to release