Bug 50092 - Failed migration blocks all future migrations
Failed migration blocks all future migrations
Status: CLOSED FIXED
Product: UCS
Classification: Unclassified
Component: Virtualization - KVM
UCS 4.3
Other Linux
: P5 normal (vote)
: UCS 4.4-2-errata
Assigned To: Philipp Hahn
Julia Bremer
:
Depends on: 47617
Blocks:
  Show dependency treegraph
 
Reported: 2019-08-30 16:16 CEST by Philipp Hahn
Modified: 2019-10-30 12:17 CET (History)
6 users (show)

See Also:
What kind of report is it?: Bug Report
What type of bug is this?: 5: Major Usability: Impairs usability in key scenarios
Who will be affected by this bug?: 1: Will affect a very few installed domains
How will those affected feel about the bug?: 5: Blocking further progress on the daily work
User Pain: 0.143
Enterprise Customer affected?: Yes
School Customer affected?:
ISV affected?:
Waiting Support: Yes
Flags outvoted (downgraded) after PO Review:
Ticket number: 2019082721000414
Bug group (optional):
Max CVSS v3 score:
hahn: Patch_Available+


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Philipp Hahn univentionstaff 2019-08-30 16:16:51 CEST
+++ This bug was initially created as a clone of Bug #47617 +++

If a migration failed, UVMM considers the migration to be in progress and blocks any future migration attempt.

src/univention/uvmm/node.py:
 2108 »···»···»···if stats['type'] != libvirt.VIR_DOMAIN_JOB_NONE:

<https://libvirt.org/html/libvirt-libvirt-domain.html#virDomainJobType>:
enum virDomainJobType {
  VIR_DOMAIN_JOB_NONE	=	0 (0x0)	// No job is active
  VIR_DOMAIN_JOB_BOUNDED	=	1 (0x1)	// Job with a finite completion time
  VIR_DOMAIN_JOB_UNBOUNDED	=	2 (0x2)	// Job without a finite completion time
  VIR_DOMAIN_JOB_COMPLETED	=	3 (0x3)	// Job has finished, but isn't cleaned up
  VIR_DOMAIN_JOB_FAILED	=	4 (0x4)	// Job hit error, but isn't cleaned up
  VIR_DOMAIN_JOB_CANCELLED	=	5 (0x5)	// Job was aborted, but isn't cleaned up
  VIR_DOMAIN_JOB_LAST	=	6 (0x6)
}

FAILED and CANCELLED and probably COMPLETED should also be considered.

Patch:
- 2108 »···»···»···if stats['type'] != libvirt.VIR_DOMAIN_JOB_NONE:
+ 2108 »···»···»···if stats['type'] in (libvirt.VIR_DOMAIN_JOB_BOUNDED, libvirt.VIR_DOMAIN_JOB_UNBOUNDED):
Comment 1 Philipp Hahn univentionstaff 2019-09-17 16:42:50 CEST
<git:phahn/50092_uvmm-migrate> = <https://git.knut.univention.de/univention/ucs/commits/phahn/50092_uvmm-migrate>
contains some more migration related fixes/improvements which happend while debugging this in my environment.

[4.4-1] 2b3de5e395 Bug #50092 UVMM: Improve debugging output
[4.4-1] 28723e305b Bug #50092 UVMM: Ignore certain errors
[4.4-1] 987220f317 Bug #50092 UVMM: Allow to migrate VMs again after failure
[4.4-1] c8337fad12 Bug #50092 UVMM: Debug PostCopy switch
[4.4-1] 5cc3751ff1 Bug #35122,Bug #4535,Bug #29965 UVMM: Document ucrv:uvmm/umc/autosearch
Comment 2 Philipp Hahn univentionstaff 2019-10-23 09:37:47 CEST
[4.4-2] abc1b5ccb3 Bug #50092 UVMM: Improve debugging output
 .../src/univention/uvmm/node.py                    | 111 +++++++++++++--------
 1 file changed, 69 insertions(+), 42 deletions(-)

[4.4-2] a1f62d55a0 Bug #50092 UVMM: Ignore certain errors
 .../src/univention/uvmm/node.py                    | 38 ++++++++++++++--------
 1 file changed, 25 insertions(+), 13 deletions(-)

[4.4-2] 9a1b9b8812 Bug #50092 UVMM: Debug PostCopy switch
 .../src/univention/uvmm/node.py                                    | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

[4.4-2] 17feb5e9d2 Bug #50092 uvmm: Handle failed migration
 .../univention-virtual-machine-manager-daemon/debian/changelog      | 6 ++++++
 .../src/univention/uvmm/node.py                                     | 2 +-
 2 files changed, 7 insertions(+), 1 deletion(-)

Package: univention-virtual-machine-manager-daemon
Version: 8.0.1-11A~4.4.0.201910221522
Branch: ucs_4.4-0
Scope: errata4.4-2

[4.4-2] dc5171d000 Bug #50092: univention-virtual-machine-manager-daemon 8.0.1-11A~4.4.0.201910221522
 .../staging/univention-virtual-machine-manager-daemon.yaml     | 10 ++++++++++
 1 file changed, 10 insertions(+)

QA: I'm not yet able to reproduce the FAILED state myself.
Comment 3 Julia Bremer univentionstaff 2019-10-29 14:05:21 CET
Migration still works: OK
Code review: OK
Yaml: OK

Verified

Note: Failed state could not be reproduced.
Comment 4 Erik Damrose univentionstaff 2019-10-30 12:17:34 CET
<http://errata.software-univention.de/ucs/4.4/326.html>