Bug 45117 - qemu-1.1-2 live migration failes - missing 'kvmclock: Ensure time in migration never goes backward'
qemu-1.1-2 live migration failes - missing 'kvmclock: Ensure time in migratio...
Status: CLOSED FIXED
Product: UCS
Classification: Unclassified
Component: Virtualization - KVM
UCS 4.1
Other Linux
: P5 normal (vote)
: UCS 4.1-4-errata
Assigned To: Philipp Hahn
Erik Damrose
https://patchwork.ozlabs.org/patch/34...
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2017-08-01 17:58 CEST by Philipp Hahn
Modified: 2017-08-23 14:59 CEST (History)
2 users (show)

See Also:
What kind of report is it?: Bug Report
What type of bug is this?: 7: Crash: Bug causes crash or data loss
Who will be affected by this bug?: 2: Will only affect a few installed domains
How will those affected feel about the bug?: 5: Blocking further progress on the daily work
User Pain: 0.400
Enterprise Customer affected?: Yes
School Customer affected?:
ISV affected?:
Waiting Support:
Flags outvoted (downgraded) after PO Review:
Ticket number: 2017073121000411
Bug group (optional):
Max CVSS v3 score:
hahn: Patch_Available+


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Philipp Hahn univentionstaff 2017-08-01 17:58:37 CEST
wrong kvmclock value migrated, leads to VM getting stuck after migration waiting for kvmclock to catch up.
Comment 1 Philipp Hahn univentionstaff 2017-08-01 18:22:51 CEST
r17637 | Bug #45117 kvmclock: Ensure time in migration never goes

Package: qemu-kvm
Version: 1.1.2+dfsg-6.54.201708011801
Branch: ucs_4.1-0
Scope: errata4.1-4

r81665 | Bug #45117 kvmclock: Ensure time in migration never goes YAML

QA: I have no test myself, but the linked discussion has some hint:
 start a VM
 keep it running for some time
 save-to-disk
 restore-from-disk
From my reading the bug stems from KVM-Clock being stored twice, once inside the guest kernel and once in the KVM Device Migration stream. It the guest view is later than the host view, the VM gets stuck.
I will do some testing tomorrow.
Comment 2 Erik Damrose univentionstaff 2017-08-15 13:27:58 CEST
Reopen:
I fixed a typo in the yaml in r82113

We could not reproduce the problem with the old qemu-kvm version. Updating to the patched version showed no issue with live migration, either. But i found
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=786789

There it seems that additional patches are required in addition to the original one used here. Ultimately it was not fixed in jessie.

We could apply the additional patches, but i do not the risk of these changes
Comment 3 Philipp Hahn univentionstaff 2017-08-15 20:47:21 CEST
(In reply to Erik Damrose from comment #2)
> Reopen:
> I fixed a typo in the yaml in r82113

OK

> We could not reproduce the problem with the old qemu-kvm version. Updating
> to the patched version showed no issue with live migration, either. But i
> found
> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=786789
> 
> There it seems that additional patches are required in addition to the
> original one used here. Ultimately it was not fixed in jessie.
> 
> We could apply the additional patches, but i do not the risk of these changes

r17658 | Bug #45117 kvmclock: add a new function to update
 Backport of <https://git.qemu.org/gitweb.cgi?p=qemu.git;a=commit;h=0fd7e098db30e302d27920487f0afec33be8982a>, which reverts 1154d84dcc + 317b0a6d8b

Package: qemu-kvm
Version: 1.1.2+dfsg-6.55.201708152029
Branch: ucs_4.1-0
Scope: errata4.1-4

r82159 | Bug #45117 kvmclock: add a new function to update env->tsc.
 qemu-kvm.yaml
Comment 4 Erik Damrose univentionstaff 2017-08-16 16:44:59 CEST
Still a bit risky, but better than before. In the internal tests, no issues occured while migrating from / to the old and new qemu-kvm version, and between two new versions.
OK: patches applied
OK: yaml
Verified
Comment 5 Arvid Requate univentionstaff 2017-08-23 14:59:11 CEST
<http://errata.software-univention.de/ucs/4.1/472.html>