Bug 33966 - Add a script which periodically checks/restarts libvirt on uvmm-nodes
Add a script which periodically checks/restarts libvirt on uvmm-nodes
Status: CLOSED FIXED
Product: UCS
Classification: Unclassified
Component: Virtualization - UVMM
UCS 3.2
Other Linux
: P5 normal (vote)
: UCS 3.2-1-errata
Assigned To: Erik Damrose
Philipp Hahn
:
Depends on:
Blocks: 35069 35070 36605
  Show dependency treegraph
 
Reported: 2014-01-22 11:51 CET by Erik Damrose
Modified: 2014-11-12 16:08 CET (History)
3 users (show)

See Also:
What kind of report is it?: ---
What type of bug is this?: ---
Who will be affected by this bug?: ---
How will those affected feel about the bug?: ---
User Pain:
Enterprise Customer affected?:
School Customer affected?:
ISV affected?:
Waiting Support:
Flags outvoted (downgraded) after PO Review:
Ticket number:
Bug group (optional):
Max CVSS v3 score:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Erik Damrose univentionstaff 2014-01-22 11:51:37 CET
At Bug #33741 the script uvmmd-check was fixed which checks if uvmmd is running. Developing a script that checks if libvirt is running on uvmm-nodes was mentioned at that bug. We should decide if such script will be introduced. Original comment:

In addition to that a similar script might be used on the virtualization hosts to check if libvirtd is still working:

#!/bin/bash
# Check for functional libvirtd
exec >>"/var/log/univention/virtual-machine-manager-daemon-errors.log" 2>&1
if [ -f /proc/xen/privcmd ]
then
  uri="xen+unix:///"
elif [ -c /dev/kvm ]
then
  uri="qemu:///system"
else
  exit 0
fi
if sv status /etc/runit/univention-libvirt | grep -q ^run
then
  tmpfile="$(mktemp)"
  trap "rm -f '$tmpfile'" EXIT
  virsh -c "$uri" list >"$tmpfile" &
  pid=$!
  sleep ${uvmm_check_timeout:-5}s
  if [ ! -s "$tmpfile" ]
  then
    kill $pid
    wait $pid
    invoke-rc.d libvirt-bin restart
  fi
fi
exit 0
Comment 1 Tim Petersen univentionstaff 2014-03-12 14:38:54 CET
I've seen the process in a futex quite often in the meantime - therefore the script should be used with "kill -9".
Comment 2 Erik Damrose univentionstaff 2014-03-14 14:55:34 CET
* Add libvirt-check.sh to virtualization nodes. It periodically tests if
  libvirtd responds and restarts libvirtd if necessary. (Bug #33966)
* Change to libvirt-bin initscript: use sv force-stop and sv force-restart

r48559 univention-virtual-machine-manager-node 2.0.5-3.74.201403141452
r48560 2013-03-14-univention-virtual-machine-manager-node.yaml
Comment 3 Philipp Hahn univentionstaff 2014-03-17 09:00:09 CET
OK: aptitude install '?source-package(univention-virtual-machine-manager-node)?version(2.0.5-2.73.201311011956)?installed'

FAIL: "Restarting UCS libvirt daemon: libvirtdok: run: univention-libvirt: (pid 6123) 0s, normally down" is printed to STDOUT (and mailed to root)

OK: pkill -STOP libvirtd

OK: gdb -p `pgrep libvirtd`

FAIL:
> if [ -f /proc/xen/privcmd ]
Please add the following additional test as the existence of /proc/xen/privcmd is not sufficient:
  && grep -q control_d /proc/xen/capabilities

FAIL:
> echo "libvirt-check.sh: No hypervisor found, exiting" >>"$logfile"
that message is append every 2 minutes by default. please remove it.

RFA: please indent the 2nd line of the first item in the list by 2 spaces.
Comment 4 Erik Damrose univentionstaff 2014-03-17 11:00:29 CET
(In reply to Philipp Hahn from comment #3)
> FAIL: "Restarting UCS libvirt daemon: libvirtdok: run: univention-libvirt:
> (pid 6123) 0s, normally down" is printed to STDOUT (and mailed to root)

Fixed: The message is redirected.

> FAIL:
> > if [ -f /proc/xen/privcmd ]
> Please add the following additional test as the existence of
> /proc/xen/privcmd is not sufficient:
>   && grep -q control_d /proc/xen/capabilities

Fix: Check added

> FAIL:
> > echo "libvirt-check.sh: No hypervisor found, exiting" >>"$logfile"
> that message is append every 2 minutes by default. please remove it.

Fix: Removed

> RFA: please indent the 2nd line of the first item in the list by 2 spaces.

Indentation corrected. In addition, the filename has been fixed.

r48567 univention-virtual-machine-manager-node 2.0.5-4.75.201403171004
r48568 2014-03-14-univention-virtual-machine-manager-node.yaml
Comment 5 Philipp Hahn univentionstaff 2014-03-17 14:22:32 CET
OK: 2.0.5-4.75.201403171004
OK: gdb
FIXED: yaml r48581
OK: announce_errata -V 2014-03-14-univention-virtual-machine-manager-node.yaml
OK: kvm
OK: xen
Comment 6 Moritz Muehlenhoff univentionstaff 2014-04-03 14:15:24 CEST
http://errata.univention.de/ucs/3.2/79.html