Univention Bugzilla – Bug 31370
connections not closed when 2.4 certificate expired
Last modified: 2023-06-28 10:46:24 CEST
Ticket#: 2013050821001911 Several servers had expired SSL certificates. UVMMd was no longer able to connect to them. > WARNING - 'xen://XXX/' broken? ... Failed to verify peer's certificate After a restart after some time all open file descriptors were used up: > WARNING - 'xen://XXX/' broken? ... Failed to open file '/etc/libvirt/libvirt.conf': Too many open files > lsof -p $(</etc/runit/univention-virtual-machine-manager-daemon/supervise/pid)|awk '/TCP.*->/{split($9,a,/->/);print a[2]}'|sort|uniq -cd 4 XXX85.phahn.dev:16514 showed lots of opened connections to those failed hosts I was able to reproduce this locally: 1. The Virt-Server must be a UCS-2.4-4 system, since the newer libvirtd from UCS-3.1 refuses to start when the certificate is expired. 2. A plain loop is not sufficient: >>> import libvirt >>> from time import sleep >>> while True: ... try: ... c = libvirt.open('xen://olb85.phahn.dev/') ... print c ... except libvirt.libvirtError, ex: ... print ex ... sleep(15) but running a UVMMd reproduces the leak. I think it is related to the event loop implementation which UVMMd uses: there the connection is registered but not released on errors.
Some candidate patches from GIT to src/rpc/virnetclient.c: * e5a1bee07a1a50c1b9819c2ee805294e2affdc80 Ensure client is marked for close in all error paths Looks most promising. * 0f7f4b160b3a568789817ff3e9c1196877cc4fbb Add callback to virNetClient to be invoked on connection close That looks promising, since in this error the TLS socket is not closed properly. I think that patch mostly adds infrastructure, which is not used. * e10e1969d51f07cc2a5d47a59506c73461423ad9 Turn virNetTLSContext and virNetTLSSession into virObject instances There was a big rewrite in libvirt, which introduced a scheme for reference counting. Backporting that to 0.9.12 would require much work; I'd recomment an update of libvirt instead.
FYI: /etc/libvirt/libvirtd.conf → max_clients = 20 limits the number of connections to 20, so after one night UVMM only had 15 connections open. For our customer with lots of broken servers, that became too much: 20 expired servers * 15 connections = 300 open TCP connections + twice that much PIPEs for libvirt internal usage + regular files > 750 max open files
As UCS-2.4 is out-of-maintenance, the first part is resolved. The second part was resolved in UVMMd with Bug #33458, were the event loop implementation was switched from the broken plain-Python-variant to fixed C-variant. *** This bug has been marked as a duplicate of bug 33458 ***