Bug 31370

Summary: connections not closed when 2.4 certificate expired
Product: UCS Reporter: Philipp Hahn <hahn>
Component: Virtualization - UVMMAssignee: Philipp Hahn <hahn>
Status: CLOSED DUPLICATE QA Contact:
Severity: normal    
Priority: P5    
Version: UCS 3.1   
Target Milestone: ---   
Hardware: Other   
OS: Linux   
See Also: https://forge.univention.org/bugzilla/show_bug.cgi?id=33458
What kind of report is it?: --- What type of bug is this?: ---
Who will be affected by this bug?: --- How will those affected feel about the bug?: ---
User Pain: Enterprise Customer affected?:
School Customer affected?: ISV affected?:
Waiting Support: Flags outvoted (downgraded) after PO Review:
Ticket number: Bug group (optional): Troubleshooting
Max CVSS v3 score:
Bug Depends on: 31371    
Bug Blocks:    

Description Philipp Hahn univentionstaff 2013-05-15 22:30:10 CEST
Ticket#: 2013050821001911

Several servers had expired SSL certificates.
UVMMd was no longer able to connect to them.
> WARNING - 'xen://XXX/' broken? ... Failed to verify peer's certificate

After a restart after some time all open file descriptors were used up:
> WARNING - 'xen://XXX/' broken? ... Failed to open file '/etc/libvirt/libvirt.conf': Too many open files

> lsof -p $(</etc/runit/univention-virtual-machine-manager-daemon/supervise/pid)|awk '/TCP.*->/{split($9,a,/->/);print a[2]}'|sort|uniq -cd
      4 XXX85.phahn.dev:16514
showed lots of opened connections to those failed hosts


I was able to reproduce this locally:

1. The Virt-Server must be a UCS-2.4-4 system, since the newer libvirtd from UCS-3.1 refuses to start when the certificate is expired.

2. A plain loop is not sufficient:
>>> import libvirt
>>> from time import sleep
>>> while True:
...   try:
...     c = libvirt.open('xen://olb85.phahn.dev/')
...     print c
...   except libvirt.libvirtError, ex:
...     print ex
...     sleep(15)
but running a UVMMd reproduces the leak. I think it is related to the event loop implementation which UVMMd uses: there the connection is registered but not released on errors.
Comment 1 Philipp Hahn univentionstaff 2013-05-16 08:27:11 CEST
Some candidate patches from GIT to src/rpc/virnetclient.c:

* e5a1bee07a1a50c1b9819c2ee805294e2affdc80
  Ensure client is marked for close in all error paths
  Looks most promising.

* 0f7f4b160b3a568789817ff3e9c1196877cc4fbb
  Add callback to virNetClient to be invoked on connection close
  That looks promising, since in this error the TLS socket is not closed properly.
  I think that patch mostly adds infrastructure, which is not used.

* e10e1969d51f07cc2a5d47a59506c73461423ad9
  Turn virNetTLSContext and virNetTLSSession into virObject instances
  There was a big rewrite in libvirt, which introduced a scheme for reference counting. Backporting that to 0.9.12 would require much work; I'd recomment an update of libvirt instead.
Comment 2 Philipp Hahn univentionstaff 2013-05-16 08:31:22 CEST
FYI: /etc/libvirt/libvirtd.conf → max_clients = 20
limits the number of connections to 20, so after one night UVMM only had 15 connections open. For our customer with lots of broken servers, that became too much: 20 expired servers * 15 connections = 300 open TCP connections + twice that much PIPEs for libvirt internal usage + regular files > 750 max open files
Comment 3 Philipp Hahn univentionstaff 2014-03-12 14:43:30 CET
As UCS-2.4 is out-of-maintenance, the first part is resolved.
The second part was resolved in UVMMd with Bug #33458, were the event loop implementation was switched from the broken plain-Python-variant to fixed C-variant.

*** This bug has been marked as a duplicate of bug 33458 ***