Univention Bugzilla – Bug 39909
UMC-Server: crashes if file limits are reached
Last modified: 2020-11-05 19:07:05 CET
12.11.15 13:30:06.297 ACL ( PROCESS ) : Allowed UMC operations: 12.11.15 13:30:06.297 ACL ( PROCESS ) : User | Host | Flavor | Command | Options 12.11.15 13:30:06.297 ACL ( PROCESS ) : ****************************************************************************** 12.11.15 13:30:06.297 ACL ( PROCESS ) : False | zeus | * | appcenter/* | {} 12.11.15 13:30:06.297 ACL ( PROCESS ) : False | zeus | * | apps/* | {} 12.11.15 13:30:06.297 ACL ( PROCESS ) : False | zeus | * | diagnostic/* | {} 12.11.15 13:30:06.297 ACL ( PROCESS ) : False | zeus | * | ip/change | {} 12.11.15 13:30:06.297 ACL ( PROCESS ) : False | zeus | * | join/* | {} 12.11.15 13:30:06.297 ACL ( PROCESS ) : False | zeus | * | lib/* | {} 12.11.15 13:30:06.297 ACL ( PROCESS ) : False | zeus | * | lib/server/* | {} 12.11.15 13:30:06.297 ACL ( PROCESS ) : False | zeus | * | lib/server/reboot | {} 12.11.15 13:30:06.297 ACL ( PROCESS ) : False | zeus | * | lib/server/shutdown | {} 12.11.15 13:30:06.297 ACL ( PROCESS ) : False | zeus | * | lib/sso/* | {} 12.11.15 13:30:06.298 ACL ( PROCESS ) : False | zeus | * | mrtg/* | {} 12.11.15 13:30:06.298 ACL ( PROCESS ) : False | zeus | * | passwordchange/* | {} 12.11.15 13:30:06.298 ACL ( PROCESS ) : False | zeus | * | pkgdb/* | {} 12.11.15 13:30:06.298 ACL ( PROCESS ) : False | zeus | * | printers/* | {} 12.11.15 13:30:06.298 ACL ( PROCESS ) : False | zeus | * | quota/* | {} 12.11.15 13:30:06.298 ACL ( PROCESS ) : False | zeus | * | reboot/* | {} 12.11.15 13:30:06.298 ACL ( PROCESS ) : False | zeus | * | services/* | {} 12.11.15 13:30:06.298 ACL ( PROCESS ) : False | zeus | * | setup/* | {} 12.11.15 13:30:06.298 ACL ( PROCESS ) : False | zeus | * | sysinfo/* | {} 12.11.15 13:30:06.298 ACL ( PROCESS ) : False | zeus | * | top/* | {} 12.11.15 13:30:06.298 ACL ( PROCESS ) : False | zeus | * | ucr/* | {} 12.11.15 13:30:06.299 ACL ( PROCESS ) : False | zeus | * | udm/* | {} 12.11.15 13:30:06.299 ACL ( PROCESS ) : False | zeus | * | udm/request_new_license | {} 12.11.15 13:30:06.299 ACL ( PROCESS ) : False | zeus | * | updater/* | {} 12.11.15 13:30:06.299 ACL ( PROCESS ) : True | zeus | * | udm/* | {} 12.11.15 13:30:06.299 ACL ( PROCESS ) : 12.11.15 13:30:06.376 MAIN ( PROCESS ) : running: ['/usr/sbin/univention-management-console-module', '-m', 'udm', '-s', '/var/run/univention-management-console/24495-1447331406376.socket', '-d', '2', '-l', 'de_DE.UTF-8'] 12.11.15 13:30:06.426 MODULE ( PROCESS ) : Die Ausführung des Kommandos udm/get users/user ist fehlgeschlagen: Traceback (most recent call last): File "/usr/lib/pymodules/python2.7/univention/management/console/base.py", line 282, in execute function(self, request) File "/usr/lib/pymodules/python2.7/univention/management/console/protocol/session.py", line 785, in handle_request_command mod_proc = ModuleProcess(module_name, debug=MODULE_DEBUG_LEVEL, locale=self.i18n.locale) File "/usr/lib/pymodules/python2.7/univention/management/console/protocol/session.py", line 153, in __init__ self.__pid = self.__process.start() File "/usr/lib/pymodules/python2.7/notifier/popen.py", line 135, in start self.pid = subprocess.Popen( cmd, shell = self._shell ).pid File "/usr/lib/python2.7/subprocess.py", line 679, in __init__ errread, errwrite) File "/usr/lib/python2.7/subprocess.py", line 1145, in _execute_child errpipe_read, errpipe_write = self.pipe_cloexec() File "/usr/lib/python2.7/subprocess.py", line 1101, in pipe_cloexec r, w = os.pipe() OSError: [Errno 24] Zu viele offene Dateien 12.11.15 13:30:06.829 MAIN ( ERROR ) : Traceback (most recent call last): File "/usr/sbin/univention-management-console-server", line 212, in <module> File "/usr/lib/pymodules/python2.7/daemon/runner.py", line 186, in do_action File "/usr/sbin/univention-management-console-server", line 144, in _restart File "/usr/lib/pymodules/python2.7/daemon/runner.py", line 131, in _start File "/usr/sbin/univention-management-console-server", line 194, in run File "/usr/lib/pymodules/python2.7/notifier/nf_generic.py", line 283, in loop File "/usr/lib/pymodules/python2.7/notifier/nf_generic.py", line 270, in step File "/usr/lib/pymodules/python2.7/univention/management/console/protocol/server.py", line 152, in _receive File "/usr/lib/pymodules/python2.7/univention/management/console/protocol/server.py", line 193, in _handle File "/usr/lib/pymodules/python2.7/univention/management/console/protocol/server.py", line 426, in reload File "/usr/lib/pymodules/python2.7/univention/management/console/module.py", line 354, in load for filename in os.listdir(Manager.DIRECTORY): OSError: [Errno 24] Zu viele offene Dateien: '/usr/share/univention-management-console/modules'
Das Problem tritt erst auf wenn mehrere hundert oder tausend Verbindungen nacheinander hergestellt werden.
Danke für das Feedback. Wenn ich das richtig sehe stürzt der UMC-Dienst komplett ab in dem Fall? Oder werden die Anfragen nur nicht beantwortet?
Der Dienst ist danach komplett weg. Wenn er sich wenigsten neu starten würde...
(In reply to Andreas Lemke from comment #3) > Der Dienst ist danach komplett weg. Wenn er sich wenigsten neu starten > würde... This has also been requested at Bug 33427.
Move to 4.0-5-errata.
Tagging to UCS 4.1-1 errata. Playing around with some different max open file limits reveals crashes of the UMC-Server in various places. By default the UMC-Server has 8 opened file descriptors. PID=$(pgrep -f 'python.*console-server'); ls "/proc/$PID/fd/" | wc -l; 8 grep files "/proc/$PID/limits" Max open files 1024 4096 files With a limit of the current number of opened fds a crash during accepting new socket connections occur. echo 'import resource; resource.setrlimit(resource.RLIMIT_NOFILE, (8, 8))' > /tmp/payload; pyrasite "$PID" /tmp/payload Traceback (most recent call last): File "/usr/sbin/univention-management-console-server", line 235, in <module> File "/usr/lib/pymodules/python2.7/daemon/runner.py", line 186, in do_action File "/usr/sbin/univention-management-console-server", line 153, in _restart File "/usr/lib/pymodules/python2.7/daemon/runner.py", line 131, in _start File "/usr/sbin/univention-management-console-server", line 203, in run File "/usr/lib/pymodules/python2.7/notifier/nf_generic.py", line 286, in loop File "/usr/lib/pymodules/python2.7/notifier/nf_generic.py", line 273, in step File "/usr/lib/pymodules/python2.7/univention/management/console/protocol/server.py", line 404, in _connection File "/usr/lib/python2.7/socket.py", line 202, in accept error: [Errno 24] Zu viele offene Dateien With a limit of 2 + the number of opened file descriptors PAM fails to authenticate the user in a new socket connection: echo 'import resource; resource.setrlimit(resource.RLIMIT_NOFILE, (10, 10))' > /tmp/payload; pyrasite "$PID" /tmp/payload Traceback (most recent call last): File "/usr/lib/pymodules/python2.7/univention/management/console/protocol/server.py", line 151, in _receive File "/usr/lib/pymodules/python2.7/univention/management/console/protocol/server.py", line 202, in _handle File "/usr/lib/pymodules/python2.7/univention/management/console/protocol/session.py", line 134, in authenticate File "/usr/lib/pymodules/python2.7/univention/management/console/auth.py", line 99, in authenticate File "/usr/lib/pymodules/python2.7/univention/management/console/pam.py", line 118, in __init__ File "/usr/lib/pymodules/python2.7/univention/management/console/pam.py", line 165, in init error: ('Critical error - immediate abort', 26) The signal handler for SIGHUP and SIGUSR1 are causing a crash during reloading modules/categories: kill -10 "$PID"; kill -1 "$PID" Traceback (most recent call last): File "/usr/sbin/univention-management-console-server", line 226, in <module> File "/usr/lib/pymodules/python2.7/daemon/runner.py", line 186, in do_action File "/usr/lib/pymodules/python2.7/daemon/runner.py", line 131, in _start File "/usr/sbin/univention-management-console-server", line 203, in run File "/usr/lib/pymodules/python2.7/notifier/nf_generic.py", line 286, in loop File "/usr/lib/pymodules/python2.7/notifier/nf_generic.py", line 219, in step File "/usr/sbin/univention-management-console-server", line 212, in signal_user1 File "/usr/lib/pymodules/python2.7/univention/management/console/protocol/server.py", line 428, in reload File "/usr/lib/pymodules/python2.7/univention/management/console/module.py", line 390, in load OSError: [Errno 24] Zu viele offene Dateien: '/usr/share/univention-management-console/modules' The same is the case when having limit = open_fd + 1 and a new connection with authentication comes in: Traceback (most recent call last): File "/usr/lib/pymodules/python2.7/univention/management/console/protocol/server.py", line 151, in _receive File "/usr/lib/pymodules/python2.7/univention/management/console/protocol/server.py", line 200, in _handle File "/usr/lib/pymodules/python2.7/univention/management/console/protocol/server.py", line 436, in reload File "/usr/lib/pymodules/python2.7/univention/management/console/module.py", line 390, in load OSError: [Errno 24] Zu viele offene Dateien: '/usr/share/univention-management-console/modules The hard limit of maximum opened files has been raised to 64512. All places where the server could crash it now either sends the traceback to the client or closes the socket connection. univention-management-console.yaml: r68871 | YAML Bug #39909 univention-management-console (8.0.28-14): r68870 | Bug #39909: autopep8 r68869 | Bug #39909: enhance robustness of UMC-Server during request handling. Raise limit of max open files to a higher value.
Still get a traceback in umc after PID=$(pgrep -f 'python.*console-server'); echo 'import resource; resource.setrlimit(resource.RLIMIT_NOFILE, (8, 8))' > /tmp/payload; pyrasite "$PID" /tmp/payload kill -10 "$PID"; kill -1 "$PID" 29.04.16 15:16:55.064 MODULE ( PROCESS ) : Die Ausführung des Kommandos setup/load network ist fehlgeschlagen: Traceback (most recent call last): File "/usr/lib/pymodules/python2.7/univention/management/console/base.py", line 283, in execute File "/usr/lib/pymodules/python2.7/univention/management/console/protocol/session.py", line 791, in handle_request_command File "/usr/lib/pymodules/python2.7/univention/management/console/module.py", line 390, in load OSError: [Errno 24] Zu viele offene Dateien: '/usr/share/univention-management-console/modules' OK - new limit of umc server
(In reply to Felix Botner from comment #7) > Still get a traceback in umc after This is okay, as the traceback is presented to the user in a dialog but doesn't cause the UMC-Server to crash anymore. The error handling in general will be improved in the future but we can't add a try-except around every line in UMC :)
(In reply to Florian Best from comment #6) > The hard limit of maximum opened files has been raised to 64512. > All places where the server could crash it now either sends the traceback to > the client or closes the socket connection. OK > univention-management-console.yaml: > r68871 | YAML Bug #39909 OK > univention-management-console (8.0.28-14): > r68870 | Bug #39909: autopep8 > r68869 | Bug #39909: enhance robustness of UMC-Server during request > handling. Raise limit of max open files to a higher value. OK
<http://errata.software-univention.de/ucs/4.1/170.html>