Univention Bugzilla – Bug 50598
Stuck univention-cli-server process after update from UCS 4.4-2 errata350 to errata385
Last modified: 2020-01-08 12:17:56 CET
After updating from UCS 4.4-2 errata350 to errata385 there is a stuck univention-cli-server process, which refuses to terminate by normal means: root@ucs-2009:~# ps ax | grep cli 19725 ? Ss 0:00 /usr/bin/python2.7 /usr/share/univention-directory-manager-tools/univention-cli-server 19728 ? Z 0:00 [univention-cli-] <defunct> 25456 pts/1 S+ 0:00 grep cli root@ucs-2009:~# kill 19725 root@ucs-2009:~# ps ax | grep cli 19725 ? Ss 0:00 /usr/bin/python2.7 /usr/share/univention-directory-manager-tools/univention-cli-server 19728 ? Z 0:00 [univention-cli-] <defunct> 25459 pts/1 S+ 0:00 grep cli root@ucs-2009:~# This causes issues e.g. for joinscripts that install new udm modules and cannot use them afterwards. Note: While writing this Bug report the process terminated itself apparently. Jürn just told me that he recently observed a similar situation, that's why I think we should check this.
Some previous discussion can be found on bug 50403. I closed that one because I thought it was an isolated incident, seems I was wrong.
#0 0x00007f77c9bf65e3 in __select_nocancel () at ../sysdeps/unix/syscall-template.S:84 #1 0x000055ee6804e119 in select_select.lto_priv () at ../Modules/selectmodule.c:267 #2 0x000055ee67f7a84a in call_function (oparg=<optimized out>, pp_stack=0x7ffd81072148) at ../Python/ceval.c:4352 #3 PyEval_EvalFrameEx () at ../Python/ceval.c:2989 #4 0x000055ee67f789f5 in PyEval_EvalCodeEx () at ../Python/ceval.c:3584 #5 0x000055ee67f80b78 in fast_function (nk=0, na=<optimized out>, n=<optimized out>, pp_stack=0x7ffd81072358, func=<function at remote 0x7f77c9099c80>) at ../Python/ceval.c:4447 #6 call_function (oparg=<optimized out>, pp_stack=0x7ffd81072358) at ../Python/ceval.c:4372 #7 PyEval_EvalFrameEx () at ../Python/ceval.c:2989 #8 0x000055ee67f789f5 in PyEval_EvalCodeEx () at ../Python/ceval.c:3584 #9 0x000055ee67f80b78 in fast_function (nk=0, na=<optimized out>, n=<optimized out>, pp_stack=0x7ffd81072568, func=<function at remote 0x7f77c9099ed8>) at ../Python/ceval.c:4447 #10 call_function (oparg=<optimized out>, pp_stack=0x7ffd81072568) at ../Python/ceval.c:4372 #11 PyEval_EvalFrameEx () at ../Python/ceval.c:2989 #12 0x000055ee67f789f5 in PyEval_EvalCodeEx () at ../Python/ceval.c:3584 #13 0x000055ee67f787b9 in PyEval_EvalCode (co=<optimized out>, globals=<optimized out>, locals=<optimized out>) at ../Python/ceval.c:669 #14 0x000055ee67fa8bff in run_mod.lto_priv () at ../Python/pythonrun.c:1376 #15 0x000055ee67fa3b52 in PyRun_FileExFlags () at ../Python/pythonrun.c:1362 #16 0x000055ee67fa369e in PyRun_SimpleFileExFlags () at ../Python/pythonrun.c:948 #17 0x000055ee67f54771 in Py_Main () at ../Modules/main.c:640 #18 0x00007f77c9b352e1 in __libc_start_main (main=0x55ee67f540a0 <main>, argc=2, argv=0x7ffd810729a8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffd81072998) at ../csu/libc-start.c:291 #19 0x000055ee67f53f9a in _start ()
Broken socket? lsof -p $PID output would be nice. I will investigate deeper somewhen. We should fix Bug #34836 and Bug #34812.
VM: arequate_o365multi.8.100 (spoka) Snapshot: office365-app-connection-e385-udm-cli ======================================================================= root@ucs-2009:~# ps ax | grep cli 19777 ? Ss 0:00 /usr/bin/python2.7 /usr/share/univention-directory-manager-tools/univention-cli-server 19780 ? Z 0:00 [univention-cli-] <defunct> ======================================================================= In this situation the update of the office365 App fails: ======================================================================= root@ucs-2009:~# univention-app update root@ucs-2009:~# univention-app upgrade office365 [...] Die folgenden Pakete werden aktualisiert (Upgrade): univention-management-console-module-office365 univention-office365 [...] Waiting for file /usr/share/univention-office365/syntax.d/office365.py: OK Object exists: cn=udm_module,cn=univention,dc=o365multi,dc=net Object created: cn=office365/profile,cn=udm_module,cn=univention,dc=o365multi,dc=net Waiting for activation of the extension object office365/profile: OK Waiting for file /usr/share/univention-office365/handler.d/profile.py: OK Terminating running univention-cli-server processes. Object exists: cn=udm_module,cn=univention,dc=o365multi,dc=net Object created: cn=office365/ad-connection,cn=udm_module,cn=univention,dc=o365multi,dc=net Waiting for activation of the extension object office365/ad-connection: OK Waiting for file /usr/share/univention-office365/handler.d/ad-connection.py: OK Terminating running univention-cli-server processes. Object exists: cn=office365,dc=o365multi,dc=net Object exists: cn=profiles,cn=office365,dc=o365multi,dc=net Object created: cn=ad-connections,cn=office365,dc=o365multi,dc=net unknown module office365/profile. Available Modules are: appcenter/app [...] root@ucs-2009:~# univention-check-join-status Warning: 'univention-office365' is not configured. Error: Not all install files configured: 1 missing =======================================================================
Created attachment 10251 [details] lsof.log
Created attachment 10252 [details] /proc/$PID/status SigBlk: 0000000000015027 >>> bin(0x5027) '0b101000000100111' ^ SIGTERM (bit 15) blocked.
lsof shows the following: univentio 19777 root 5u IPv4 198036 0t0 TCP ucs-2009.o365multi.net:59286->ucs-2009.o365multi.net:6669 (ESTABLISHED) univentio 19777 root 6u REG 253,0 163360 5758749 /var/log/univention/directory-manager-cmd.log univentio 19777 root 7u REG 253,0 1372160 5770779 /var/lib/univention-directory-listener/cache/data.mdb univentio 19777 root 8u unix 0xffff92b63bd43800 0t0 198340 /tmp/admincli_0/sock type=STREAM univentio 19777 root 10w REG 253,0 0 5770781 /var/log/univention/listener_modules/portal_groups.log univentio 19777 root 11w REG 253,0 644 5756199 /var/log/univention/listener_modules/app_attributes.log.2019-11-04 univentio 19777 root 12r CHR 1,9 0t0 6508 /dev/urandom Bug #50403 had the same problem, as well when installing the office356 app. There we see, that the signal handling ignores the kill: (In reply to Jürn Brodersen from comment #3) > Turns out this was not the problem and is not really a problem generally. > > ucs_registerLDAPExtension --udm_syntax does wait for the activation > attribute to be set to true, which is set only after the listener wrote out > the syntax. > > The problem was that the univention-cli-server was blocking signals and was > therefore not restarted! > > ''' > root@qa442:~# ps -eo blocked,pid,cmd | egrep -v '^0+ ' > BLOCKED PID CMD > 7be3c0fe28014a03 1 /sbin/init splash > 0000000400004a02 238 /lib/systemd/systemd-journald > 0000000000014003 282 /lib/systemd/systemd-udevd > 0000000200000001 402 /lib/systemd/systemd-logind > 0000000000010000 451 runsvdir -P /etc/service log: > fffffffe3bfa2800 700 /usr/bin/containerd > fffffffe3bfa2800 701 /usr/bin/dockerd -H fd:// > --containerd=/run/containerd/containerd.sock > 0000000000015027 17363 [univention-cli-] <defunct> > 0000000000010000 17495 -bash > 0000000000010000 25224 /bin/bash /usr/bin/univention-welcome-screen > 0000000000015027 25574 /usr/bin/python2.7 > /usr/share/univention-directory-manager-tools/univention-cli-server > ''' > > Keyword: SIGPROCMASK > > I have no clue how that happened and I can't reproduce this. (I do have a > snapshot of this though) (In reply to Sönke Schwardt-Krummrich from comment #5) > (In reply to Jürn Brodersen from comment #4) > > The univention-cli-server just ignored that signal completely :( see comment > > 3. > > AFAIR signals are ignored by the listener, if the listener is currently > initializing/resync a listener module. We also know that due to Bug #34812 file descriptors are inherited instead of being closed. The process has at least some listener logfiles inherited. → Can a process also inherit some signal handling settings?
> → Can a process also inherit some signal handling settings? man 7 signal (http://man7.org/linux/man-pages/man7/signal.7.html) says: A child created via fork(2) inherits a copy of its parent's signal dispositions. During an execve(2), the dispositions of handled signals are reset to the default; the dispositions of ignored signals are left unchanged.
(In reply to Arvid Requate from comment #8) > > → Can a process also inherit some signal handling settings? > > man 7 signal (http://man7.org/linux/man-pages/man7/signal.7.html) says: > > A child created via fork(2) inherits a copy of its parent's signal > dispositions. During an execve(2), the dispositions of handled > signals are reset to the default; the dispositions of ignored signals > are left unchanged. The univention-cli-client makes a fork() to create the univention-cli-server process. But the listener probably only calls UDM-CLI without fork().
From code review I found a bug/crash, which might occur very seldom: diff --git management/univention-directory-manager-modules/univention-cli-server management/univention-directory-manager-modules/univention-cli-server index 8c036e807d..ede4655e19 100755 --- management/univention-directory-manager-modules/univention-cli-server +++ management/univention-directory-manager-modules/univention-cli-server @@ -60,7 +60,7 @@ class MyRequestHandler(SocketServer.BaseRequestHandler): sarglist = '' while True: buf = self.request.recv(1024) - if buf[-1] == '\0': + if buf.endswith('\0'): buf = buf[:-1] sarglist += buf break Not sure, if it's the reason.
Is this bug reproducable / does it happen frequently when installing the O365 App or in other situations?