Univention Bugzilla – Bug 48002
UMC system diagnostic module hangs for a long time
Last modified: 2022-02-21 13:38:59 CET
A situation similar to Bug #45343 just happened on a customer system running 4.3-2 errata257. Any workaround or further troubleshooting steps? Not sure if it is helpful but pstree -achls shows: =================================================== systemd splash ├─acpid ├─agetty --noclear tty1 linux ├─apache2 -k start │ ├─apache2 -k start │ ├─apache2 -k start │ ├─apache2 -k start │ ├─apache2 -k start │ ├─apache2 -k start │ ├─apache2 -k start │ ├─apache2 -k start │ ├─apache2 -k start │ ├─apache2 -k start │ ├─apache2 -k start │ └─apache2 -k start ├─atd -f ├─blkmapd ├─cron -f ├─dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation ├─dockerd -H fd:// --storage-driver=overlay --live-restore --bip=172.17.42.1/16 │ ├─docker-containe -l unix:///var/run/docker/libcontainerd/docker-containerd.sock --metrics-interval=0 --start-timeout 2m --state-dir /var/run/docker/libcontainerd/containerd --shim docker-containerd-shim --runtime docker-runc │ │ ├─{docker-containe} │ │ ├─{docker-containe} │ │ ├─{docker-containe} │ │ ├─{docker-containe} │ │ ├─{docker-containe} │ │ ├─{docker-containe} │ │ ├─{docker-containe} │ │ └─{docker-containe} │ ├─{dockerd} │ ├─{dockerd} │ ├─{dockerd} │ ├─{dockerd} │ ├─{dockerd} │ ├─{dockerd} │ ├─{dockerd} │ ├─{dockerd} │ ├─{dockerd} │ └─{dockerd} ├─inetd ├─irqbalance --foreground ├─lvmetad -f ├─master -w │ ├─pickup -l -t unix -u │ ├─qmgr -l -t unix -u │ └─tlsmgr -l -t unix -u ├─memcached -m 64 -s /var/run/univention-saml/memcached.socket -u samlcgi │ ├─{memcached} │ ├─{memcached} │ ├─{memcached} │ ├─{memcached} │ ├─{memcached} │ └─{memcached} ├─memcached -m 64 -p 11211 -u memcache -l 127.0.0.1 │ ├─{memcached} │ ├─{memcached} │ ├─{memcached} │ ├─{memcached} │ ├─{memcached} │ └─{memcached} ├─named -c /etc/bind/named.conf.samba4 -f -d 0 │ ├─{named} │ ├─{named} │ ├─{named} │ └─{named} ├─nmbd -D │ └─nmbd -D ├─nrpe -c /etc/nagios/nrpe.cfg -f ├─nscd │ ├─{nscd} │ ├─{nscd} │ ├─{nscd} │ ├─{nscd} │ ├─{nscd} │ ├─{nscd} │ └─{nscd} ├─ntpd -p /var/run/ntpd.pid -g -u 114:123 │ └─{ntpd} ├─python2.7 -W ignore /usr/lib/pymodules/python2.7/univention/s4connector/s4/main.py ├─rpc.gssd ├─rpc.idmapd ├─rpc.mountd --manage-gids --port 32767 ├─rpcbind -f -w ├─rsyslogd -n │ ├─{in:imklog} │ ├─{in:imuxsock} │ └─{rs:main Q:Reg} ├─runsvdir -P /etc/service log: ........................................................................................................................................................................................................................................................................................................................................................................................................... │ ├─runsv univention-directory-notifier │ │ └─univention-dire -o -d 1 -F │ └─runsv univention-directory-listener │ └─univention-dire -F -d 2 -b dc=domain,dc=com -m /usr/lib/univention-directory-listener/system -c /var/lib/univention-directory-listener -ZZ -x -D cn=admin,dc=domain,dc=com -y /etc/ldap.secret ├─samba -D │ ├─samba -D │ │ └─samba -D │ │ └─smbd -D --option=server role check:inhibit=yes --foreground │ │ ├─cleanupd -D --option=server role check:inhibit=yes --foreground │ │ ├─lpqd -D --option=server role check:inhibit=yes --foreground │ │ └─smbd-notifyd -D --option=server role check:inhibit=yes --foreground │ ├─samba -D │ ├─samba -D │ ├─samba -D │ │ └─samba -D │ ├─samba -D │ ├─samba -D │ ├─samba -D │ ├─samba -D │ │ └─samba -D │ │ └─winbindd -D --option=server role check:inhibit=yes --foreground │ │ └─winbindd -D --option=server role check:inhibit=yes --foreground │ ├─samba -D │ ├─samba -D │ └─samba -D ├─slapd -h ldapi:/// ldap://:7389/ ldaps://:7636/ │ ├─{slapd} │ ├─{slapd} │ ├─{slapd} │ ├─{slapd} │ └─{slapd} ├─sshd -D │ ├─sshd │ │ └─bash │ │ └─pstree -alhcs │ └─sshd │ └─sftp-server ├─stunnel4 /etc/stunnel/univention_saml.conf ├─systemd-journal ├─systemd-logind ├─systemd-udevd ├─univention-mana /usr/sbin/univention-management-console-server start │ └─univention-mana /usr/sbin/univention-management-console-module -m diagnostic -s /var/run/univention-management-console/29243-1538126911743.socket -d 2 -l en_US.UTF-8 │ ├─{univention-mana} │ ├─{univention-mana} │ └─{univention-mana} ├─univention-mana /usr/sbin/univention-management-console-web-server start │ ├─{univention-mana} │ ├─{univention-mana} │ ├─{univention-mana} │ ├─{univention-mana} │ ├─{univention-mana} │ ├─{univention-mana} │ ├─{univention-mana} │ ├─{univention-mana} │ ├─{univention-mana} │ ├─{univention-mana} │ ├─{univention-mana} │ ├─{univention-mana} │ ├─{univention-mana} │ ├─{univention-mana} │ ├─{univention-mana} │ ├─{univention-mana} │ ├─{univention-mana} │ ├─{univention-mana} │ ├─{univention-mana} │ ├─{univention-mana} │ ├─{univention-mana} │ ├─{univention-mana} │ ├─{univention-mana} │ ├─{univention-mana} │ ├─{univention-mana} │ ├─{univention-mana} │ ├─{univention-mana} │ ├─{univention-mana} │ ├─{univention-mana} │ ├─{univention-mana} │ ├─{univention-mana} │ ├─{univention-mana} │ ├─{univention-mana} │ ├─{univention-mana} │ ├─{univention-mana} │ ├─{univention-mana} │ ├─{univention-mana} │ └─{univention-mana} └─univention-welc /usr/bin/univention-welcome-screen └─hexdump -n 96 /dev/input/event0 ================================================ +++ This bug was initially created as a clone of Bug #45343 +++ Soemtimes the UMC system diagnostic module hangs for a long time, several minutes (or infinitely?). See attached screenshot. All tests seem to run but finally it hangs at 96% with the last message "Diagnosis of "Gateway is not reachable" was successful". It is not really clear if the module is still doing something.
Can you provide UMC logfiles with debug level 4?
If I remember correctly, the problem was caused by samba-tool dbcheck, which takes longer than 10 minutes on the system and is terminated by the timeout set via umc/module/timeout. This causes the 502 in the UMC diagnostic module in Ticket 2018092721000521.
Patch in branch git:fbest/48002-asynchronous-diagnostic-checks. This makes the plugins being executed asynchronously, so that the diagnostic/run request finished immediately and one can fetch the progress with a diagnostic/progress request. Therefore the CLI tool has been adjusted as well. This still causes that the module runs 10 monutes or longer, but it will output the correct state after the plugin finishes (that was the wanted solution in our sprint discussions).
Patch has been applied: ucs-test (9.0.3-3) 585d2c7e48a7 | Bug #48002: execute plugins asynchronously univention-management-console-module-diagnostic.yaml 585d2c7e48a7 | Bug #48002: execute plugins asynchronously univention-management-console-module-diagnostic (5.0.1-10) 585d2c7e48a7 | Bug #48002: execute plugins asynchronously
As the command is now using a progressbar and there are many requests, I reduced the interval to 600 ms. Also error handling with progress bars is now possible and behaves like every other regular UMC command. univention-web (3.0.5-23) c94a98fdc2aa | Bug #48002: make progress interval configurable univention-management-console (11.0.4-31) c94a98fdc2aa | Bug #48002: make progress interval configurable univention-management-console-module-diagnostic (5.0.1-11) c94a98fdc2aa | Bug #48002: make progress interval configurable
What I tested: 11min diagnostic test -> No more timeouts -> OK diagnostic without saml -> OK diagnostic with saml -> OK closing running diagnostic -> Progress checks are stopped -> OK Rerunning diagnostic after a module timeout -> Only one module process is created -> OK yaml -> OK -> verified
<http://errata.software-univention.de/ucs/4.4/202.html> <http://errata.software-univention.de/ucs/4.4/203.html> <http://errata.software-univention.de/ucs/4.4/207.html>
*** Bug 48125 has been marked as a duplicate of this bug. ***
*** Bug 45162 has been marked as a duplicate of this bug. ***