Bug 48002 - UMC system diagnostic module hangs for a long time
UMC system diagnostic module hangs for a long time
Status: CLOSED FIXED
Product: UCS
Classification: Unclassified
Component: UMC - System diagnostic
UCS 4.4
Other Linux
: P5 normal (vote)
: UCS 4.4-1-errata
Assigned To: Florian Best
Jürn Brodersen
:
: 45162 48125 (view as bug list)
Depends on: 37032 45343 47106 49929
Blocks:
  Show dependency treegraph
 
Reported: 2018-10-16 13:57 CEST by Arvid Requate
Modified: 2022-02-21 13:38 CET (History)
6 users (show)

See Also:
What kind of report is it?: Bug Report
What type of bug is this?: 4: Minor Usability: Impairs usability in secondary scenarios
Who will be affected by this bug?: 2: Will only affect a few installed domains
How will those affected feel about the bug?: 2: A Pain – users won’t like this once they notice it
User Pain: 0.091
Enterprise Customer affected?: Yes
School Customer affected?:
ISV affected?:
Waiting Support: Yes
Flags outvoted (downgraded) after PO Review:
Ticket number: 2018101521000694, 2018092721000521, 2019072321000192
Bug group (optional):
Max CVSS v3 score:
best: Patch_Available+


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Arvid Requate univentionstaff 2018-10-16 13:57:49 CEST
A situation similar to Bug #45343 just happened on a customer system running 4.3-2 errata257.

Any workaround or further troubleshooting steps?

Not sure if it is helpful but pstree -achls shows:
===================================================
systemd splash
├─acpid
├─agetty --noclear tty1 linux
├─apache2 -k start
│ ├─apache2 -k start
│ ├─apache2 -k start
│ ├─apache2 -k start
│ ├─apache2 -k start
│ ├─apache2 -k start
│ ├─apache2 -k start
│ ├─apache2 -k start
│ ├─apache2 -k start
│ ├─apache2 -k start
│ ├─apache2 -k start
│ └─apache2 -k start
├─atd -f
├─blkmapd
├─cron -f
├─dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation
├─dockerd -H fd:// --storage-driver=overlay --live-restore --bip=172.17.42.1/16
│ ├─docker-containe -l unix:///var/run/docker/libcontainerd/docker-containerd.sock --metrics-interval=0 --start-timeout 2m --state-dir /var/run/docker/libcontainerd/containerd --shim docker-containerd-shim --runtime docker-runc
│ │ ├─{docker-containe}
│ │ ├─{docker-containe}
│ │ ├─{docker-containe}
│ │ ├─{docker-containe}
│ │ ├─{docker-containe}
│ │ ├─{docker-containe}
│ │ ├─{docker-containe}
│ │ └─{docker-containe}
│ ├─{dockerd}
│ ├─{dockerd}
│ ├─{dockerd}
│ ├─{dockerd}
│ ├─{dockerd}
│ ├─{dockerd}
│ ├─{dockerd}
│ ├─{dockerd}
│ ├─{dockerd}
│ └─{dockerd}
├─inetd
├─irqbalance --foreground
├─lvmetad -f
├─master -w
│ ├─pickup -l -t unix -u
│ ├─qmgr -l -t unix -u
│ └─tlsmgr -l -t unix -u
├─memcached -m 64 -s /var/run/univention-saml/memcached.socket -u samlcgi
│ ├─{memcached}
│ ├─{memcached}
│ ├─{memcached}
│ ├─{memcached}
│ ├─{memcached}
│ └─{memcached}
├─memcached -m 64 -p 11211 -u memcache -l 127.0.0.1
│ ├─{memcached}
│ ├─{memcached}
│ ├─{memcached}
│ ├─{memcached}
│ ├─{memcached}
│ └─{memcached}
├─named -c /etc/bind/named.conf.samba4 -f -d 0
│ ├─{named}
│ ├─{named}
│ ├─{named}
│ └─{named}
├─nmbd -D
│ └─nmbd -D
├─nrpe -c /etc/nagios/nrpe.cfg -f
├─nscd
│ ├─{nscd}
│ ├─{nscd}
│ ├─{nscd}
│ ├─{nscd}
│ ├─{nscd}
│ ├─{nscd}
│ └─{nscd}
├─ntpd -p /var/run/ntpd.pid -g -u 114:123
│ └─{ntpd}
├─python2.7 -W ignore /usr/lib/pymodules/python2.7/univention/s4connector/s4/main.py
├─rpc.gssd
├─rpc.idmapd
├─rpc.mountd --manage-gids --port 32767
├─rpcbind -f -w
├─rsyslogd -n
│ ├─{in:imklog}
│ ├─{in:imuxsock}
│ └─{rs:main Q:Reg}
├─runsvdir -P /etc/service log: ...........................................................................................................................................................................................................................................................................................................................................................................................................
│ ├─runsv univention-directory-notifier
│ │ └─univention-dire -o -d 1 -F
│ └─runsv univention-directory-listener
│ └─univention-dire -F -d 2 -b dc=domain,dc=com -m /usr/lib/univention-directory-listener/system -c /var/lib/univention-directory-listener -ZZ -x -D cn=admin,dc=domain,dc=com -y /etc/ldap.secret
├─samba -D
│ ├─samba -D
│ │ └─samba -D
│ │ └─smbd -D --option=server role check:inhibit=yes --foreground
│ │ ├─cleanupd -D --option=server role check:inhibit=yes --foreground
│ │ ├─lpqd -D --option=server role check:inhibit=yes --foreground
│ │ └─smbd-notifyd -D --option=server role check:inhibit=yes --foreground
│ ├─samba -D
│ ├─samba -D
│ ├─samba -D
│ │ └─samba -D
│ ├─samba -D
│ ├─samba -D
│ ├─samba -D
│ ├─samba -D
│ │ └─samba -D
│ │ └─winbindd -D --option=server role check:inhibit=yes --foreground
│ │ └─winbindd -D --option=server role check:inhibit=yes --foreground
│ ├─samba -D
│ ├─samba -D
│ └─samba -D
├─slapd -h ldapi:/// ldap://:7389/ ldaps://:7636/
│ ├─{slapd}
│ ├─{slapd}
│ ├─{slapd}
│ ├─{slapd}
│ └─{slapd}
├─sshd -D
│ ├─sshd
│ │ └─bash
│ │ └─pstree -alhcs
│ └─sshd
│ └─sftp-server
├─stunnel4 /etc/stunnel/univention_saml.conf
├─systemd-journal
├─systemd-logind
├─systemd-udevd
├─univention-mana /usr/sbin/univention-management-console-server start
│ └─univention-mana /usr/sbin/univention-management-console-module -m diagnostic -s /var/run/univention-management-console/29243-1538126911743.socket -d 2 -l en_US.UTF-8
│ ├─{univention-mana}
│ ├─{univention-mana}
│ └─{univention-mana}
├─univention-mana /usr/sbin/univention-management-console-web-server start
│ ├─{univention-mana}
│ ├─{univention-mana}
│ ├─{univention-mana}
│ ├─{univention-mana}
│ ├─{univention-mana}
│ ├─{univention-mana}
│ ├─{univention-mana}
│ ├─{univention-mana}
│ ├─{univention-mana}
│ ├─{univention-mana}
│ ├─{univention-mana}
│ ├─{univention-mana}
│ ├─{univention-mana}
│ ├─{univention-mana}
│ ├─{univention-mana}
│ ├─{univention-mana}
│ ├─{univention-mana}
│ ├─{univention-mana}
│ ├─{univention-mana}
│ ├─{univention-mana}
│ ├─{univention-mana}
│ ├─{univention-mana}
│ ├─{univention-mana}
│ ├─{univention-mana}
│ ├─{univention-mana}
│ ├─{univention-mana}
│ ├─{univention-mana}
│ ├─{univention-mana}
│ ├─{univention-mana}
│ ├─{univention-mana}
│ ├─{univention-mana}
│ ├─{univention-mana}
│ ├─{univention-mana}
│ ├─{univention-mana}
│ ├─{univention-mana}
│ ├─{univention-mana}
│ ├─{univention-mana}
│ └─{univention-mana}
└─univention-welc /usr/bin/univention-welcome-screen
└─hexdump -n 96 /dev/input/event0

================================================

+++ This bug was initially created as a clone of Bug #45343 +++

Soemtimes the UMC system diagnostic module hangs for a long time, several minutes (or infinitely?). See attached screenshot. All tests seem to run but finally it hangs at 96% with the last message "Diagnosis of "Gateway is not reachable" was successful". It is not really clear if the module is still doing something.
Comment 1 Stefan Gohmann univentionstaff 2018-10-16 16:53:01 CEST
Can you provide UMC logfiles with debug level 4?
Comment 2 Christina Scheinig univentionstaff 2019-01-17 13:54:04 CET
If I remember correctly, the problem was caused by samba-tool dbcheck,  which takes longer than 10 minutes on the system and is terminated by the timeout set via umc/module/timeout. This causes the 502 in the UMC diagnostic module in Ticket 2018092721000521.
Comment 3 Florian Best univentionstaff 2019-07-16 15:09:29 CEST
Patch in branch git:fbest/48002-asynchronous-diagnostic-checks.

This makes the plugins being executed asynchronously, so that the diagnostic/run request finished immediately and one can fetch the progress with a diagnostic/progress request. Therefore the CLI tool has been adjusted as well.

This still causes that the module runs 10 monutes or longer, but it will output the correct state after the plugin finishes (that was the wanted solution in our sprint discussions).
Comment 4 Florian Best univentionstaff 2019-07-16 20:20:47 CEST
Patch has been applied:

ucs-test (9.0.3-3)
585d2c7e48a7 | Bug #48002: execute plugins asynchronously

univention-management-console-module-diagnostic.yaml
585d2c7e48a7 | Bug #48002: execute plugins asynchronously

univention-management-console-module-diagnostic (5.0.1-10)
585d2c7e48a7 | Bug #48002: execute plugins asynchronously
Comment 5 Florian Best univentionstaff 2019-07-29 14:42:27 CEST
As the command is now using a progressbar and there are many requests, I reduced the interval to 600 ms.
Also error handling with progress bars is now possible and behaves like every other regular UMC command.

univention-web (3.0.5-23)
c94a98fdc2aa | Bug #48002: make progress interval configurable

univention-management-console (11.0.4-31)
c94a98fdc2aa | Bug #48002: make progress interval configurable

univention-management-console-module-diagnostic (5.0.1-11)
c94a98fdc2aa | Bug #48002: make progress interval configurable
Comment 7 Jürn Brodersen univentionstaff 2019-07-31 12:02:27 CEST
What I tested:
11min diagnostic test -> No more timeouts -> OK
diagnostic without saml -> OK
diagnostic with saml -> OK
closing running diagnostic -> Progress checks are stopped -> OK
Rerunning diagnostic after a module timeout -> Only one module process is created -> OK

yaml -> OK

-> verified
Comment 9 Jürn Brodersen univentionstaff 2019-11-28 10:38:23 CET
*** Bug 48125 has been marked as a duplicate of this bug. ***
Comment 10 Florian Best univentionstaff 2022-02-21 13:38:59 CET
*** Bug 45162 has been marked as a duplicate of this bug. ***