Univention Bugzilla – Bug 46860
4.3: Listener module for generating netlogon scripts is too slow
Last modified: 2018-06-04 15:27:43 CEST
Port changes to UCS@school 4.3 +++ This bug was initially created as a clone of Bug #45802 +++ A large customer reported (2017112021000371) that the regeneration of netlogon script takes now 2 hours instead of ~20 minutes. Also the new daemon throws tracebacks from time to time which (intentionally) slows down the daemon each time by ~5 seconds. +++ This bug was initially created as a clone of Bug #42997 +++ Currently the listener module for generating netlogon scripts is triggered by LDAP modifications of user, group and share objects. For each event the corresponding netlogon scripts are rebuilt. There are some situations when this approach takes too much time: a) when rejoining slave systems, the rebuild o all netlogon scripts takes minutes up to hours b) when starting the exam mode, there the replication might be too slow and hits the current timeout of 300 seconds for replication of all exam users to the corresponding school server. Idea to speed up the rebuild: Affected DNs are not immediately handled in handler() but collected until a certain number of changes is reached, e.g. handler() has been called 100 times (UCR variable!), or the listener module's postrun() method is called. DNs might be collected by creating/touching the file /var/lib/ucs-school-netlogon-user-logonscripts/${DN}. This way recurring changes to the same LDAP object are automatically merged and there is no information loss if the listener gets stopped. Log excerpts from a customer system: 15.11.16 13:47:03.619 LISTENER (INFO): updating 'cn=exam-someuser1,cn=uid,cn=temporary,cn=univention,dc=example,dc=com' command a 15.11.16 13:47:03.733 LISTENER (INFO): updating 'cn=xxxxxx,cn=uidNumber,cn=temporary,cn=univention,dc=example,dc=com' command a 15.11.16 13:47:03.831 LISTENER (INFO): updating 'cn=xxxxxx,cn=gidNumber,cn=temporary,cn=univention,dc=example,dc=com' command a 15.11.16 13:47:03.925 LISTENER (INFO): updating 'cn=xxxxxx,cn=gidNumber,cn=temporary,cn=univention,dc=example,dc=com' command d 15.11.16 13:47:03.993 LISTENER (INFO): updating 'cn=S-1-5-21-xxxxxxxxxx-xxxxxxxxxx-xxxxxxxxxx-xxxxxxx,cn=sid,cn=temporary,cn=univention,dc=example,dc=com' command a 15.11.16 13:47:04.073 LISTENER (INFO): updating 'uid=exam-someuser1,cn=examusers,ou=schoolA,dc=example,dc=com' command a 15.11.16 13:47:04.318 LISTENER (INFO): updating 'cn=Domain Users schoolA,cn=groups,ou=schoolA,dc=example,dc=com' command m 15.11.16 13:47:04.653 LISTENER (INFO): updating 'cn=schuelerschoolA,cn=groups,ou=schoolA,dc=example,dc=com' command m 15.11.16 13:47:04.872 LISTENER (INFO): updating 'cn=schoolA-Q II,cn=klassen,cn=schueler,cn=groups,ou=schoolA,dc=example,dc=com' command m 15.11.16 13:47:05.002 LISTENER (INFO): updating 'cn=schoolA-GKurs-13-mat 10,cn=kurse,ou=schoolA,dc=example,dc=com' command m 15.11.16 13:47:05.152 LISTENER (INFO): updating 'cn=schoolA-LKurs-13-BIO 4,cn=kurse,ou=schoolA,dc=example,dc=com' command m 15.11.16 13:47:05.290 LISTENER (INFO): updating 'cn=schoolA-GKurs-13-che 30,cn=kurse,ou=schoolA,dc=example,dc=com' command m 15.11.16 13:47:05.382 LISTENER (INFO): updating 'cn=schoolA-GKurs-13-kun 15,cn=kurse,ou=schoolA,dc=example,dc=com' command m 15.11.16 13:47:05.640 LISTENER (INFO): updating 'cn=schoolA-LKurs-13-GEG 2,cn=kurse,ou=schoolA,dc=example,dc=com' command m 15.11.16 13:47:05.812 LISTENER (INFO): updating 'cn=schoolA-GKurs-13-spopr 12,cn=kurse,ou=schoolA,dc=example,dc=com' command m 15.11.16 13:47:05.952 LISTENER (INFO): updating 'cn=schoolA-GKurs-13-deu 65,cn=kurse,ou=schoolA,dc=example,dc=com' command m 15.11.16 13:47:06.137 LISTENER (INFO): updating 'cn=schoolA-GKurs-13-ges 35,cn=kurse,ou=schoolA,dc=example,dc=com' command m 15.11.16 13:47:06.443 LISTENER (INFO): updating 'cn=schoolA-GKurs-13-eng 50,cn=kurse,ou=schoolA,dc=example,dc=com' command m 15.11.16 13:47:06.654 LISTENER (INFO): updating 'cn=OUschoolA-Klassenarbeit,cn=ucsschool,cn=groups,dc=example,dc=com' command m 15.11.16 13:47:06.788 LISTENER (INFO): updating 'cn=exam-someuser1,cn=uid,cn=temporary,cn=univention,dc=example,dc=com' command d 15.11.16 13:47:06.869 LISTENER (INFO): updating 'cn=S-1-5-21-xxxxxxxxxx-xxxxxxxxxx-xxxxxxxxxx-xxxxxxx,cn=sid,cn=temporary,cn=univention,dc=example,dc=com' command d
Have a look at Bug #45802 for further details: Several optimizations have been performed to improve performance of the listener module. 1) a missing SQL index has been added 2) the traceback of comment 1 has been fixed 3) groups (resp. the group's members) are only processed if the group has been referenced by at least one share 4) a much simpler LDAP query is used which reduced the initial LDAP query time by factor of ~4. In a customer environment the searchtime jumped from 15m56s to 4m36s. The found objects are verified later on against the UDM modules. So there should be no behaviour change. 5) As Daniel pointed out, the setup of a LDAP connection for each handled object is very expensive. The connection is now set up once and in case of an LDAP exception, and teared down in postrun(). 6) The LDAP connection is not performed against the UCS master anymore but to the local LDAP server. As shown below, this improved the processing speed of the listener module remarkably, so the LDAP replication is no longer block for longer periods. A test (module resync) has been performed with the old and the new code. The created logon scripts have been compared from both test runs. No difference has been found. * old code with benchmarking: 24.03.18 18:33:58.041 LISTENER ( WARN ) : initializing module ucs-school-user-logonscript 24.03.18 18:35:24.889 LISTENER ( WARN ) : finished initializing module ucs-school-user-logonscript with rv=0 24.03.18 18:35:39.908 LISTENER ( PROCESS ) : ucs-school-user-logonscript: Timer total: handler start__: 0.000000 lo ready_______: 59.822722 db ready_______: 1.116696 handlers called: 12.993600 handler end____: 0.262067 ** …daemon.log 2018-03-24 18:33:58,261 [INFO] INIT 2018-03-24 18:33:58,307 [INFO] got global links {'Marktplatz': {'letter': 'M', 'server': 'slave111'}} 2018-03-24 18:34:08,434 [INFO] signal received - reduced wait time by 162.9 seconds […] 2018-03-24 18:34:41,062 [INFO] all items processed... sleeping... 2018-03-24 18:34:41,080 [INFO] signal received - reduced wait time by 173.0 seconds […] 2018-03-24 18:34:49,676 [INFO] all items processed... sleeping... 2018-03-24 18:34:49,682 [INFO] signal received - reduced wait time by 173.0 seconds […] 2018-03-24 18:34:58,653 [INFO] all items processed... sleeping... 2018-03-24 18:34:58,662 [INFO] signal received - reduced wait time by 173.0 seconds […] 2018-03-24 18:35:07,420 [INFO] all items processed... sleeping... 2018-03-24 18:35:07,438 [INFO] signal received - reduced wait time by 173.0 seconds […] 2018-03-24 18:35:16,118 [INFO] all items processed... sleeping... 2018-03-24 18:35:16,128 [INFO] signal received - reduced wait time by 173.0 seconds […] 2018-03-24 18:35:31,794 [INFO] all items processed... sleeping... → Start: 18:33:58 → End: 18:35:31 → 93 seconds # egrep '^2018-04-09 15:0(4|5).*updating logon script for' /var/log/univention/ucs-school-user-logonscript-daemon.log | wc -l 2398 → Listener: 2398 users / 86 seconds → 27.8 users/s → Daemon: 2398 users / 83 seconds → 28.9 users/s * new code with benchmarking ** listener.log 24.03.18 21:41:48.343 LISTENER ( WARN ) : initializing module ucs-school-user-logonscript 24.03.18 21:41:59.219 LISTENER ( WARN ) : finished initializing module ucs-school-user-logonscript with rv=0 24.03.18 21:42:14.236 LISTENER ( PROCESS ) : ucs-school-user-logonscript: Timer total: handler start__: 0.000000 lo ready_______: 0.054209 db ready_______: 0.620861 handlers called: 3.995481 handler end____: 0.192979 ** …daemon.log 2018-03-24 21:41:48,614 [INFO] INIT 2018-03-24 21:41:48,665 [INFO] got global links {'Marktplatz': {'letter': 'M', 'server': 'slave111'}} 2018-03-24 21:41:53,086 [INFO] signal received - reduced wait time by 168.6 seconds […] 2018-03-24 21:42:09,424 [INFO] all items processed... sleeping... → Start: 21:41:48 → End: 21:42:09 → 27 seconds # egrep '^2018-04-09 15:0(4|5).*updating logon script for' /var/log/univention/ucs-school-user-logonscript-daemon.log | wc -l 2398 → Listener: 2398 users / 11 seconds → 218 users/s → Daemon: 2398 users / 16 seconds → 150 users/s Test run in customer environment: * old package 12.04.18 14:45:09.031 LISTENER ( WARN ) : initializing module ucs-school-user-logonscript 12.04.18 14:45:09.263 LISTENER ( INFO ) : call handler_initialize for module ucs-school-user-logonscript 12.04.18 14:45:12.859 LISTENER ( INFO ) : initialize schema for module ucs-school-user-logonscript 12.04.18 17:09:54.627 LISTENER ( WARN ) : finished initializing module ucs-school-user-logonscript with rv=0 → ~2h25m → ~145min * new package 20.04.18 16:02:14.116 LISTENER ( WARN ) : initializing module ucs-school-user-logonscript 20.04.18 16:02:14.356 LISTENER ( INFO ) : call handler_initialize for module ucs-school-user-logonscript 20.04.18 16:02:17.799 LISTENER ( INFO ) : initialize schema for module ucs-school-user-logonscript 20.04.18 16:11:53.043 LISTENER ( WARN ) : finished initializing module ucs-school-user-logonscript with rv=0 → ~9m30s → ~10min ucs-school-netlogon-user-logonscripts (14.0.0-3) b04806f38504 | Bug #46860/#45802: add changelog entry f512ada97dcc | Bug #46860/#45802: use local LDAP connection instead of a LDAP connection to DC master 81dd0efff78d | Bug #46860/#45802: reuse LDAP connection between handler() calls c7fd8761806a | Bug #46860/#45802: make mypy happy 675f33db024f | Bug #46860/#45802: cleanup 8fa6df62b39d | Bug #46860/#45802: use much simpler LDAP query to reduce query time 0dba7f35140a | Bug #46860/#45802: only process groups that are referenced by at least one share 63128aa5ad4d | Bug #46860/#45802: add mypy comments 98c814eb86d6 | Bug #46860/#45802: fix traceback by skipping users that have no gidNumer set 3b1b181cff87 | Bug #46860/#45802: add comment for mypy a452b8b4f9da | Bug #46860/#45802: improve performance by adding a missing index ucs-school-netlogon-user-logonscripts.yaml 7dd74fe646cc | Bug #46860/#45802: add advisory entry Package: ucs-school-netlogon-user-logonscripts Version: 14.0.0-3A~4.3.0.201804201654 Branch: ucs_4.3-0 Scope: ucs-school-4.3 (Hint: package does not work until Bug 46696 is fixed).
OK: code change OK: functional change OK: tests OK: changelog entry OK: advisory OK: package built
(In reply to Sönke Schwardt-Krummrich from comment #2) > OK: code change > OK: functional change > OK: tests > OK: changelog entry > OK: advisory > OK: package built → Wrong bug → back to RESOLVED
OK: code changes, fixed in mypy hints: [4.3] 78becbf6 Bug #46860: comment/mypy-only fix: class name of univention.admin.uldap.access is lower case [4.3] 679c95f2 Bug #46860: fix mypy hint, enhance docstring OK: manual tests: ================================================================ root@m61:~# /usr/share/ucs-school-import/scripts/ucs-school-testuser-import --staff 100 --students 700 --teachers 150 --staffteachers 50 --classes 30 --schools 1 SchuleM62 ---------------------------------------------------------------- root@s62:~# dpkg -l ucs-school-netlogon-user-logonscripts ii ucs-school-netlogon-user 14.0.0-1A~4.3.0.2 all root@s62:~# univention-ldapsearch -LLL '(&(uid=*)(!(uid=*$)))' dn | grep dn: | wc -l 919 root@s62:~# univention-directory-listener-ctrl resync ucs-school-user-logonscript ==> /var/log/univention/listener.log <== 02.05.18 13:42:56.415 LISTENER ( WARN ) : initializing module ucs-school-user-logonscript […] 02.05.18 13:43:27.204 LISTENER ( WARN ) : finished initializing module ucs-school-user-logonscript with rv=0 ==> /var/log/univention/ucs-school-user-logonscript-daemon.log <== 2018-05-02 13:42:56,546 [INFO] INIT 2018-05-02 13:42:56,570 [INFO] got global links {'Marktplatz': {'letter': 'M', 'server': 's62'}} 2018-05-02 13:42:58,066 [INFO] signal received - reduced wait time by 171.5 seconds […] 2018-05-02 13:43:32,249 [INFO] all items processed... sleeping... root@s62:~# egrep '2018-05-02 13:4.*updating logon script for' /var/log/univention/ucs-school-user-logonscript-daemon.log | wc -l 2241 root@s62:~# find /var/lib/samba/netlogon/ -name '*.vbs' -ls | grep 'Mai 2 13:4' | wc -l 908 --- listener : 919 user / 31s = 29,65 user/s daemon : 919 user / 36s = 25,53 user/s (incl. sleeps) wall clock resync -> files written: 13:42:56 -> 13:43:32 = 36s wall clock: 919 user / 36s = 25,53 user/s ================================================================ root@s62:~# univention-upgrade --ignoressh --ignoreterm root@s62:~# dpkg -l ucs-school-netlogon-user-logonscripts ii ucs-school-netlogon-user 14.0.0-4A~4.3.0.2 all root@s62:~# service univention-directory-listener restart root@s62:~# service ucs-school-netlogon-user-logonscripts restart root@s62:~# univention-directory-listener-ctrl resync ucs-school-user-logonscript ==> /var/log/univention/listener.log <== 02.05.18 14:00:14.536 LISTENER ( WARN ) : initializing module ucs-school-user-logonscript 02.05.18 14:00:17.786 LISTENER ( WARN ) : finished initializing module ucs-school-user-logonscript with rv=0 ==> /var/log/univention/ucs-school-user-logonscript-daemon.log <== 2018-05-02 14:00:14,845 [INFO] INIT 2018-05-02 14:00:14,896 [INFO] got global links {'Marktplatz': {'letter': 'M', 'server': 's62'}} 2018-05-02 14:00:15,657 [INFO] signal received - reduced wait time by 172.2 seconds 2018-05-02 14:00:26,277 [INFO] all items processed... sleeping... root@s62:~# egrep '2018-05-02 14:.*updating logon script for' /var/log/univention/ucs-school-user-logonscript-daemon.log | wc -l 910 root@s62:~# find /var/lib/samba/netlogon/ -name '*.vbs' -ls | grep 'Mai 2 14:' | wc -l 908 --- listener : 919 user / 3s = 306,33 user/s daemon : 919 user / 11s = 83,55 user/s (no sleeps) wall clock resync -> files written: 14:00:14 -> 14:00:26 = 12s wall clock: 919 user / 12s = 76,58 user/s ================================================================ root@s121:~# cd /var/lib/samba/netlogon/ root@s121:/var/lib/samba/netlogon# diff -Naur user.14.0.0-1/ user.14.0.0-4/ → no difference 3 x faster on wall clock, same files generated.
UCS@school 4.3 v3 has been released. https://docs.software-univention.de/changelog-ucsschool-4.3v3-de.html If this error occurs again, please clone this bug.