Univention Bugzilla – Bug 49862
The initialization of the listener module portal_groups takes too long
Last modified: 2021-11-29 16:43:15 CET
The initialization of the listener module portal_groups takes up to 12 hours in large environments (36000 groups). The listener builds its cache group by group the first time. The cache is written 36,000 times instead of calculating everything once at the end. We have to optimize that. In this special environment we have the workaround, that this module can be deactivated by setting the ucr varible listener/module/<name>/deactivate This might not work on other environments.
I small update. The customer has concerns about the Listener module to cause a lot of trouble during the next import of the new user data at the end of the school year. Thus several Dozens of thousands of groups can be modified hundreds of times, so that the Listener module could also stop other modules (S4-Connector) here. That would be a big problem. If these concerns are justified, the customer would therefore like a short-term performance optimization for the module if possible. Maybe it's enough to deactivate the module with the ucr variable again?
Is there a reason why the cache is rebuilt with every group change and not just once in postrun()?
Even for environments not that large, this is a problem, especiall since it takes place during an upgrade and extends the downtime and maintenance window massively. 6500 groups on a 8 Core, 32 GB RAM, UCS Backup: 25.10.19 15:55:25.273 LISTENER ( WARN ) : initializing module portal_groups 25.10.19 16:37:56.854 LISTENER ( WARN ) : finished initializing module portal_groups with rv=0
Performance of json implementations: https://artem.krylysov.com/blog/2015/09/29/benchmark-python-json-libraries/ https://medium.com/dataweave/json-vs-simplejson-vs-ujson-2887b2c128b2 Maybe msgpack is a faster solution: https://www.benfrederickson.com/dont-pickle-your-data/
Untested patch in git:fbest/49862-remove-portal-group-cache: Patch removes the listener and replaces the logic to use raw ldap calls using the memberOf attribute of groups. Patch could be fine tuned to even make it faster by iterating only over the groups. TODO: make the ldap dn comparision case insensitive.
While joining a DC Backup to a domain with 52000 users and 11400 groups it took more than an hour to initialize the portal_groups module 21.08.20 15:32:36.250 LISTENER ( WARN ) : initializing module portal_groups 21.08.20 16:37:21.806 LISTENER ( WARN ) : finished initializing module portal_groups with rv=0
might be fixed with the new portal, needs check
(In reply to Ingo Steuwer from comment #8) > might be fixed with the new portal, needs check The latest code does: 42 class PortalGroups(ListenerModuleHandler): 43 » def post_run(self): 44 » » with self.as_root(): 45 » » » subprocess.call(['/usr/sbin/univention-portal', 'update', '--reason', 'ldap:group']) and nothing else anymore. So still a blocking call but only in postrun. → Way more faster.
(In reply to Erik Damrose from comment #7) > While joining a DC Backup to a domain with 52000 users and 11400 groups it > took more than an hour to initialize the portal_groups module > > 21.08.20 15:32:36.250 LISTENER ( WARN ) : initializing module > portal_groups > 21.08.20 16:37:21.806 LISTENER ( WARN ) : finished initializing > module portal_groups with rv=0 29.11.21 11:26:41.977 LISTENER ( WARN ) : initializing module portal_groups 29.11.21 14:16:15.668 LISTENER ( WARN ) : finished initializing module portal_groups with rv=0 91.000 users + 19.800 groups (In reply to Florian Best from comment #9) > (In reply to Ingo Steuwer from comment #8) > > might be fixed with the new portal, needs check > > The latest code does: > > 42 class PortalGroups(ListenerModuleHandler): > 43 » def post_run(self): > 44 » » with self.as_root(): > 45 » » » subprocess.call(['/usr/sbin/univention-portal', 'update', > '--reason', 'ldap:group']) > > and nothing else anymore. > So still a blocking call but only in postrun. > → Way more faster. Btw: what happens if the Listener is shut down/restarted before the module's postrun has been executed?
(In reply to Sönke Schwardt-Krummrich from comment #10) > 29.11.21 11:26:41.977 LISTENER ( WARN ) : initializing module > portal_groups > 29.11.21 14:16:15.668 LISTENER ( WARN ) : finished initializing > module portal_groups with rv=0 > > 91.000 users + 19.800 groups This was a DC slave with UCS@school 4.4 / UCS 4.4-8
(In reply to Sönke Schwardt-Krummrich from comment #10) > Btw: what happens if the Listener is shut down/restarted before the module's > postrun has been executed? UDL implements a "poor mans locking mechanism", e.g. it disables UNIX signals while modules are run; that way SIGTERM et.al. will be ignored until signals are re-enabled. If you kill UDL anyway - for example via SIGKILL - the UDL cache will be in an half-initialized state, where *some objects* are already handled by *some of their modules* and others are not. As such the module will not be flagged as "fully initialized" in "/var/lib/univention-directory-listener/handlers/$name" file and UDL will just call them again for all objects. What happens then depends on the module implementation if it is idempotent and hand handle handling the objects a second time without "clean()" and/or "initialize()" being called explicitly before the re-run.