Univention Bugzilla – Bug 51233
[4.4] fast_member_add and fast_member_remove don't scale
Last modified: 2021-10-28 18:15:53 CEST
The functions fast_member_add and fast_member_remove create a CPU bottleneck when changing very big groups. Adding 100.000 users to a group takes fast_member_add() 819s (13m39s). Removing 100.000 users from a group takes fast_member_remove() s (ms). Improve run time of both functions.
Fix commited in git branch dtroeder/51233_fast_member_add_remove: 3adb671b88 Bug #51233: remove nested for loops in fast_member_add and fast_member_remove Please reopen when QAed for merge and build.
My benchmark results: Before: fast_member_add(100.000 user): 819s (13m39s) fast_member_remove(100.000 user): after 70min still not finished Fater: fast_member_add(100.000 user): 15s fast_member_remove(100.000 user): 190s (3m10s)
How I benchmarked: ---------------------------------------------------- root@m70:~# udm groups/group create --set name=testgr01 WARNING: The object is not going to be created underneath of its default containers. Object created: cn=testgr01,dc=uni,dc=dtr root@m70:~# ipython In [1]: from univention.admin.uldap import getAdminConnection ...: lo, pos = getAdminConnection() In [2]: users = [("user{:03d}".format(num), "uid=user{:03d},cn=users,dc=uni,dc=dtr".format(num)) for num in range(1000)] In [5]: ml = [("memberUid", [], [u[0] for u in users]), ("uniqueMember", [], [u[1] for u in users])] In [7]: import time In [8]: t0 = time.time(); lo.modify("cn=testgr01,dc=uni,dc=dtr", ml); t1 = time.time(); print(t1 - t0) 0.323982000351 In [10]: t0 = time.time(); lo.modify("cn=testgr01,dc=uni,dc=dtr", [("memberUid", [u[0] for u in users], [u[0] for u in users]+["user1000"]), ("uniqueMember", [u[1] for ...: u in users], [u[1] for u in users]+["uid=user1000,cn=users,dc=uni,dc=dtr"])]); t1 = time.time(); print(t1 - t0) 0.590006113052 In [11]: import univention.admin.modules In [12]: module_groups = univention.admin.modules.get("groups/group") In [15]: univention.admin.modules.init(lo, pos, module_groups) In [16]: grp_obj = module_groups.object(None, lo, pos, "cn=testgr01,dc=uni,dc=dtr") In [17]: len(grp_obj["users"]) Out[17]: 0 In [18]: t0 = time.time(); grp_obj.open(); t1 = time.time(); print(t1 - t0) 0.0287919044495 In [19]: len(grp_obj["users"]) Out[19]: 1001 In [20]: t0 = time.time(); grp_obj.fast_member_remove(["uid=user1001,cn=users,dc=uni,dc=dtr"], ["user1001"]); t1 = time.time(); print(t1 - t0) 0.0296738147736 root@m70:~# time udm groups/group list --filter cn=testgr01 | grep users: | wc -l 1001 real 0m1,432s user 0m0,088s sys 0m0,008s In [23]: users = [("user{:03d}".format(num), "uid=user{:03d},cn=users,dc=uni,dc=dtr".format(num)) for num in range(1001, 100000)] In [28]: t0 = time.time(); grp_obj.fast_member_add([u[1] for u in users], [u[0] for u in users]); t1 = time.time(); print(t1 - t0) 819.554710865 # 13m39s In [32]: grp_obj = module_groups.object(None, lo, pos, "cn=testgr01,dc=uni,dc=dtr") In [33]: t0 = time.time(); grp_obj.open(); t1 = time.time(); print(t1 - t0) 3.02851104736 In [34]: len(grp_obj["users"]) Out[34]: 100000 root@m70:~# cp /usr/lib/python2.7/dist-packages/univention/admin/handlers/groups/group.py /usr/lib/python2.7/dist-packages/univention/admin/handlers/groups/group.py.oriroot@m70:~# cp /sync/univention-directory-manager-modules/modules/univention/admin/handlers/groups/group.py /usr/lib/python2.7/dist-packages/univention/admin/handlers/groups/group.py root@m70:~# rm /usr/lib/python2.7/dist-packages/univention/admin/handlers/groups/group.pyc root@m70:~# pkill -f univention-cli-server root@m70:~# ipython In [1]: from univention.admin.uldap import getAdminConnection In [2]: lo, pos = getAdminConnection() In [3]: import time In [4]: users = [("user{:03d}".format(num), "uid=user{:03d},cn=users,dc=uni,dc=dtr".format(num)) for num in range(100000)] In [5]: import univention.admin.modules In [6]: module_groups = univention.admin.modules.get("groups/group") In [7]: univention.admin.modules.init(lo, pos, module_groups) In [8]: grp_dn = "cn=testgr02,dc=uni,dc=dtr" In [9]: grp_obj = module_groups.object(None, lo, pos, grp_dn) In [10]: t0 = time.time(); grp_obj.open(); t1 = time.time(); print(t1 - t0) 0.00237202644348 In [11]: len(grp_obj["users"]) Out[11]: 0 In [12]: len(users) Out[12]: 100000 In [13]: t0 = time.time(); grp_obj.fast_member_add([u[1] for u in users], [u[0] for u in users]); t1 = time.time(); print(t1 - t0) 15.1061120033 In [14]: grp_obj = module_groups.object(None, lo, pos, grp_dn) In [15]: t0 = time.time(); grp_obj.open(); t1 = time.time(); print(t1 - t0) 2.84513306618 In [16]: len(grp_obj["users"]) Out[16]: 100000
The patch is already part of UCS 5 (during the Python 3 migration) - slightly adapted.
(In reply to Florian Best from comment #5) > The patch is already part of UCS 5 (during the Python 3 migration) - > slightly adapted. git:db0d44f0ef5a56542293538c641e4de50d07f6de
I have backported some changes from the UCS 5.0 branch: [dtroeder/51233_fast_member_add_remove] bb9e99d3dc Bug #51233: backport changes from ucs 5.0 A typo was fixed: [dtroeder/51233_fast_member_add_remove] f9735ee23d fixup! Bug #51233: remove nested for loops in fast_member_add and fast_member_remove I added a benchmark script to show the performance difference: [dtroeder/51233_fast_member_add_remove] e8e14a29a0 Bug #51233: benchmark script, do not merge to main
The benchmark script mocks the LDAP access. So it only measures the time used by the UDM Python code in fast_member_add/( and fast_member_remove(). The results are:
Current UDM in 4.4 (univention-directory-manager 14.0.20-17A~4.4.0.202108311301): root@m70:/sync/univention-directory-manager-modules# time ./add_remove_lots_of_users.py No group deleted, nothing found with name='testgr20210902'. Created group 'cn=testgr20210902,cn=groups,dc=uni,dc=dtr'. Removing a user from the group that has 10000 users... FakeLo.get() -> returning 10000 users FakeLo.modify() -> args repr 165 chars long ==========> Removed a user from a group with 10000 users in 0.050 seconds (198834/s). <========== Deleted group 'cn=testgr20210902,cn=groups,dc=uni,dc=dtr'. Created group 'cn=testgr20210902,cn=groups,dc=uni,dc=dtr'. Removing a user from the group that has 50000 users... FakeLo.get() -> returning 50000 users FakeLo.modify() -> args repr 166 chars long ==========> Removed a user from a group with 50000 users in 0.249 seconds (201048/s). <========== Deleted group 'cn=testgr20210902,cn=groups,dc=uni,dc=dtr'. Created group 'cn=testgr20210902,cn=groups,dc=uni,dc=dtr'. Removing a user from the group that has 100000 users... FakeLo.get() -> returning 100000 users FakeLo.modify() -> args repr 166 chars long ==========> Removed a user from a group with 100000 users in 0.543 seconds (183993/s). <========== Deleted group 'cn=testgr20210902,cn=groups,dc=uni,dc=dtr'. Created group 'cn=testgr20210902,cn=groups,dc=uni,dc=dtr'. Adding 10000 users to the group that has already 10000 users with UDM... FakeLo.get() -> returning 10000 users FakeLo.modify() -> args repr 130070 chars long ==========> Added 10000 users to the group in 181.090 seconds (55/s). <========== Deleted group 'cn=testgr20210902,cn=groups,dc=uni,dc=dtr'. Created group 'cn=testgr20210902,cn=groups,dc=uni,dc=dtr'. Adding 50000 users to the group that has already 10000 users with UDM... FakeLo.get() -> returning 10000 users FakeLo.modify() -> args repr 2250094 chars long ==========> Added 50000 users to the group in 1172.931 seconds (42/s). <========== Deleted group 'cn=testgr20210902,cn=groups,dc=uni,dc=dtr'. Created group 'cn=testgr20210902,cn=groups,dc=uni,dc=dtr'. Adding 100000 users to the group that has already 10000 users with UDM... FakeLo.get() -> returning 10000 users FakeLo.modify() -> args repr 4900094 chars long ==========> Added 100000 users to the group in 2411.696 seconds (41/s). <========== Deleted group 'cn=testgr20210902,cn=groups,dc=uni,dc=dtr'. real 62m47,996s user 62m23,800s sys 0m1,200s
Patched UDM from dtroeder/51233_fast_member_add_remove: root@m70:/sync/univention-directory-manager-modules# time ./add_remove_lots_of_users.py No group deleted, nothing found with name='testgr20210902'. Created group 'cn=testgr20210902,cn=groups,dc=uni,dc=dtr'. Removing a user from the group that has 10000 users... FakeLo.get() -> returning 10000 users FakeLo.modify() -> args repr 165 chars long ==========> Removed a user from a group with 10000 users in 0.024 seconds (421284/s). <========== Deleted group 'cn=testgr20210902,cn=groups,dc=uni,dc=dtr'. Created group 'cn=testgr20210902,cn=groups,dc=uni,dc=dtr'. Removing a user from the group that has 50000 users... FakeLo.get() -> returning 50000 users FakeLo.modify() -> args repr 166 chars long ==========> Removed a user from a group with 50000 users in 0.122 seconds (408417/s). <========== Deleted group 'cn=testgr20210902,cn=groups,dc=uni,dc=dtr'. Created group 'cn=testgr20210902,cn=groups,dc=uni,dc=dtr'. Removing a user from the group that has 100000 users... FakeLo.get() -> returning 100000 users FakeLo.modify() -> args repr 166 chars long ==========> Removed a user from a group with 100000 users in 0.260 seconds (384080/s). <========== Deleted group 'cn=testgr20210902,cn=groups,dc=uni,dc=dtr'. Created group 'cn=testgr20210902,cn=groups,dc=uni,dc=dtr'. Adding 10000 users to the group that has already 10000 users with UDM... FakeLo.get() -> returning 10000 users FakeLo.modify() -> args repr 130070 chars long ==========> Added 10000 users to the group in 0.046 seconds (217557/s). <========== Deleted group 'cn=testgr20210902,cn=groups,dc=uni,dc=dtr'. Created group 'cn=testgr20210902,cn=groups,dc=uni,dc=dtr'. Adding 50000 users to the group that has already 10000 users with UDM... FakeLo.get() -> returning 10000 users FakeLo.modify() -> args repr 2250094 chars long ==========> Added 50000 users to the group in 0.141 seconds (354303/s). <========== Deleted group 'cn=testgr20210902,cn=groups,dc=uni,dc=dtr'. Created group 'cn=testgr20210902,cn=groups,dc=uni,dc=dtr'. Adding 100000 users to the group that has already 10000 users with UDM... FakeLo.get() -> returning 10000 users FakeLo.modify() -> args repr 4900094 chars long ==========> Added 100000 users to the group in 0.272 seconds (367151/s). <========== Deleted group 'cn=testgr20210902,cn=groups,dc=uni,dc=dtr'. real 0m2,275s user 0m1,748s sys 0m0,256s
The code is already in production use in UCS 5.0. Looking for someone to QA this.
Merged and build in 4.4-8: [4.4-8] e0ccea694e Bug #51233: remove nested for loops in fast_member_add and fast_member_remove [4.4-8] 0759e6aaf3 Bug #51233: advisory update univention-directory-manager-modules (14.0.20-24) Thanks for the QA.
Created attachment 10845 [details] benchmark script (add_remove_lots_of_users.py) Benchmark script for posterity, as the git branch will be deleted.
Your patch to make the UDM code in UCS 4.4 and 5.0 more similar was applied. [4.4-8] d44fb1eba9 Bug #51233: make ucs4 and ucs5 code more alike [4.4-8] 2fecc47975 Bug #51233: advisory update
OK: performance increase in fast_member_add() and fast_member_remove() OK: mixed unicode and str works correctly OK: YAML
<https://errata.software-univention.de/#/?erratum=4.4x1077>