Bug 48545 - replication.py: very slow for groups with lots of members (domain users)
replication.py: very slow for groups with lots of members (domain users)
Status: CLOSED FIXED
Product: UCS
Classification: Unclassified
Component: LDAP
UCS 4.4
Other Linux
: P5 normal (vote)
: UCS 4.4-4-errata
Assigned To: Philipp Hahn
Arvid Requate
:
Depends on: 51061 51093
Blocks:
  Show dependency treegraph
 
Reported: 2019-01-30 10:16 CET by Felix Botner
Modified: 2020-10-02 10:12 CEST (History)
6 users (show)

See Also:
What kind of report is it?: Bug Report
What type of bug is this?: 5: Major Usability: Impairs usability in key scenarios
Who will be affected by this bug?: 3: Will affect average number of installed domains
How will those affected feel about the bug?: 5: Blocking further progress on the daily work
User Pain: 0.429
Enterprise Customer affected?: Yes
School Customer affected?:
ISV affected?:
Waiting Support: Yes
Flags outvoted (downgraded) after PO Review:
Ticket number: 2020032321000375
Bug group (optional):
Max CVSS v3 score:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Felix Botner univentionstaff 2019-01-30 10:16:06 CET
Setup: UCS Maste with Slave, i have already 10000 user objects ()

1. now create even more user on the master
@master-> for i in $(seq 1 10000); do udm users/user create --set username=dcbtest1$i --set lastname=test2 --set password=univention; done

2. now that the domain users object is constantly updated, replication.py on 
   the slave wants to update this object as well

listener.log

... 22:46:24.389  ... updating 'uid=cbtest1461,dc=four,dc=three' command a
... 22:46:24.440  ... updating 'uid=cbtest1461,dc=four,dc=three' command m
... 22:46:24.446  ... updating 'uid=cbtest1461,dc=four,dc=three' command m
... 22:46:24.450  ... updating 'cn=Domain Users,cn=groups,dc=four,dc=three' command m
... 22:46:50.029  ... updating 'uid=cbtest1462,dc=four,dc=three' command a
... 22:46:50.113  ... updating 'uid=cbtest1462,dc=four,dc=three' command m
... 22:46:50.117  ... updating 'uid=cbtest1462,dc=four,dc=three' command m
... 22:46:50.122  ... updating 'cn=Domain Users,cn=groups,dc=four,dc=three' command m
... 22:47:13.793  ... updating 'uid=cbtest1463,dc=four,dc=three' command a
... 22:47:13.823  ... updating 'uid=cbtest1463,dc=four,dc=three' command m
... 22:47:13.827  ... updating 'uid=cbtest1463,dc=four,dc=three' command m
... 22:47:13.832  ... updating 'cn=Domain Users,cn=groups,dc=four,dc=three' command m

and this takes 20s for each modification (as long as the objects is modified on the master). 

This is not a bug problem (because after the master stops modifying domain users, the replication on the slave for this object is done) but could be improved with something like fast_member_add form the group udm handler.
Comment 1 Philipp Hahn univentionstaff 2020-04-15 17:41:43 CEST
Happens in a customer environment with hourly updates of groups.
This is caused by
- replication.py doing a (REPLACE, "uniqueMember", […])-
- this triggers the LDAP overlay module "memberOf"
- it iterates over all those users
- and does a dummy update of each user
- slapd with LMDB does a fdatasync() for each such update
- this is very slow on systems not having a fast persistent storage

In some cases this takes longer then 5 minutes, which leads to the LDAP connection to the master (or local slapd?) being closed meanwhile. The next transaction then aborts with TIMEOUT(). This was hidden by Bug #51061.

openldap/servers/slapd/overlays/memberof.c works faster if it gets an incremental update using ADD/DELETE instead of the REPLACE.
Comment 2 Philipp Hahn univentionstaff 2020-04-15 18:06:54 CEST
A similar bug has been fixed in ADC Bug #50630 and is pending for S4C Bug #50629
Comment 3 Philipp Hahn univentionstaff 2020-04-15 18:29:00 CEST
[4.4-4] 5d0a78710c Bug #48545 repl: Do incremental updates for group.uniqueMember
 .../debian/changelog                               |  6 ++++++
 .../replication.py                                 | 24 ++++++++++++++++++++--
 2 files changed, 28 insertions(+), 2 deletions(-)

Package: univention-directory-replication
Version: 12.0.0-7A~4.4.0.202004151819
Branch: ucs_4.4-0
Scope: errata4.4-4

[4.4-4] 9f29f95f2b Bug #51093: univention-directory-replication 12.0.0-7A~4.4.0.202004151819
 doc/errata/staging/univention-directory-replication.yaml | 16 +++++++++++-----
 1 file changed, 11 insertions(+), 5 deletions(-)


QA: replication.py:modlist() augmented with  @ud.trace(with_return=True,repr=str)

…  DEBUG_END    : /usr/lib/univention-directory-listener/system/replication.py.modlist(...): […(0, 'uniqueMember', ['uid=nscd07cf,cn=users,dc=phahn,dc=dev'])]
…  DEBUG_END    : /usr/lib/univention-directory-listener/system/replication.py.modlist(...): […(1, 'uniqueMember', ['uid=nscd07cf,cn=users,dc=phahn,dc=dev'])]
Comment 4 Arvid Requate univentionstaff 2020-04-15 21:17:46 CEST
Verified:

* Code change works
* Strictly speaking 'uniqueMember' is just the default for the 'memberof-member-ad' option, which can be overridden via UCR ldap/overlay/memberof/memberof, see /etc/univention/templates/files/etc/ldap/slapd.conf.d/41univention-ldap-overlay-memberof , but your current change is ok for me now.
* Advisory Ok
Comment 5 Philipp Hahn univentionstaff 2020-04-16 11:13:23 CEST
PS1: we decided to compare the DNs case-sensitive: even if there only is a change in case, a DELETE/ADD is performed nevertheless. This is desired as we want the Backup/Slave LDAPs to have the same content as the Master. See Bug #46590 where this is already a problem.

PS2: As already documented in <https://help.univention.com/t/memberof-attribute-group-memberships-of-user-and-computer-objects/6439> "/usr/share/univention-ldap-overlay-memberof/univention-update-memberof" must be called on all roles after the overlay module "memberof" has been enabled: The new behavior now filters out the null-change and no longer triggers running the  overlay module on backups/slaves when the script runs on the master. The script already works on Backups/Slaves as it uses uldap.getRootDnConnection(), which either uses "cn=admin" or "cn=update" depending on the server role.
Comment 6 Philipp Hahn univentionstaff 2020-04-16 12:16:42 CEST
[4.4-4] 929762b894 Bug #48545: Fix incremental updates for group.uniqueMember
 management/univention-directory-replication/debian/changelog | 6 ++++++
 management/univention-directory-replication/replication.py   | 6 +++---
 2 files changed, 9 insertions(+), 3 deletions(-)

[4.4-4] e8e1580828 Bug #48545: Fix incremental updates for group.uniqueMember
 .../univention-directory-replication/debian/changelog       |  6 ++++++
 management/univention-directory-replication/replication.py  | 13 ++++++-------
 2 files changed, 12 insertions(+), 7 deletions(-)

Package: univention-directory-replication
Version: 12.0.0-9A~4.4.0.202004161212
Branch: ucs_4.4-0
Scope: errata4.4-4

[4.4-4] 0dde5024c7 Bug #51093: univention-directory-replication 12.0.0-9A~4.4.0.202004161212
 doc/errata/staging/univention-directory-replication.yaml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
Comment 7 Arvid Requate univentionstaff 2020-04-16 12:36:24 CEST
Verified:
* Jenkins tests of last night Ok on Backup & Slave
* Artificial case changes in group.uniqueMember are replicated correctly
* The UCR variable ldap/overlay/memberof/member is considered correctly (tested  by setting it to foo and restarting the listener)
* Advisory
Comment 8 Arvid Requate univentionstaff 2020-04-16 14:10:52 CEST
<http://errata.software-univention.de/ucs/4.4/528.html>