Bug 47885 - Performance issues in umc schoolusers module
Performance issues in umc schoolusers module
Status: CLOSED FIXED
Product: UCS@school
Classification: Unclassified
Component: Ucsschool-lib
UCS@school 4.3
Other Linux
: P5 normal (vote)
: UCS@school 4.3 v6-errata
Assigned To: Sönke Schwardt-Krummrich
Daniel Tröder
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2018-10-01 11:28 CEST by Ole Schwiegert
Modified: 2019-03-16 19:20 CET (History)
5 users (show)

See Also:
What kind of report is it?: Bug Report
What type of bug is this?: 5: Major Usability: Impairs usability in key scenarios
Who will be affected by this bug?: 3: Will affect average number of installed domains
How will those affected feel about the bug?: 4: A User would return the product
User Pain: 0.343
Enterprise Customer affected?:
School Customer affected?: Yes
ISV affected?:
Waiting Support:
Flags outvoted (downgraded) after PO Review:
Ticket number:
Bug group (optional):
Max CVSS v3 score:


Attachments
Profiling output (19.63 KB, text/plain)
2018-10-04 18:30 CEST, Sönke Schwardt-Krummrich
Details
open(loadGroups=False) (1.50 KB, patch)
2018-10-05 08:56 CEST, Daniel Tröder
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Ole Schwiegert univentionstaff 2018-10-01 11:28:02 CEST
It was reported, that for some customers the retrieval of users in environments with a number of users in the magnitude of 10^3 is very slow (>30 sec to load the page).

It was determined already that the problem is most likely not in the frontend code (users are only fetched once at page load, so no hiccup here), but in the ucsschool-lib in the schoolldap module if users are retrieved without specifying a group/class:

line 486ff:

else:
			# be aware that this search opens all user objects of specified type and may take some time!
			users = []
			for cls in classes:
				_users = cls.get_all(ldap_connection, school, LDAP_Filter.forUsers(pattern))
				users.extend(user.get_udm_object(ldap_connection) for user in _users)
Comment 1 Sönke Schwardt-Krummrich univentionstaff 2018-10-02 10:34:57 CEST
To be more precise regarding the issue:

the password reset module now uses (in)directly the code referenced by Ole above.
So customers with around 1500 user have to wait around 30-40 seconds when opening the UCS@school password reset modules before the initial search completes and the module can be used.

I see several possible points where we should have a look at:
1) do we have to use this "slow" function within the password reset module? Or are there faster/more effective versions?

2) can we improve the listed code?
Currently it fetches with *one* LDAP search request e.g. 1500 users and afterwards it fetches with *1500* LDAP requests the UDM object data from the LDAP server. Which takes a lot of time.
It is possible to pass a dict with LDAP object attributes to an UDM object __init__ method. In this case, the UDM object does not fetch the LDAP data on its own. So in theory, the whole thing is doable with only *ONE* LDAP search request that fetches all LDAP data for all 1500 users and then it passes smaller chunks to each __init__ method of the UDM user objects.

If you have any questions about that, please consult me.

3) As a last resort: if the UDM objects are always too slow, we should consider to ask the LDAP directly and process the password expiration date on our own without the use of UDM objects.
Comment 2 Daniel Tröder univentionstaff 2018-10-02 11:56:51 CEST
The above code searches once in LDAP and does 2 * <num uses> open() on each UDM object:

def get_all(filter):
-------------
ret = []
for udm_obj in cls.lookup(filter):
    udm_obj.open()
    ret.append(cls.from_udm_obj(udm_obj))
return ret
-------------

def from_udm_obj(udm_obj):
-------------
udm_obj.open()
obj = cls(**deepcopy(attrs))
obj._udm_obj_searched = True
obj._udm_obj = udm_obj
return obj
-------------

def get_udm_object():
-------------
if self._udm_obj_searched is False:
    # we don't go here
    self._udm_obj = ...
    self._udm_obj.open()
    self._udm_obj_searched = True
return self._udm_obj
Comment 3 Daniel Tröder univentionstaff 2018-10-02 11:58:25 CEST
The function _users() in the schoolldap module can easily be changed into a generator.
If the UMC module can handle it, the problem would be solved.
Comment 4 Daniel Tröder univentionstaff 2018-10-04 12:13:49 CEST
Created 2000 students, 500 teachers and 500 teachersandstaff (VM on lattjo):

root@m126:~# time /usr/share/ucs-school-import/scripts/ucs-school-testuser-import --students 2000 --classes 100 SchoolA

Then waited some minutes for listener and s4 connector to catch up...

Opening the "Passwörter (Schüler)" module took 12 seconds of which "query" used 8s alone.
The response was 270 KB of JSON:

{"status": 200, "result": [{"passwordexpiry": -1, "id": "uid=eyvette.stau,cn=schueler,cn=users,ou=SchoolA,dc=uni,dc=dtr", "name": "Stauvermann, Eyvette (eyvette.stau)"}, {"passwordexpiry": -1, "id": "uid=edmond.lueke,cn=schueler,cn=users,ou=SchoolA,dc=uni,dc=dtr", "name": "Lueke, Edmond (edmond.lueke)"}, ...]}

That originates from Instance.query() in ucs-school-umc-users/umc/python/schoolusers/__init__.py.

The Student.get_all() takes 6.5 seconds alone.
The user.get_udm_object() for each user from above command takes 0.01 seconds.

The problem is not really solvable in the backend. It must be fixed in the UMC fronend: No pagination is done!!

The frontend relies on getting the complete data all at once. Instead, when scrolling down, it should keep loading objects from the backend.

A workaround to lessen the waiting time is possible though:
If using a direct LDAP query (not using ucsschool.lib or UDM) the query for the required data takes only 0.14 seconds.

So for now I suggest to do just that: workaround for this module. The same problem exists in all other user modules too ofc...

Implemented said workaround in a branch:

[dtroeder/47885_performance_password_reset a1c091318] Bug #47885: improve performance of password reset module using direct LDAP queries


Opening the password reset module for 2000 students now takes 800ms of which the "query" call uses 455ms (of which 300ms are overhead: the LDAP query alone takes only 160ms).
Comment 5 Jürn Brodersen univentionstaff 2018-10-04 13:28:12 CEST
A bit of context, because I have looked into this in the past as well.

There is no pagination in the backend. That would need to be implemented from the ground up.

Using udm is slow because the whole object is fetched, instead of just the needed attribute. The backend than throws all attributes away that the frontend didn't asked for...

Implementing a way to only fetch selected attributes through udm should yield a similar performance as the ldap query.
Comment 6 Sönke Schwardt-Krummrich univentionstaff 2018-10-04 18:29:53 CEST
(In reply to Jürn Brodersen from comment #5)
> A bit of context, because I have looked into this in the past as well.
> 
> There is no pagination in the backend. That would need to be implemented
> from the ground up.
> 
> Using udm is slow because the whole object is fetched, instead of just the
> needed attribute. The backend than throws all attributes away that the
> frontend didn't asked for...

I don't think that the attributes are the problem. I profiled the code and tracked it down. 

1 LDAP with 1468 users:

1 LDAP search is done for searching all users inclusive LDAP attributes, which returns 1468 users.
Now for each user obj.open() is called. The obj._load_groups() does 4 LDAP lookups on its own → 1468 * 4 LDAP requests → 5872 LDAP request

Each LDAP request takes ~5ms → 5872 * 0.005s → ~30 seconds

see attached output.txt
Comment 7 Sönke Schwardt-Krummrich univentionstaff 2018-10-04 18:30:17 CEST
Created attachment 9691 [details]
Profiling output
Comment 8 Daniel Tröder univentionstaff 2018-10-05 08:56:22 CEST
Created attachment 9692 [details]
open(loadGroups=False)

(In reply to Sönke Schwardt-Krummrich from comment #6)
> 1 LDAP search is done for searching all users inclusive LDAP attributes,
> which returns 1468 users.
> Now for each user obj.open() is called. The obj._load_groups() does 4 LDAP
> lookups on its own → 1468 * 4 LDAP requests → 5872 LDAP request
> 
> Each LDAP request takes ~5ms → 5872 * 0.005s → ~30 seconds
That's strange. In _load_groups() there can be only 1 to 3 LDAP queries. It should usually make 2.

Could you profile the code by patching ucs-school-lib/python/models/base.py to use "open(loadGroups=False)" ? Patch for that is attached.
Comment 9 Sönke Schwardt-Krummrich univentionstaff 2018-12-20 22:04:13 CET
(In reply to Daniel Tröder from comment #8)
> That's strange. In _load_groups() there can be only 1 to 3 LDAP queries. It
> should usually make 2.
> 
> Could you profile the code by patching ucs-school-lib/python/models/base.py
> to use "open(loadGroups=False)" ? Patch for that is attached.

I tested with a variant of your patch (udm_obj.open(False) → udm_obj is not always a users/user object, so I had to catch this). But there was no noticable performance gain.
So you were right, that this does not seem to be the correct approach.

(In reply to Daniel Tröder from comment #4)
> A workaround to lessen the waiting time is possible though:
> If using a direct LDAP query (not using ucsschool.lib or UDM) the query for
> the required data takes only 0.14 seconds.
> 
> So for now I suggest to do just that: workaround for this module. The same
> problem exists in all other user modules too ofc...
> 
> Implemented said workaround in a branch:
> 
> [dtroeder/47885_performance_password_reset a1c091318] Bug #47885: improve
> performance of password reset module using direct LDAP queries

I tested your test branch successfully on my test machine.

> Opening the password reset module for 2000 students now takes 800ms of which
> the "query" call uses 455ms (of which 300ms are overhead: the LDAP query
> alone takes only 160ms).

Test setup: UCS@school with 2264 users in one OU (shown in the grid)

schoolusers/query with original code:
3 runs: ~66s, ~60s, 66s → avg. 64s
Please note: chromes resolution is 0.1min if runtime is larger than 60s!

schoolusers/query with test branch:
5 runs: 7000ms, 7870ms, 9510ms, 7530ms, 7610ms → avg. 7900ms

I noticed that the UDM is using quite a huge LDAP filter:
(&(objectClass=posixAccount)(objectClass=shadowAccount)(objectClass=sambaSamAccount)(objectClass=krb5Principal)(objectClass=krb5KDCEntry)(!(uidNumber=0))(!(uid=*$))(!(univentionObjectFlag=functional))(&(ucsschoolSchool=gsmitte)(&(|(sn=*)(uid=*)(givenName=*)))(&(objectClass=ucsschoolStudent)(!(objectClass=ucsschoolExam)))))

This seems to be a performance issue. Since univentionObjectType is reliably set at all objects, the much simpler query reduced the search time by 50%! For testing, I patched univention/uldap.py to use the following shorter filter string if the filter string above was passed to access.search():
(&(univentionObjectType=users/user)(&(ucsschoolSchool=gsmitte)(&(|(sn=*)(uid=*)(givenName=*)))(&(objectClass=ucsschoolStudent)(!(objectClass=ucsschoolExam)))))

schoolusers/query with special uldap.py and test branch:
3 runs: 4330ms, 4650ms, 4260ms → avg. 4414ms
Comment 10 Sönke Schwardt-Krummrich univentionstaff 2018-12-20 22:45:50 CET
8872b4ba3 Bug #47885: update advisories
5895692b7 Bug #47885: Merge branch 'dtroeder/47885_performance_password_reset' into 4.3
db39812b4 Bug #47885: add advisories
8467350b6 Bug #47885: add changelog entry
04bc85873 Bug #47885: add changelog entry
0e2876c0d Bug #47885: improve performance of password reset module using direct LDAP queries

Package: ucs-school-lib
Version: 11.0.2-1A~4.3.0.201812202240
Branch: ucs_4.3-0
Scope: ucs-school-4.3

Package: ucs-school-umc-users
Version: 14.0.3-1A~4.3.0.201812202240
Branch: ucs_4.3-0
Scope: ucs-school-4.3

I think the load times are okay now and will become much better, once bug 48390 is fixed ("Bug 48390 – UDM performance can be improved by optimizing LDAP filters").

Merged test branch to 4.3 and created changelog/advisory entries. Can you please check them?
Comment 11 Daniel Tröder univentionstaff 2019-01-07 08:49:32 CET
(In reply to Sönke Schwardt-Krummrich from comment #10)
> Merged test branch to 4.3 and created changelog/advisory entries. Can you
> please check them?
OK: merge
OK: changelog
OK: advisory
Comment 12 Daniel Tröder univentionstaff 2019-01-09 17:30:12 CET
Merge to 4.4:

[4.4] 76cf6f0fe Bug #47885: improve performance of password reset module using direct LDAP queries
[4.4] 36e5201b8 Bug #47885: add changelog entry
[4.4] 04c2fce10 Bug #47885: add changelog entry
[4.4] ba002b0b9 Bug #47885: add advisories
[4.4] 03fd4d873 Bug #47885: update advisories
Comment 13 Sönke Schwardt-Krummrich univentionstaff 2019-01-16 15:51:41 CET
UCS@school 4.3 v6 has been released.

https://docs.software-univention.de/changelog-ucsschool-4.3v6-de.html

If this error occurs again, please clone this bug.