Bug 30991 - Performance problem with many groups
Performance problem with many groups
Status: CLOSED FIXED
Product: UCS
Classification: Unclassified
Component: UMC - Groups
UCS 3.1
All Linux
: P3 normal (vote)
: UCS 3.1-1-errata
Assigned To: Dirk Wiesenthal
Stefan Gohmann
:
Depends on:
Blocks: 32096 31560
  Show dependency treegraph
 
Reported: 2013-04-05 18:09 CEST by Philipp Hahn
Modified: 2013-07-26 14:28 CEST (History)
5 users (show)

See Also:
What kind of report is it?: ---
What type of bug is this?: ---
Who will be affected by this bug?: ---
How will those affected feel about the bug?: ---
User Pain:
Enterprise Customer affected?:
School Customer affected?:
ISV affected?:
Waiting Support:
Flags outvoted (downgraded) after PO Review:
Ticket number:
Bug group (optional): Large environments, Usability
Max CVSS v3 score:


Attachments
do not open() when not necessary (1.91 KB, patch)
2013-05-10 20:11 CEST, Dirk Wiesenthal
Details | Diff
Output of syntax inspection (6.93 KB, text/plain)
2013-05-10 20:50 CEST, Dirk Wiesenthal
Details
Speedup syntax lookups (13.65 KB, patch)
2013-05-12 23:14 CEST, Dirk Wiesenthal
Details | Diff
Non-invasive, moderate speedup (4.75 KB, patch)
2013-05-13 12:49 CEST, Dirk Wiesenthal
Details | Diff
Speedup syntax lookups (23.37 KB, patch)
2013-05-14 20:24 CEST, Dirk Wiesenthal
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Philipp Hahn univentionstaff 2013-04-05 18:09:32 CEST
Opening the UDM users or computers module takes "very" long.

groups/group.py#open() does not seem to be the culprit, since a "udm users/user list --filter uid=XXXXXX" from the command line is instant (<0.5s).

"udm groups/group list" on the other hand takes 11 seconds, which matches roughly the time UMC UDM takes.

1213 groups.
No groups in groups.
No mixed hosts / users groups.
The largest group has 3572 users.
The user is member of 26 groups with 16,17²,21²,22,25,39,67,78,88,112,119³,121²,137,144,176,251,1945,2814,2872,3296,3572 members.
Comment 1 Stefan Gohmann univentionstaff 2013-05-08 17:46:57 CEST
After changing the syntax for the primary group to string or set the size limit the situation is a little bit better. Opening a windows computer takes 12 seconds instead of 45 seconds:

# Syntax
ucr set
directory/manager/web/modules/computers/windows/properties/primaryGroup/syntax=string
/etc/init.d/univention-management-console-server restart

# Limit
ucr set directory/manager/web/sizelimit=500
/etc/init.d/univention-management-console-web-server restart


The real problem is the change from the LDAP syntax to the UDM syntax and specially the open of the object. Removing the open of the object increases the performance:

dev/branches/ucs-3.1/ucs-3.1-2/management/univention-management-console-module-udm$ svn diff
Index: umc/python/udm/udm_ldap.py
===================================================================
--- umc/python/udm/udm_ldap.py  (Revision 39934)
+++ umc/python/udm/udm_ldap.py  (Arbeitskopie)
@@ -1068,7 +1068,7 @@
                def map_choices( obj_list ):
                        result = []
                        for obj in obj_list:
-                               obj.open()
+                               # obj.open()
 
                                if syn.key == 'dn':
                                        key = obj.dn

The open is only necessary if the requested key is set in the open and not in __init__.
Comment 2 Dirk Wiesenthal univentionstaff 2013-05-10 20:11:31 CEST
Created attachment 5220 [details]
do not open() when not necessary

This patch does not open the object when getting key and label is trivial i.e. if it is the DN or if the key/label-string can be built with the info gathered in __init__.
Comment 3 Dirk Wiesenthal univentionstaff 2013-05-10 20:39:32 CEST
The problem is more complex. The patch from Comment#2 cannot prevent obj.open() in general and especially not for a syntax like GroupDN.

The patch has to open() the object when udm_objects.description(obj) is used to preserve compatibility (it was done before).

The first step to improve the situation is to apply the patch to UDM. But at two more steps are needed:

1. udm_objects.description(obj) is used when syn.label is None. This happens quite frequently (it is default). In this case obj.description() is called which returns the first part of the DN:
cn=Domain Admins,cn=groups,dc=ucs31,dc=local,dc=test -> Domain Admins
Many syntax classes rely upon this fallback. Simple solution: Find them and add a label:

class GroupDN(UDM_Objects):
  label = '%(name)s' # obj.info['name'] is set in __init__

Better solution: Add something like:
class UDM_Objects(ISyntax):
  label_from_dn = True

if syn.label is None and syn.label_from_dn:
  label = obj.dn.do_some_formatting()

This way we catch more syntax classes and even future ones. One downside: When this is released with 3.1-2 we have to be very backward-compatible: We have to check which objects defined a custom description() and would have to use this function (and open() the object before calling it).

This is feasible for our own classes but not really for 3rd-party extensions. It can be dynamically checked with (ugly but actually works...) inspect.getsource(handlers.simpleLdap.description) == inspect.getsource(obj.description)

2. Test which attributes are used in syn.label and if they are not set by __init__ check whether they could be added there. Example: gidNumber, uidNumber.
Comment 4 Dirk Wiesenthal univentionstaff 2013-05-10 20:50:58 CEST
Created attachment 5221 [details]
Output of syntax inspection

Attached the output of a simple script I write listing all UDM_Object sytax classes.

Also listing its modules and the attribute names used in syn.key and syn.label. Note that this is not completely correct. I compared the strings with the variable named "mapping". "username" for example is set explicitely in users/user.__init__. And I do not know what happens if one attribute in mapping is not set in LDAP itself...

I did not investigate every attribute in detail but "uidNumber" and "gidNumber" could be added to mapping.
Comment 5 Dirk Wiesenthal univentionstaff 2013-05-10 20:52:02 CEST
(In reply to comment #4)
> Created an attachment (id=5221) [details]
> Output of syntax inspection
> 
> Attached the output of a simple script I write listing all UDM_Object sytax
> classes.
> 
> Also listing its modules and the attribute names used in syn.key and syn.label.

Oh and showing the source of obj.description in case label is None.
Comment 6 Stefan Gohmann univentionstaff 2013-05-10 21:21:34 CEST
(In reply to comment #1)
> After changing the syntax for the primary group to string or set the size limit
> the situation is a little bit better. Opening a windows computer takes 12
> seconds instead of 45 seconds:
> 
> # Syntax
> ucr set
> directory/manager/web/modules/computers/windows/properties/primaryGroup/syntax=string
> /etc/init.d/univention-management-console-server restart

It is better to use the old primaryGroup syntax:

for c in domaincontroller_master domaincontroller_backup domaincontroller_slave memberserver windows
do
ucr set directory/manager/web/modules/computers/$c/properties/primaryGroup/syntax=primaryGroup
done
ucr set directory/manager/web/modules/users/user/properties/primaryGroup/syntax=primaryGroup

My customer test environment is much faster with these settings. Maybe there is another problem in the customer environment.
Comment 7 Stefan Gohmann univentionstaff 2013-05-10 21:49:50 CEST
(In reply to comment #6)
> (In reply to comment #1)
> > After changing the syntax for the primary group to string or set the size limit
> > the situation is a little bit better. Opening a windows computer takes 12
> > seconds instead of 45 seconds:
> > 
> > # Syntax
> > ucr set
> > directory/manager/web/modules/computers/windows/properties/primaryGroup/syntax=string
> > /etc/init.d/univention-management-console-server restart
> 
> It is better to use the old primaryGroup syntax:
> 
> for c in domaincontroller_master domaincontroller_backup domaincontroller_slave
> memberserver windows
> do
> ucr set
> directory/manager/web/modules/computers/$c/properties/primaryGroup/syntax=primaryGroup
> done
> ucr set
> directory/manager/web/modules/users/user/properties/primaryGroup/syntax=primaryGroup
> 
> My customer test environment is much faster with these settings. Maybe there is
> another problem in the customer environment.


This helps while opening a user because in the environment are a lot of UCS servers:
ucr set directory/manager/web/modules/users/user/properties/homeShare/syntax=string
Comment 8 Dirk Wiesenthal univentionstaff 2013-05-12 23:00:09 CEST
(In reply to comment #6)
> It is better to use the old primaryGroup syntax:
> My customer test environment is much faster with these settings. Maybe there is
> another problem in the customer environment.

primaryGroup is not UDM_Objects syntax and thus does not go the long way using the handlers but operates directly on on LDAP DNs.

This is not beatable with UDM_Objects. On the other hand using these syntax classes give a more consistent, user(developer)-friendly interface for defining the syntax. There are a couple of ways to optimize the UDM_Object-syntax classes present.

I will attach a patch that greatly improves performance of syntax lookups. It works by inspecting the well-defined syntax class and not uses the handlers when not needed. If a syntax class uses for example key='dn' and label='dn' one does not need to build the objects.

On my very small environment with only 3000 empty groups and a couple of hosts I iterated over all UDM_Objects syntax classes and compared the new with the old behavior. No differences in my environment.

root@m250:~# time python test.py 1 > syntax-slow.output 2> /dev/null

real    0m11.486s
user    0m8.885s
sys     0m0.392s
root@m250:~# time python test.py > syntax-fast.output 2> /dev/null

real    0m4.937s
user    0m4.212s
sys     0m0.168s

This should be far more impressive in a larger environment.
Comment 9 Dirk Wiesenthal univentionstaff 2013-05-12 23:14:03 CEST
Created attachment 5222 [details]
Speedup syntax lookups

Some caveats:
 * udm_filter is not supported (i.e. it falls back to the slow operations)
 * I had to change some handlers so that they can benefit from the new code. Some more in-depth-udm-dev will have to look at it
 * obj.info changes in handler.__init__() are not recognized. obj._post_unmap() is not recognized
 * LDAP-entries are search with univentionObjectType=users/user. The far more complex filter in users/user.lookup() is not used. It _seems_ to work (which leads to the question why it is not done the way I do it now...

UDM_Objects' key and label is scanned for LDAP-Attributes (handler.mapping is used for that). If everything is already present in LDAP, only uses ldap_connection.search() or even ldap_connection.searchDn().
If more flexibility is needed (see caveats) one may specify always_use_objects = True. For example obj.description() is not supported anymore as the label of a syntax. Always falls back to getting label from DN.

For 3.1-2 one could default always_use_objects to False (safer in terms of backward compatibility) and change it for 3.2 if the overall idea pleases.
Comment 10 Dirk Wiesenthal univentionstaff 2013-05-12 23:18:06 CEST
(In reply to comment #9)
> Some caveats:

If forgot: Handlers like computers/computer do not work anymore. computers/computer dynamically used all computers/* handlers; this is currently impossible because of the way the syntax' udm_modules is used: LDAP is search for univentionObjectType=computers/computer which returns nothing. Here always_use_objects needs to be set.
Comment 11 Stefan Gohmann univentionstaff 2013-05-13 06:21:01 CEST
After applying the patch the DNS and DHCP entries are no longer shown while opening a windows computer.
Comment 12 Dirk Wiesenthal univentionstaff 2013-05-13 12:49:47 CEST
Created attachment 5223 [details]
Non-invasive, moderate speedup

Used a not-so-invasive patch (more or less the obsolete patch):

These are time critical and were optimized by setting the syntax' label explicitely.
DNS_ForwardZone never opened
DNS_ReverseZone never opened
GroupDN never opened
nfsShare never opened
UserDN never opened
WritableShare never opened

These were not critical and not optimized but will receive a speedup for free because of the way the patch works.
network never opened
PrinterProducerList never opened
Service never opened
uccImage never opened
UMC_OperationSet never opened

No diffs between old and new version found so far. I can think of only one situation where this could happen:

Something like this:
class GroupDN(UDM_Objects):
  label = '%(name)s' # solvable without open()

class object(simpleLdap):
  def open(self):
    self.dn = 'cn=messingwithdninopen,$ldap_base'
    self.info['name'] = 'Muhaha, I am changing the name fetched from LDAP in open() for no known reason'

Changes made in syntax.py are somehow code duplications because the result is the same without these new lines.
Comment 13 Dirk Wiesenthal univentionstaff 2013-05-14 20:24:27 CEST
Created attachment 5224 [details]
Speedup syntax lookups

This patch speeds up 5 critical syntax classes by using "raw" lo.search() instead of initializing and opening hander objects.

Can be activated by setting use_objects = False in a UDM_Objects syntax (default for errata: True).

The handler has to define a function def lookup_filter(filter_s=None). See patch for five (similar) examples.
Comment 14 Dirk Wiesenthal univentionstaff 2013-05-14 20:25:52 CEST
Fixed in (3.1-2):
  univention-directory-manager-modules 8.0.155-1.1009.201305141941
  univention-management-console-module-udm 3.0.79-1.315.201305141943
and
  univention-directory-manager-modules 8.0.143-14.1010.201305142002
  univention-management-console-module-udm 3.0.78-2.316.201305142005

Changelog updated, YAMLs updated resp created.
Comment 15 Stefan Gohmann univentionstaff 2013-05-22 07:33:13 CEST
3.1-1-errata:
 - OK: Performance for opening a windows computer, the CPU usage is much lower.
    OLD: 1. opening of a search result object: 17,5
    OLD: 2. opening of a search result object: 14,3
    OLD: 3. opening of a search result object: 11,7
    NEW: 1. opening of a search result object:  4,2
    NEW: 2. opening of a search result object:  4,3
    NEW: 3. opening of a search result object:  4,2
 - FAIL: Opening a share takes about 27 seconds, after setting  the following UCR variables it takes less than 2 seconds:
  directory/manager/web/modules/shares/share/properties/host/syntax=string 
  directory/manager/web/modules/shares/share/properties/group/syntax=string
  directory/manager/web/modules/shares/share/properties/owner/syntax=string
 Can you change these syntax definitions as well? If it takes too much time or if it is too complicate we should fix it later.
 - OK: Sizelimit considered
 - OK: Drop-Downs complete
   - OK: Groups
   - OK: ForwardZone
   - OK: ReverseZone
   - OK: nfsShare
   - OK: UserDN
   - OK: WritableShare

3.1-2:
 - OK: Code comparison

Changelog:
 - FAIL: Can you remove "aggressive" from the changelog entry?

YAML:
 - OK: 2013-05-14-univention-management-console-module-udm.yaml
 - OK: 2013-04-30-univention-directory-manager-modules.yaml
Comment 16 Dirk Wiesenthal univentionstaff 2013-05-22 12:59:06 CEST
(In reply to comment #15)
>  - FAIL: Opening a share takes about 27 seconds, after setting  the following
> UCR variables it takes less than 2 seconds:
>   directory/manager/web/modules/shares/share/properties/host/syntax=string 
>   directory/manager/web/modules/shares/share/properties/group/syntax=string
>   directory/manager/web/modules/shares/share/properties/owner/syntax=string
>  Can you change these syntax definitions as well? If it takes too much time or
> if it is too complicate we should fix it later.

owner: UserID, optimization added, needed to extend mapping
group: GroupID, optimization added, needed to extend mapping
host: UCS_Server, quite complicated to optimize because of some magic applied to the filter in computers.domaincontroller_master.lookup() (and similar modules). I suggest using the UCR workaround for now.

>  - OK: Drop-Downs complete
>    - OK: nfsShare

nfsShare was not optimized. Would be possible although it leads to "code dupication" because the syntax' label has to be the same format as printablename in shares.share.object.open(). As this does not seem to be critical for this bug, it was not done.

> Changelog:
>  - FAIL: Can you remove "aggressive" from the changelog entry?

Done

Fixed in:
  univention-directory-manager-modules 8.0.143-16.1014.201305221243
and:
  univention-directory-manager-modules 8.0.157-1.1015.201305221250
Comment 17 Stefan Gohmann univentionstaff 2013-05-22 13:55:21 CEST
(In reply to comment #16)
> (In reply to comment #15)
> >  - FAIL: Opening a share takes about 27 seconds, after setting  the following
> > UCR variables it takes less than 2 seconds:
> >   directory/manager/web/modules/shares/share/properties/host/syntax=string 
> >   directory/manager/web/modules/shares/share/properties/group/syntax=string
> >   directory/manager/web/modules/shares/share/properties/owner/syntax=string
> >  Can you change these syntax definitions as well? If it takes too much time or
> > if it is too complicate we should fix it later.
> 
> owner: UserID, optimization added, needed to extend mapping
> group: GroupID, optimization added, needed to extend mapping
> host: UCS_Server, quite complicated to optimize because of some magic applied
> to the filter in computers.domaincontroller_master.lookup() (and similar
> modules). I suggest using the UCR workaround for now.

OK
 
> >  - OK: Drop-Downs complete
> >    - OK: nfsShare
> 
> nfsShare was not optimized. Would be possible although it leads to "code
> dupication" because the syntax' label has to be the same format as
> printablename in shares.share.object.open(). As this does not seem to be
> critical for this bug, it was not done.

OK
 
> > Changelog:
> >  - FAIL: Can you remove "aggressive" from the changelog entry?
> 
> Done
> 
> Fixed in:
>   univention-directory-manager-modules 8.0.143-16.1014.201305221243
> and:
>   univention-directory-manager-modules 8.0.157-1.1015.201305221250


OK
Comment 18 Moritz Muehlenhoff univentionstaff 2013-05-22 16:17:48 CEST
http://errata.univention.de/ucs/3.1/109.html
Comment 19 Moritz Muehlenhoff univentionstaff 2013-05-22 16:19:03 CEST
http://errata.univention.de/ucs/3.1/111.html