Bug 56420 - Increased memory usage/leak using custom querying for users/user
Increased memory usage/leak using custom querying for users/user
Status: CLOSED FIXED
Product: UCS
Classification: Unclassified
Component: UDM - REST API
UCS 5.0
Other Linux
: P5 normal (vote)
: UCS 5.0-6-errata
Assigned To: Florian Best
Marius Meschter
https://git.knut.univention.de/univen...
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2023-08-11 12:21 CEST by Timo Hollwedel
Modified: 2024-01-14 09:49 CET (History)
5 users (show)

See Also:
What kind of report is it?: Bug Report
What type of bug is this?: 4: Minor Usability: Impairs usability in secondary scenarios
Who will be affected by this bug?: ---
How will those affected feel about the bug?: ---
User Pain:
Enterprise Customer affected?: Yes
School Customer affected?: Yes
ISV affected?:
Waiting Support:
Flags outvoted (downgraded) after PO Review:
Ticket number:
Bug group (optional):
Max CVSS v3 score:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Timo Hollwedel univentionstaff 2023-08-11 12:21:34 CEST
Customer has an environment with ~60000 users and is using UDM REST as part of a overarching project for a custom user import and querying.

Snipped of the custom code that is used:
from udm_rest_client.udm import UDM

def udm_client():
    """Return the Univention UDM API client."""
    return UDM(
        config.settings["api"]["auth"]["username"],
        config.settings["api"]["auth"]["password"],
        config.settings["api"]["link"]["udm_url"],
        verify_ssl=config.settings["api"]["link"]["verify_certs"],
    )

async def read_persons():
    users_in_udm = []
    async with ucs_client.udm_client() as udm:
        users_user = udm.get("users/user")
        async for user in users_user.search("uid=*"):
            users_in_udm.append(user.to_dict())

    return users_in_udm 

They noticed that with multiple subsequent queries the memory usage kept increasing and would only reset after restarting the service manually.

Florian already suggested a patch using the garbage collection from udm_ldap.py and adding an additional gargabe collection after the request is responded, so the defined dictionaries/objects are also cleaned up.

diff --git management/univention-directory-manager-rest/src/univention/admin/rest/module.py management/univention-directory-manager-rest/src/univention/admin/rest/module.py
index 63a62231bc..7c28492c1f 100755
--- management/univention-directory-manager-rest/src/univention/admin/rest/module.py
+++ management/univention-directory-manager-rest/src/univention/admin/rest/module.py
@@ -2873,6 +2873,12 @@ class Objects(FormBase, ReportingBase, _OpenAPIBase, Resource):
         self.add_caching(public=False, no_cache=True, no_store=True, max_age=1, must_revalidate=True)
         self.content_negotiation(result)

+        if search:
+            del objects, result
+            obj = None
+            import gc
+            gc.collect()
+
     async def search(self, module, container, ldap_filter, superordinate, scope, hidden, items_per_page, page, by, reverse):
         ctrls = {}
         serverctrls = []


Customer tried the patch but could not see a change in the behaviour.
Test procedure:

1) REST service restart
2) Measure memory usage (baseline)
3) Loop 3x:
3a) API call
3b) Measure memory usage

The whole once without the patch, as well as once with patch.

After spending some more time today on the memory leak they finally came across a line of code in the i18n module that uses "weakref" but does not discard the weakref references themselves. With the following patch, the problem no longer occurs, and only absolutely minimal memory changes are found between requests.
Patch:
[SPW4] root@ucs-pri-01:/usr/lib/python3/dist-packages/univention# diff -u lib/i18n.py lib/i18n.py.new
--- lib/i18n.py 2023-08-10 19:05:38.412799393 +0200
+++ lib/i18n.py.new     2023-08-10 19:05:23.580799393 +0200
@@ -241,6 +241,7 @@
     def __new__(cls, *args, **kwargs):
         self = object.__new__(cls)
         cls._instances.append(weakref.ref(self))
+        # drop expired weakrefs
+        cls._instances = [ ref for ref in cls._instances if ref() ]
         return self


Without this line, `cls._instances` grows unchecked, even though the targets of the weakrefs are cleanly cleaned up (at most, the cleanup line would not need to be done every time)

    They used the following code to track the memory increase between requests:
       if 'tracker__' not in globals():
           # 1st run
           from pympler import tracker
           globals()['tracker__'] = tracker.SummaryTracker()
       else:
           # nth run
           import os
           print("-------------")
           print("PID", os.getpid())
           globals()['tracker__'].print_diff()
           print("-------------")


They put these lines at the beginning of the `async def get()` from the REST module, the same method where the previous patch had placed the gc.collect() calls. This way, for every request (except the first one), you can see how many objects from the previous request are still "lying around".

The output from the SummaryTracker was what gave the idea that there were a number of "weakrefs" left after each request, which corresponded to the number of users processed.

For better traceability they started the REST service as follows:

/usr/bin/python3 -m univention.admin.rest.server -c 1
Comment 2 Florian Best univentionstaff 2024-01-08 11:13:51 CET
We could not reproduce the bug by just searching for users but had to call the additional /add endpoint which gives a empty user representation:
curl -k https://Administrator:univention@localhost/univention/udm/users/user/add -H 'Accept: application/json'
That reloads UDM modules and the leak gets obvious. But we don't get a leaking weakref per found user.
Nevertheless the code is more correct and prevents small leaks.

univention-lib.yaml
b1fa1ff9dbde | fix(univention-lib): fix memory leak for not existing Translation instances

univention-lib (9.0.17-1)
b1fa1ff9dbde | fix(univention-lib): fix memory leak for not existing Translation instances
Comment 3 Marius Meschter univentionstaff 2024-01-09 09:06:39 CET
QA:
YAML/Changelog: OK
Code review: OK
No more leaking weakrefs: OK