Univention Bugzilla – Bug 47114
UMC Memory Leaks with UVMM
Last modified: 2021-09-21 14:11:21 CEST
Customer's UMC uses an increasing amount of memory. During one hour the usage of "univention-management-console-server" grew from 600MB to 2.3GB! At this stage users trying to open UVMM get the error: ======================================= UCS Version: 4.2-3 errata310 (Lesum) Konnte das Modul nicht öffnen. Auf dem Server ist nicht genügend freier Arbeitsspeicher verfügbar. Bitte später erneut versuchen. None Request: handle_request_command Konnte das Modul nicht öffnen. Auf dem Server ist nicht genügend freier Arbeitsspeicher verfügbar. Bitte später erneut versuchen. ======================================================== A restart of the service cleans up and the system works as expected. It happens approximately once a day. Here output of "top": =================================== PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 18857 root 20 0 1885552 1,322g 25096 S 0,0 34,2 0:51.19 /usr/bin/python2.7 /usr/sbin/univention-management-console-module -m uvmm -s /var/run/univention-management-console/4415-1527726624735.socket -d 2 -l de_DE.UTF-8 4415 root 20 0 2771332 678012 13928 S 0,3 16,7 224:42.58 /usr/bin/python2.7 /usr/sbin/univention-management-console-server start 13075 root 20 0 862764 170908 25176 S 0,3 4,2 0:10.46 /usr/bin/python2.7 /usr/sbin/univention-management-console-module -m uvmm -s /var/run/univention-management-console/4415-1527744643097.socket -d 2 -l de_DE.UTF-8 28211 root 20 0 755972 116332 24704 S 0,0 2,9 0:05.89 /usr/bin/python2.7 /usr/sbin/univention-management-console-module -m uvmm -s /var/run/univention-management-console/4415-1527747733269.socket -d 2 -l de_DE.UTF-8 32734 root 20 0 2008344 49632 0 S 7,3 1,2 27:58.19 /usr/bin/python2.7 /usr/sbin/univention-virtual-machine-manager-daemon -l /var/log/univention/virtual-machine-manager-daemon-errors.log 1062 root 20 0 3690100 24192 5272 S 0,3 0,6 3890:32 /usr/sbin/slapd -h ldapi:/// ldap://:7389/ ldap://:389/ ldaps://:7636/ ldaps://:636/ =================================== Appears it is not limited to UVMM module.
Possibly related: See also https://forge.univention.org/bugzilla/show_bug.cgi?id=36640
Probably related Ticket#2018062321000016.
This is not caused by pickle. The leak is easy to see with collectd. I can re-produce the leak with: from univention.lib.umc import Client from pprint import pprint umc = Client('dc0', 'Administrator', 'univention') while True: pprint(umc.umc_command('uvmm/query', dict(nodePattern='*', type='node')).data) 'service/query' on the other hand does not show that leak, so it's UVMM-UMC only. Using the technique from <http://tech.labs.oliverwyman.com/blog/2008/11/14/tracing-python-memory-leaks/> I found this: UVMM_ConnectionThread 14480 Message 14480 The problem seems to be a dependency loop involving UVMM_ConnectionThread, which is a sub-class of Python-Notifier, which implements a __del__ method, which is very bad: <https://docs.python.org/2/library/gc.html#gc.garbage> Combined with <https://docs.python.org/2/library/sys.html#sys.exc_info> this leads to this leak.
Created attachment 9587 [details] collectd VirtualMemorySize graph of UVMM-UMC
As we have our own fork of python-notifier <https://git.knut.univention.de/univention/python-notifier/tree/ucs>, the workflow was as following: 1. git clone -b ucs -o gitlab git.knut.univention.de:univention/python-notifier 2. apply patches - please note that the fix for Bug #44222 was applied as a repo-ng Patch; I merged the patch into that repository and told repo-ng to skip copying the patch. 3. git push gitlab 4. switch to UCS repo ./management/python-notifier/ 5. git submodule iniit python-notifier 6. git submodule update python-notifier 7. ( cd python-notifier && git pull --rebase ) 8. git add python-notifier 9. dch -i 'Bug #47114: Fix memory leak' 10. git add debian/changelog 11. git commit -m 'Bug #47114: Fix memory leak' 12. git pull --rebase 13. git push ... Package: python-notifier Version: 0.9.7-5A~4.3.0.201807041038 Version: 0.9.7-5A~4.3.0.201807041139 Branch: ucs_4.3-0 Scope: errata4.3-1 [4.3-1] 3a266ced6c Bug #47114: python-notifier 0.9.7-5A~4.3.0.201807041139 doc/errata/staging/python-notifier.yaml | 11 +++++++++++ 1 file changed, 11 insertions(+) OK: zless /usr/share/doc/python-notifier/changelog.Debian.gz FYI: <https://github.com/crunchy-github/python-notifier/pull/14> TODO: After QA clone the bug for UCS-4.2-4 and apply the same logic.
(In reply to Philipp Hahn from comment #6) > TODO: After QA clone the bug for UCS-4.2-4 and apply the same logic. As an alternative announce the same binary also for that older release as it contains also the fix for Bug #44222 . The test is running on xen1 and the VM size no longer increases: <http://xen1.knut.univention.de/collectd/bin/index.cgi?hostname=dc0.phahn.dev&plugin=processes×pan=86400&action=show_selection&ok_button=OK>
I saw the pull request in python-notifier. The changes there will cause that some tracebacks are broken and not displayed in UMC anymore. See: management/univention-management-console/src/univention/management/console/auth.py: » » » AUTH.error(''.join(traceback.format_exception(*thread.exc_info))) management/univention-management-console/src/univention/management/console/base.py: » » self.__error_handling(request, method, *thread.exc_info)
(In reply to Florian Best from comment #8) > I saw the pull request in python-notifier. The changes there will cause that > some tracebacks are broken and not displayed in UMC anymore. See: > > management/univention-management-console/src/univention/management/console/ > auth.py: > » » » AUTH.error(''.join(traceback.format_exception(*thread.exc_info))) > > management/univention-management-console/src/univention/management/console/ > base.py: > » » self.__error_handling(request, method, *thread.exc_info) Better then a crash due OOM; UMC should be fixed: <https://git.knut.univention.de/univention/ucs/commit/15621de896819a686d593c360b309eedef5475e2>
--- mirror/ftp/4.1/unmaintained/4.1-1/source/python-notifier_0.9.7-4.30.201601271815.dsc +++ apt/ucs_4.3-0-errata4.3-1/source/python-notifier_0.9.7-5A~4.3.0.201807041139.dsc @@ -1,6 +1,10 @@ -0.9.7-4.30.201601271815 [Wed, 27 Jan 2016 18:15:13 +0100] Univention builddaemon <buildd@univention.de>: +0.9.7-5A~4.3.0.201807041139 [Wed, 04 Jul 2018 11:39:26 +0200] Univention builddaemon <buildd@univention.de>: * UCS auto build. No patches were applied to the original source package + +0.9.7-5 [Wed, 04 Jul 2018 10:36:25 +0200] Philipp Hahn <hahn@univention.de>: + + * Bug #47114: Fix memory leak 0.9.7-4 [Wed, 27 Jan 2016 18:12:51 +0100] Florian Best <best@univention.de>: <http://10.200.17.11/4.2-4/#7474818989236411566>
--- mirror/ftp/4.3/unmaintained/component/4.3-1-errata/source/python-notifier_0.9.7-4.A~4.3.0.201806071253.dsc +++ apt/ucs_4.3-0-errata4.3-1/source/python-notifier_0.9.7-5A~4.3.0.201807041139.dsc @@ -1,7 +1,10 @@ -0.9.7-4.A~4.3.0.201806071253 [Tue, 26 Jun 2018 13:25:55 +0200] Univention builddaemon <buildd@univention.de>: +0.9.7-5A~4.3.0.201807041139 [Wed, 04 Jul 2018 11:39:26 +0200] Univention builddaemon <buildd@univention.de>: - * UCS auto build. The following patches have been applied to the original source package - 10_bug44222_overflowerror + * UCS auto build. No patches were applied to the original source package + +0.9.7-5 [Wed, 04 Jul 2018 10:36:25 +0200] Philipp Hahn <hahn@univention.de>: + + * Bug #47114: Fix memory leak 0.9.7-4 [Wed, 27 Jan 2016 18:12:51 +0100] Florian Best <best@univention.de>: <http://10.200.17.11/4.3-1/#7474818989236411566>
Created attachment 9589 [details] reproduce memory leak How to reproduce: Download mem_eater.py Add: ''' def _big_exception(): raise Exception(['foo'] * 1000000) return lambda _1, _2: _big_exception() ''' to univention/management/console/modules/top/__init__.py:55 (inside at the top of the query function) python mem_eater.py Watch the memory consumption
As an alternative to changing the api, it seems adding a manual garbage collection then the thread is removed from the global _threads variable has an comparable effect. As a drawback that wouldn't help if some other code holds on to the thread object.
I think Comment 8 / Comment 9 need to be addressed before releasing this.
(In reply to Arvid Requate from comment #14) > I think Comment 8 / Comment 9 need to be addressed before releasing this. As discussed lets fix UMC and Co: [4.3-1] 4d9287adf2 Bug #47114: Fix notifyer traceback memory leaks YAML [4.3-1] dc3023661c Bug #47114: Fix notifyer traceback memory leaks [4.3-1] fe1c5308cd Bug #47114 USS: Fix notifier traceback memory leak [4.3-1] 1d3e78dc87 Bug #47114 UAC: Fix notifier traceback memory leak [4.3-1] 96965b711c Bug #47114 join: Fix notifier traceback memory leak [4.3-1] d381fa1933 Bug #47114 UMC: Fix notifier traceback memory leak After successful ucs-test and QA cherry-pick to 4.2-4 is needed.
Package: univention-system-setup Version: 11.0.5-11A~4.3.0.201807311709 Branch: ucs_4.3-0 Scope: errata4.3-1 Package: univention-join Version: 10.0.0-18A~4.3.0.201807311721 Branch: ucs_4.3-0 Scope: errata4.3-1 Package: univention-appcenter Version: 7.0.2-18A~4.3.0.201807311723 Branch: ucs_4.3-0 Scope: errata4.3-1 Package: univention-management-console Version: 10.0.6-9A~4.3.0.201807311732 Branch: ucs_4.3-0 Scope: errata4.3-1 [4.3-1] 90c1cd94e8 Bug #47114: Fix notifyer traceback memory leaks YAML 2
Tracebacks from inside threads don't print the context from inside the thread anymore but instead the context during which they were reraised. __error_handling management/univention-management-console/src/univention/management/console/base.py:307 L292 calls __error_handling if a traceback occurred inside a threads L258 calls __error_handling for "normal" tracebacks I suggest instead of reraising the exception we raise a more generic exception like "UmcOperationsError" with the already formatted traceback inside the message.
(In reply to Jürn Brodersen from comment #17) > I suggest instead of reraising the exception we raise a more generic > exception like "UmcOperationsError" with the already formatted traceback > inside the message. No please not, it was hard work to get rid of such an exception handling.
(In reply to Florian Best from comment #18) > (In reply to Jürn Brodersen from comment #17) > > I suggest instead of reraising the exception we raise a more generic > > exception like "UmcOperationsError" with the already formatted traceback > > inside the message. > > No please not, it was hard work to get rid of such an exception handling. Any Ideas? :) The traceback object is gone at that point and I would really like to see the context from inside the thread.
Created attachment 9622 [details] patch Maybe this?
(In reply to Florian Best from comment #20) > Created attachment 9622 [details] > Maybe this? Yes, which one small fix. Thanks: [4.3-1] 51e79b6b3e Bug #47114 UMC: Fix traceback after fixing memory leak .../univention-management-console/debian/changelog | 6 ++++ .../src/univention/management/console/base.py | 32 ++++++++++++++++++++-- 2 files changed, 36 insertions(+), 2 deletions(-) Package: univention-management-console Version: 10.0.6-10A~4.3.0.201808071548 Branch: ucs_4.3-0 Scope: errata4.3-1 [4.3-1] 2c4274e7fa Bug #47114 UMC: Fix traceback after fixing memory leak YAML doc/errata/staging/univention-management-console.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) OK: errata-announce -V --only univention-management-console.yaml
Looks good now :) I will check on Tuesday if no other packages block the release and set this to verified if possible.
(In reply to Jürn Brodersen from comment #22) > Looks good now :) > > I will check on Tuesday if no other packages block the release and set this > to verified if possible. OK jenkins will check this -> Verified
<http://errata.software-univention.de/ucs/4.3/211.html> <http://errata.software-univention.de/ucs/4.3/213.html> <http://errata.software-univention.de/ucs/4.3/214.html> <http://errata.software-univention.de/ucs/4.3/215.html> <http://errata.software-univention.de/ucs/4.3/216.html>
This broke logging of tracebacks in ucs-school-umc-installer and ucs-school-umc-import.