50531 – UMC module Admin Diary consumes too much RAM leaving UCS Master unresponsive

Bug 50531 - UMC module Admin Diary consumes too much RAM leaving UCS Master unresponsive

Summary: UMC module Admin Diary consumes too much RAM leaving UCS Master unresponsive

Status:	CLOSED FIXED

Alias:	None

Product:	UCS
Classification:	Unclassified
Component:	Admin Diary
Version:	UCS 4.4
Hardware:	Other Linux

Importance:	P5 normal
Target Milestone:	UCS 4.4-3-errata
Assignee:	Johannes Keiser
QA Contact:	Jürn Brodersen

URL:
Keywords:

Depends on:
Blocks:

Reported:	2019-11-21 18:52 CET by Arvid Requate
Modified:	2020-08-24 21:04 CEST (History)
CC List:	6 users (show)

See Also:
What kind of report is it?:	Bug Report
What type of bug is this?:	7: Crash: Bug causes crash or data loss
Who will be affected by this bug?:	1: Will affect a very few installed domains
How will those affected feel about the bug?:	5: Blocking further progress on the daily work
User Pain:	0.200
Enterprise Customer affected?:
School Customer affected?:	Yes
ISV affected?:
Waiting Support:
Flags outvoted (downgraded) after PO Review:
Ticket number:	2019110421000291
Bug group (optional):
Customer ID:	27658
Max CVSS v3 score:

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Arvid Requate

2019-11-21 18:52:43 CET

Ticket #2019110421000291 shows a case where the UMC module Admin Diary process consumes a large amount of RAM memory and causes the a totally unresponsive system. Details about the impact are noted at the ticket.


========================================================================
USER        PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root     106961 11.0 71.7 6385600 5858076 ?     D    07:07   0:45 /usr/bin/python2.7 /usr/sbin/univention-management-console-module -m admindiary -s /var/run/univention-management-console/106689-1574316420648.socket -d 4 -l de_DE.UTF-8
========================================================================

Apparently the lack of memory causes termination of the docker container process. AFAICS the system is a UCS@school Master. This is the Admin diary config in UCR:

admin/diary/backend/port: 10514
admin/diary/backend:
admin/diary/dbhost: <empty>
admin/diary/dbms: postgresql
appcenter/apps/admindiary-backend/status: installed
appcenter/apps/admindiary-backend/ucs: 4.4
appcenter/apps/admindiary-backend/version: 1.0
appcenter/apps/admindiary-frontend/status: installed
appcenter/apps/admindiary-frontend/ucs: 4.4
appcenter/apps/admindiary-frontend/version: 1.0
appcenter/prudence/docker/admindiary-backend: yes
appcenter/prudence/docker/admindiary-frontend: yes
security/packetfilter/package/univention-admin-diary/tcp/10514/all/en: rsyslog relp+TLS
security/packetfilter/package/univention-admin-diary/tcp/10514/all: ACCEPT

Comment 1 Florian Best

2019-12-12 08:45:31 CET

Do we have information about the size of the mysql/postgres database?
Maybe the admin diary libraries are reading the whole database into memory?

Comment 2 Florian Best

2019-12-13 12:25:05 CET

(In reply to Florian Best from comment #1)
> Do we have information about the size of the mysql/postgres database?
> Maybe the admin diary libraries are reading the whole database into memory?
This theory is true!

With the following patch, one can see the SQL statements:
diff --git services/univention-admin-diary/python/admindiary/backend.py services/univention-admin-diary/python/admindiary/backend.py
index c60d2d5a02..138dba5f77 100644
--- services/univention-admin-diary/python/admindiary/backend.py
+++ services/univention-admin-diary/python/admindiary/backend.py
@@ -59,7 +59,7 @@ def get_engine():
        db_url = '%s://admindiary:%s@%s/admindiary' % (dbms, password, dbhost)
        if dbms == 'mysql':
                db_url = db_url + '?charset=utf8mb4'
-       return sqlalchemy.create_engine(db_url)
+       return sqlalchemy.create_engine(db_url, echo=True)


import univention.admindiary.backend
with univention.admindiary.backend.get_client(version=1) as client:
    q = client.query()
^C
Traceback (most recent call last):
  File "<stdin>", line 3, in <module>
  File "/usr/lib/python2.7/dist-packages/univention/admindiary/backend.py", line 280, in query
    for entry in entries:
  File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/loading.py", line 67, in instances
    fetch = cursor.fetchall()
  File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/result.py", line 964, in fetchall
    l = self.process_rows(self._fetchall_impl())
  File "/usr/lib/python2.7/dist-packages/sqlalchemy/engine/result.py", line 915, in _fetchall_impl
    return self.cursor.fetchall()
  File "/usr/lib/python2.7/dist-packages/psycopg2/tz.py", line 59, in __new__
    def __new__(cls, offset=None, name=None):
KeyboardInterrupt

→ The used select statement is the following:

SELECT entries.id AS entries_id, entries.username AS entries_username, entries.hostname AS entries_hostname, entries.message AS entries_message, entries
.timestamp AS entries_timestamp, entries.context_id AS entries_context_id, entries.event_id AS entries_event_id, entries.main_id AS entries_main_id 
FROM entries WHERE entries.event_id IS NOT NULL

The number of selected rows are:
SELECT COUNT(entries.id) FROM entries WHERE entries.event_id IS NOT NULL
→ 4114017L

over 4 million entries!

Let's set a limit of 10 000 entries!

diff --git services/univention-admin-diary/python/admindiary/backend.py services/univention-admin-diary/python/admindiary/backend.py
index c60d2d5a02..a40d59858e 100644
--- services/univention-admin-diary/python/admindiary/backend.py
+++ services/univention-admin-diary/python/admindiary/backend.py
@@ -273,9 +273,9 @@ class Client(object):
                        else:
                                ids.intersection_update(pattern_ids)
                if ids is None:
-                       entries = self._session.query(Entry).filter(Entry.event_id != None)  # noqa: E711
+                       entries = self._session.query(Entry).filter(Entry.event_id != None).limit(10000)  # noqa: E711
                else:
-                       entries = self._session.query(Entry).filter(Entry.id.in_(ids), Entry.event_id != None)  # noqa: E711
+                       entries = self._session.query(Entry).filter(Entry.id.in_(ids), Entry.event_id != None).limit(10000)  # noqa: E711
                res = []
                for entry in entries:
                        event = entry.event

→ The raw select is fast then and doesn't consume to much memory!

But: Afterwards for every of this query a COUNT() of comments is done:

SELECT count(*) AS count_1                                                                                                                              
FROM (SELECT entries.id AS entries_id, entries.username AS entries_username, entries.hostname AS entries_hostname, entries.message AS entries_message, entries.timestamp AS entries_timestamp, entries.context_id A
S entries_context_id, entries.event_id AS entries_event_id, entries.main_id AS entries_main_id                                                                                                                     
FROM entries 
WHERE entries.context_id = %(context_id_1)s AND entries.message IS NOT NULL) AS anon_1

Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/univention/admindiary/backend.py", line 287, in query
    comments = self._session.query(Entry).filter(Entry.context_id == entry.context_id, Entry.message != None).count()  # noqa: E711
  File "/usr/lib/python2.7/dist-packages/sqlalchemy/sql/operators.py", line 310, in __ne__
    return self.operate(ne, other)
  File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/attributes.py", line 175, in operate
    return op(self.comparator, *other, **kwargs)
  File "/usr/lib/python2.7/dist-packages/sqlalchemy/sql/operators.py", line 310, in __ne__
    return self.operate(ne, other)
  File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/properties.py", line 269, in operate
    return op(self.__clause_element__(), *other, **kwargs)
  File "/usr/lib/python2.7/dist-packages/sqlalchemy/sql/operators.py", line 310, in __ne__
    return self.operate(ne, other)
  File "/usr/lib/python2.7/dist-packages/sqlalchemy/sql/elements.py", line 742, in operate
    return op(self.comparator, *other, **kwargs)
  File "/usr/lib/python2.7/dist-packages/sqlalchemy/sql/operators.py", line 310, in __ne__
    return self.operate(ne, other)
  File "<string>", line 1, in <lambda>
  File "/usr/lib/python2.7/dist-packages/sqlalchemy/sql/type_api.py", line 60, in operate
    return o[0](self.expr, op, *(other + o[1:]), **kwargs)
  File "/usr/lib/python2.7/dist-packages/sqlalchemy/sql/default_comparator.py", line 50, in _boolean_compare
    negate=operators.is_)
  File "/usr/lib/python2.7/dist-packages/sqlalchemy/sql/elements.py", line 2742, in __init__
    if isinstance(operator, util.string_types):
KeyboardInterrupt

This slows down things a lot as well! And it adds 2 MB RAM usage per second (why?).

sqlalchemy is also not using iterators here. We should have a look if we can use its `connection.execution_options(stream_results=True)`

Wr should add pagination to the UMC module.

Client.query() should proably use yield instead of `return ret`.

Comment 3 Johannes Keiser

2020-02-17 12:15:57 CET

e274eb8b74 Bug #50531: debian changelog
317d60167f Bug #50531: 87_admindiary/01_query_limit file permissions
16033fa975 Bug #50531: debian changelog
2d5b2ac870 Bug #50531: add uct-test-admindiary package
640f452066 Bug #50531: Merge branch 'jkeiser/4.4-3/admindiary_limit' into 4.4-3
d0cf3f3aa7 Bug #50531: Debian changelog
4be6f6849f Bug #50531: yaml
6effd659c1 Bug #50531: add test case
eac0e8686e Bug #50531: ucr variable and apply limit

Successful build
Package: univention-admin-diary
Version: 1.0.0-49A~4.4.0.202002171141

Successful build
Package: ucs-test
Version: 9.0.3-155A~4.4.0.202002171213


Added the ucr variable admin/diary/query/limit that, if set, add an limit to the sql query. If the variable is set to 0 then no limit is added. If the variable is unset or can't be converted to an int then 1000 is used as limit.

Comment 4 Jürn Brodersen

2020-02-17 23:06:46 CET

What I tested:
Default is 1000 -> OK
admin/diary/query/limit is used as a limit -> ok
A warning is shown on the umc, if the limit is reached -> OK

Reopen:
using a negative value results in a traceback
Could you format the name of the ucr variable on the umc a little bit different?
Version is missing in yaml

Comment 5 Johannes Keiser

2020-02-18 12:14:23 CET

> Reopen:
> using a negative value results in a traceback
> Could you format the name of the ucr variable on the umc a little bit
> different?
> Version is missing in yaml

fdc3170c93 Bug #50531: yaml
21ad58acab Bug #50531: debian changelog
a1af353a8f Bug #50531: negative limits are ignored and the default is used

Successful build
Package: univention-admin-diary
Version: 1.0.0-50A~4.4.0.202002181208(In reply to Jürn Brodersen from comment #4)

Comment 6 Johannes Keiser

2020-02-18 15:56:03 CET

2dfd1fdec2 Bug #50531: debian changelog
73d512fe55 Bug #50531: add univention-admin-diary-backend package for autotest master

Successful build
Package: ucs-test
Version: 9.0.3-156A~4.4.0.202002181553

Comment 7 Jürn Brodersen

2020-02-19 10:43:12 CET

No traceback for negative values anymore -> OK
Test: https://jenkins.knut.univention.de:8181/job/UCS-4.4/job/UCS-4.4-3/job/AutotestJoin/SambaVersion=s4,Systemrolle=master/lastCompletedBuild/testReport/87_admindiary/01_query_limit/master091/ -> OK

-> Verified

Comment 8 Erik Damrose

2020-02-19 16:12:17 CET

<http://errata.software-univention.de/ucs/4.4/444.html>