Bug 50583 - Memory Leak in univention-management-console-server
Memory Leak in univention-management-console-server
Status: CLOSED FIXED
Product: UCS
Classification: Unclassified
Component: UMC (Generic)
UCS 4.4
Other Linux
: P5 normal (vote)
: UCS 4.4-5-errata
Assigned To: Florian Best
Arvid Requate
:
: 51047 51402 (view as bug list)
Depends on:
Blocks: 51803
  Show dependency treegraph
 
Reported: 2019-12-02 16:20 CET by Florian Best
Modified: 2020-11-13 12:35 CET (History)
12 users (show)

See Also:
What kind of report is it?: Bug Report
What type of bug is this?: 5: Major Usability: Impairs usability in key scenarios
Who will be affected by this bug?: 2: Will only affect a few installed domains
How will those affected feel about the bug?: 5: Blocking further progress on the daily work
User Pain: 0.286
Enterprise Customer affected?: Yes
School Customer affected?:
ISV affected?:
Waiting Support: Yes
Flags outvoted (downgraded) after PO Review:
Ticket number: 2019112221000373
Bug group (optional):
Max CVSS v3 score:


Attachments
memory.png (81.09 KB, image/png)
2019-12-02 16:34 CET, Florian Best
Details
memory1.py (359 bytes, text/x-python)
2020-07-09 17:50 CEST, Florian Best
Details
univention-management-console-server memory usage over time (6.13 KB, image/png)
2020-07-09 19:46 CEST, Bernhard M. Wiedemann
Details
patch (2.07 KB, patch)
2020-07-10 11:21 CEST, Florian Best
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Florian Best univentionstaff 2019-12-02 16:20:56 CET
We have several reports about a memory leak in the univention-management-console-server.
Comment 1 Florian Best univentionstaff 2019-12-02 16:34:28 CET
Created attachment 10244 [details]
memory.png

In a customer dump I saw a lot of univention.management.console.protocol.message.Message and univention.management.console.protocol.session.SessionHandler objects.
See attached memory graph.

univention.management.console.protocol.session.SessionHandler has a __del__() method, which is suspicious.
Comment 2 Florian Best univentionstaff 2019-12-02 16:37:34 CET
When restarting the UMC-Server and there are several people logged in (via SAML?) I see in the logfiles (on laiva):

…
02.12.19 15:30:46.593  MAIN        ( WARN    ) : Module None (command='uvmm/cloud/query', id='157529704433107-805293') does not exists anymore
02.12.19 15:30:46.687  MAIN        ( WARN    ) : Module None (command='uvmm/query', id='157529704438399-805294') does not exists anymore
02.12.19 15:30:46.862  MAIN        ( WARN    ) : Module None (command='uvmm/query', id='157529704443234-805295') does not exists anymore
02.12.19 15:30:46.889  MAIN        ( WARN    ) : Module None (command='uvmm/node/query', id='157529704447359-805296') does not exists anymore
02.12.19 15:30:46.946  MAIN        ( WARN    ) : Module None (command='uvmm/query', id='157529704454510-805297') does not exists anymore
02.12.19 15:30:46.977  LOCALE      ( WARN    ) : Could not find translation file: 'umc-core.mo'
02.12.19 15:30:46.977  MAIN        ( WARN    ) : Module None (command='uvmm/query', id='157529704466857-805298') does not exists anymore
02.12.19 15:30:47.014  MAIN        ( WARN    ) : Module None (command='uvmm/query', id='157529704469122-805299') does not exists anymore
02.12.19 15:30:47.110  MAIN        ( WARN    ) : Module None (command='uvmm/cloud/query', id='157529704476062-805300') does not exists anymore
02.12.19 15:30:47.133  MAIN        ( WARN    ) : Module None (command='uvmm/node/query', id='157529704478649-805301') does not exists anymore
02.12.19 15:30:47.187  LOCALE      ( WARN    ) : Could not find translation file: 'umc-core.mo'
02.12.19 15:30:47.187  MAIN        ( WARN    ) : Module None (command='uvmm/cloud/query', id='157529704484444-805302') does not exists anymore
02.12.19 15:30:47.240  MAIN        ( WARN    ) : Module None (command='uvmm/node/query', id='157529704488191-805303') does not exists anymore
02.12.19 15:30:47.298  LOCALE      ( WARN    ) : Could not find translation file: 'umc-core.mo'
02.12.19 15:30:47.298  MAIN        ( WARN    ) : Module None (command='uvmm/node/query', id='157529704496432-805304') does not exists anymore
02.12.19 15:30:47.362  LOCALE      ( WARN    ) : Could not find translation file: 'umc-core.mo'
02.12.19 15:30:47.362  MAIN        ( WARN    ) : Module None (command='uvmm/node/query', id='157529704501203-805305') does not exists anymore
02.12.19 15:30:47.565  LOCALE      ( WARN    ) : Could not find translation file: 'umc-core.mo'
02.12.19 15:30:47.565  MAIN        ( WARN    ) : Module None (command='uvmm/cloud/query', id='157529704507844-805306') does not exists anymore
02.12.19 15:30:47.629  MAIN        ( WARN    ) : Module None (command='uvmm/node/query', id='157529704513899-805307') does not exists anymore
02.12.19 15:30:47.684  MAIN        ( WARN    ) : Module None (command='uvmm/cloud/query', id='157529704522571-805308') does not exists anymore
02.12.19 15:30:47.705  MAIN        ( WARN    ) : Module None (command='uvmm/cloud/query', id='157529704528484-805309') does not exists anymore
02.12.19 15:30:47.728  MAIN        ( WARN    ) : Module None (command='uvmm/query', id='157529704534979-805310') does not exists anymore
02.12.19 15:30:47.753  LOCALE      ( WARN    ) : Could not find translation file: 'umc-core.mo'
02.12.19 15:30:47.753  MAIN        ( WARN    ) : Module None (command='uvmm/query', id='157529704539947-805311') does not exists anymore
02.12.19 15:30:47.782  MAIN        ( WARN    ) : Module None (command='uvmm/cloud/query', id='157529704546310-805312') does not exists anymore
02.12.19 15:30:48.023  MAIN        ( WARN    ) : Module None (command='uvmm/query', id='157529704549073-805313') does not exists anymore
02.12.19 15:30:48.047  MAIN        ( WARN    ) : Module None (command='uvmm/node/query', id='157529704562081-805314') does not exists anymore
02.12.19 15:30:48.338  MAIN        ( WARN    ) : Module None (command='uvmm/query', id='157529704566931-805315') does not exists anymore
02.12.19 15:30:48.372  MAIN        ( WARN    ) : Module None (command='uvmm/node/query', id='157529704570947-805316') does not exists anymore
02.12.19 15:30:48.398  LOCALE      ( WARN    ) : Could not find translation file: 'umc-core.mo'
02.12.19 15:30:48.398  MAIN        ( WARN    ) : Module None (command='uvmm/node/query', id='157529704577035-805317') does not exists anymore
02.12.19 15:30:48.419  MAIN        ( WARN    ) : Module None (command='uvmm/node/query', id='157529704582421-805318') does not exists anymore
02.12.19 15:30:48.443  MAIN        ( WARN    ) : Module None (command='uvmm/cloud/query', id='157529704588237-805319') does not exists anymore
02.12.19 15:30:48.620  LOCALE      ( WARN    ) : Could not find translation file: 'umc-core.mo'
02.12.19 15:30:48.620  MAIN        ( WARN    ) : Module None (command='uvmm/query', id='157529704603943-805322') does not exists anymore
02.12.19 15:30:48.656  MAIN        ( WARN    ) : Module None (command='uvmm/query', id='157529704611744-805323') does not exists anymore
02.12.19 15:30:48.692  LOCALE      ( WARN    ) : Could not find translation file: 'umc-core.mo'
02.12.19 15:30:48.692  MAIN        ( WARN    ) : Module None (command='uvmm/query', id='157529704631838-805324') does not exists anymore
02.12.19 15:30:48.731  LOCALE      ( WARN    ) : Could not find translation file: 'umc-core.mo'
02.12.19 15:30:48.731  MAIN        ( WARN    ) : Module None (command='uvmm/node/query', id='157529704649977-805325') does not exists anymore
02.12.19 15:30:48.806  LOCALE      ( WARN    ) : Could not find translation file: 'umc-core.mo'
02.12.19 15:30:48.806  MAIN        ( WARN    ) : Module None (command='uvmm/cloud/query', id='157529704592994-805320') does not exists anymore
02.12.19 15:30:48.837  LOCALE      ( WARN    ) : Could not find translation file: 'umc-core.mo'
02.12.19 15:30:48.837  MAIN        ( WARN    ) : Module None (command='uvmm/query', id='157529704598664-805321') does not exists anymore
02.12.19 15:30:48.858  LOCALE      ( WARN    ) : Could not find translation file: 'umc-core.mo'
02.12.19 15:30:48.859  MAIN        ( WARN    ) : Module None (command='uvmm/cloud/query', id='157529704653571-805326') does not exists anymore
02.12.19 15:30:48.902  MAIN        ( WARN    ) : Module None (command='uvmm/cloud/query', id='157529704660657-805327') does not exists anymore
02.12.19 15:30:48.937  LOCALE      ( WARN    ) : Could not find translation file: 'umc-core.mo'
02.12.19 15:30:48.937  MAIN        ( WARN    ) : Module None (command='uvmm/node/query', id='157529704665046-805328') does not exists anymore
02.12.19 15:30:48.964  MAIN        ( WARN    ) : Module None (command='uvmm/query', id='157529704680481-805329') does not exists anymore
02.12.19 15:30:48.988  MAIN        ( WARN    ) : Module None (command='uvmm/cloud/query', id='157529704690642-805330') does not exists anymore
02.12.19 15:30:49.014  MAIN        ( WARN    ) : Module None (command='uvmm/query', id='157529704692743-805331') does not exists anymore
02.12.19 15:30:49.088  LOCALE      ( WARN    ) : Could not find translation file: 'umc-core.mo'
02.12.19 15:30:49.089  MAIN        ( WARN    ) : Module None (command='uvmm/query', id='157529704700241-805332') does not exists anymore
02.12.19 15:30:49.121  MAIN        ( WARN    ) : Module None (command='uvmm/node/query', id='157529704703836-805333') does not exists anymore
02.12.19 15:30:49.160  MAIN        ( WARN    ) : Module None (command='uvmm/cloud/query', id='157529704707079-805334') does not exists anymore
02.12.19 15:30:49.225  MAIN        ( WARN    ) : Module None (command='uvmm/query', id='157529704715262-805335') does not exists anymore
02.12.19 15:30:49.304  LOCALE      ( WARN    ) : Could not find translation file: 'umc-core.mo'
02.12.19 15:30:49.304  MAIN        ( WARN    ) : Module None (command='uvmm/node/query', id='157529704716907-805336') does not exists anymore
…

→ Is this related to the SessionHandler objects?
The messages indicate that there is a connection downgrade from being authenticated to being unauthenticated. Unauthenticated sessions still have anonymous SessionHandler, they might not be cleaned up correctly?
Comment 4 Florian Best univentionstaff 2019-12-03 10:48:18 CET
Suspections, which I currently check on laiva:
* MagicBucket.states[].requests
* SessionHandler._ProcessorBase__processes

Maybe worth adding:
diff --git management/univention-management-console/src/univention/management/console/protocol/session.py management/univention-management-console/src/univention/management/console/protocol/session.py
index 168f1ab3e8..0b8edb5b08 100644
--- management/univention-management-console/src/univention/management/console/protocol/session.py
+++ management/univention-management-console/src/univention/management/console/protocol/session.py
@@ -979,6 +979,7 @@ class SessionHandler(ProcessorBase):
                        CORE.info('Renewing session')
                        if self.processor:
                                self.__locale = str(self.processor.locale)
+                               self.processor.__del__()
                        self.processor = None
                        self.finished(request.id, None)
                else
Comment 5 Florian Best univentionstaff 2019-12-18 08:34:54 CET
(In reply to Florian Best from comment #4)
> Suspections, which I currently check on laiva:
> * MagicBucket.states[].requests
> * SessionHandler._ProcessorBase__processes
These are fine → the size is decreasing back to a few instances (~20) over time.
Comment 7 Ingo Steuwer univentionstaff 2020-06-29 09:17:27 CEST
we fixed several "close filedescriptor" issues for Bug #51047 - might those fixes help here, too?
Comment 8 Bernhard M. Wiedemann 2020-07-03 11:05:04 CEST
We are also experiencing memory leaks with
univention-management-console-server 11.0.4-63A~4.4.0.202004211448

I found the amount of leakage is proportional to how much it is used.
Possibly via web-ui.
One instance is leaking 269MB/d

output of
pid=$(ps ax|grep univention-management-console-server.[Ss]tart|cut -c1-5)
lsof -p $pid 
remains at around 300, so it is not a file-descriptor leakage.

diff log/proc-status-2020-07-0[23]T08\:13\:*
17,18c17,18
< VmPeak:        1303056 kB
> VmPeak:        1565412 kB
< VmSize:        1303052 kB
> VmSize:        1565412 kB
< VmHWM:          698504 kB
> VmHWM:          967964 kB
< VmRSS:          698488 kB
> VmRSS:          967964 kB
< RssAnon:        672460 kB
> RssAnon:        941936 kB
< VmData:         683744 kB
> VmData:         954332 kB
< VmPTE:            2336 kB
> VmPTE:            2860 kB
< voluntary_ctxt_switches:      4606586
> voluntary_ctxt_switches:      6569428
< nonvoluntary_ctxt_switches:   333442
> nonvoluntary_ctxt_switches:   482551


After
systemctl restart univention-management-console-server
RssAnon starts at 37028 and it is the smallest of the growing values in /proc/$pid/status
Comment 9 Florian Best univentionstaff 2020-07-03 11:11:43 CEST
(In reply to Bernhard M. Wiedemann from comment #8)
> We are also experiencing memory leaks with
> univention-management-console-server 11.0.4-63A~4.4.0.202004211448
Thank you for the feedback. Could you upgrade the system to the latest errata updates (univention-management-console-server 11.0.4-85A~4.4.0.202006091536) and re-do these tests?
Comment 10 Bernhard M. Wiedemann 2020-07-03 11:49:36 CEST
The leak is still there with 11.0.4-85A~4.4.0.202006091536
Here is how I reproduce it:

# find relevant PID:
pid=$(ps ax|grep univention-management-console-server.[Ss]tart|cut -c1-5)
# watch memory usage continuously:
while sleep 2 ; do grep RssAnon /proc/$pid/status ; done


Then use a browser to log into https://YOURDOMAIN/univention/portal/

and most times that page is reloaded, memory usage grows a bit (~200kB):

load /univention/portal/
RssAnon:           41832 kB
RssAnon:           42252 kB

again
RssAnon:           42252 kB
RssAnon:           42416 kB

again
RssAnon:           42416 kB
RssAnon:           42624 kB
Comment 11 Bernhard M. Wiedemann 2020-07-03 11:59:43 CEST
found another simple leak-trigger:
while sleep 8 ; do umc-get -P $PASS -U $USER ; done

These lines were output every 2 seconds
And one can you see where the 8s trigger above ran:
RssAnon:           44976 kB
RssAnon:           44976 kB
RssAnon:           44976 kB
RssAnon:           45080 kB
RssAnon:           44976 kB
RssAnon:           44976 kB
RssAnon:           44976 kB
RssAnon:           44976 kB
RssAnon:           45000 kB
RssAnon:           45000 kB
RssAnon:           45000 kB
RssAnon:           45000 kB
RssAnon:           45000 kB
RssAnon:           45080 kB
RssAnon:           45080 kB
RssAnon:           45080 kB
RssAnon:           45080 kB
RssAnon:           45308 kB
RssAnon:           45308 kB
RssAnon:           45308 kB
RssAnon:           45308 kB
RssAnon:           45308 kB
RssAnon:           45332 kB
RssAnon:           45332 kB
RssAnon:           45332 kB
Comment 12 Florian Best univentionstaff 2020-07-09 17:48:57 CEST
(In reply to Bernhard M. Wiedemann from comment #10)
> The leak is still there with 11.0.4-85A~4.4.0.202006091536
> Here is how I reproduce it:
Thanks for the explanation. All info is good info. 
 
> Then use a browser to log into https://YOURDOMAIN/univention/portal/
Do you just access /univention/portal/ or are you doing a complete log in after accessing the page?

Accessing /univention/portal causes the following request:
* static HTML files
* /univention/portal/portal.json → This is handeled by univention-portal-server which makes a request to /univention/get/session-info
* /univention/get/session-info → This is handled by univention-management-console-web-server and just returns session information. it's not passed to the UMC-Server
* /univention/get/meta
→ This is the handled by the UMC-Server. It just returns some file contents (and if logged in UCR variable values). So if that would leak it would be the univention.management.console.protocol.message.Message() object. unlikely.
* univention/saml/iframe
→ When being logged in via SAML this would redirect to simplesamlphp and back to /univention/saml/.
This then does a AUTH request to the UMC-Server.
The UMC-Server authenticates with the SAML message with PAM. Then a Server.reload() is called which reloads resources like UCR variables, module and category XML definitions.

> # find relevant PID:
> pid=$(ps ax|grep univention-management-console-server.[Ss]tart|cut -c1-5)
> # watch memory usage continuously:
> while sleep 2 ; do grep RssAnon /proc/$pid/status ; done
> 
> and most times that page is reloaded, memory usage grows a bit (~200kB):
> 
> load /univention/portal/
> RssAnon:           41832 kB
> RssAnon:           42252 kB
> 
> again
> RssAnon:           42252 kB
> RssAnon:           42416 kB

I am not convinced that this indicates already a memory leak?!
Every univention.management.console.resources.moduleManager.load() increases the RssAnon value.
Even if I call gc.collect() and implement a umc.module.XML_Definition.__del__() method, which shows that the object is getting cleaned up, the value does never de-crease.
I think this is Pythons memory management. It might keeps the memory allocated for use in future objects(?!).
Comment 13 Florian Best univentionstaff 2020-07-09 17:50:56 CEST
Created attachment 10423 [details]
memory1.py
Comment 14 Florian Best univentionstaff 2020-07-09 18:04:53 CEST
Note: we have sessions for each client and a session timeout.
For debugging purposes I set this session timeout to 10 seconds:
ucr set umc/http/session/timeout='10'; systemctl restart univention-management-console-server univention-management-console-web-server.
Comment 15 Bernhard M. Wiedemann 2020-07-09 19:46:30 CEST
Created attachment 10424 [details]
univention-management-console-server memory usage over time

You can see the 2 times we restarted the service, but if we don't,
it goes up to 10GB when UCS stops to do certain actions.

> Do you just access /univention/portal/ or are you doing a complete log in after
accessing the page?

Not sure, if I understand the question correctly, but I meant just hitting Ctrl-R or pressing Ctrl-L and Enter
Comment 16 Bernhard M. Wiedemann 2020-07-09 20:22:35 CEST
Can you try this leak-trigger?
while sleep 0.1 ; do umc-get -P $PASS -U $USER ; done >/dev/null 2>&1

Here it even leaks with incorrect USER+PASS.

I also tested your memory1.py with 10000 instead of 100 and without the gc call and while usage grew in the beginning, it never went above 11MB.
So that one looks fine.
Comment 17 Florian Best univentionstaff 2020-07-10 11:21:14 CEST
Created attachment 10425 [details]
patch

I identified the root of the leak:
The starting point of everything is the the umcp.server.Server which has one umcp.server.MagicBucket instance which contains all umcp.server.State objects for every incoming socket connection.
These umcp.server.State objects are never cleaned up and they contain a lot of other references to all other objects we possibly have.

For example after 15 requests:
protocol.server.State: 15
auth.AuthenticationResult: 1
auth.AuthHandler: 15
base.Base: 0
category.XML_Definition: 7
error.UMC_Error: 0
locales.I18N: 25
locales.I18N_Manager: 25
module.Command: 8510
module.Flavor: 250
module.Module: 170
module.Link: 0
module.XML_Definition: 19
module.Manager: 1
pam.AuthenticationError: 0
pam.AuthenticationFailed: 0
pam.AuthenticationInformationMissing: 0
pam.AccountExpired: 0
pam.PasswordExpired: 0
pam.PasswordChangeFailed: 0
pam.PamAuth: 18
protocol.client.Client: 0
protocol.message.ParseError: 0
protocol.message.IncompleteMessageError: 0
protocol.message.Message: 29
protocol.message.Request: 10
protocol.message.Response: 0
protocol.modserver.ModuleServer: 0
protocol.server.MagicBucket: 1
protocol.server.Server: 1
protocol.session.ModuleProcess: 10
protocol.session.ProcessorBase: 0
protocol.session.Processor: 10
protocol.session.SessionHandler: 15
tools.JSON_Object: 0
tools.JSON_List: 360
tools.JSON_Dict: 0

The umcp.server.State object is not cleaned up because we have a python-notifier signal handler ( with a notifier.Callback(state) ) which is never disconnected.

The attached patch solves the issue for me.
Still a gc.collect() is necessary so that I see that old objects are cleaned up. I have to analyze if this is really necessary of if python somewhen cleans them up automatically of if there is another issue.
Comment 19 Florian Best univentionstaff 2020-07-10 13:35:16 CEST
(In reply to Florian Best from comment #17)
> The attached patch solves the issue for me.
> Still a gc.collect() is necessary so that I see that old objects are cleaned
> up. I have to analyze if this is really necessary of if python somewhen
> cleans them up automatically of if there is another issue.

The reason is that there exists cyclic references between the objects.
Another reason is that __del__() methods aren't ensured to be called. We do multiple __del__() calls manually. We need to make things more explicit.
Comment 20 Florian Best univentionstaff 2020-07-10 18:24:15 CEST
I started to work on a fix in git:fbest/50583-umc-server-memory-leaks:
https://github.com/univention/univention-corporate-server/tree/fbest/50583-umc-server-memory-leaks
Comment 21 Florian Best univentionstaff 2020-07-15 15:00:40 CEST
Several leaks have been identified and fixed:
Python __del__() methods preventing cleanup of objects with circular references. All __del__ methods have been replaced with explicit cleanup/exit/shutdown calls.
The notifier.Callback instance for the success signal forever hold a State reference. Therefore nothing was ever cleaned up.
The PamAuth.pam instance never free()d the PAM_CONV function, therefore leaking a AuthHandler instance forever.
The AuthHandler was also leaking by a notifier.Callback instance for the authenticated signal.
The LDAP_ACLs instance, which is renewed on every "AUTH" and every "GET modules" referenced a ldap connection, which was also cached by univention.management.console.ldap.


univention-management-console.yaml
cfcdb8600249 | YAML Bug #50583
ab1db1dbcfb3 | Bug #50583: Merge branch 'fbest/50583-umc-server-memory-leaks' into 4.4-5
af07dc58f26b | Changelog Bug #50583

univention-management-console (11.0.4-86)
ab1db1dbcfb3 | Bug #50583: Merge branch 'fbest/50583-umc-server-memory-leaks' into 4.4-5
6dcdd91c01d4 | Bug #50583: add debug code helpful for memory analysis
f547e1435a1f | Bug #50583: Bug #51402: Bug #51047: fix leaking open LDAP connections
3ac991144455 | Bug #50583: enhance log messages
85b3aef040dc | Bug #50583: fix leaking AuthHandler object
5c5436e8a7c5 | Bug #50583: fix leaking PamAuth object
769826051413 | Bug #50583: fix leaking State objects
6b6c75511d27 | Bug #50583: Replace {Base,SessionHandler,ProcessorBase}.__del__()
9ee6c7c548c8 | Bug #50583: stop ModuleProcess explicit instead of using __del__()
4b76a6a5560a | Bug #50583: add SessionHandler.close_session()
10fc8b03b674 | Bug #50583: Server: remove the need for __del__() by calling exit functions explicit
5ac573656b66 | Bug #50583: Make Server.exit() robust
b8df2d5a2b46 | Bug #50583: shutdown old processor when re-creating session
043a1aaa8701 | Bug #50583: Server exit should use the same cleanup functionality as deletion
21524ef1e680 | Bug #50583: rename function for readability
c0a4438b45c1 | Bug #50583: add missing Python 3 bool method
af07dc58f26b | Changelog Bug #50583
Comment 22 Florian Best univentionstaff 2020-07-16 09:50:58 CEST
REOPEN: I see this traceback in the logs and many selenium tests failed this night because the UMC-Server doesn't run:

Traceback (most recent call last):
  File "/usr/sbin/univention-management-console-server", line 248, in <module>
    umc_daemon.do_action()
  File "/usr/lib/python2.7/dist-packages/daemon/runner.py", line 267, in do_action
    func(self)
  File "/usr/lib/python2.7/dist-packages/daemon/runner.py", line 186, in _start
    self.app.run()
  File "/usr/sbin/univention-management-console-server", line 215, in run
    notifier.loop()
  File "/usr/lib/python2.7/dist-packages/notifier/nf_generic.py", line 296, in loop
    step()
  File "/usr/lib/python2.7/dist-packages/notifier/nf_generic.py", line 251, in step
    if not timer[CALLBACK]():
  File "/usr/lib/python2.7/dist-packages/notifier/__init__.py", line 104, in __call__
    return self._function(*tmp, **self._kwargs)
  File "/usr/lib/python2.7/dist-packages/univention/management/console/protocol/session.py", line 563, in _mod_connect
    'acls': self.acls.json(),
AttributeError: 'NoneType' object has no attribute 'json'
Comment 23 Felix Botner univentionstaff 2020-07-16 09:52:55 CEST
Various selenium test failed with "UMC internal server error"
Comment 24 Florian Best univentionstaff 2020-07-16 10:37:53 CEST
The timer which does the connection to the module processes was not removed in the case that the module process ends in parallel. The shutdown code now consider this as well.

Hopefully fixed in:

univention-management-console (11.0.4-87)
14fca1520a96 | Bug #50583: remove connect_timer when shutting down module process

New tests are running.
Comment 25 Florian Best univentionstaff 2020-07-17 18:20:03 CEST
(In reply to Florian Best from comment #24)
> The timer which does the connection to the module processes was not removed
> in the case that the module process ends in parallel. The shutdown code now
> consider this as well.
> 
> Hopefully fixed in:
This seems to work and is necessary but doesn't resolve the failing Selenium Tests.

One issue was also that "86_selenium/130_miscellaneous" in test_module_process_timeout() did not restore the original UCR variable leading to that all further tests ran with module-timeout=30.

Fixed in git:a2320734275c50c7c79ad19a3ee58dcb9204d85a.

(In reply to Felix Botner from comment #23)
> Various selenium test failed with "UMC internal server error"

We can see error messages in the management-console-web-server.log:

> 17.07.20 03:39:52.431  MAIN        ( PROCESS ) : SessionClient(0x7f0dac356090): _authenticated: success=True  status=200  message=None result={u'username': u'Administrator'}
> 17.07.20 03:39:52.431  MAIN        ( PROCESS ) : auth_type=None
> 17.07.20 03:40:36.392  MAIN        ( PROCESS ) : SessionClient(0x7f0dac161110): _authenticated: success=True  status=200  message=None result={u'username': u'Administrator'}
> 17.07.20 03:40:36.392  MAIN        ( PROCESS ) : auth_type=None
> 17.07.20 03:40:49.064  MAIN        ( PROCESS ) : SessionClient(0x7f0dac284dd0): _authenticated: success=True  status=200  message=None result={u'username': u'Administrator'}
> 17.07.20 03:40:49.064  MAIN        ( PROCESS ) : auth_type=None
> 17.07.20 03:41:02.462  MAIN        ( PROCESS ) : SessionClient(0x7f0dac27d610): _authenticated: success=True  status=200  message=None result={u'username': u'Administrator'}
> 17.07.20 03:41:02.463  MAIN        ( PROCESS ) : auth_type=None
> 17.07.20 03:41:09.173  MAIN        ( PROCESS ) : CPGet (10.207.158.75:56492) response status code: 401
> 17.07.20 03:41:09.173  MAIN        ( PROCESS ) : CPGet (10.207.158.75:56492) response message: For using this request a login is required.
> 17.07.20 03:41:09.173  MAIN        ( PROCESS ) : CPGet (10.207.158.75:56492) response result: None
> 17.07.20 03:41:09.173  MAIN        ( PROCESS ) : CPGet (10.207.158.75:56492) response error: {u'traceback': None, u'command': u'handle_request_get'}
> 17.07.20 03:41:14.166  MAIN        ( PROCESS ) : SessionClient(0x7f0dac27dcd0): _authenticated: success=True  status=200  message=None result={u'username': u'Administrator'}
> 17.07.20 03:41:14.166  MAIN        ( PROCESS ) : auth_type=None
> 17.07.20 03:46:17.847  MAIN        ( PROCESS ) : CPCommand (10.207.158.75:56572) response status code: 503
> 17.07.20 03:46:17.847  MAIN        ( PROCESS ) : CPCommand (10.207.158.75:56572) response message: The connection to the Univention Management Console Server broke up unexpectedly. 
> If you have root permissions on the system you can restart UMC by executing the following commands:
>  * service univention-management-console-server restart
>  * service univention-management-console-web-server restart
> Otherwise please contact an administrator or try again later.

So, the UMC-Server closes the connection from the UMC-Web-Server client.
Unfortionately we don't collect the apache access.log in the tests so we don't see the exact requests which are done.
( Not yet sure if relevant, but we can't reproduce it easily:
 It seems to be some kind of AUTH request and afterwards some other request with a unauthorized session.
 Maybe related to something like password changing (10_product_test_password_change)
)

The explicit socket shutdown maybe was never done before, so that the connection kept alive forever only if the UMC-Web-Server stopped it (The UMC-Web-Server has own session timeout handling so that it does this somewhen).

I am testing my theory with this example commit:

univention-management-console (11.0.4-88)
266d3fe05579 | Bug #50583: wait before closing socket

I hope this works. If yes, revert the "or True".
Happy holiday!
Comment 26 Julia Bremer univentionstaff 2020-07-20 23:22:59 CEST
bbcb183923 Revert "Bug #50583: temporary skip test"

The commits made for this bug id have been reverted since the amount of failing tests was too disrupting.
The rest of the fix should be developed in a feature branch. The branch tests can show the problem as well.
Comment 27 Florian Best univentionstaff 2020-08-03 09:53:00 CEST
(In reply to Florian Best from comment #21)
> b8df2d5a2b46 | Bug #50583: shutdown old processor when re-creating session
→ This commit was responsible for all failing test cases after 137_appcenter.
It has not been taken again.

The 130_miscellaneous test case was failing because of "pkill -f univention-management-console-module" and a freezing/hanging Chromium afterwards. As these module processes don't run, we can ignore the killing of every other currently running module process.

ucs-test (9.0.4-24)
196fdc7a6e51 | Bug #50583: Merge branch 'fbest/50583-umc-server-memory-leaks' into 4.4-5
1591a7b6cd50 | Bug #50583: do not kill module processes

univention-management-console (11.0.4-97)
196fdc7a6e51 | Bug #50583: Merge branch 'fbest/50583-umc-server-memory-leaks' into 4.4-5
79184b424d4e | Bug #50583: add debug code helpful for memory analysis
df63001b81b3 | Bug #50583: remove connect_timer when shutting down module process
209e93519140 | Bug #50583: Bug #51402: Bug #51047: fix leaking open LDAP connections
2f4db6ff386f | Bug #50583: enhance log messages                                         
b2848109c1e5 | Bug #50583: fix leaking AuthHandler object
9eb715982a00 | Bug #50583: fix leaking PamAuth object  
252c65948d28 | Bug #50583: fix leaking State objects
7d4d4c9f3498 | Bug #50583: Replace {Base,SessionHandler,ProcessorBase}.__del__()        
70d87022d4ae | Bug #50583: stop ModuleProcess explicit instead of using __del__()
10ff8b177458 | Bug #50583: add SessionHandler.close_session()
2aa9c06ce42e | Bug #50583: Server: remove the need for __del__() by calling exit functions explicit
6bf1dc20b110 | Bug #50583: Make Server.exit() robust   
d5942f3f26a1 | Bug #50583: Server exit should use the same cleanup functionality as deletion
536673580dc4 | Bug #50583: rename function for readability
ea437210d7ed | Bug #50583: add missing Python 3 bool method
d3eab27267b3 | Changelog Bug #50583
Comment 28 Florian Best univentionstaff 2020-08-03 13:15:57 CEST
*** Bug 51402 has been marked as a duplicate of this bug. ***
Comment 29 Arvid Requate univentionstaff 2020-08-05 12:38:35 CEST
Verified:

* Patches merged
* Jenkins UMC Tests look good
* Package update & UMC login Ok
* Advisory Ok

b2e89a1ee0 | Advisory wording
Comment 31 Florian Best univentionstaff 2020-09-02 09:47:51 CEST
*** Bug 51047 has been marked as a duplicate of this bug. ***