Bug 38402 - UMC-Webserver: memory leak due to no restart of UMC-Server
Summary: UMC-Webserver: memory leak due to no restart of UMC-Server
Status: CLOSED FIXED
Alias: None
Product: UCS
Classification: Unclassified
Component: UMC (Generic)
Version: UCS 4.0
Hardware: Other Linux
: P5 normal
Target Milestone: UCS 4.1
Assignee: Florian Best
QA Contact: Alexander Kramer
URL:
Keywords: interim-2
: 38052 38053 (view as bug list)
Depends on:
Blocks:
 
Reported: 2015-04-30 14:41 CEST by Florian Best
Modified: 2020-11-27 18:04 CET (History)
3 users (show)

See Also:
What kind of report is it?: ---
What type of bug is this?: ---
Who will be affected by this bug?: ---
How will those affected feel about the bug?: ---
User Pain:
Enterprise Customer affected?:
School Customer affected?:
ISV affected?:
Waiting Support:
Flags outvoted (downgraded) after PO Review:
Ticket number:
Bug group (optional): Cleanup, Error handling, Usability
Customer ID:
Max CVSS v3 score:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Florian Best univentionstaff 2015-04-30 14:41:47 CEST
If setup-join.sh is executed by hand without the cleanup scripts the UMC-Server is not restarted. The UMC-Server then has old SSL certificated internally while the UMC-Webserver has the new ones. When the UMC-webserver now connects to the UMC-server it gets a socket.error with errno "Connection refused".

strace shows a lot of socket errors for the socket file descriptors. Seems internally the cleanup is not correctly performed. (in python notifier?):
> getsockopt(14, SOL_SOCKET, SO_ERROR, [0], [4]) = 0

This is followed by "Cannot allocate memory".
getsockopt(14, SOL_SOCKET, SO_ERROR, [0], [4]) = 0
gettimeofday({1430396851, 831716}, NULL) = 0
poll([{fd=10, events=POLLIN}, {fd=12, events=POLLIN}, {fd=14, events=POLLIN}], 3, 100) = 3 ([{fd=10, revents=POLLIN|POLLHUP}, {fd=12, revents=POLLIN|POLLHUP}, {fd=14, revents=POLLIN|POLLHUP}])
gettimeofday({1430396851, 831969}, NULL) = 0
gettimeofday({1430396851, 832115}, NULL) = 0
gettimeofday({1430396851, 832214}, NULL) = 0
mmap(NULL, 432189440, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory)
mmap(NULL, 432189440, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory)
brk(0x1c617000)                         = 0x2a10000
mmap(NULL, 432324608, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory)
getsockopt(10, SOL_SOCKET, SO_ERROR, [0], [4]) = 0
mmap(NULL, 243105792, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory)
mmap(NULL, 243105792, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory)
brk(0x111c3000)                         = 0x2a10000
mmap(NULL, 243236864, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory)
getsockopt(12, SOL_SOCKET, SO_ERROR, [0], [4]) = 0
mmap(NULL, 136744960, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory)
mmap(NULL, 136744960, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory)
brk(0xac55000)                          = 0x2a10000
mmap(NULL, 136880128, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory)
getsockopt(14, SOL_SOCKET, SO_ERROR, [0], [4]) = 0

The socket symlinks are all dead:

# ls -l /proc/2807/fd
insgesamt 0
lr-x------ 1 root root 64 Apr 30 08:29 0 -> /dev/null
lrwx------ 1 root root 64 Apr 30 08:29 1 -> /dev/null
lrwx------ 1 root root 64 Apr 30 08:29 10 -> socket:[44532]
lrwx------ 1 root root 64 Apr 30 08:29 11 -> socket:[46289]
lrwx------ 1 root root 64 Apr 30 08:29 12 -> socket:[46290]
lrwx------ 1 root root 64 Apr 30 08:29 13 -> socket:[48884]
lrwx------ 1 root root 64 Apr 30 08:29 14 -> socket:[48886]
lrwx------ 1 root root 64 Apr 30 08:29 15 -> socket:[50761]
lrwx------ 1 root root 64 Apr 30 08:29 18 -> socket:[49846]
lrwx------ 1 root root 64 Apr 30 08:29 2 -> /dev/null
lrwx------ 1 root root 64 Apr 30 08:29 21 -> socket:[49865]
lrwx------ 1 root root 64 Apr 30 08:29 3 -> /var/log/univention/management-console-web-server.log
lr-x------ 1 root root 64 Apr 30 08:29 4 -> /dev/null
lrwx------ 1 root root 64 Apr 30 08:29 5 -> socket:[12439]
lrwx------ 1 root root 64 Apr 30 08:29 6 -> /dev/null
lrwx------ 1 root root 64 Apr 30 08:29 7 -> /dev/null
lrwx------ 1 root root 64 Apr 30 08:29 8 -> socket:[44531]
lr-x------ 1 root root 64 Apr 30 08:29 9 -> /dev/urandom
Comment 1 Erik Damrose univentionstaff 2015-04-30 14:44:39 CEST
(In reply to Florian Best from comment #0)
> If setup-join.sh is executed by hand without the cleanup scripts the
> UMC-Server is not restarted. 

That is Bug #38332
Comment 2 Florian Best univentionstaff 2015-08-25 22:11:56 CEST
The UMC-webserver (umcp.Client()) goes into a endless recursion loop which causes a 100% CPU usage. This is caused because the polling of the socket is never stopped in a specific error scenario: The socket SSL connection can't be established so the client tries to establish a non SSL fallback connection. The problem here was just that the old socket is not cleaned up in that case so that the notifier loop always calls the socket-callback _read().

I fixed this with a simple patch and tested it intensively in the following ways:
* make 1000 request while restarting the UMC-server somewhen during these requests
* produce a SSL error due to an outdated SSL certificate (set the time 6 years in the future on the system)
* require a verified connection in the UMC-server while the client doesn't send a certificate at all

univention-management-console (8.0.6-1):
r63249 | Bug #38402: fix recursion error in the umcp client which allowed DoS by causing 100% CPU load
Comment 3 Florian Best univentionstaff 2015-08-27 18:13:18 CEST
*** Bug 38052 has been marked as a duplicate of this bug. ***
Comment 4 Alexander Kramer univentionstaff 2015-11-06 07:16:23 CET
> I fixed this with a simple patch and tested it intensively in the following
> ways:
> * make 1000 request while restarting the UMC-server somewhen during these
> requests
> * produce a SSL error due to an outdated SSL certificate (set the time 6
> years in the future on the system)
> * require a verified connection in the UMC-server while the client doesn't
> send a certificate at all

I also reproduced the cases from above, looks good.

OK -  no memory leak due to no restart of UMC-Server
OK - changelog xml
Comment 5 Stefan Gohmann univentionstaff 2015-11-17 12:11:30 CET
UCS 4.1 has been released:
 https://docs.software-univention.de/release-notes-4.1-0-en.html
 https://docs.software-univention.de/release-notes-4.1-0-de.html

If this error occurs again, please use "Clone This Bug".
Comment 6 Florian Best univentionstaff 2020-11-27 18:04:34 CET
*** Bug 38053 has been marked as a duplicate of this bug. ***