Bug 40134 - atd hangs using 100% CPU in docker
atd hangs using 100% CPU in docker
Status: CLOSED FIXED
Product: UCS
Classification: Unclassified
Component: Docker
UCS 4.1
Other Linux
: P5 normal (vote)
: UCS 4.1-0-errata
Assigned To: Daniel Tröder
Felix Botner
:
Depends on: 39482
Blocks:
  Show dependency treegraph
 
Reported: 2015-11-30 19:43 CET by Dirk Wiesenthal
Modified: 2015-12-22 16:05 CET (History)
4 users (show)

See Also:
What kind of report is it?: ---
What type of bug is this?: ---
Who will be affected by this bug?: ---
How will those affected feel about the bug?: ---
User Pain:
Enterprise Customer affected?:
School Customer affected?:
ISV affected?:
Waiting Support:
Flags outvoted (downgraded) after PO Review:
Ticket number:
Bug group (optional):
Max CVSS v3 score:


Attachments
skip_atjob_creation_in_postinst.patch (733 bytes, patch)
2015-12-07 11:57 CET, Arvid Requate
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Dirk Wiesenthal univentionstaff 2015-11-30 19:43:26 CET
Feedback from an ISV using the released docker image:

Auf dem Server tauchen immer wieder mehrere atd-Prozesse auf, welche 100% CPU-Zeit verbrauchen. Nach einem kurzen strace scheinen diese auf ein at-Job-File zuzugreifen, welches nicht existiert. Die Performance leidet darunter leider merklich.

+++ This bug was initially created as a clone of Bug #39482 +++

Whenever a docker.software-univention.de/ucs-appbox-amd64:4.1-0 is started, the atd process inside the container hangs using a full core.
Comment 1 Daniel Tröder univentionstaff 2015-11-30 19:48:39 CET
Those must be old images. Please check with Arvid how to make sure they exchange them for fresh ones.

As a workaround until solved, they can simply run
# atrm 1
With "1" being the jobid. The jobid can be seen using
# atq
Comment 2 Dirk Wiesenthal univentionstaff 2015-12-04 01:06:53 CET
They use the new ones. New feedback:

Es scheint, als waere fest im Image eine Datei fuer einen atjob unter /var/spool/cron/atjobs vom 11. November eingebaut, welche das Problem verursacht. Nach dem Loeschen der Datei schien das Problem nicht mehr zu erscheinen.
Comment 3 Daniel Tröder univentionstaff 2015-12-04 09:11:18 CET
The version of univention-mail-postfix installed at image creation time is 10.0.0-6.270.201511051213 - that includes the fix for Bug #39482 (which is in 10.0.0-5).

The atjobs file creation date and postfix configuration file dates as well as the installation date in dpkg.log are the same. That means, that the join script was run at installation time.

I had understood, that join scripts would be suppressed at image creation time and run at container creation time. Is that not so?
Comment 4 Arvid Requate univentionstaff 2015-12-07 11:57:34 CET
Created attachment 7350 [details]
skip_atjob_creation_in_postinst.patch

The ucs-appbox-amd64:4.1-0 contains the recent version of univention-mail-postfix:

root@f2b3658ccc9f:/# COLUMNS=200 dpkg -l univention-mail-postfix

ii  univention-mail-postfix                      10.0.0-6.270.201511051213



The attached patch shows the location in univention-mail-postfix.postinst were this at job is created. The above mentioned joinscript is not involved, as it is in a different package (univention-mail-server).
Comment 5 Daniel Tröder univentionstaff 2015-12-07 14:12:24 CET
Arg! Code removal in commit 65166 was incomplete: "create-dh-parameter-files.sh" was in two places!

Commit 66125 (+ yaml 66126) fixes it.
Comment 6 Stefan Gohmann univentionstaff 2015-12-08 07:39:56 CET
I think the at job should be removed during the upgrade. Otherwise running container won't be fixed. Furthermore, the job should be killed during the upgrade.
Comment 7 Daniel Tröder univentionstaff 2015-12-08 09:32:57 CET
(In reply to Stefan Gohmann from comment #6)
> I think the at job should be removed during the upgrade. Otherwise running
> container won't be fixed.
Commit 66147 (yaml 66148) adds a check for the at job and removes it if found (in a upgrade in a docker container).

> Furthermore, the job should be killed during the upgrade.
It is not the job that hangs, just atd. After removing the offending job, atd settles down.
Comment 8 Felix Botner univentionstaff 2015-12-15 15:24:03 CET
ok, works but please maybe we should limit this to the update. Please add something like this

if [ "$1" = "configure" -a -n "$2" ] && dpkg --compare-versions "$2" lt 10.0.0-8; then

to the postinst.
Comment 9 Daniel Tröder univentionstaff 2015-12-15 15:52:10 CET
The check for the version was added to the postinst in commit 66364, YAML 66365.
Comment 10 Felix Botner univentionstaff 2015-12-17 10:26:56 CET
OK
Comment 11 Arvid Requate univentionstaff 2015-12-22 16:05:02 CET
<http://errata.software-univention.de/ucs/4.1/43.html>