Bug 51104 - cron jobs schedules multiple ldap-group-to-file.py in parallel
cron jobs schedules multiple ldap-group-to-file.py in parallel
Status: NEW
Product: UCS
Classification: Unclassified
Component: PAM
UCS 4.4
Other Linux
: P5 normal (vote)
: ---
Assigned To: UCS maintainers
UCS maintainers
:
Depends on:
Blocks: 39824
  Show dependency treegraph
 
Reported: 2020-04-14 15:43 CEST by Philipp Hahn
Modified: 2020-08-06 13:45 CEST (History)
4 users (show)

See Also:
What kind of report is it?: Bug Report
What type of bug is this?: 2: Improvement: Would be a product improvement
Who will be affected by this bug?: 2: Will only affect a few installed domains
How will those affected feel about the bug?: 1: Nuisance – not a big deal but noticeable
User Pain: 0.023
Enterprise Customer affected?: Yes
School Customer affected?:
ISV affected?:
Waiting Support:
Flags outvoted (downgraded) after PO Review:
Ticket number: 2020080621000415
Bug group (optional):
Max CVSS v3 score:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Philipp Hahn univentionstaff 2020-04-14 15:43:01 CEST
987 ?        Ss     0:10 /usr/sbin/cron -f
 3346 ?        S      0:00  \_ /usr/sbin/CRON -f
 3350 ?        Ss     0:00  |   \_ /bin/bash /usr/sbin/jitter 1800 /usr/lib/univention-pam/ldap-group-to-file.py
 3353 ?        S      0:00  |       \_ sleep 1773
 4845 ?        S      0:00  \_ /usr/sbin/CRON -f
 4858 ?        Ss     0:00      \_ /bin/bash /usr/sbin/jitter 1800 /usr/lib/univention-pam/ldap-group-to-file.py
 4859 ?        S      0:00          \_ sleep 668

# grep ldap /etc/cron.d/univention-pam
*/15 * * * *   root   [ -x /usr/lib/univention-pam/ldap-group-to-file.py ] && /usr/sbin/jitter 1800 /usr/lib/univention-pam/ldap-group-to-file.py
# ucr get nss/group/cachefile/invalidate_interval
*/15 * * * *

`jitter 1800` is hard-coded to 30m, which clashes with the default update interval of 15m!

There is generic Bug #39824
Comment 1 Ingo Steuwer univentionstaff 2020-04-14 17:52:19 CEST
Which Errata is this?

There should be only one running process since #35173 and the default for the Cron Job has been changed to "once per day".
Comment 2 Philipp Hahn univentionstaff 2020-04-14 18:09:20 CEST
(In reply to Ingo Steuwer from comment #1)
> Which Errata is this?

version/erratalevel: 525
version/patchlevel: 4
version/releasename: Blumenthal
version/security-patchlevel: 9
version/version: 4.4
 
> There should be only one running process since #35173 and the default for
> the Cron Job has been changed to "once per day".

Please write that as Bug #35173 next time so that Bugzilla detects and renders it that as a link to another bug. Thanks.
Comment 3 Erik Damrose univentionstaff 2020-04-14 18:21:11 CEST
(In reply to Ingo Steuwer from comment #1)
> There should be only one running process since #35173 and the default for
> the Cron Job has been changed to "once per day".

Default for the cronjob was changed in bug 50191, but only for new installations.
Comment 4 Christian Völker univentionstaff 2020-05-06 14:46:21 CEST
According to Bug #35173 ldap-group-to-file has a now a locking mechanism.

I assume a new process sleeps when another process is running? 

Wouldn't it be better just to exit gracefully (as someone else is doing my job)?

There should not be running multiple processes as they are hindering each other.
Comment 5 Ingo Steuwer univentionstaff 2020-05-06 14:59:30 CEST
(In reply to Christian Völker from comment #4)
> According to Bug #35173 ldap-group-to-file has a now a locking mechanism.
> 
> I assume a new process sleeps when another process is running? 
> 
> Wouldn't it be better just to exit gracefully (as someone else is doing my
> job)?
> 
> There should not be running multiple processes as they are hindering each
> other.

That is exactly what is done in script:


----
        lock = univention.lib.locking.get_lock('ldap-group-to-file', nonblocking=True)
        try:
                if not lock:
                        print('Abort: Process is locked, another instance is already running.')
                        sys.exit(2)

----

The process list is misleading: the "jitter" waits up to 30 minutes to call "ldap-group-to-file". The old default was to do this every 15 minutes. So if the first "jitter" waits >15 minutes there are two parallel task. As this is only "sleep" there is no load for LDAP or other services.

Once one if them actually executes "ldap-group-to-file", it will check whether the there is a lock and won't run it a second time in parallel.

As the timing isn't the default anymore this also occures only in "old" installations and even then it is no real issue.
Comment 6 Christina Scheinig univentionstaff 2020-08-06 13:45:29 CEST
So this process is scheduled via cron. If the process takes to long, and the second process is aborded, the customer, who has configured mail for the cronjobs gets an email about this. This caused a support ticket, so I added the ticket to this Bug.