Bug 56585 - Fetchmail restart doesn't work
Summary: Fetchmail restart doesn't work
Status: CLOSED FIXED
Alias: None
Product: UCS
Classification: Unclassified
Component: Mail
Version: UCS 5.0
Hardware: Other Linux
: P5 normal
Target Milestone: UCS 5.0-10-errata
Assignee: Christian Castens
QA Contact: Arvid Requate
URL: https://git.knut.univention.de/univen...
Keywords:
Depends on: 56586 56587
Blocks: 58532
  Show dependency treegraph
 
Reported: 2023-09-14 06:33 CEST by Stefan Gohmann
Modified: 2025-08-27 16:01 CEST (History)
4 users (show)

See Also:
What kind of report is it?: Bug Report
What type of bug is this?: 5: Major Usability: Impairs usability in key scenarios
Who will be affected by this bug?: 2: Will only affect a few installed domains
How will those affected feel about the bug?: 4: A User would return the product
User Pain: 0.229
Enterprise Customer affected?: Yes
School Customer affected?:
ISV affected?:
Waiting Support: Yes
Flags outvoted (downgraded) after PO Review:
Ticket number: 2023062721000197
Bug group (optional):
Customer ID:
Max CVSS v3 score:
troeder: Patch_Available+


Attachments
fetchmail.py.patch (433 bytes, patch)
2023-09-21 07:36 CEST, Stefan Gohmann
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Stefan Gohmann univentionstaff 2023-09-14 06:33:28 CEST
The fetchmail process gets stuck hangs (or does not restart) every time you change a user object in the UMC.

Jun 27 14:14:45 server systemd[1]: fetchmail.service: Found left-over process 1680 (fetchmail) in control group while starting unit. Ignoring.
Jun 27 14:14:45 server systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Jun 27 14:14:45 server systemd[1]: Starting LSB: init-Script for system wide fetchmail daemon...
Jun 27 14:14:45 server fetchmail[1680]: beendet mit Signal 15
Jun 27 14:14:45 server fetchmail[31072]: fetchmail already started; not starting. ... failed!
Jun 27 14:14:45 server systemd[1]: Started LSB: init-Script for system wide fetchmail daemon.


univention-app info
UCS: 5.0-4 errata794
Installed: adconnector=12.0 fetchmail=6.3.26 mailserver=12.0 open-xchange-guard=2.10.6-ucs1 open-xchange-text=7.10.6-ucs2 ox-connector=2.2.6 oxseforucs=7.10.6-ucs9

dpkg -l | grep fetchmail
ii  fetchmail                                           6.4.0~beta4-3+deb10u1A~5.0.0.202104091504         amd64        SSL enabled POP3, APOP, IMAP mail gatherer/forwarder
ii  univention-fetchmail                                13.0.6-5                                          all          UCS fetchmail integration for UDM
ii  univention-fetchmail-schema                         13.0.6-5                                          all          UCS schema package for univention-fetchmail

Dev Q&A card:
https://wekan.knut.univention.de/b/DSf93wFtTAyvGCW3u/development-q-and-a/fJRfYMmjeh4Zg8Suu
Comment 2 Daniel Tröder univentionstaff 2023-09-14 09:24:04 CEST
In the OTRS ticket the customer writes that after changing _any_ attribute of a user - not necessarily a fetchmail attribute - fetchmail stops polling emails, although the fetchmail process is still running.

First I thought: That seems highly unlikely, because the listener wouldn't trigger for just any change.

But then I took a look at the listener (fetchmailrc.py) and it has only an LDAP »filter='(objectClass=univentionFetchmail)'« but no »attribute=[...]«.

So it does indeed rewrite the fetchmail configuration after _any_ change to a user  with UDM fetchmail data.

That is not only inefficient but also potentially problematic.

That means, that in a time of change with high frequency (e.g. school import, school year change etc.), the fetchmailrc will be overwritten again and again.
Currently the writes are not atomic (create temp. file and do a mv), but the fetchmailrc is written to directly.
So it is possible, that during a restart of the daemon the file is in an inconsistent state.

It seems unlikely to me, that this is the problem of the referenced customer, but two changes should be done:

1. Add »attribute=[...]« to the listener, so it is only triggered, when a relevant LDAP attribute changes.
2. Make writes to the fetchmailrc atomic.

I will create separate bugs for that, because while those fixes will lower the probability of the occurrence of the customers problem, I done't think it will solve it.
Comment 3 Stefan Gohmann univentionstaff 2023-09-19 14:55:33 CEST
Do we have any workaround?
Comment 4 Daniel Tröder univentionstaff 2023-09-19 18:09:23 CEST
While the things listed in the previous comment should be fixed, it's very unlikely that they should cause a problem _every_ time.
It is more likely, that the customer's problem is with a broken configuration.

But to debug the customer's problem, they could run:

systemctl stop fetchmail.service

sudo -u fetchmail fetchmail --verbose --nodetach --nosyslog --fetchmailrc /etc/fetchmailrc

If there is a problem with the configuration, it should now be printed to the terminal.
Comment 5 Stefan Gohmann univentionstaff 2023-09-21 07:32:36 CEST
I don't see any configuration error. It looks more like a timing issue. If I change a description of a user the restart works in 50% of the cases.

I changed the listener script from the initscript to services. Afterwards I wasn't able to reproduce the problem:

--- fetchmailrc.py.orig	2023-09-21 07:10:24.843304181 +0200
+++ fetchmailrc.py	2023-09-21 07:10:53.623293111 +0200
@@ -237,6 +237,6 @@
     ud.debug(ud.LISTENER, ud.INFO, 'Restarting fetchmail-daemon')
     listener.setuid(0)
     try:
-        listener.run(initscript, ['fetchmail', 'restart'], uid=0)
+        listener.run('/usr/sbin/service', ['service', 'fetchmail', 'restart'], uid=0)
     finally:
         listener.unsetuid()
Comment 6 Stefan Gohmann univentionstaff 2023-09-21 07:36:17 CEST
Created attachment 11129 [details]
fetchmail.py.patch

Workaround:

patch -p0 -d /usr/lib/univention-directory-listener/system/ <fetchmail.py.patch 
service univention-directory-listener restart
Comment 7 Daniel Tröder univentionstaff 2023-09-21 09:22:04 CEST
Oh... looks like using an old-style init script directly, prevents systems from doing its thing, leading to "Found left-over process <pid> (<process>) in control group while starting unit."...
I wonder if we have more cases like this... Most calls have been converted to "systemctl restart <..>.service" or "service <..> restart", but some remain in UCS...
Comment 8 Daniel Tröder univentionstaff 2023-09-21 09:30:03 CEST
To the implementer: Just talked to Philipp about what our standards are in UCS5: When implementing, please modify the patch to use "systemctl" instead of "service". (And in Debian maintainer scripts (like postinst) use "deb-systemd-invoke".)
Comment 9 Stefan Gohmann univentionstaff 2023-09-22 16:24:35 CEST
(In reply to Stefan Gohmann from comment #6)
> Created attachment 11129 [details]
> fetchmail.py.patch
> 
> Workaround:
> 
> patch -p0 -d /usr/lib/univention-directory-listener/system/
> <fetchmail.py.patch 
> service univention-directory-listener restart

Unfortunately, it didn't work. So next try is to change the listener script in the following way:

        # listener.run(initscript, ['fetchmail', 'restart'], uid=0)
        listener.run('/usr/sbin/service', ['service', 'fetchmail', 'stop'], uid=0)
        time.sleep(5)
        listener.run('/usr/sbin/service', ['service', 'fetchmail', 'start'], uid=0)
Comment 10 Stefan Gohmann univentionstaff 2023-10-02 11:13:23 CEST
(In reply to Stefan Gohmann from comment #9)
> (In reply to Stefan Gohmann from comment #6)
> > Created attachment 11129 [details]
> > fetchmail.py.patch
> > 
> > Workaround:
> > 
> > patch -p0 -d /usr/lib/univention-directory-listener/system/
> > <fetchmail.py.patch 
> > service univention-directory-listener restart
> 
> Unfortunately, it didn't work. So next try is to change the listener script
> in the following way:
> 
>         # listener.run(initscript, ['fetchmail', 'restart'], uid=0)
>         listener.run('/usr/sbin/service', ['service', 'fetchmail', 'stop'],
> uid=0)
>         time.sleep(5)
>         listener.run('/usr/sbin/service', ['service', 'fetchmail', 'start'],
> uid=0)

The partner / customer gave feedback that it works now.
Comment 12 Arvid Requate univentionstaff 2025-08-18 10:18:11 CEST
To summarize the above, things changed in the mean time:

> 1. Add »attribute=[...]« to the listener, so it is only triggered, when a relevant LDAP attribute changes.
> 2. Make writes to the fetchmailrc atomic.

This has been done via

* Bug #56586 already adjusted the fetchmail listener so it only runs for changes of specific relevant attributes
* Bug #56587 made the way the fetchmailrc is generated more robust against race conditions (which we should not have here)

Regarding the question if calling /etc/init.d/fetchmail could be worse that "systemctl restart fetchmail.service":

* The /var/run/systemd/generator.late/fetchmail.service is automatically generated for /etc/init.d/fetchmail by systemd-sysv-generator
* "/etc/init.d/fetchmail restart" and "systemctl restart fetchmail" do the same thing via /lib/lsb/init-functions.d/40-systemd
* Comment 9 already says that the theory of Comment 6 didn't work out
* univention-fetchmail could attempt to ship a dedicated fetchmail.service, but if upstream doesn't, why should we?

That /etc/init.d/fetchmail is from the fetchmail upstream package (version 6.4.0~beta4-3+deb10u1 in UCS 5.0-x and 6.4.37-1 since UCS 5.2-0).
So either we are lucky an that fixed something (still didn't find the debian maintainer repo to check the git history) or we may want to
follow the advice of Comment 10, just to be sure. But the tile.sleep(5) is not so great - if every module does things like this, we kill
(synchronous) replication performance.
Comment 13 Arvid Requate univentionstaff 2025-08-18 13:13:32 CEST
Ah Julia found https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=981464
which shows that /usr/lib/systemd/user/fetchmail.service exists.
But it's a user unit, not sure yet if/how we can use that.
Comment 14 Arvid Requate univentionstaff 2025-08-20 16:14:22 CEST
[5.0-10] b01c5692c6d | fix(univention-fetchmail): manage fetchmail service via systemd instead of using SysV init script

Package: univention-fetchmail
Version: 13.0.12-2
Release: errata5.0-10
Scope: errata5.0-10
Comment 15 Christian Castens univentionstaff 2025-08-25 09:58:15 CEST
Package: univention-fetchmail
Version: 13.0.12-3
Release: errata5.0-10
Scope: errata5.0-10

QA:
  OK: switch to systemd instead of SysV init script
  OK: advisories
  OK: manual tests - fetchmail no longer enters the described failed state
Comment 16 Christian Castens univentionstaff 2025-08-27 16:01:00 CEST
<https://errata.software-univention.de/#/?erratum=5.0x1314>