Univention Bugzilla – Bug 56585
Fetchmail restart doesn't work
Last modified: 2023-11-13 08:46:39 CET
The fetchmail process gets stuck hangs (or does not restart) every time you change a user object in the UMC. Jun 27 14:14:45 server systemd[1]: fetchmail.service: Found left-over process 1680 (fetchmail) in control group while starting unit. Ignoring. Jun 27 14:14:45 server systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies. Jun 27 14:14:45 server systemd[1]: Starting LSB: init-Script for system wide fetchmail daemon... Jun 27 14:14:45 server fetchmail[1680]: beendet mit Signal 15 Jun 27 14:14:45 server fetchmail[31072]: fetchmail already started; not starting. ... failed! Jun 27 14:14:45 server systemd[1]: Started LSB: init-Script for system wide fetchmail daemon. univention-app info UCS: 5.0-4 errata794 Installed: adconnector=12.0 fetchmail=6.3.26 mailserver=12.0 open-xchange-guard=2.10.6-ucs1 open-xchange-text=7.10.6-ucs2 ox-connector=2.2.6 oxseforucs=7.10.6-ucs9 dpkg -l | grep fetchmail ii fetchmail 6.4.0~beta4-3+deb10u1A~5.0.0.202104091504 amd64 SSL enabled POP3, APOP, IMAP mail gatherer/forwarder ii univention-fetchmail 13.0.6-5 all UCS fetchmail integration for UDM ii univention-fetchmail-schema 13.0.6-5 all UCS schema package for univention-fetchmail Dev Q&A card: https://wekan.knut.univention.de/b/DSf93wFtTAyvGCW3u/development-q-and-a/fJRfYMmjeh4Zg8Suu
In the OTRS ticket the customer writes that after changing _any_ attribute of a user - not necessarily a fetchmail attribute - fetchmail stops polling emails, although the fetchmail process is still running. First I thought: That seems highly unlikely, because the listener wouldn't trigger for just any change. But then I took a look at the listener (fetchmailrc.py) and it has only an LDAP »filter='(objectClass=univentionFetchmail)'« but no »attribute=[...]«. So it does indeed rewrite the fetchmail configuration after _any_ change to a user with UDM fetchmail data. That is not only inefficient but also potentially problematic. That means, that in a time of change with high frequency (e.g. school import, school year change etc.), the fetchmailrc will be overwritten again and again. Currently the writes are not atomic (create temp. file and do a mv), but the fetchmailrc is written to directly. So it is possible, that during a restart of the daemon the file is in an inconsistent state. It seems unlikely to me, that this is the problem of the referenced customer, but two changes should be done: 1. Add »attribute=[...]« to the listener, so it is only triggered, when a relevant LDAP attribute changes. 2. Make writes to the fetchmailrc atomic. I will create separate bugs for that, because while those fixes will lower the probability of the occurrence of the customers problem, I done't think it will solve it.
Do we have any workaround?
While the things listed in the previous comment should be fixed, it's very unlikely that they should cause a problem _every_ time. It is more likely, that the customer's problem is with a broken configuration. But to debug the customer's problem, they could run: systemctl stop fetchmail.service sudo -u fetchmail fetchmail --verbose --nodetach --nosyslog --fetchmailrc /etc/fetchmailrc If there is a problem with the configuration, it should now be printed to the terminal.
I don't see any configuration error. It looks more like a timing issue. If I change a description of a user the restart works in 50% of the cases. I changed the listener script from the initscript to services. Afterwards I wasn't able to reproduce the problem: --- fetchmailrc.py.orig 2023-09-21 07:10:24.843304181 +0200 +++ fetchmailrc.py 2023-09-21 07:10:53.623293111 +0200 @@ -237,6 +237,6 @@ ud.debug(ud.LISTENER, ud.INFO, 'Restarting fetchmail-daemon') listener.setuid(0) try: - listener.run(initscript, ['fetchmail', 'restart'], uid=0) + listener.run('/usr/sbin/service', ['service', 'fetchmail', 'restart'], uid=0) finally: listener.unsetuid()
Created attachment 11129 [details] fetchmail.py.patch Workaround: patch -p0 -d /usr/lib/univention-directory-listener/system/ <fetchmail.py.patch service univention-directory-listener restart
Oh... looks like using an old-style init script directly, prevents systems from doing its thing, leading to "Found left-over process <pid> (<process>) in control group while starting unit."... I wonder if we have more cases like this... Most calls have been converted to "systemctl restart <..>.service" or "service <..> restart", but some remain in UCS...
To the implementer: Just talked to Philipp about what our standards are in UCS5: When implementing, please modify the patch to use "systemctl" instead of "service". (And in Debian maintainer scripts (like postinst) use "deb-systemd-invoke".)
(In reply to Stefan Gohmann from comment #6) > Created attachment 11129 [details] > fetchmail.py.patch > > Workaround: > > patch -p0 -d /usr/lib/univention-directory-listener/system/ > <fetchmail.py.patch > service univention-directory-listener restart Unfortunately, it didn't work. So next try is to change the listener script in the following way: # listener.run(initscript, ['fetchmail', 'restart'], uid=0) listener.run('/usr/sbin/service', ['service', 'fetchmail', 'stop'], uid=0) time.sleep(5) listener.run('/usr/sbin/service', ['service', 'fetchmail', 'start'], uid=0)
(In reply to Stefan Gohmann from comment #9) > (In reply to Stefan Gohmann from comment #6) > > Created attachment 11129 [details] > > fetchmail.py.patch > > > > Workaround: > > > > patch -p0 -d /usr/lib/univention-directory-listener/system/ > > <fetchmail.py.patch > > service univention-directory-listener restart > > Unfortunately, it didn't work. So next try is to change the listener script > in the following way: > > # listener.run(initscript, ['fetchmail', 'restart'], uid=0) > listener.run('/usr/sbin/service', ['service', 'fetchmail', 'stop'], > uid=0) > time.sleep(5) > listener.run('/usr/sbin/service', ['service', 'fetchmail', 'start'], > uid=0) The partner / customer gave feedback that it works now.