Bug 45984 - Amavis fails reinserting mail to postfix
Amavis fails reinserting mail to postfix
Status: CLOSED FIXED
Product: UCS
Classification: Unclassified
Component: Mail
UCS 4.3
Other Linux
: P5 normal (vote)
: UCS 4.3
Assigned To: Daniel Tröder
Sönke Schwardt-Krummrich
: interim-2
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2018-01-08 10:24 CET by Daniel Tröder
Modified: 2018-03-14 14:38 CET (History)
1 user (show)

See Also:
What kind of report is it?: Development Internal
What type of bug is this?: ---
Who will be affected by this bug?: ---
How will those affected feel about the bug?: ---
User Pain:
Enterprise Customer affected?:
School Customer affected?:
ISV affected?:
Waiting Support:
Flags outvoted (downgraded) after PO Review:
Ticket number:
Bug group (optional):
Max CVSS v3 score:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Daniel Tröder univentionstaff 2018-01-08 10:24:33 CET
40_mail/39_postfix_accepts_mails_on_ports_25_465_587 regularly fails (http://jenkins.knut.univention.de:8080/job/UCS-4.3/job/UCS-4.3-0/job/AutotestJoin/lastBuild/SambaVersion=s3,Systemrolle=slave/testReport/junit/40_mail/39_postfix_accepts_mails_on_ports_25_465_587/test/).

In a production environment the problem is not big, because a later retry will succeed.
Comment 1 Daniel Tröder univentionstaff 2018-01-08 10:57:44 CET
There is little to be found on the internet about the error. The most likely problem seems to be a connection caching error in amavis.

Disabling it will not cost much performance, as the relevant sockets connect to a process that is very long running compared to the time needed for a TCP handshake.

Let's try this and wait for tonights Jenkins run:

[4.3-0 62b0964d7c] Bug #45984: disable connection caching for amavis socket
[4.3-0 431bcb7458] Bug #45984: disable connection caching

univention-mail-postfix 12.0.0-7A~4.3.0.201801081037
univention-antivir-mail 9.0.0-4A~4.3.0.201801081055
Comment 2 Daniel Tröder univentionstaff 2018-01-12 09:15:47 CET
40_mail/39_postfix_accepts_mails_on_ports_25_465_587 is stable since 3 Jenkins runs.

QA: the same test is effected in 4.2-3 as well. Should this be backported? If yes, please clone this bug.
Comment 3 Sönke Schwardt-Krummrich univentionstaff 2018-02-25 20:55:00 CET
Is there a reason why you disabled the connection cache in master.cf AND
60-univention?
IIRC the amavis config is aggregated over all files in /etc/amavis/conf.d/. If we disable the connection cache in 60-univention only, the user/support/... would be able to reenable it by creating a small config file /etc/amavis/conf.d/99-hotfix
with the entries
$smtp_connection_cache_enable = 1;
$smtp_connection_cache_on_demand = 1;

So if it is sufficient, I would prefer to remove the change in master.cf.

Next question: would it be sufficient to only set smt

→ REOPEN


Important information from 
https://sourceforge.net/p/amavis/mailman/message/19427075/:

From [amavis] release notes:

- smtp client connection caching is a new feature which allows smtp client
  code in amavisd to keep a SMTP session to MTA open after forwarding a
  message or a notification, so that a next mail message that needs to be
  sent by this child process can avoid re-establishing a session and the
  initial greeting/EHLO (and TLS) handshake.

  A current value of a global settings $smtp_connection_cache_enable
  controls whether a session will be retained after forwarding a message
  or not. Its default initial value is true.

  A global setting $smtp_connection_cache_on_demand controls whether amavisd
  is allowed to dynamically change the $smtp_connection_cache_enable setting
  according to its estimate of the message frequency. The heuristics is
  currently very simple: if time interval between a previous task completion
  by this child process and the arrival of a current message is 5 seconds
  or less, the $smtp_connection_cache_enable is turned on (which will affect
  the next message); if the interval is 15 seconds or more, it is turned off.
  The default value of the $smtp_connection_cache_on_demand is true, thus
  enabling the adaptive behaviour.

  On a busy server the connection caching can save some processing time.
  Savings are substantial if client-side TLS is enabled, otherwise just a
  few milliseconds are saved. On an idle server the feature may unnecessarily
  keep sessions to MTA open (until MTA times them out), so one can disable
  the feature by setting both controls to false (to 0 or undef).

  To monitor the connection caching effectiveness, some SNMP-like counters
  were added, so amavisd-agent may display something like:

    OutConnNew                      2764    319/h    98.2 % (OutMsgs)
    OutConnQuit                     2521    291/h    89.5 % (OutMsgs)
    OutConnReuseFail                   7      1/h     0.2 % (OutMsgs)
    OutConnReuseRecent                21      2/h     0.7 % (OutMsgs)
    OutConnReuseRefreshed             31      4/h     1.1 % (OutMsgs)
    OutConnTransact                 2816    325/h   100.0 % (OutMsgs)

(In reply to Daniel Tröder from comment #0)
> 40_mail/39_postfix_accepts_mails_on_ports_25_465_587 regularly fails
> (http://jenkins.knut.univention.de:8080/job/UCS-4.3/job/UCS-4.3-0/job/
> AutotestJoin/lastBuild/SambaVersion=s3,Systemrolle=slave/testReport/junit/
> 40_mail/39_postfix_accepts_mails_on_ports_25_465_587/test/).

Please note that linking to jenkins is useless. Jenkins keeps currently only the last 6-10 build logs → after 10 days, the link becomes invalid.
Please also note that if the URL contains "/lastBuild/" the URL always refers to the latest build and is therefore no static reference to a specific build/problem.
Please always copy at least the error message to bugzilla.

The original error message was:
[2018-02-24 22:38:50.316323] Feb 24 22:37:20 master090 39_postfix_accepts_mails_on_ports_25_465_587: Sending to port 587.
[2018-02-24 22:38:50.316334] Feb 24 22:37:20 master090 postfix/smtpd[4730]: connect from localhost[127.0.0.1]
[2018-02-24 22:38:50.316344] Feb 24 22:37:20 master090 postfix/smtpd[4730]: 2328DC5195: client=localhost[127.0.0.1]
[2018-02-24 22:38:50.316354] Feb 24 22:37:20 master090 postfix/cleanup[4725]: 2328DC5195: message-id=2ef08322-19dd-11e8-9a31-0260acd61b74.tjbgzprh2t@autotest090.local
[2018-02-24 22:38:50.316366] Feb 24 22:37:20 master090 postfix/qmgr[4666]: 2328DC5195: from=<jvydzbvjnv@autotest090.local>, size=764, nrcpt=1 (queue active)
[2018-02-24 22:38:50.316376] Feb 24 22:37:20 master090 postfix/smtpd[4730]: disconnect from localhost[127.0.0.1]
[2018-02-24 22:38:50.316399] Feb 24 22:37:20 master090 amavis[4175]: (04175-02) (!)No SMTP resp. to XFORWARD, dt: 0.000 s
[2018-02-24 22:38:50.316416] Feb 24 22:37:20 master090 amavis[4175]: (04175-02) (!)mail_via_smtp: error during QUIT: Error writing 146 bytes to socket: Broken pipe at /usr/sbin/amavisd-new line 8263.
[2018-02-24 22:38:50.316433] Feb 24 22:37:20 master090 amavis[4175]: (04175-02) (!)ZVKTtJIoJoMi FWD from <jvydzbvjnv@autotest090.local> -> <mx3j5bgjci@autotest090.local>, BODY=7BIT 451 4.5.0 From MTA(smtp:[127.0.0.1]:10025) during fwd-data-cmd (Error writing 140 bytes to socket: Broken pipe at /usr/sbin/amavisd-new line 8263.): id=04175-02
[2018-02-24 22:38:50.316451] Feb 24 22:37:20 master090 amavis[4175]: (04175-02) Blocked MTA-BLOCKED {TempFailedInternal}, LOCAL [127.0.0.1]:59384 <jvydzbvjnv@autotest090.local> -> <mx3j5bgjci@autotest090.local>, Queue-ID: 2328DC5195, Message-ID: <2ef08322-19dd-11e8-9a31-0260acd61b74.tjbgzprh2t@autotest090.local>, mail_id: ZVKTtJIoJoMi, Hits: 3.368, size: 764, 186 ms
[2018-02-24 22:38:50.316487] Feb 24 22:37:20 master090 postfix/smtp[4726]: 2328DC5195: to=<mx3j5bgjci@autotest090.local>, relay=127.0.0.1[127.0.0.1]:10024, delay=0.23, delays=0.04/0/0/0.18, dsn=4.5.0, status=deferred (host 127.0.0.1[127.0.0.1] said: 451 4.5.0 id=04175-02 - Temporary MTA failure on relaying, From MTA(smtp:[127.0.0.1]:10025) during fwd-data-cmd (Error writing 140 bytes to socket: Broken pipe at /usr/sbin/amavisd-new line 8263.): id=04175-02 (in reply to end of DATA command))
Comment 4 Sönke Schwardt-Krummrich univentionstaff 2018-02-25 21:11:16 CET
> Next question: would it be sufficient to only set smt
→ please ignore that line
Comment 5 Daniel Tröder univentionstaff 2018-02-26 13:08:02 CET
As decided, we'll try to not unnecessarily modify the Postfix configuration. The connection from Postfix to Amavis does not seem to suffer from connection caching problems.

[4.3-0 ce9701a8b7] Bug #45984: don't disable connection caching on postfix' amavis port

univention-mail-postfix (12.0.0-16)
Comment 6 Sönke Schwardt-Krummrich univentionstaff 2018-02-28 22:33:31 CET
(In reply to Daniel Tröder from comment #5)
> As decided, we'll try to not unnecessarily modify the Postfix configuration.
> The connection from Postfix to Amavis does not seem to suffer from
> connection caching problems.
> 
> [4.3-0 ce9701a8b7] Bug #45984: don't disable connection caching on postfix'
> amavis port

The commit does not contain the discussed change:

-- mail/univention-mail-postfix/conffiles/etc/postfix/master.cf.d/10_services --
index a0681f27ab..e40f021d6f 100644
@@ -74,6 +74,6 @@ local     unix  -       n       n       -       -       local
 #virtual   unix  -       n       n       -       -       virtual
 lmtp      unix  -       -       n       -       -       lmtp
 anvil     unix  -       -       n       -       1       anvil
-scache    unix  -       -       -       -       1       scache
+scache    unix  -       -       y       -       1       scache

and a new changelog entry. That's all the commit contains.
→ REOPEN
Comment 7 Daniel Tröder univentionstaff 2018-03-01 09:15:54 CET
Confusion: [4.3-0 ce9701a8b7] was meant for Bug #46102.

New commit:

[4.3-0 4cbbf0a884] Bug #45984: don't disable connection caching on postfix' amavis port (previous commit was for Bug #46102)

univention-mail-postfix (12.0.0-17)
Comment 8 Daniel Tröder univentionstaff 2018-03-01 09:21:37 CET
Actually ce9701a8b7 was meant for Bug #45963.
Comment 9 Sönke Schwardt-Krummrich univentionstaff 2018-03-01 17:37:01 CET
OK: code change
OK: functional change
OK: tests
OK: changelog entry
OK: package built
Comment 10 Stefan Gohmann univentionstaff 2018-03-14 14:38:49 CET
UCS 4.3 has been released:
 https://docs.software-univention.de/release-notes-4.3-0-en.html
 https://docs.software-univention.de/release-notes-4.3-0-de.html

If this error occurs again, please use "Clone This Bug".