Bug 40061 - Apache fails to restart while an app is updated
Apache fails to restart while an app is updated
Status: CLOSED FIXED
Product: UCS
Classification: Unclassified
Component: Apache
UCS 4.1
Other Linux
: P5 normal (vote)
: UCS 4.1-0-errata
Assigned To: Felix Botner
Daniel Tröder
:
: 24951 (view as bug list)
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2015-11-21 23:55 CET by Dirk Wiesenthal
Modified: 2015-12-09 16:49 CET (History)
4 users (show)

See Also:
What kind of report is it?: ---
What type of bug is this?: ---
Who will be affected by this bug?: ---
How will those affected feel about the bug?: ---
User Pain:
Enterprise Customer affected?:
School Customer affected?:
ISV affected?:
Waiting Support:
Flags outvoted (downgraded) after PO Review:
Ticket number:
Bug group (optional):
Max CVSS v3 score:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Dirk Wiesenthal univentionstaff 2015-11-21 23:55:58 CET
Seen this twice now. There is a rogue apache claiming 443 but it is not recognized as a real apache:
root@master200:~# ps fax | grep apache
  854 pts/0    S+     0:00          \_ grep apache
27090 ?        S      0:01 /usr/sbin/apache2 -k start


Enabling site owncloud.
To activate the new configuration, you need to run: 
service apache2 reload
Restarting web server: apache2(98)Address already in use: make_sock: could not bind to address [::]:443
(98)Address already in use: make_sock: could not bind to address 0.0.0.0:443 
no listening sockets available, shutting down
Unable to open logs
Action 'start' failed.
The Apache error log may have more information.
failed!
invoke-rc.d: initscript apache2, action "restart" failed.
owncloud: Unterprozess installiertes post-installation-Skript gab den Fehlerwert 1 zurück
dpkg: Fehler beim Bearbeiten von owncloud (--configure):
Unterprozess installiertes post-installation-Skript gab den Fehlerwert 1 zurück
dpkg: Abhängigkeitsprobleme verhindern Konfiguration von owncloud-meta-8.2:
owncloud-meta-8.2 hängt ab von owncloud (= 2015.8.2.1-enterprise-0); aber:
Paket owncloud ist noch nicht konfiguriert.

dpkg: Fehler beim Bearbeiten von owncloud-meta-8.2 (--configure):
Abhängigkeitsprobleme - verbleibt unkonfiguriert
Fehler traten auf beim Bearbeiten von:
owncloud
owncloud-meta-8.2
Error in function:
SystemError: E:Sub-process /usr/bin/dpkg returned an error code (1)
owncloud-meta-8.2: Abhängigkeitsprobleme - verbleibt unkonfiguriert 
owncloud-meta-8.2: Installation fehlgeschlagen
Comment 1 Felix Botner univentionstaff 2015-11-24 15:54:37 CET
Seems that "/etc/init.d/apache2 force-reload" is the problem.

During the force-reload "apache2ctl graceful" is started, which kills all worker processes and starts new worker process.

Now the status of the running apache is checked with "timeout 3 apache2ctl status". On the test machine this command hits the timeout (although everything is fine) and therefore all processes are killed. But this seems to fail. The master process is killed but not all new worker processes and the following apache restart fails, because the left over worker processes are blocking port 443:


-> invoke-rc.d apache2 apache2 force-reload


root      5685  6.6  9.3 397288 96008 ?        Ss   14:51   0:00 /usr/sbin/apache2 -k start
www-data  5705  0.0  8.3 399412 85608 ?        S    14:51   0:00 /usr/sbin/apache2 -k start
www-data  5706  0.0  8.3 399412 85608 ?        S    14:51   0:00 /usr/sbin/apache2 -k start
www-data  5707  0.0  8.3 399412 85608 ?        S    14:51   0:00 /usr/sbin/apache2 -k start
www-data  5708  0.0  8.3 399412 85608 ?        S    14:51   0:00 /usr/sbin/apache2 -k start
www-data  5709  0.0  8.3 399412 85608 ?        S    14:51   0:00 /usr/sbin/apache2 -k start

+ env -i LANG=C PATH=/usr/local/bin:/usr/bin:/bin /usr/sbin/apache2ctl graceful

root      5685  7.0  9.3 397288 96008 ?        Rs   14:51   0:00 /usr/sbin/apache2 -k start

+ timeout 3 apache2ctl status
+ pkill -9 -f '^/usr/sbin/apache2 -k'

www-data  5815  0.0  8.6 399576 88816 ?        S    14:51   0:00 /usr/sbin/apache2 -k start
www-data  5816  0.0  8.6 399576 88816 ?        S    14:51   0:00 /usr/sbin/apache2 -k start
www-data  5817  0.0  8.5 399396 87032 ?        S    14:51   0:00 /usr/sbin/apache2 -k start

The main process is killed by pkill, but some of the new workers still running, best guess: 

the "apache2ctl graceful" took longer than the 3s timeout
pkill sees/kills only the main process send
but the main process started some workers before it got killed

+ /etc/init.d/apache2 restart
...
+ env -i LANG=C PATH=/usr/local/bin:/usr/bin:/bin /usr/sbin/apache2ctl start
(98)Address already in use: make_sock: could not bind to address [::]:443
(98)Address already in use: make_sock: could not bind to address 0.0.0.0:443

and the next restart fails
Comment 2 Felix Botner univentionstaff 2015-11-24 17:06:25 CET
i suggest (timeout 5, apache2ctl stop before pkill):

if ! timeout 5 apache2ctl status ; then
    $APACHE2CTL stop
    pkill -9 -f '^/usr/sbin/apache2 -k'
    "$0" restart
fi


worked in my test setup, kolab app installation/deinstallation also worked.


One alternative would be to always restart apache, even on reload, force-reload, graceful.
Comment 3 Felix Botner univentionstaff 2015-11-26 13:52:28 CET
(In reply to Felix Botner from comment #2)
> i suggest (timeout 5, apache2ctl stop before pkill):
> 
> if ! timeout 5 apache2ctl status ; then
>     $APACHE2CTL stop
>     pkill -9 -f '^/usr/sbin/apache2 -k'
>     "$0" restart
> fi

cp apache2 from 4.1-0 to 4.1-0-errata4.1-0 and modified patch 10-apache2-reload.patch (timeout 5, $APACHE2CTL stop before pkill)

apache2.yaml
Comment 4 Florian Best univentionstaff 2015-12-02 17:27:56 CET
apache2.2-bin 2.2.22-13.94.201511261335

I can still trigger this when calling 'apt-get remove univention-self-service'.

Restarting web server: apache2(98)Address already in use: make_sock: could not bind to address [::]:443
(98)Address already in use: make_sock: could not bind to address 0.0.0.0:443
no listening sockets available, shutting down
Unable to open logs
Action 'start' failed.
The Apache error log may have more information.
 failed!
invoke-rc.d: initscript apache2, action "restart" failed.
Restarting web server: apache2 ... waiting .
Comment 5 Florian Best univentionstaff 2015-12-02 17:31:09 CET
*** Bug 24951 has been marked as a duplicate of this bug. ***
Comment 6 Felix Botner univentionstaff 2015-12-04 14:26:30 CET
(In reply to Florian Best from comment #4)
> apache2.2-bin 2.2.22-13.94.201511261335
> 
> I can still trigger this when calling 'apt-get remove
> univention-self-service'.
> 
> Restarting web server: apache2(98)Address already in use: make_sock: could
> not bind to address [::]:443
> (98)Address already in use: make_sock: could not bind to address 0.0.0.0:443
> no listening sockets available, shutting down
> Unable to open logs
> Action 'start' failed.
> The Apache error log may have more information.
>  failed!
> invoke-rc.d: initscript apache2, action "restart" failed.
> Restarting web server: apache2 ... waiting .

We haven't changed the reload function, so this is pure debian. But it is true, apache2 restart started multiple times can lead to this error, but i will fix that in univention-self-service as a restart in prerm and postrm makes no sense to me.
Comment 7 Felix Botner univentionstaff 2015-12-04 14:33:50 CET
But we have another problem, i had kolab-enterprise installed and with it comes a php-kolab module. With this module apache reload can lead to a segfault in apache (after the reload, all apache processes are gone). 

Our patch
                     if $APACHE2CTL graceful $2 ; then
                         log_end_msg 0
+                        if ! timeout 5 apache2ctl status >/dev/null; then
+                            $APACHE2CTL stop
+                            pkill -9 -f '^/usr/sbin/apache2 -k'
+                            "$0" restart
should fix this, but unfortunately it doesn't work. apache2ctl status also returns 0 if apache is not running.
Comment 8 Felix Botner univentionstaff 2015-12-07 11:28:29 CET
Felix Botner from comment #7)
> But we have another problem, i had kolab-enterprise installed and with it
> comes a php-kolab module. With this module apache reload can lead to a
> segfault in apache (after the reload, all apache processes are gone). 
> 
> Our patch
>                      if $APACHE2CTL graceful $2 ; then
>                          log_end_msg 0
> +                        if ! timeout 5 apache2ctl status >/dev/null; then
> +                            $APACHE2CTL stop
> +                            pkill -9 -f '^/usr/sbin/apache2 -k'
> +                            "$0" restart
> should fix this, but unfortunately it doesn't work. apache2ctl status also
> returns 0 if apache is not running.

Can be reproduced on a UCS 4.1 + kolab-enterprise App with the following script:

a2dissite univention-saml
a2dismod actions
a2dismod suexec
invoke-rc.d apache2 restart
sleep 3
a2ensite univention-saml
a2enmod actions
a2enmod suexec
invoke-rc.d apache2 reload


1. univention-self-service.yaml

removed apache restart from prerm, restart apache (once) in postrm

2. apache2.yaml

reload: restart apache if apache2ctl status or pidof_apache fails(In reply to
Comment 9 Daniel Tröder univentionstaff 2015-12-09 09:32:50 CET
OK: code review
OK: advisory
OK: manual tests:

root@Test35:~# udm mail/folder create --position cn=folder,cn=mail,$ldap_base --set name=pub1 --set mailDomain=$domainname --set mailHomeServer=$hostname.$domainname --append sharedFolderUserACL="test1m@uni.dtr write" --append sharedFolderUserACL="test2m@uni.dtr all" --append sharedFolderUserACL="test3m@uni.dtr read"

root@Test35:~# udm mail/folder create --position cn=folder,cn=mail,$ldap_base --set name=pub2 --set mailDomain=$domainname --set mailHomeServer=$hostname.$domainname --append sharedFolderUserACL="test1m@uni.dtr write" --append sharedFolderUserACL="test2m@uni.dtr all" --append sharedFolderUserACL="test3m@uni.dtr read" --set mailPrimaryAddress=pub2m@uni.dtr

root@Test35:~# cp /var/spool/dovecot/public/Uni.Dtr/pub1/.INBOX/dovecot-acl pub1-before
root@Test35:~# cp /var/spool/dovecot/private/uni.dtr/pub2m/Maildir/dovecot-acl pub2m-before

root@Test35:~# cat pub1-before
user=test1m@uni.dtr ilprwts
user=test2m@uni.dtr ailprwts
user=test3m@uni.dtr lrws

root@Test35:~# diff pub1-before pub2m-before 

root@Test35:~# univention-upgrade 

root@Test35:~# grep univention-mail-dovecot /var/univention-join/status
univention-mail-dovecot v1 successful
univention-mail-dovecot v2 successful

root@Test35:~# cat /var/log/univention/reapply_shared_folder_acls.log
09.12.15 09:17:40.217  DEBUG_INIT
09.12.15 09:17:40.224  MAIN        ( INFO    ) : Initialising reapply_shared_folder_acls...
09.12.15 09:17:40.268  MAIN        ( INFO    ) : Looking for objects matching to following LDAP filter:
   (&(objectClass=univentionMailSharedFolder)(univentionMailHomeServer=Test35.Uni.Dtr))
09.12.15 09:17:40.270  MAIN        ( PROCESS ) : DN: 'cn=pub1@Uni.Dtr,cn=folder,cn=mail,dc=Uni,dc=Dtr'
09.12.15 09:17:41.464  LISTENER    ( PROCESS ) : reapply_shared_folder_acls: Updated shared mailbox configuration.
09.12.15 09:17:41.623  LISTENER    ( PROCESS ) : reapply_shared_folder_acls: Set ACLs on 'pub1@Uni.Dtr'.
09.12.15 09:17:41.623  MAIN        ( PROCESS ) : ACLs updated
09.12.15 09:17:41.623  MAIN        ( PROCESS ) : DN: 'cn=pub2@Uni.Dtr,cn=folder,cn=mail,dc=Uni,dc=Dtr'
09.12.15 09:17:41.714  LISTENER    ( PROCESS ) : reapply_shared_folder_acls: Set ACLs on 'pub2m@uni.dtr'.
09.12.15 09:17:41.715  MAIN        ( PROCESS ) : ACLs updated
09.12.15 09:17:41.715  MAIN        ( PROCESS ) : Done

root@Test35:~# diff pub1-before /var/spool/dovecot/public/Uni.Dtr/pub1/.INBOX/dovecot-acl
1,2c1,2
< user=test1m@uni.dtr ilprwts
< user=test2m@uni.dtr ailprwts
---
> user=test1m@uni.dtr eilprwts
> user=test2m@uni.dtr aeilprwts

root@Test35:~# diff pub2m-before /var/spool/dovecot/private/uni.dtr/pub2m/Maildir/dovecot-acl
1,2c1,2
< user=test1m@uni.dtr ilprwts
< user=test2m@uni.dtr ailprwts
---
> user=test1m@uni.dtr eilprwts
> user=test2m@uni.dtr aeilprwts

root@Test35:~# diff /var/spool/dovecot/public/Uni.Dtr/pub1/.INBOX/dovecot-acl /var/spool/dovecot/private/uni.dtr/pub2m/Maildir/dovecot-acl


* Manual tests with Horde webmail worked as expected.
Comment 10 Daniel Tröder univentionstaff 2015-12-09 09:34:49 CET
wrong bug
Comment 11 Daniel Tröder univentionstaff 2015-12-09 11:26:05 CET
OK: can reproduce error
OK: advisory
OK: manual test:

invoke-rc.d apache2 start
a2dissite univention-saml
a2dismod actions
a2dismod suexec
invoke-rc.d apache2 restart
sleep 3
a2ensite univention-saml
a2enmod actions
a2enmod suexec
invoke-rc.d apache2 reload
www-browser -dump http://localhost:80/server-status > /dev/null && echo OK || echo FAIL
ELinks: Verbindungsaufbau abgelehnt
FAIL

univention-upgrade

invoke-rc.d apache2 start
a2dissite univention-saml
a2dismod actions
a2dismod suexec
invoke-rc.d apache2 restart
sleep 3
a2ensite univention-saml
a2enmod actions
a2enmod suexec
invoke-rc.d apache2 reload
www-browser -dump http://localhost:80/server-status > /dev/null && echo OK || echo FAIL
OK
Comment 12 Janek Walkenhorst univentionstaff 2015-12-09 16:44:24 CET
<http://errata.software-univention.de/ucs/4.1/20.html>
Comment 13 Janek Walkenhorst univentionstaff 2015-12-09 16:49:21 CET
<http://errata.software-univention.de/ucs/4.1/24.html>