Bug 40338 - Configuration error: host is unresolvable
Summary: Configuration error: host is unresolvable
Status: CLOSED FIXED
Alias: None
Product: UCS
Classification: Unclassified
Component: UMC - App-Center
Version: UCS 4.1
Hardware: Other Linux
: P5 normal
Target Milestone: UCS 4.1-1-errata
Assignee: Philipp Hahn
QA Contact: Stefan Gohmann
URL:
Keywords:
: 31245 40045 (view as bug list)
Depends on: 38807
Blocks: 41237 41238 41496
  Show dependency treegraph
 
Reported: 2015-12-23 13:29 CET by Stefan Gohmann
Modified: 2016-12-08 12:56 CET (History)
6 users (show)

See Also:
What kind of report is it?: ---
What type of bug is this?: ---
Who will be affected by this bug?: ---
How will those affected feel about the bug?: ---
User Pain:
Enterprise Customer affected?:
School Customer affected?:
ISV affected?:
Waiting Support:
Flags outvoted (downgraded) after PO Review:
Ticket number:
Bug group (optional):
Customer ID:
Max CVSS v3 score:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Stefan Gohmann univentionstaff 2015-12-23 13:29:16 CET
See Bug #40045. The test case 80_docker/65_app_container_upgrade_dockerallowedimag fails because univention-upgrade aborts with an error directly after the upgrade from 4.0-4 to 4.1-0. The error message explains that the repository server can't be reached:

-----------------------------------------------------------------------------
Starting univention-upgrade. Current UCS version is 4.0-4 errata377

Checking for local repository:                          none
Checking for release updates:                           found: UCS 4.1-0

Starting update to UCS version 4.1-0

HINT:
Please check the release notes carefully BEFORE updating to UCS 4.1-0:
 English version: https://docs.software-univention.de/release-notes-4.1-0-en.html
 German version:  https://docs.software-univention.de/release-notes-4.1-0-de.html

Please also consider documents of following release updates and
3rd party components.


Checking for space on /var/cache/apt/archives: OK
Checking for space on /boot: OK
Checking for space on /: OK
Checking for package status: OK

Starting update process, this may take a while.
Check /var/log/univention/updater.log for more information.
Running postup.sh script:done.
The connection to the repository server failed: Configuration error: host is unresolvable. Please check the repository configuration and the network connection.

Starting univention-upgrade. Current UCS version is 4.1-0 errata0

Checking for local repository:                          none
Checking for release updates:                           none
Checking for package updates:                           found

The following packages will be installed:
 dictionaries-common,sudo,univention-sudo,wngerman
The following packages will be upgraded:
 libc-bin,libc6,libwbclient0,libtalloc2,python-ldb,python-tdb,libtdb1,libtevent0,python-talloc,python-samba,samba-common-bin,smbclient,samba-common,libsmbclient,samba-libs,libldb1,ntpdate,multiarch-support,locales,rpcbind,apache2-mpm-prefork,apache2.2-common,apache2.2-bin,apache2-utils,nscd,univention-directory-manager-tools,python-univention-directory-manager,python-univention-directory-manager-cli,python-univention-appcenter,univention-apache,univention-appcenter,univention-directory-listener,univention-errata-level,univention-system-setup,univention-updater,univention-mail-postfix,univention-pam
Starting package upgrade                                done
Checking for app updates:                               none
Checking for release updates:                           none
Checking for package updates:                           none

-----------------------------------------------------------------------------

It seems to be only a short time in which the repository server could not be reached. From my point of view it is wrong to abort with an error in this case. It works directly if I start it again.

It is reproducible with this test case.

+++ This bug was initially created as a clone of Bug #40045 +++

The test case fails on all 4.1 roles:

For example:
http://jenkins.knut.univention.de:8080/job/UCS-4.1/job/UCS-4.1-0/job/Autotest%20MultiEnv/SambaVersion=s3,Systemrolle=master/116/testReport/80_docker/65_app_container_upgrade_dockerallowedimage/test/

(2015-11-19 22:55:58.133293)dh_pysupport: This program is deprecated, you should use dh_python2 instead. Migration guide: http://deb.li/dhs2p
(2015-11-19 22:56:00.397597) dpkg-genchanges -b >../wvy6cedcin_2.1.0.3_amd64.changes
(2015-11-19 22:56:01.053791)dpkg-genchanges: rein binärer Upload - es ist kein Quellcode hinzugefügt
(2015-11-19 22:56:01.067242) dpkg-source --after-build wvy6cedcin
(2015-11-19 23:04:34.066805)The connection to the repository server failed: Configuration error: host is unresolvable. Please check the repository configuration and the network connection.
(2015-11-19 23:04:34.342293)Release upgrade script failed
(2015-11-19 23:04:34.342321)Aborting...
(2015-11-19 23:04:40.500917)sh: 0: getcwd() failed: No such file or directory
(2015-11-19 23:04:41.834261)sh: 0: getcwd() failed: No such file or directory
(2015-11-19 23:04:42.965774)sh: 0: getcwd() failed: No such file or directory
(2015-11-19 23:04:44.632326)sh: 0: getcwd() failed: No such file or directory
(2015-11-19 23:04:45.801540)sh: 0: getcwd() failed: No such file or directory
(2015-11-19 23:04:46.909183)Traceback (most recent call last):
(2015-11-19 23:04:46.909241)  File "65_app_container_upgrade_dockerallowedimage", line 73, in <module>
(2015-11-19 23:04:46.909319)    app.upgrade()
(2015-11-19 23:04:46.909357)  File "/usr/share/ucs-test/80_docker/dockertest.py", line 166, in upgrade
(2015-11-19 23:04:46.923197)    raise UCSTest_DockerApp_UpgradeFailed()
(2015-11-19 23:04:46.923220)dockertest.UCSTest_DockerApp_UpgradeFailed
Comment 1 Stefan Gohmann univentionstaff 2015-12-23 13:32:30 CET
I've disabled the test case. Please re-enable it after fixing this bug: r66537
Comment 2 Stefan Gohmann univentionstaff 2015-12-23 13:33:25 CET
*** Bug 40045 has been marked as a duplicate of this bug. ***
Comment 3 Philipp Hahn univentionstaff 2016-01-19 17:30:05 CET
1. /etc/resolv.conf inside the Container contains this:
>nameserver  10.200.17.29
>nameserver  10.200.17.28
>nameserver  8.8.4.4
The Google NS will never be able to resolve my local DNS name, so adding it should be disabled.


2. That's looks like an error in *resolving* the name to an IP address via a DNS server.
It does *not* look like a bug in reaching the repository server itself, but read on.

(In reply to Stefan Gohmann from comment #0)
> The connection to the repository server failed: Configuration error: host is
> unresolvable. Please check the repository configuration and the network
> connection.

As you yourself have stated in Bug #40045 comment 1, `ping` has no problem reaching the server, as `ping` resolves the address once one startup and then uses that resolved address repeatedly in doing its ICMP-ECHO-REQUEST.


3. I get this:
>Traceback (most recent call last):
>  File "65_app_container_upgrade_dockerallowedimage", line 70, in <module>
>    app.verify_basic_modproxy_settings()
>  File "/usr/share/ucs-test/80_docker/dockertest.py", line 358, in verify_basic_modproxy_settings
>    response = urllib2.urlopen('http://%s/%s/index.txt' % (fqdn, self.app_name))
>  File "/usr/lib/python2.7/urllib2.py", line 127, in urlopen
>    return _opener.open(url, data, timeout)
>  File "/usr/lib/python2.7/urllib2.py", line 407, in open
>    response = meth(req, response)
>  File "/usr/lib/python2.7/urllib2.py", line 520, in http_response
>    'http', request, response, code, msg, hdrs)
>  File "/usr/lib/python2.7/urllib2.py", line 445, in error
>    return self._call_chain(*args)
>  File "/usr/lib/python2.7/urllib2.py", line 379, in _call_chain
>    result = func(*args)
>  File "/usr/lib/python2.7/urllib2.py", line 528, in http_error_default
>    raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
>urllib2.HTTPError: HTTP Error 502: Proxy Error
This looks like the services inside the container are already stopped, while then they are still used.

Looking at /var/log/apache2/error.log on the host confirms this:
>[Tue Jan 19 15:42:44 2016] [notice] Graceful restart requested, doing restart
>[Tue Jan 19 15:42:44 2016] [error] (9)Bad file descriptor: apr_socket_accept: (client socket)
>[Tue Jan 19 15:42:44 2016] [warn] Init: Name-based SSL virtual hosts only work for clients with TLS server name indication support (RFC 4366)
>[Tue Jan 19 15:42:45 2016] [notice] Apache/2.2.22 (Univention) PHP/5.4.45-0.216.201510081527 mod_ssl/2.2.22 OpenSSL/1.0.2d configured -- resuming normal operations
>[Tue Jan 19 15:42:50 2016] [error] [client 10.200.17.30] (104)Connection reset by peer: proxy: error reading status line from remote server 127.0.0.1:40002
>[Tue Jan 19 15:42:50 2016] [error] [client 10.200.17.30] proxy: Error reading from remote server returned by /gxul944yym/index.txt

# lsof -i :40002
>COMMAND   PID USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
>docker  15498 root    4u  IPv6 427938      0t0  TCP *:40002 (LISTEN)

# ps www 15498
>  PID TTY      STAT   TIME COMMAND
>15498 ?        Sl     0:00 docker-proxy -proto tcp -host-ip 0.0.0.0 -host-port 40002 -container-ip 172.17.0.3 -container-port 80

Inside the container there is *no* apache runnung!
# tail /var/log/apache2/error.log
>[Tue Jan 19 15:38:53 2016] [notice] Apache/2.2.22 (Univention) PHP/5.4.45-0.216.201510081527 mod_ssl/2.2.22 OpenSSL/1.0.2d configured -- resuming normal operations
>[Tue Jan 19 15:40:37 2016] [notice] caught SIGTERM, shutting down

# tail /var/log/univention/join.log
>RUNNING 08univention-apache.inst
>2016-01-19 15:42:41.384768894+01:00 (in joinscript_init)
>Module ssl disabled.
>To activate the new configuration, you need to run:
>  service apache2 restart
>Enabling module ssl.
>See /usr/share/doc/apache2.2-common/README.Debian.gz on how to configure SSL and create self-signed certificates.
>To activate the new configuration, you need to run:
>  service apache2 restart
>Site default-ssl disabled.
>To activate the new configuration, you need to run:
>  service apache2 reload
>Enabling site default-ssl.
>To activate the new configuration, you need to run:
>  service apache2 reload
>Reloading web server config: apache2 not running.
>Setting ucs/web/overview/entries/admin/ldap-master/label
>Setting ucs/web/overview/entries/admin/ldap-master/label/de
>Setting ucs/web/overview/entries/admin/ldap-master/description
>Setting ucs/web/overview/entries/admin/ldap-master/description/de
>Setting ucs/web/overview/entries/admin/ldap-master/link
>Setting ucs/web/overview/entries/admin/ldap-master/link/de
>Setting ucs/web/overview/entries/admin/ldap-master/priority
>File: /var/www/ucs-overview/entries.json
>2016-01-19 15:42:42.137202315+01:00 (in joinscript_save_current_version)
>EXITCODE=0

# less /var/log/univention/updater.log
...
>Preparing to replace apache2-mpm-prefork 2.2.22-13.93.201510081517 (using .../apache2-mpm-prefork_2.2.22-13.95.201512071109_amd64.deb) ...
>Stopping web server: apache2
>The apache2 configtest failed, so we are trying to kill it manually. This is almost certainly suboptimal, so please make sure your system is working as you'd expect now! ... (warning).
> ... waiting .
...
>Setting up apache2-mpm-prefork (2.2.22-13.95.201512071109) ...
>Starting web server: apache2apache2: Syntax error on line 268 of /etc/apache2/apache2.conf: Could not open configuration file /etc/apache2/sites-enabled/ssl: No such file or directory
>Action 'start' failed.
>The Apache error log may have more information.
> failed!
>invoke-rc.d: initscript apache2, action "start" failed.
...
>Setting up univention-apache (8.0.1-5.255.201601121222) ...
>Installing new version of config file /etc/univention/templates/files/etc/apache2/mods-available/ssl.conf ...


This is caused by r64198 of Bug #38807, which renamed "ssl" to "default-ssl", but didn't backport that change to "ucs-4.0/component/docker/" and removed the
>a2dissite ssl || true # TODO: remove, dev version

# apachectl configtest
>apache2: Syntax error on line 268 of /etc/apache2/apache2.conf: Could not open configuration file /etc/apache2/sites-enabled/ssl: No such file or directory
>Action 'configtest' failed.

# readlink /etc/apache2/sites-enabled/ssl    
>../sites-available/ssl

# grep -n site.*ssl /var/lib/dpkg/info/*.p*
>/var/lib/dpkg/info/univention-apache.postinst:50:       a2dissite default-ssl || true
>/var/lib/dpkg/info/univention-apache.postinst:51:       a2ensite default-ssl || true
>/var/lib/dpkg/info/univention-apache.postinst:101:      a2ensite default-ssl || true
>/var/lib/dpkg/info/univention-apache.postinst:104:      a2dissite default-ssl || true

# docker run docker.software-univention.de/ucs-appbox-amd64:4.0-3 /bin/bash -c 'univention-install -qq -y --force-yes univention-apache &>/dev/null && grep -n site.*ssl /var/lib/dpkg/info/*.p*'
/var/lib/dpkg/info/univention-apache.postinst:47:a2ensite ssl || true

That thing is from "ucs-4.0/component/docker/univention-apache/", which creates the "ssl" site for testing purpose.
After the update of univention-apache from 4.1-0 that package is then broken.


4. There are two places, where "host is unresolvable" is thrown. The relevant is in /usr/lib/pymodules/python2.7/univention/updater/tools.py
>463             if res.code in (httplib.BAD_GATEWAY, httplib.GATEWAY_TIMEOUT):  # 502 504
>464                 self.failed_hosts.add(req.get_host())
>465                 raise ConfigurationError(uri, 'host is unresolvable')

So actually the error is from Apache, which fails to contact the Apache inside the container.
Is when the *outside* Apache is forwarding that request like a proxy, an "BAD_GATEWAY" error is returned when the updater checks for additional updates just after it has upgraded to 4.1-0.
This explains the "strange" error message.
Comment 4 Philipp Hahn univentionstaff 2016-01-20 14:15:04 CET
r66895 | Bug #40338 docker: Rename ssl to default-ssl following Bug #38807
 Merge relevant part of r64268,r64198 to rename 'ssl' to 'default-ssl' in docker component for test image.

Package: univention-apache
Version: 7.0.17-3.257.201601201205
Branch: ucs_4.0-0
Scope: docker


announce_ucs_scope.py -s docker -r 4.0-0 -k /etc/archive-keys/ucs4.0.txt -K 6B8BFD3C -u --skip-tag -n
sudo update_mirror.sh ftp/4.0/unmaintained/component/docker


No YAML file needed as it's unmaintained:
# apt-cache policy univention-apache
...
     7.0.17-2.241.201507141426 0
        500 http://updates.software-univention.de/4.0/unmaintained/component/ docker/all/ Packages
     7.0.16-16.248.201510131207 0
        500 http://updates.software-univention.de/4.0/maintained/component/ 4.0-3-errata/all/ Packages


The real problem is this in univention-updater/python/univention-upgrade:
491 »···»···# check for new updates after updating ; update UCR variable
492 »···»···#
493 »···»···# BUG: After an release upgrade this process MUST NOT continue to use old python, ucr, u…
494 »···»···try:
495 »···»···»···update_available = performUpdate(options, checkForUpdates=True, silent=True)

The old updater from 4.0-4 is still running, but the updater.postinst from 4.1-0 changes 'repository/online/server' to '*https://*updates.software-univention.de/' iff it is 'updates.software-univention.de' before.
performUpdate() internally reloads the UCR and then tries to use that value as a *plain* host name, which fails.
This was not detected by our test, as we mostly use "univention-repository.knut.univention.de" and "upddates-test.software-univention.de" in our tests, which is not modified by the postinst script. (the docker images already uses the public FTP; switching our KVM images to it before doing the update to 4.1-0 also shows the bug; docker-4.1 has Bug #40430)

Configuring logging in the old updater confirms this:
> Running postup.sh script:done.
> 2016-01-20 12:24:48,668 INFO:updater.UCSHttp:Requesting http://https://updates.software-univention.de//univention-repository/
> 2016-01-20 12:24:48,705 ERROR:updater.UCSHttp:Failed HEAD http://https://updates.software-univention.de//univention-repository/: <urlopen error [Errno -2] Name or service not known>


We need to fix that in 4.0-4, so that users hopefully already have the fixed version when they start updating to 4.1-0.
The fix also needs to be forward-ported to 4.1-0.
Comment 5 Philipp Hahn univentionstaff 2016-02-20 16:51:20 CET
r67590 | Bug #40338 up: Re-indent nested code
r67589 | Bug #40338 up: Invert logic checks
r67588 | Bug #40338 up: Re-execute update executable after update
r67587 | Bug #40338 up: Re-execute after release upgrade
r67586 | Bug #40338 up: Allow check without locking
r67585 | Bug #40338 up: Union check and upgrade
r67584 | Bug #40338 up: Print message once
r67583 | Bug #40338 up: Remove impossible case
r67582 | Bug #40338 up: Improve logging

Package: univention-updater
Version: 11.0.9-3.1446.201602201639
Branch: ucs_4.1-0
Scope: errata4.1-1

r67591 | Bug #40338 up: Re-execute after update YAML
 univention-updater.yaml

The are some semantic changes:
 The old-updater did:
 1. release updates (until the next minor/major version)
 2. package updates
 3. App updates
 4. repeat above with next minor/major release

 The new updater does:
 1. package updates, only then exec()
 2. App updates, only then exec()
 3. update to next patch-level/minor/major, then exec()
 Each exec() starts that loop again with 1.

This allows us to
1. provide errata updates first
2. fix apps before update
3. only then do updates

The problem is passing the state between re-executed updater processes: They share nothing and have to re-get the state from the system state. As such the updater always again start with 1. The updater only reaches 2. if there are no package updates. Same with 3.: Only when no package and App updates exist.

Another problem is that we need to fix the updater of 4.0-4 and 4.1-0 with errata. The update would still install the old (broken) updater from the non-errata repository. As such I propose releasing the updates for 4.0-4e and 4.1-0e with a version number higher than 4.1-1 (but lower than 4.1-1e): As such the new updater will be higher until 4.1-1e and will not get replaces by those broken ones from 4.1-0 and 4.1-1.
Or at leas the update for 4.0-4e should be 4.1-0 < x < 4.1-0e and the one for 4.1-0e should be x > 4.1-1.

After QA and deciding on the above this bug needs to be cloned for 4.0-4e and 4.1-0e (moving this to 4.1-1e)

Another minor change: --setucr can now be used without --check.
Comment 6 Philipp Hahn univentionstaff 2016-02-23 11:56:56 CET
r67625 | Bug #40338 up: Remove traceback from default debug output

Package: univention-updater
Version: 11.0.9-4.1447.201602231153
Branch: ucs_4.1-0
Scope: errata4.1-1

r67626 | Bug #40338 up: Remove traceback from default debug output YAML
 univention-updater.yaml
Comment 7 Stefan Gohmann univentionstaff 2016-04-11 08:44:43 CEST
Code review: OK

YAML: OK, I've removed 4.1-0 from YAML file.

Tests: OK
Comment 8 Janek Walkenhorst univentionstaff 2016-04-13 14:52:49 CEST
<http://errata.software-univention.de/ucs/4.1/152.html>
Comment 9 Philipp Hahn univentionstaff 2016-12-08 12:56:17 CET
*** Bug 31245 has been marked as a duplicate of this bug. ***