Univention Bugzilla – Bug 39792
Joins in Amazon EC2 environments fail from time to time
Last modified: 2015-11-17 12:12:18 CET
The join of slaves or docker memmber servers fails from time to time with different error messages, for example: [2015-11-05 03:28:14.115062]Calling app join scripts... (2015-11-05 03:28:14.839737)ssh: Could not resolve hostname master092.autotest092.local: Name or service not known (2015-11-05 03:28:14.951566)ssh: Could not resolve hostname master092.autotest092.local: Name or service not known (2015-11-05 03:28:15.114579)ssh: Could not resolve hostname master092.autotest092.local: Name or service not known (2015-11-05 03:28:15.243847)Something went wrong while installing the app [2015-11-05 03:28:15.243962]Search LDAP binddn [2015-11-05 03:28:15.244097] [2015-11-05 03:28:15.244223]************************************************************************** [2015-11-05 03:28:15.244350]* Running join scripts failed! * [2015-11-05 03:28:15.244470]************************************************************************** [2015-11-05 03:28:15.244590]* Message: binddn for user Administrator not found [2015-11-05 03:28:15.244709]**************************************************************************
The syslog and daemon.log are now copied: r65222
It seems to be a problem in the bind listener module. bind9 is restarted too often: Nov 5 09:37:48 backup092 named[1875]: received control channel command 'reload autotest092.local' Nov 5 09:37:48 backup092 named[1576]: LDAP sdb zone 'autotest092.local': ldap_result returned -1 Nov 5 09:37:48 backup092 named[1875]: zone autotest092.local/IN: Transfer started. Nov 5 09:37:48 backup092 named[1875]: transfer of 'autotest092.local/IN' from 127.0.0.1#7777: connected using 127.0.0.1#34412 Nov 5 09:37:48 backup092 named[1576]: client 127.0.0.1#34412: transfer of 'autotest092.local/IN': AXFR-style IXFR started Nov 5 09:37:48 backup092 named[1875]: zone autotest092.local/IN: transferred serial 25 Nov 5 09:37:48 backup092 named[1875]: transfer of 'autotest092.local/IN' from 127.0.0.1#7777: Transfer completed: 1 messages, 20 records, 918 bytes, 0.012 secs (76500 bytes/sec) Nov 5 09:37:48 backup092 named[1875]: zone autotest092.local/IN: sending notifies (serial 25) Nov 5 09:37:48 backup092 named[1576]: received control channel command 'reload autotest092.local' Nov 5 09:37:48 backup092 named[1576]: zone autotest092.local/IN: _kerberos._udp.autotest092.local/SRV 'dy3a3rzffh-1446734083265519.autotest092.local' has no address records (A or AAAA) Nov 5 09:37:48 backup092 named[1576]: zone autotest092.local/IN: _kerberos._tcp.autotest092.local/SRV 'dy3a3rzffh-1446734083265519.autotest092.local' has no address records (A or AAAA) Nov 5 09:37:48 backup092 named[1576]: zone autotest092.local/IN: _ldap._tcp.autotest092.local/SRV 'dy3a3rzffh-1446734083265519.autotest092.local' has no address records (A or AAAA) Nov 5 09:37:48 backup092 named[1576]: zone autotest092.local/IN: zone serial (25) unchanged. zone may fail to transfer to slaves. Nov 5 09:37:48 backup092 named[1576]: client 127.0.0.1#34412: transfer of 'autotest092.local/IN': AXFR-style IXFR ended Nov 5 09:37:48 backup092 named[1576]: zone autotest092.local/IN: sending notifies (serial 25) Nov 5 09:37:48 backup092 named[1875]: client 127.0.0.1#9602: received notify for zone 'autotest092.local' Nov 5 09:37:48 backup092 named[1875]: zone autotest092.local/IN: notify from 127.0.0.1#9602: zone is up to date Nov 5 09:37:48 backup092 named[1576]: shutting down Nov 5 09:37:48 backup092 named[1576]: stopping command channel on 127.0.0.1#55555 Nov 5 09:37:48 backup092 named[1576]: no longer listening on ::#7777 Nov 5 09:37:48 backup092 named[1576]: no longer listening on 127.0.0.1#7777 Nov 5 09:37:48 backup092 named[1576]: no longer listening on 10.210.53.115#7777 Nov 5 09:37:48 backup092 named[1576]: no longer listening on 172.17.42.1#7777 Nov 5 09:37:48 backup092 named[1576]: exiting Nov 5 09:37:48 backup092 named[1875]: client 10.210.182.236#32986: received notify for zone 'autotest092.local' Nov 5 09:37:48 backup092 named[1875]: zone autotest092.local/IN: refused notify from non-master: 10.210.182.236#32986 Nov 5 09:37:54 backup092 named[1875]: shutting down Nov 5 09:37:54 backup092 named[1875]: stopping command channel on 127.0.0.1#953 Nov 5 09:37:54 backup092 named[1875]: no longer listening on ::#53 Nov 5 09:37:54 backup092 named[1875]: no longer listening on 127.0.0.1#53 Nov 5 09:37:54 backup092 named[1875]: no longer listening on 10.210.53.115#53 Nov 5 09:37:54 backup092 named[1875]: no longer listening on 172.17.42.1#53 Nov 5 09:37:54 backup092 named[1875]: exiting Nov 5 09:37:54 backup092 named[26896]: starting BIND 9.8.4-rpz2+rl005.12-P1 -c /etc/bind/named.conf -p 7777 -u bind -f -d 0 Nov 5 09:37:54 backup092 named[26896]: built with '--prefix=/usr' '--mandir=/usr/share/man' '--infodir=/usr/share/info' '--sysconfdir=/etc/bind' '--localstatedir=/var' '--enable-threads' '--enable-largefile' '--with-libtool' '--enable-shared' '--enable-static' '--with-openssl=/usr' '--with-gssapi=/usr' '--with-gnu-ld' '--with-geoip=/usr' '--enable-ipv6' '--with-dlz-dlopen' 'CFLAGS=-fno-strict-aliasing -DDIG_SIGCHASE -O2'
From my test machine: root@backup092:~# ps -ef | grep -i named bind 5138 2790 0 14:56 ? 00:00:00 /usr/sbin/named -c /etc/bind/named.conf -p 7777 -u bind -f -d 0 bind 5141 2786 0 14:56 ? 00:00:00 /usr/sbin/named -c /etc/bind/named.conf.proxy -u bind -f -d 0 root 24719 23842 0 15:08 pts/0 00:00:00 grep -i named root@backup092:~# ls -la /var/cache/bind/ insgesamt 24 drwxrwxr-x 2 root bind 4096 Nov 5 15:04 . drwxr-xr-x 25 root root 4096 Nov 5 14:37 .. -rw-r--r-- 1 bind bind 484 Nov 5 14:47 210.10.in-addr.arpa.zone -rw-r--r-- 1 bind bind 1082 Nov 5 14:47 autotest092.local.zone -rw-r--r-- 1 bind bind 221 Nov 5 14:47 managed-keys.bind -rw-r--r-- 1 bind bind 512 Nov 5 14:47 managed-keys.bind.jnl root@backup092:~# ls -la /var/cache/univention-bind-proxy/ insgesamt 8 drwxr-xr-x 2 root root 4096 Nov 5 14:47 . drwxr-xr-x 25 root root 4096 Nov 5 14:37 .. root@backup092:~# From daemon.log: Nov 5 15:04:22 backup092 named[5138]: received control channel command 'reload' Nov 5 15:04:22 backup092 named[5138]: loading configuration from '/etc/bind/named.conf' Nov 5 15:04:22 backup092 named[5138]: reading built-in trusted keys from file '/etc/bind/bind.keys' Nov 5 15:04:22 backup092 named[5138]: using default UDP/IPv4 port range: [1024, 65535] Nov 5 15:04:22 backup092 named[5138]: using default UDP/IPv6 port range: [1024, 65535] Nov 5 15:04:22 backup092 named[5138]: listening on IPv4 interface AExBG0BjpXz3mo2, 192.0.2.1#7777 Nov 5 15:04:22 backup092 named[5138]: sizing zone task pool based on 2 zones Nov 5 15:04:22 backup092 named[5138]: Warning: 'empty-zones-enable/disable-empty-zone' not set: disabling RFC 1918 empty zones Nov 5 15:04:22 backup092 named[5138]: reloading configuration succeeded Nov 5 15:04:22 backup092 named[5138]: LDAP sdb zone '210.10.in-addr.arpa': ldap_result returned -1 Nov 5 15:04:22 backup092 named[5138]: zone 210.10.in-addr.arpa/IN: zone serial (4) unchanged. zone may fail to transfer to slaves. Nov 5 15:04:22 backup092 named[5138]: zone autotest092.local/IN: zone serial (15) unchanged. zone may fail to transfer to slaves. Nov 5 15:04:22 backup092 named[5138]: reloading zones succeeded Nov 5 15:04:22 backup092 named[5138]: zone 210.10.in-addr.arpa/IN: sending notifies (serial 4) Nov 5 15:04:22 backup092 named[5141]: received control channel command 'reload' Nov 5 15:04:22 backup092 named[5141]: loading configuration from '/etc/bind/named.conf.proxy' Nov 5 15:04:22 backup092 named[5141]: reading built-in trusted keys from file '/etc/bind/bind.keys' Nov 5 15:04:22 backup092 named[5141]: using default UDP/IPv4 port range: [1024, 65535] Nov 5 15:04:22 backup092 named[5141]: using default UDP/IPv6 port range: [1024, 65535] Nov 5 15:04:22 backup092 named[5141]: listening on IPv4 interface AExBG0BjpXz3mo2, 192.0.2.1#53 Nov 5 15:04:22 backup092 named[5141]: sizing zone task pool based on 7 zones Nov 5 15:04:22 backup092 named[5141]: Warning: 'empty-zones-enable/disable-empty-zone' not set: disabling RFC 1918 empty zones Nov 5 15:04:22 backup092 named[5141]: reloading configuration succeeded Nov 5 15:04:22 backup092 named[5138]: zone autotest092.local/IN: sending notifies (serial 15) Nov 5 15:04:22 backup092 named[5141]: reloading zones succeeded Nov 5 15:04:22 backup092 named[5141]: client 127.0.0.1#39801: received notify for zone '210.10.in-addr.arpa' Nov 5 15:04:22 backup092 named[5141]: zone 210.10.in-addr.arpa/IN: notify from 127.0.0.1#39801: zone is up to date Nov 5 15:04:22 backup092 named[5138]: received control channel command 'reload' Nov 5 15:04:22 backup092 named[5138]: loading configuration from '/etc/bind/named.conf' Nov 5 15:04:22 backup092 named[5138]: reading built-in trusted keys from file '/etc/bind/bind.keys' Nov 5 15:04:22 backup092 named[5138]: using default UDP/IPv4 port range: [1024, 65535] Nov 5 15:04:22 backup092 named[5138]: using default UDP/IPv6 port range: [1024, 65535] Nov 5 15:04:22 backup092 named[5138]: no longer listening on 192.0.2.1#7777 Nov 5 15:04:22 backup092 named[5138]: sizing zone task pool based on 2 zones Nov 5 15:04:22 backup092 named[5138]: Warning: 'empty-zones-enable/disable-empty-zone' not set: disabling RFC 1918 empty zones Nov 5 15:04:22 backup092 named[5138]: reloading configuration succeeded Nov 5 15:04:22 backup092 named[5138]: zone 210.10.in-addr.arpa/IN: zone serial (4) unchanged. zone may fail to transfer to slaves. Nov 5 15:04:22 backup092 named[5138]: zone autotest092.local/IN: zone serial (15) unchanged. zone may fail to transfer to slaves. Nov 5 15:04:22 backup092 named[5138]: reloading zones succeeded Nov 5 15:04:22 backup092 named[5141]: received control channel command 'reload' Nov 5 15:04:22 backup092 named[5141]: loading configuration from '/etc/bind/named.conf.proxy' Nov 5 15:04:22 backup092 named[5141]: reading built-in trusted keys from file '/etc/bind/bind.keys' Nov 5 15:04:22 backup092 named[5141]: using default UDP/IPv4 port range: [1024, 65535] Nov 5 15:04:22 backup092 named[5141]: using default UDP/IPv6 port range: [1024, 65535] Nov 5 15:04:22 backup092 named[5141]: no longer listening on 192.0.2.1#53 Nov 5 15:04:22 backup092 named[5141]: sizing zone task pool based on 7 zones Nov 5 15:04:22 backup092 named[5141]: Warning: 'empty-zones-enable/disable-empty-zone' not set: disabling RFC 1918 empty zones Nov 5 15:04:22 backup092 named[5141]: reloading configuration succeeded Nov 5 15:04:22 backup092 named[5141]: reloading zones succeeded Nov 5 15:04:22 backup092 named[5141]: client 10.210.86.252#43370: received notify for zone '210.10.in-addr.arpa' Nov 5 15:04:22 backup092 named[5141]: zone 210.10.in-addr.arpa/IN: refused notify from non-master: 10.210.86.252#43370 Nov 5 15:04:22 backup092 named[5141]: client 127.0.0.1#43370: received notify for zone 'autotest092.local' Nov 5 15:04:22 backup092 named[5141]: zone autotest092.local/IN: notify from 127.0.0.1#43370: zone is up to date Nov 5 15:04:22 backup092 named[5141]: client 10.210.86.252#43370: received notify for zone 'autotest092.local' Nov 5 15:04:22 backup092 named[5141]: zone autotest092.local/IN: refused notify from non-master: 10.210.86.252#43370 Nov 5 15:04:27 backup092 named[5138]: zone 210.10.in-addr.arpa/IN: sending notifies (serial 4) Nov 5 15:04:27 backup092 named[5141]: client 127.0.0.1#63372: received notify for zone '210.10.in-addr.arpa' Nov 5 15:04:27 backup092 named[5141]: zone 210.10.in-addr.arpa/IN: notify from 127.0.0.1#63372: zone is up to date Nov 5 15:04:27 backup092 named[5138]: zone autotest092.local/IN: sending notifies (serial 15) Nov 5 15:04:27 backup092 named[5141]: client 10.210.86.252#9281: received notify for zone '210.10.in-addr.arpa' Nov 5 15:04:27 backup092 named[5141]: zone 210.10.in-addr.arpa/IN: refused notify from non-master: 10.210.86.252#9281 Nov 5 15:04:27 backup092 named[5141]: client 127.0.0.1#9281: received notify for zone 'autotest092.local' Nov 5 15:04:27 backup092 named[5141]: zone autotest092.local/IN: notify from 127.0.0.1#9281: zone is up to date Nov 5 15:04:27 backup092 named[5141]: client 10.210.86.252#9281: received notify for zone 'autotest092.local' Nov 5 15:04:27 backup092 named[5141]: zone autotest092.local/IN: refused notify from non-master: 10.210.86.252#9281
In my case the restarted is triggered because a reverse zone for the Docker interface is created. This is triggered by the 80_docker/52_app_install_dc_slave test case: ------------------------------------------------------------------------------- ==> /var/log/univention/listener.log <== 05.11.15 09:59:29.003 LISTENER ( PROCESS ) : DNS: /var/cache/bind/17.172.in-addr.arpa.zone does not exist. Triggering a bind9 restart. ------------------------------------------------------------------------------- This is the normal DC slave behaviour. So, I'll disable the test case until Bug #39801 has been fixed.
(In reply to Stefan Gohmann from comment #4) > This is the normal DC slave behaviour. So, I'll disable the test case until > Bug #39801 has been fixed. Done via r65258. Now, let's wait for the test results.
(In reply to Stefan Gohmann from comment #5) > (In reply to Stefan Gohmann from comment #4) > > This is the normal DC slave behaviour. So, I'll disable the test case until > > Bug #39801 has been fixed. > > Done via r65258. Now, let's wait for the test results. Looks good.
OK, test disabled
UCS 4.1 has been released: https://docs.software-univention.de/release-notes-4.1-0-en.html https://docs.software-univention.de/release-notes-4.1-0-de.html If this error occurs again, please use "Clone This Bug".