Bug 46206 - 90univention-bind-post.inst fails caused by a traceback in univention-fix-ucr-dns
90univention-bind-post.inst fails caused by a traceback in univention-fix-ucr...
Status: CLOSED FIXED
Product: UCS
Classification: Unclassified
Component: Join (univention-join)
UCS 4.2
Other Linux
: P5 normal (vote)
: UCS 4.3-0-errata
Assigned To: Philipp Hahn
Jürn Brodersen
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2018-01-31 15:37 CET by Christina Scheinig
Modified: 2018-06-06 16:16 CEST (History)
4 users (show)

See Also:
What kind of report is it?: Bug Report
What type of bug is this?: 2: Improvement: Would be a product improvement
Who will be affected by this bug?: 1: Will affect a very few installed domains
How will those affected feel about the bug?: 2: A Pain – users won’t like this once they notice it
User Pain: 0.023
Enterprise Customer affected?: Yes
School Customer affected?:
ISV affected?:
Waiting Support:
Flags outvoted (downgraded) after PO Review:
Ticket number: 2018013021000244
Bug group (optional):
Max CVSS v3 score:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Christina Scheinig univentionstaff 2018-01-31 15:37:49 CET
Configure 90univention-bind-post.inst Mon Jan 29 13:14:23 CET 2018
2018-01-29 13:14:23.447437654+01:00 (in joinscript_init)
Create dns/backend
2018-01-29 13:14:23,844 INFO    __main__.ucr/fwd  Found forwarder 192.168.100.6 from UCRV dns/forwarder1
2018-01-29 13:14:23,845 INFO    __main__.ucr/ns   Found server 192.168.100.4 from UCRV nameserver1
2018-01-29 13:14:23,845 INFO    __main__.ucr/ns   Found server 192.168.100.6 from UCRV nameserver2
2018-01-29 13:14:23,849 INFO    __main__.val      Validated UCS domain server: 192.168.100.4
2018-01-29 13:14:23,886 WARNING __main__.val      UCS master SRV record is unknown at 192.168.100.6, converting into forwarder
2018-01-29 13:14:23,886 INFO    __main__.xor      Skip removing nameservers from forwarders
2018-01-29 13:14:23,893 INFO    __main__.ucr/self Default IP address configured in UCR: 192.168.100.231
2018-01-29 13:14:23,894 INFO    __main__.ns       Skip adding NS
2018-01-29 13:14:23,894 INFO    __main__.ldap     Skip adding master
2018-01-29 13:14:23,894 INFO    __main__.ucr      Updating 'nameserver1': '192.168.100.4' -> '192.168.100.231'
2018-01-29 13:14:23,895 INFO    __main__.ucr      Updating 'nameserver2': '192.168.100.6' -> '192.168.100.4'
2018-01-29 13:14:24,350 INFO    __main__.ucr      Reloading BIND
rndc: connect failed: 127.0.0.1#953: connection refused
File: /etc/bind/named.conf.proxy
File: /etc/bind/named.conf.samba4
File: /etc/resolv.conf
Traceback (most recent call last):
  File "/usr/share/univention-server/univention-fix-ucr-dns", line 406, in <module>
    main()
  File "/usr/share/univention-server/univention-fix-ucr-dns", line 87, in main
    update_ucr(ucr, nameservers, forwarders)
  File "/usr/share/univention-server/univention-fix-ucr-dns", line 340, in update_ucr
    check_call(('rndc', 'reconfig'))
  File "/usr/lib/python2.7/subprocess.py", line 540, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '('rndc', 'reconfig')' returned non-zero exit status 1
Job for bind9.service failed. See 'systemctl status bind9.service' and 'journalctl -xn' for details.
invoke-rc.d: initscript bind9, action "restart" failed.
Wait for bind9:  done
 done
Object exists: cn=services,cn=univention,dc=example,dc=de
Object exists: cn=DNS,cn=services,cn=univention,dc=example,dc=de
Object modified: cn=heart,cn=dc,cn=computers,dc=example,dc=de
2018-01-29 13:14:32.529488172+01:00 (in joinscript_save_current_version)
Configure 92univention-management-console-web-server.inst Mon Jan 29 13:14:32 CET 2018

This causes the 92univention-management-console-web-server.inst to fail

This happens on a DC slave
Maybe some interesting information from the USI:
______________________________________________________________________________
tcp        0      0 127.0.0.1:953           0.0.0.0:*               LISTEN      115        16709       1079/named       off (0.00/0/0)
______________________________________________________________________________

nameserver1: 192.168.100.4
nameserver2: 192.168.100.6
dns/backend: ldap
dns/debug/level: 0
dns/dlz/debug/level: 0
dns/forwarder1: 192.168.100.6
dns/ipv6: yes
dns/master/address: 127.0.0.1
dns/master/port: 7777

interfaces/eth0/address: 192.168.100.231
interfaces/eth0/broadcast: 192.168.100.255
interfaces/eth0/ipv6/acceptRA: false
interfaces/eth0/netmask: 255.255.255.0
interfaces/eth0/network: 192.168.100.0
interfaces/eth0/start: true
interfaces/eth0/type: static
interfaces/handler: ifplugd
interfaces/primary: eth0

______________________________________________________________________________
Comment 1 BC 2018-04-21 10:19:13 CEST
Encountered a similar one, on DC Slave too. 
(My UCS complain 05bind something is pending...)
It seems that `/etc/bind/named.conf.proxy` missed a semicolon at line 14. Add it back worked for me.
```
controls{
        inet 127.0.0.1;   # <--- missed here
        allow { 127.0.0.1; };
};
```
Comment 2 Richard Ulmer univentionstaff 2018-05-16 10:32:31 CEST
There were two independent problems visible in the log from comment #1. It seems to me like the suggestion from comment #2 does solve neither of them, because the additional semicolon is syntactically wrong.

The first problem was the traceback, where `rndc reconfig` failed in univention-fix-ucr-dns. I solved this by moving the call of univention-fix-ucr-dns behind the starting of BIND9.

The second problem was, that BIND9 was not always started by the join-script. This was caused by an insufficient delay between making changes in /etc/runit/ and restarting BIND9 (see `man runsvdir` for more info).

If this fix is desired for 4.2-x, please clone this bug.

univention-bind.yaml
37c4b58e877f | Bug #46206: Add yaml file

univention-bind (12.0.1-3)
257974298591 | Bug #46206: Add changelog entry
03d476d34e5d | Bug #46206: Fix traceback, where rndc was called before bind9 was set up
03a057000bac | Bug #46206: Wait for /etc/runit to be read again before restarting bind9
Comment 3 Philipp Hahn univentionstaff 2018-05-16 10:51:06 CEST
(In reply to Richard Ulmer from comment #2)
> There were two independent problems visible in the log from comment #1. It
> seems to me like the suggestion from comment #2 does solve neither of them,
> because the additional semicolon is syntactically wrong.

Yes, it is wrong as `man 5 named.conf` shows this syntax:
>           controls {
>                inet ( ipv4_address | ipv6_address | * )
>                     [ port ( integer | * ) ]
>                     allow { address_match_element; ... }
>                     [ keys { string; ... } ];
>                unix unsupported; // not implemented
>           };

It configures a IPvX control channel; the allow rule directly belongs to this channel and not to all controls. So *no* semicolon there.
The formatting in our UCR template may be confusing.
Comment 4 Philipp Hahn univentionstaff 2018-05-30 12:28:43 CEST
[4.3-0] 80bca0661e Bug #46206 DNS: Fix wait_for_dns
 services/univention-bind/90univention-bind-post.inst | 4 ++--
 services/univention-bind/debian/changelog            | 6 ++++++
 2 files changed, 8 insertions(+), 2 deletions(-)

Package: univention-bind
Version: 12.0.1-4A~4.3.0.201805301225
Branch: ucs_4.3-0
Scope: errata4.3-0

[4.3-0] 68cfe2a268 Bug #46206 DNS: Fix wait_for_dns YAML
 doc/errata/staging/univention-bind.yaml | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
Comment 5 Jürn Brodersen univentionstaff 2018-05-31 16:50:04 CEST
What I tested:
Forced execution of joinscript -> OK
Forced execution of joinscript with broken resolv.conf -> OK
Forced execution of joinscript with stopped bind -> OK
Forced execution of joinscript with self as dns/forwarder -> OK

YAML -> OK

The joinscript version wasn't changed, is that ok? I don't think that is a problem since these were setup problems anyways.

Everything else is good.

Reopened because of the joinscript version. Just set the bug to resolved again if that is no problem.
Comment 6 Philipp Hahn univentionstaff 2018-06-01 14:03:17 CEST
(In reply to Jürn Brodersen from comment #5)
> The joinscript version wasn't changed, is that ok? I don't think that is a
> problem since these were setup problems anyways.

If the VERSION changes the join script needs to be re-executed - this is not required in this case as no *additional* step was added:
  git diff 2e576ffbb510bddfd8aa05e48f2f456b96b7f452..80bca0661e8dd72a484ae0e7914efa585453ed7a -- services/univention-bind

If the scrip previously failed it is still marked as pending, so no bump of the version required to force a re-execution.

Incrementing VERSION generally has the draw-back that on DC Slaves and Member-Servers the Admin must type in her password as there exists no /etc/ldap.secret for automatic execution. We try to prevent this interactive intervention if all possible.
Comment 7 Jürn Brodersen univentionstaff 2018-06-04 09:56:18 CEST
OK -> Verified
Comment 8 Erik Damrose univentionstaff 2018-06-06 16:16:23 CEST
<http://errata.software-univention.de/ucs/4.3/92.html>