Univention Bugzilla – Bug 37491
Installing KVM during system setup breaks external DNS resolution
Last modified: 2019-01-18 12:02:19 CET
I just installed a fresh UCS 4.0-0 from DVD/ISO and specified '220.127.116.11' as nameserver in the installer (d-i). I then configured the system to become a DC Master. After the installation finished, I ended up with these DNS settings:
> dns/forwarder1: 18.104.22.168
> nameserver1: 10.200.30.22
> nameserver2: 22.214.171.124
I didn't expect that 126.96.36.199 would also be added as nameserver2, since there is little chance 188.8.131.52 will be able to resolve anything in my domain.
Task #6732 UCS Technical Training
I experienced this on all 6 setups:
- I Installed UCS-4.2-1 on Wednesday
- I upgrades successfully to UCS-4.2-2 on Thursday morning using updates.software-univention.de, so DNS was working than
- later I tried to setup a Windows-VM, which complained about missing "Internet Connection" - this was caused by the DC Master not doing external DNS resolution, as 172.1..0.1 was configures as UCRV "nameserver2", not as UCRV "dns/forwarder1".
- After running /usr/share/univention-server/univention-fix-ucr-dns manually NS was fixed by moving the external DNS to forwarder.
I was able to reproduce the bug after the training by setting up a new UCS-4.2-1 system:
- after a fresh install the external DNS server is configures as UCRV "nameserver2", not as UCRV "dns/forwarder1".
/var/log/univention/setup.log has this:
>=== 30_net/16forwarder (2017-09-15 18:11:56) ===
>__NAME__:30_net/16forwarder Setting external name servers
>Restarting bind9 Domain Name Server (DNS): Unknown DNS backend failed!
>run-parts: executing /usr/lib/univention-system-setup/scripts/30_net/18proxy --network-only --appliance-mode
>2017-09-15 18:16:05.564326527+02:00 (in joinscript_init)
>2017-09-15 18:16:06,090 INFO __main__.ucr/ns Found server 172.16.0.1 from UCRV nameserver1
>2017-09-15 18:16:36,106 WARNING __main__.val Connection check to 172.16.0.1 (Timeout) failed, maybe down?!
>2017-09-15 18:16:36,106 INFO __main__.val Leaving it configured as nameserver anyway
>2017-09-15 18:16:36,106 INFO __main__.xor Skip removing nameservers from forwarders
>2017-09-15 18:16:36,110 INFO __main__.ucr/self Default IP address configured in UCR: 172.16.1.50
>2017-09-15 18:16:36,110 INFO __main__.ns Skip adding NS
>2017-09-15 18:16:36,110 INFO __main__.ldap Skip adding master
>2017-09-15 18:16:36,111 INFO __main__.ucr Updating 'nameserver1': '172.16.0.1' -> '172.16.1.50'
>2017-09-15 18:16:36,111 INFO __main__.ucr Updating 'nameserver2': None -> '172.16.0.1'
>2017-09-15 18:16:36,333 INFO __main__.ucr Reloading BIND
>Restarting bind9 Domain Name Server (DNS): samba4 ldap proxy failed!
>invoke-rc.d: initscript bind9, action "restart" failed.
>Wait for bind9: .Restarting bind9 Domain Name Server (DNS): samba4 ldap proxy.
>Object modified: cn=default-settings,cn=dns,cn=dhcp,cn=policies,dc=schulung5-ucs,dc=intranet
>Object exists: cn=services,cn=univention,dc=schulung5-ucs,dc=intranet
>Object created: cn=DNS,cn=services,cn=univention,dc=schulung5-ucs,dc=intranet
>Object modified: cn=dc0,cn=dc,cn=computers,dc=schulung5-ucs,dc=intranet
>2017-09-15 18:16:51.729830310+02:00 (in joinscript_save_current_version)
>=== 90_postjoin/20upgrade (2017-09-15 18:17:21) ===
>__NAME__:90_postjoin/20upgrade Upgrading the system
>__MSG__:This might take a while depending on the number of pending updates.
>Running upgrade on DC Master: univention-upgrade --noninteractive --updateto 4.2-99
>Starting univention-upgrade. Current UCS version is 4.2-1 errata52
>Checking for local repository: none
>The connection to the repository server failed: Configuration error: host is unresolvable. Please check the repository configuration and the network connection.
>=== DONE (2017-09-15 18:17:29) ===
>=== done (2017-09-15 18:17:38) ===
This only happens when "KVM" is selected during system setup, which configures the network bridge in the chroot environment, which breaks networking:
> $ ip r
> default via 172.16.1.1 dev eth0
> 172.16.1.0/24 dev eth0 proto kernel scope link src 172.16.1.50
> 172.16.1.0/24 dev br0 proto kernel scope link src 172.16.1.50
> $ ip a
> 2: eth0:
> inet 172.16.1.50/24 ...
> 3: br0:
> inet 172.16.1.50/24 ...
pinging 172.16.0.1 no longer works.
Looking in /var/log/univention/config-registry.replog shows this:
~2017-09-17 09:45:49 interfaces/eth0/* is configured
~2017-09-17 10:07:30 ucs-kvm-setup-bridge transferred the settings from eth0 to br0
Bug #36085 comment 3 (ucs-4.0-0@55526) moved the code for unsetting "interfaces/restart/auto" earlier, so the code now gets executed while still in the chroot environment.
It works when I set interfaces/restart/auto=no manually on the text console as soon as USS is started.
Short-term we should prevent "ucs-kvm-setup-bridge" from updating the interface until the next reboot.
Long-term we should make "interfaces/restart/auto=no" the default.
This also explains why non of out tests detected this, as we don't test nested virtualization in EC2!
Task #9985 UCS Technical Training (again)
Task #10198 UCS Technical Training (again)
Task #10200 UCS Technical Training (again):
> dns/forwarder1: <empty>
> nameserver1: 172.16.1.10 <- UCS Master
> nameserver2: 172.16.0.1 <- Extneral DNS server, should be dns/forwarder1
Philipp pointed out the underlying reason for this in Comment 1.
The tl;dr is:
Installing KVM during system setup breaks external DNS resolution.
Does this issue still happen?
(In reply to Stefan Gohmann from comment #5)
> Does this issue still happen?
Yes: I just tried UCS-4.3-2
(In reply to Philipp Hahn from comment #1)
> > $ ip r
> > default via 172.16.1.1 dev eth0
> > 172.16.1.0/24 dev eth0 proto kernel scope link src 172.16.1.50
> > 172.16.1.0/24 dev br0 proto kernel scope link src 172.16.1.50
If I do "ip addr flush eth0" inside the chroot, the default route is removed, but after that I can ping the gateway fine.
"ifup -v br0" fails as that address is already configured:
> /bin/ip addr add 10.200.17.6/184.108.40.206 broadcast 10.200.17.255 dev br0 label br0
> RTNETLINK answers: File exists
> /bin/ip route add default via 10.200.17.1 dev br0 onlink
> RTNETLINK answers: File exists
Test VM "pmhahn_bug37491" @ "utby" with snapshots available for further testing.
(In reply to Philipp Hahn from comment #6)
> (In reply to Stefan Gohmann from comment #5)
> > Does this issue still happen?
> Yes: I just tried UCS-4.3-2
> If I do "ip addr flush eth0" inside the chroot, the default route is
> removed, but after that I can ping the gateway fine.
Is that not the issue that was fixed with 4.3-2e305, bug 47767? I am asking because you used a 4.3-2 install medium, for which the issue was identified. IMHO we need to re-check this with 4.3-3
(In reply to Erik Damrose from comment #7)
> (In reply to Philipp Hahn from comment #6)
> > (In reply to Stefan Gohmann from comment #5)
> > > Does this issue still happen?
> > Yes: I just tried UCS-4.3-2
> Is that not the issue that was fixed with 4.3-2e305, bug 47767? I am asking
> because you used a 4.3-2 install medium, for which the issue was identified.
> IMHO we need to re-check this with 4.3-3
I already re-did the test with UCS-4.3-3 also and it still failed.
(I forgot to update the version number in comment #6)