Univention Bugzilla – Bug 37491
Installing KVM during system setup breaks external DNS resolution
Last modified: 2019-08-07 09:44:55 CEST
I just installed a fresh UCS 4.0-0 from DVD/ISO and specified '22.214.171.124' as nameserver in the installer (d-i). I then configured the system to become a DC Master. After the installation finished, I ended up with these DNS settings:
> dns/forwarder1: 126.96.36.199
> nameserver1: 10.200.30.22
> nameserver2: 188.8.131.52
I didn't expect that 184.108.40.206 would also be added as nameserver2, since there is little chance 220.127.116.11 will be able to resolve anything in my domain.
Task #6732 UCS Technical Training
I experienced this on all 6 setups:
- I Installed UCS-4.2-1 on Wednesday
- I upgrades successfully to UCS-4.2-2 on Thursday morning using updates.software-univention.de, so DNS was working than
- later I tried to setup a Windows-VM, which complained about missing "Internet Connection" - this was caused by the DC Master not doing external DNS resolution, as 172.1..0.1 was configures as UCRV "nameserver2", not as UCRV "dns/forwarder1".
- After running /usr/share/univention-server/univention-fix-ucr-dns manually NS was fixed by moving the external DNS to forwarder.
I was able to reproduce the bug after the training by setting up a new UCS-4.2-1 system:
- after a fresh install the external DNS server is configures as UCRV "nameserver2", not as UCRV "dns/forwarder1".
/var/log/univention/setup.log has this:
>=== 30_net/16forwarder (2017-09-15 18:11:56) ===
>__NAME__:30_net/16forwarder Setting external name servers
>Restarting bind9 Domain Name Server (DNS): Unknown DNS backend failed!
>run-parts: executing /usr/lib/univention-system-setup/scripts/30_net/18proxy --network-only --appliance-mode
>2017-09-15 18:16:05.564326527+02:00 (in joinscript_init)
>2017-09-15 18:16:06,090 INFO __main__.ucr/ns Found server 172.16.0.1 from UCRV nameserver1
>2017-09-15 18:16:36,106 WARNING __main__.val Connection check to 172.16.0.1 (Timeout) failed, maybe down?!
>2017-09-15 18:16:36,106 INFO __main__.val Leaving it configured as nameserver anyway
>2017-09-15 18:16:36,106 INFO __main__.xor Skip removing nameservers from forwarders
>2017-09-15 18:16:36,110 INFO __main__.ucr/self Default IP address configured in UCR: 172.16.1.50
>2017-09-15 18:16:36,110 INFO __main__.ns Skip adding NS
>2017-09-15 18:16:36,110 INFO __main__.ldap Skip adding master
>2017-09-15 18:16:36,111 INFO __main__.ucr Updating 'nameserver1': '172.16.0.1' -> '172.16.1.50'
>2017-09-15 18:16:36,111 INFO __main__.ucr Updating 'nameserver2': None -> '172.16.0.1'
>2017-09-15 18:16:36,333 INFO __main__.ucr Reloading BIND
>Restarting bind9 Domain Name Server (DNS): samba4 ldap proxy failed!
>invoke-rc.d: initscript bind9, action "restart" failed.
>Wait for bind9: .Restarting bind9 Domain Name Server (DNS): samba4 ldap proxy.
>Object modified: cn=default-settings,cn=dns,cn=dhcp,cn=policies,dc=schulung5-ucs,dc=intranet
>Object exists: cn=services,cn=univention,dc=schulung5-ucs,dc=intranet
>Object created: cn=DNS,cn=services,cn=univention,dc=schulung5-ucs,dc=intranet
>Object modified: cn=dc0,cn=dc,cn=computers,dc=schulung5-ucs,dc=intranet
>2017-09-15 18:16:51.729830310+02:00 (in joinscript_save_current_version)
>=== 90_postjoin/20upgrade (2017-09-15 18:17:21) ===
>__NAME__:90_postjoin/20upgrade Upgrading the system
>__MSG__:This might take a while depending on the number of pending updates.
>Running upgrade on DC Master: univention-upgrade --noninteractive --updateto 4.2-99
>Starting univention-upgrade. Current UCS version is 4.2-1 errata52
>Checking for local repository: none
>The connection to the repository server failed: Configuration error: host is unresolvable. Please check the repository configuration and the network connection.
>=== DONE (2017-09-15 18:17:29) ===
>=== done (2017-09-15 18:17:38) ===
This only happens when "KVM" is selected during system setup, which configures the network bridge in the chroot environment, which breaks networking:
> $ ip r
> default via 172.16.1.1 dev eth0
> 172.16.1.0/24 dev eth0 proto kernel scope link src 172.16.1.50
> 172.16.1.0/24 dev br0 proto kernel scope link src 172.16.1.50
> $ ip a
> 2: eth0:
> inet 172.16.1.50/24 ...
> 3: br0:
> inet 172.16.1.50/24 ...
pinging 172.16.0.1 no longer works.
Looking in /var/log/univention/config-registry.replog shows this:
~2017-09-17 09:45:49 interfaces/eth0/* is configured
~2017-09-17 10:07:30 ucs-kvm-setup-bridge transferred the settings from eth0 to br0
Bug #36085 comment 3 (ucs-4.0-0@55526) moved the code for unsetting "interfaces/restart/auto" earlier, so the code now gets executed while still in the chroot environment.
It works when I set interfaces/restart/auto=no manually on the text console as soon as USS is started.
Short-term we should prevent "ucs-kvm-setup-bridge" from updating the interface until the next reboot.
Long-term we should make "interfaces/restart/auto=no" the default.
This also explains why non of out tests detected this, as we don't test nested virtualization in EC2!
Task #9985 UCS Technical Training (again)
Task #10198 UCS Technical Training (again)
Task #10200 UCS Technical Training (again):
> dns/forwarder1: <empty>
> nameserver1: 172.16.1.10 <- UCS Master
> nameserver2: 172.16.0.1 <- Extneral DNS server, should be dns/forwarder1
Philipp pointed out the underlying reason for this in Comment 1.
The tl;dr is:
Installing KVM during system setup breaks external DNS resolution.
Does this issue still happen?
(In reply to Stefan Gohmann from comment #5)
> Does this issue still happen?
Yes: I just tried UCS-4.3-2
(In reply to Philipp Hahn from comment #1)
> > $ ip r
> > default via 172.16.1.1 dev eth0
> > 172.16.1.0/24 dev eth0 proto kernel scope link src 172.16.1.50
> > 172.16.1.0/24 dev br0 proto kernel scope link src 172.16.1.50
If I do "ip addr flush eth0" inside the chroot, the default route is removed, but after that I can ping the gateway fine.
"ifup -v br0" fails as that address is already configured:
> /bin/ip addr add 10.200.17.6/18.104.22.168 broadcast 10.200.17.255 dev br0 label br0
> RTNETLINK answers: File exists
> /bin/ip route add default via 10.200.17.1 dev br0 onlink
> RTNETLINK answers: File exists
Test VM "pmhahn_bug37491" @ "utby" with snapshots available for further testing.
(In reply to Philipp Hahn from comment #6)
> (In reply to Stefan Gohmann from comment #5)
> > Does this issue still happen?
> Yes: I just tried UCS-4.3-2
> If I do "ip addr flush eth0" inside the chroot, the default route is
> removed, but after that I can ping the gateway fine.
Is that not the issue that was fixed with 4.3-2e305, bug 47767? I am asking because you used a 4.3-2 install medium, for which the issue was identified. IMHO we need to re-check this with 4.3-3
(In reply to Erik Damrose from comment #7)
> (In reply to Philipp Hahn from comment #6)
> > (In reply to Stefan Gohmann from comment #5)
> > > Does this issue still happen?
> > Yes: I just tried UCS-4.3-2
> Is that not the issue that was fixed with 4.3-2e305, bug 47767? I am asking
> because you used a 4.3-2 install medium, for which the issue was identified.
> IMHO we need to re-check this with 4.3-3
I already re-did the test with UCS-4.3-3 also and it still failed.
(I forgot to update the version number in comment #6)
Again at 4 customer environments: As soon as uvmm-node-kvm gets installed during the PXE installation, the networks afterwards is broken. This especially breaks un-setting the "reinstall" option on the computer account at the end of the PXE installation.
Also there is no option to really disable the bridge creation:
- If ucrv:uvmm/kvm/bridge/autostart is set to 'yes', debian/univention-virtual-machine-manager-node-kvm.init creates 'eth0' as a bridge with the original 'eth0' being renamed to 'peth0'.
- If ucrv:uvmm/kvm/bridge/autostart is set to 'no' or 'manually', debian/univention-virtual-machine-manager-node-kvm.postins creates the 'br0' bridge with 'eth0' enslaved through UCRVs.
- There is no option to disable both mechanisms, which is required at the customer site as they have several bridges to setup and the interface names are not stable (8-12 interfaces to different networks).
I had to divert the script to get the work done:
dpkg-divert --local --rename --divert /usr/lib/univention-virtual-machine-manager-node-kvm/ucs-kvm-setup-bridge.XXX --add /usr/lib/univention-virtual-machine-manager-node-kvm/ucs-kvm-setup-bridge
That was further complicated by the fact that loading the "bridge" Linux kernel module failed due to Bug #48123