Bug 56911 - EC2: /usr/share/univention-server/univention-fix-ucr-dns stalls
EC2: /usr/share/univention-server/univention-fix-ucr-dns stalls
Status: CLOSED FIXED
Product: UCS
Classification: Unclassified
Component: DNS
UCS 5.0
Other Linux
: P5 normal (vote)
: UCS 5.0-6-errata
Assigned To: Philipp Hahn
Florian Best
https://git.knut.univention.de/univen...
:
: 56750 (view as bug list)
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2023-12-13 00:33 CET by Philipp Hahn
Modified: 2024-03-07 13:07 CET (History)
0 users

See Also:
What kind of report is it?: Bug Report
What type of bug is this?: 6: Setup Problem: Issue for the setup process
Who will be affected by this bug?: 1: Will affect a very few installed domains
How will those affected feel about the bug?: 5: Blocking further progress on the daily work
User Pain: 0.171
Enterprise Customer affected?:
School Customer affected?:
ISV affected?:
Waiting Support:
Flags outvoted (downgraded) after PO Review:
Ticket number:
Bug group (optional):
Max CVSS v3 score:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Philipp Hahn univentionstaff 2023-12-13 00:33:01 CET
Some some yet unknown reason querying the "Amazon provided DNS" stalls:

# /usr/share/univention-server/univention-fix-ucr-dns -v -d
2023-12-13 00:00:56,016 DEBUG   __main__.cli/ns   Reading UCS domain servers from CLI...
2023-12-13 00:00:56,016 DEBUG   __main__.ucr/fwd  Reading external DNS forwarders from UCR...
2023-12-13 00:00:56,016 DEBUG   __main__.ucr/ns   Reading UCS domain servers from UCR...
2023-12-13 00:00:56,016 INFO    __main__.ucr/ns   Found server 10.210.241.22 from UCRV nameserver1
2023-12-13 00:00:56,016 INFO    __main__.ucr/ns   Found server 10.210.0.2 from UCRV nameserver2
2023-12-13 00:00:56,016 DEBUG   __main__.val      Validating UCS domain servers...
2023-12-13 00:00:56,017 DEBUG   __main__.dns/srv  Querying 10.210.241.22 for SRV _domaincontroller_master._tcp.qa506.intranet.
2023-12-13 00:00:56,019 DEBUG   __main__.dns/srv  header={'id': 26693, 'qr': 1, 'opcode': 0, 'aa': 1, 'tc': 0, 'rd': 0, 'ra': 1, 'z': 0, 'rcode': 0, 'qdcount': 1, 'ancount': 1, 'nscount': 1, 'arcount': 1, 'opcodestr': 'QUERY', 'status': 'NOERROR'}
2023-12-13 00:00:56,019 INFO    __main__.val      Validated UCS domain server: 10.210.241.22
2023-12-13 00:00:56,020 DEBUG   __main__.dns/srv  Querying 10.210.0.2 for SRV _domaincontroller_master._tcp.qa506.intranet.
>>>>>
2023-12-13 00:01:26,050 WARNING __main__.val      Connection check to 10.210.0.2 (Timeout) failed, maybe down?!
2023-12-13 00:01:26,051 INFO    __main__.val      Leaving it configured as nameserver anyway
2023-12-13 00:01:26,051 INFO    __main__.xor      Skip removing nameservers from forwarders
2023-12-13 00:01:26,051 INFO    __main__          No action required.

`dig` on the other hand is instantaneous:

# time dig +short @10.210.0.2 _domaincontroller_master._tcp.qa506.intranet. srv
real    0m0.013s
user    0m0.009s
sys     0m0.003s

`strace` yields:

# strace -s 128 python3 -c 'import DNS;DNS.DnsRequest("_domaincontroller_master._tcp.qa506.intranet.", qtype="SRV", server=["10.210.0.2"], aa=1, rd=0, protocol="udp").req()'

socket(AF_INET, SOCK_DGRAM|SOCK_CLOEXEC, IPPROTO_IP) = 3
clock_gettime(CLOCK_REALTIME, {tv_sec=1702423462, tv_nsec=367591919}) = 0
getrandom("\xd4\x55", 2, 0)             = 2
bind(3, {sa_family=AF_INET, sin_port=htons(55381), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
connect(3, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("10.210.0.2")}, 16) = 0
sendto(3, "G\324\0\0\0\1\0\0\0\0\0\0\30_domaincontroller_master\4_tcp\5qa506\10intranet\0\0!\0\1", 62, 0, NULL, 0) = 62
Comment 1 Philipp Hahn univentionstaff 2023-12-13 02:08:00 CET
# dpkg -l python3-dnslib python3-dnspython python3-dnsq python3-dns     
dpkg-query: no packages found matching python3-dnslib
dpkg-query: no packages found matching python3-dnsq
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name              Version          Architecture Description
+++-=================-================-============-=================================
ii  python3-dns       3.2.0-2          all          DNS client module for Python 3
ii  python3-dnspython 1.16.0-1+deb10u1 all          DNS toolkit for Python 3

# python3
>>> import dns.resolver
>>> resolver = dns.resolver.Resolver(configure=False)
>>> resolver.nameservers = ["10.210.0.2"]
>>> answer = resolver.query("_domaincontroller_master._tcp.qa506.intranet.", "SRV")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3/dist-packages/dns/resolver.py", line 1002, in query
    raise NXDOMAIN(qnames=qnames_to_try, responses=nxdomain_responses)
dns.resolver.NXDOMAIN: The DNS query name does not exist: _domaincontroller_master._tcp.qa506.intranet.
>>> resolver.nameservers = ["10.210.45.46"]
>>> answer = resolver.query("_domaincontroller_master._tcp.qa506.intranet.", "SRV")
>>> for rr in answer: print(rr)
0 0 0 ucs-8524.qa506.intranet.
Comment 2 Philipp Hahn univentionstaff 2023-12-14 16:00:20 CET
# apt install tcpdump tshark

# tcpdump -i eth0 -w /tmp/dns2.pcap 'udp port 53' &
# python3 -c 'import DNS;DNS.DnsRequest("_domaincontroller_master._tcp.qa506.intranet.", qtype="SRV", server=["10.210.0.2"], aa=1, rd=0, timeout=3).req()'
# tshark -r /tmp/dns2.pcap 
    1   0.000000 10.210.105.81 → 10.210.0.2   DNS 104 Standard query 0x25bb SRV _domaincontroller_master._tcp.qa506.intranet

# tcpdump -i eth0 -w /tmp/dns1.pcap 'udp port 53' &
# dig @10.210.0.2 _domaincontroller_master._tcp.qa506.intranet. srv
# tshark -r /tmp/dns1.pcap 
    1   0.000000 10.210.105.81 → 10.210.0.2   DNS 127 Standard query 0x9e5a SRV _domaincontroller_master._tcp.qa506.intranet OPT
    2   0.001184   10.210.0.2 → 10.210.105.81 DNS 190 Standard query response 0x9e5a No such name SRV _domaincontroller_master._tcp.qa506.intranet SOA a.root-servers.net OPT

# tshark -V -O dns -c 1 -r /tmp/dns1.pcap 
Frame 1: 127 bytes on wire (1016 bits), 127 bytes captured (1016 bits)
Ethernet II, Src: 02:db:cb:a3:89:2b (02:db:cb:a3:89:2b), Dst: 0a:c6:a9:c0:00:01 (0a:c6:a9:c0:00:01)
Internet Protocol Version 4, Src: 10.210.105.81, Dst: 10.210.0.2
User Datagram Protocol, Src Port: 36125, Dst Port: 53
Domain Name System (query)
    Transaction ID: 0x9e5a
    Flags: 0x0120 Standard query
        0... .... .... .... = Response: Message is a query
        .000 0... .... .... = Opcode: Standard query (0)
        .... ..0. .... .... = Truncated: Message is not truncated
        .... ...1 .... .... = Recursion desired: Do query recursively
        .... .... .0.. .... = Z: reserved (0)
        .... .... ..1. .... = AD bit: Set
        .... .... ...0 .... = Non-authenticated data: Unacceptable
    Questions: 1
    Answer RRs: 0
    Authority RRs: 0
    Additional RRs: 1
    Queries
        _domaincontroller_master._tcp.qa506.intranet: type SRV, class IN
            Name: _domaincontroller_master._tcp.qa506.intranet
            [Name Length: 44]
            [Label Count: 4]
            Type: SRV (Server Selection) (33)
            Class: IN (0x0001)
    Additional records
        <Root>: type OPT
            Name: <Root>
            Type: OPT (41)
            UDP payload size: 4096
            Higher bits in extended RCODE: 0x00
            EDNS0 version: 0
            Z: 0x0000
                0... .... .... .... = DO bit: Cannot handle DNSSEC security RRs
                .000 0000 0000 0000 = Reserved: 0x0000
            Data length: 12
            Option: COOKIE
                Option Code: COOKIE (10)
                Option Length: 8
                Option Data: 94c74ec9127356e3
                Client Cookie: 94c74ec9127356e3
                Server Cookie: <MISSING>

# git grep python3-dns
base/univention-errata-level/maintained-packages.txt:python3-dns
base/univention-python/debian/control: python3-dns,
services/univention-pkgdb/debian/control: python3-dns,
# gpi DNS
base/univention-server/univention-fix-ucr-dns:from DNS import DnsRequest, SocketError, TimeoutError
services/univention-pkgdb/pyshared/univention/pkgdb.py:import 


I have a branch which replaces `py3dns` with `pythondns`, what that was not the issue.

The real problem seems to be related to `RD` (recursion desired): 

# dig +norecurse @10.210.0.2 _domaincontroller_master._tcp.qa506.intranet. SRV
;; connection timed out; no servers could be reached


https://docs.aws.amazon.com/vpc/latest/userguide/vpc-dns.html
> The Amazon Route 53 Resolver only supports recursive DNS queries.
Comment 4 Philipp Hahn univentionstaff 2023-12-14 17:53:09 CET
All UCS 5.0-x AWS-EC2 Images are probably affected: the symptom is that even after provisioning

nameserver1: $localhost
nameserver2: 10.210.0.3

but it should be

nameserver1: $localhost
dns/forwarder1: 10.210.0.3

This leads to https://updates.software-univention.de/ not being resolvable.

I have also see the VM loos network connection after provisioning.
Comment 5 Philipp Hahn univentionstaff 2023-12-15 11:04:26 CET
(In reply to Philipp Hahn from comment #4)
> All UCS 5.0-x AWS-EC2 Images are probably affected: the symptom is that even
> after provisioning
> 
> nameserver1: $localhost
> nameserver2: 10.210.0.3
> 
> but it should be
> 
> nameserver1: $localhost
> dns/forwarder1: 10.210.0.3
> 
> This leads to https://updates.software-univention.de/ not being resolvable.

Actually that is not fully true: By NOT having a forwarder BIND9 will RECURSE itself, e.g. do all the lookups from ROOT to final DNS name.
- for PUBLIC names that still works, but will create more network traffic and load on the involved DNS servers due to less caching
- for INTERNAL names (for example .knut.univention.de) that might break as in many cases there are no "delegation NS records" to follow.

This completely breaks when `ucr set dns/fakeroot=yes` as then BIND9 no longer does the recursion and requires a external DNS server to do the recursion for it.

Also domains are affected, where DNS traffic is blocked or restricted to dedicated resolvers only.
Comment 6 Philipp Hahn univentionstaff 2024-03-04 20:05:47 CET
*** Bug 56750 has been marked as a duplicate of this bug. ***
Comment 7 Philipp Hahn univentionstaff 2024-03-04 22:10:17 CET
[5.0-6] 6ff656cdbf doc(UCR): PEP-484 type annotations
 base/univention-config-registry/python/univention/config_registry/backend.py    |  4 ++--
 base/univention-config-registry/python/univention/config_registry/frontend.py   |  4 ++--
 base/univention-config-registry/python/univention/config_registry/interfaces.py |  9 +++++----
 base/univention-config-registry/python/univention/config_registry_info.py       |  6 +++---
 base/univention-config-registry/python/univention/info_tools.py                 | 10 +++++-----
 base/univention-config-registry/python/univention/service_info.py               |  8 ++++----
 6 files changed, 21 insertions(+), 20 deletions(-)

[5.0-6] 434b5d6a2d fix(UCR): Interfaces(ReadOnlyConfigRegistry())
 base/univention-config-registry/debian/changelog                                |  3 ++-
 base/univention-config-registry/python/univention/config_registry/interfaces.py |  4 ++--
 doc/errata/staging/univention-config-registry.yaml                              | 10 ++++++++++
 3 files changed, 14 insertions(+), 3 deletions(-)

[5.0-6] c851d3ca30 fix(server): Switch from py3dns to dnspython
 base/univention-server/univention-fix-ucr-dns | 98 +++++++++++++++++++++++------------------------
 1 file changed, 48 insertions(+), 50 deletions(-)

[5.0-6] 2744e52bb0 refactor(server): Modernize Python 3 code
 base/univention-server/univention-fix-ucr-dns | 32 +++++++++++++++-----------------
 1 file changed, 15 insertions(+), 17 deletions(-)

[5.0-6] cdd3e47387 refactor(server): Improve IP address handling
 base/univention-server/univention-fix-ucr-dns | 23 ++++++++++++++---------
 1 file changed, 14 insertions(+), 9 deletions(-)

[5.0-6] 44d2cc67a0 style(server): Rename `query_master_sr{c -> v}_record()`
 base/univention-server/univention-fix-ucr-dns | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

[5.0-6] 0c36b33487 fix(server): Remove duplicate timestamp
 base/univention-server/debian/changelog       |  6 ++++++
 base/univention-server/debian/control         |  2 +-
 base/univention-server/univention-fix-ucr-dns |  2 +-
 doc/errata/staging/univention-server.yaml     | 18 ++++++++++++++++++
 4 files changed, 26 insertions(+), 2 deletions(-)

[5.0-6] 86479483d4 fix(pkgdb): Switch from py3dns to dnspython
 doc/errata/staging/univention-pkgdb.yaml               | 11 +++++++++++
 services/univention-pkgdb/debian/changelog             |  5 +++--
 services/univention-pkgdb/debian/control               |  2 +-
 services/univention-pkgdb/pyshared/univention/pkgdb.py | 25 ++++++++++++++-----------
 4 files changed, 29 insertions(+), 14 deletions(-)

[5.0-6] 2fb45d0379 fix(python): Remove unused py3dns
 base/univention-python/debian/changelog   | 6 ++++++
 base/univention-python/debian/control     | 2 --
 doc/errata/staging/univention-python.yaml | 5 +++--
 3 files changed, 9 insertions(+), 4 deletions(-)

[5.0-6] 6ad30215de fix(USS): Adjust code to python3-dnspython 2.3.0
 base/univention-system-setup/debian/changelog         |  6 ++++++
 base/univention-system-setup/umc/python/setup/util.py |  8 ++++----
 doc/errata/staging/univention-system-setup.yaml       | 10 ++++++++++
 3 files changed, 20 insertions(+), 4 deletions(-)

Package: univention-config-registry
Version: 15.0.11-1
Branch: ucs_5.0-0
Scope: errata5.0-6

Package: univention-pkgdb
Version: 13.0.5-1
Branch: ucs_5.0-0
Scope: errata5.0-6

Package: univention-python
Version: 13.0.5-4
Branch: ucs_5.0-0
Scope: errata5.0-6

Package: univention-server
Version: 15.0.8-2
Branch: ucs_5.0-0
Scope: errata5.0-6

Package: univention-system-setup
Version: 13.0.10-5
Branch: ucs_5.0-0
Scope: errata5.0-6
Comment 8 Florian Best univentionstaff 2024-03-05 12:38:08 CET
OK: changed DNS python library
OK: advisory