Bug 49493 - bind9: Make "max-socks" configurable via UCR and increase the default limits
bind9: Make "max-socks" configurable via UCR and increase the default limits
Status: CLOSED FIXED
Product: UCS
Classification: Unclassified
Component: DNS
UCS 4.4
Other other
: P5 normal (vote)
: UCS 4.4-1-errata
Assigned To: Daniel Tröder
Felix Botner
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2019-05-16 15:02 CEST by Michael Grandjean
Modified: 2019-09-04 15:48 CEST (History)
3 users (show)

See Also:
What kind of report is it?: Bug Report
What type of bug is this?: 5: Major Usability: Impairs usability in key scenarios
Who will be affected by this bug?: 2: Will only affect a few installed domains
How will those affected feel about the bug?: 5: Blocking further progress on the daily work
User Pain: 0.286
Enterprise Customer affected?: Yes
School Customer affected?: Yes
ISV affected?:
Waiting Support: Yes
Flags outvoted (downgraded) after PO Review:
Ticket number: 2019071821000817
Bug group (optional):
Max CVSS v3 score:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Comment 1 Michael Grandjean univentionstaff 2019-05-16 15:05:20 CEST
comment #0 set to private because of customer information. This is the redacted version:

During a customer workshop we observed the following behaviour (UCS 4.4-0, UCS@school edu school slave):

After some time, bind9 would stop to operate:

> root@dcschule01:~# service bind9 status
> ● bind9.service - BIND Domain Name Server with samba4 backend
>   Loaded: loaded (/lib/systemd/system/bind9.service; enabled; vendor preset: enabled)
>   Drop-In: /etc/systemd/system/bind9.service.d
>            └─10-configure-backend.conf
>    Active: active (running) since Sat 2019-02-23 09:50:52 CET; 1 weeks 4 days ago
>      Docs: man:named(8)
>   Process: 1271 ExecStartPost=/usr/lib/univention-bind/samba4 wait-for-startup (code=exited, status=0/SUCCESS)
>   Process: 1267 ExecStartPre=/bin/systemctl stop univention-bind-ldap.service (code=exited, status=0/SUCCESS)
>  Main PID: 1270 (named)
>     Tasks: 7 (limit: 4915)
>    Memory: 58.1M
>       CPU: 21min 41.726s
>    CGroup: /system.slice/bind9.service
>            └─1270 /usr/sbin/named -c /etc/bind/named.conf.samba4 -f -d 0
> 
> Mär 06 11:42:02 dcschule01 named[1270]: samba_dlz: cancelling transaction on zone schulen-cb.local
> Mär 06 11:42:02 dcschule01 named[1270]: accept: file descriptor exceeds limit (5739/4096)
> Mär 06 11:42:03 dcschule01 named[1270]: socket: file descriptor exceeds limit (5739/4096)
> Mär 06 11:42:03 dcschule01 named[1270]: socket: file descriptor exceeds limit (5739/4096)
> Mär 06 11:42:03 dcschule01 named[1270]: socket: file descriptor exceeds limit (5739/4096)
> Mär 06 11:42:03 dcschule01 named[1270]: socket: file descriptor exceeds limit (5739/4096)
> Mär 06 11:42:03 dcschule01 named[1270]: socket: file descriptor exceeds limit (5739/4096)
> Mär 06 11:42:03 dcschule01 named[1270]: socket: file descriptor exceeds limit (5739/4096)
> Mär 06 11:42:03 dcschule01 named[1270]: socket: file descriptor exceeds limit (5739/4096)
> Mär 06 11:42:03 dcschule01 named[1270]: socket: file descriptor exceeds limit (5739/4096)

The corresponding excerpt from the manpage:

<man:named8>

[...]
        -S #max-socks
            Allow named to use up to #max-socks sockets. The default value is 4096 on systems built with default configuration options, and 21000 on systems built with "configure --with-tuning=large".

            Warning: This option should be unnecessary for the vast majority of users. The use of this option could even be harmful because the specified value may exceed the limitation of the underlying system API. It is therefore set only when the default configuration causes exhaustion of file descriptors and the operational environment is known to support the specified number of sockets. Note also that the actual maximum number is normally a little fewer than the specified value because named reserves some file descriptors for its internal use.
[...]
Comment 2 Stefan Gohmann univentionstaff 2019-07-19 19:46:05 CEST
Happened again, the system stopped working due to named problems:

named[6745]: socket: file descriptor exceeds limit (4097/4096)

I increase the feel flag since it is critical if bind stops working on a DC.
Comment 3 Daniel Tröder univentionstaff 2019-09-03 09:37:51 CEST
The three init scripts have been adapted to append "-S #max-socks' when starting named, in case the new UCRV "dns/max-socks" is set.

The default is "unset", and then nothing changes.

Using "LimitNOFILE=..." in the systemd unit is not necessary, because the named process raises the limit on its own.

Using a value below "30" for dns/max-socks leads to not named not starting.

$ ucr set dns/max-socks=65000
→ Sep  3 09:17:40 m66 named[20909]: using up to 65000 sockets
→ Sep  3 09:17:42 m66 named[20937]: using up to 65000 sockets

$ucr unset dns/max-socks
Sep  3 09:29:31 m66 named[22488]: using up to 4096 sockets
Sep  3 09:29:32 m66 named[22517]: using up to 4096 sockets


[4.4-1] 4544400f79 Bug #49493: reorder code to reduce diff between init scripts
[4.4-1] c5348f17d9 Bug #49493: allow setting the maximum number of open sockets via UCR
[4.4-1] 46efeb32d0 Bug #49493: advisory update

univention-bind (13.0.1-7)
Comment 4 Daniel Tröder univentionstaff 2019-09-03 10:09:40 CEST
SDB article (made invisible for now): https://help.univention.com/t/bind9-stops-to-operate-socket-file-descriptor-exceeds-limit/12905
Comment 5 Felix Botner univentionstaff 2019-09-03 10:20:58 CEST
OK - univention-bind dns/max-socks
OK - yaml

SDB

  There are always two named processes running in UCS. When restarted, they write 
  their new configuration to /var/log/syslog, including a line using up to _____ 
  sockets.

remove that sentence, in case of a samba DC there is only named process, to complicated to explain ...
Comment 6 Daniel Tröder univentionstaff 2019-09-03 10:34:25 CEST
(In reply to Felix Botner from comment #5)
> SDB
> 
>   There are always two named processes running in UCS. When restarted, they
> write 
>   their new configuration to /var/log/syslog, including a line using up to
> _____ 
>   sockets.
> 
> remove that sentence, in case of a samba DC there is only named process, to
> complicated to explain ...
Done.
Comment 7 Felix Botner univentionstaff 2019-09-03 10:44:27 CEST
OK
Comment 8 Arvid Requate univentionstaff 2019-09-04 15:48:13 CEST
<http://errata.software-univention.de/ucs/4.4/249.html>