+++ This bug was initially created as a clone of Bug #56914 +++ With Bug #56914 we added some logic to the samba init.d script, so that hanging processes are terminated during the stop of the server. In our tests we saw the following: The join of a replica server failed because the joinscript RUNNING 98univention-samba4-dns.inst was waiting for the service account dns-slave to become visible in its local Samba. This never happened, because the samba service was dead. We can see in the logs that the service was restarted, the code from Bug #56914 was run but still, the service couldn't start because of NT_STATUS_ADDRESS_ALREADY_ASSOCIATED. journalctl.log May 22 02:56:51 slave samba-ad-dc[7716]: Stopping Samba AD DC server: samba May 22 02:56:51 slave samba-ad-dc[7716]: Samba did not terminate in time. Killing remaining processes. May 22 02:56:51 slave samba-ad-dc[7716]: S PID PGRP TIME COMMAND. May 22 02:56:51 slave samba-ad-dc[7716]: S 3376 3376 00:00:00 /usr/sbin/winbindd -D. May 22 02:56:51 slave samba-ad-dc[7716]: S 3384 3376 00:00:00 winbindd: domain child [SLAVE]. May 22 02:56:51 slave samba-ad-dc[7716]: S 3627 3376 00:00:00 winbindd: idmap child. May 22 02:56:51 slave SAMBA[7857]: ERROR: Stuck process after service stop: May 22 02:56:51 slave SAMBA[7857]: S PID PGRP TIME COMMAND May 22 02:56:51 slave SAMBA[7857]: S 3376 3376 00:00:00 /usr/sbin/winbindd -D May 22 02:56:51 slave SAMBA[7857]: S 3384 3376 00:00:00 winbindd: domain child [SLAVE] May 22 02:56:51 slave SAMBA[7857]: S 3627 3376 00:00:00 winbindd: idmap child May 22 02:56:51 slave SAMBA[7857]: PIDFILE: 7551 May 22 02:56:51 slave systemd[1]: Stopping LSB: Samba NetBIOS nameserver (nmbd)... May 22 02:56:52 slave nmbd[7871]: Stopping NetBIOS name server: nmbd. May 22 02:56:52 slave systemd[1]: nmbd.service: Succeeded. May 22 02:56:52 slave samba-ad-dc[7716]: Stopping nmbd (via systemctl): nmbd.service. May 22 02:56:52 slave systemd[1]: Stopped LSB: Samba NetBIOS nameserver (nmbd). May 22 02:56:52 slave samba-ad-dc[7716]: . May 22 02:56:52 slave systemd[1]: samba-ad-dc.service: Succeeded. May 22 02:56:52 slave systemd[1]: Stopped LSB: Samba daemons for the AD DC. May 22 02:56:53 slave systemd[1]: Starting LSB: Samba NetBIOS nameserver (nmbd)... May 22 02:56:53 slave nmbd[7904]: Starting NetBIOS name server: nmbd. May 22 02:56:53 slave systemd[1]: Started LSB: Samba NetBIOS nameserver (nmbd). May 22 02:56:53 slave systemd[1]: Starting LSB: Samba daemons for the AD DC... May 22 02:56:53 slave samba-ad-dc[7926]: Starting nmbd (via systemctl): nmbd.service. May 22 02:56:54 slave samba-ad-dc[7926]: Starting Samba AD DC server: samba. May 22 02:56:54 slave systemd[1]: Started LSB: Samba daemons for the AD DC. log.samba: [2024/05/22 02:56:55.227036, 0, pid=7962] ../../source4/samba/service_stream.c:373(stream_setup_socket) stream_setup_socket: Failed to listen on 127.0.0.1:49153 - NT_STATUS_ADDRESS_ALREADY_ASSOCIATED [2024/05/22 02:56:55.227130, 0, pid=7962] ../../source4/rpc_server/dcerpc_server.c:513(add_socket_rpc_tcp_iface) service_setup_stream_socket(address=127.0.0.1,port=49153) for dnsserver backupkey eventlog6 browser unixinfo dssetup drsuapi lsarpc mgmt failed - NT_STATUS_ADDRESS_ALREADY_ASSOCIATED [2024/05/22 02:56:55.227158, 0, pid=7962] ../../source4/samba/service_task.c:36(task_server_terminate) task_server_terminate: task_server_terminate: [dcerpc: Failed to initialise end points]
Happened again in the tests: journald.log May 23 00:16:14 master091 samba-ad-dc[20366]: Stopping Samba AD DC server: samba May 23 00:16:14 master091 samba-ad-dc[20366]: Samba did not terminate in time. Killing remaining processes. May 23 00:16:14 master091 samba-ad-dc[20366]: S PID PGRP TIME COMMAND. May 23 00:16:14 master091 samba-ad-dc[20366]: S 5699 5699 00:00:00 /usr/sbin/winbindd -D. May 23 00:16:14 master091 samba-ad-dc[20366]: S 5700 5699 00:00:00 winbindd: domain child [MASTER091]. May 23 00:16:14 master091 samba-ad-dc[20366]: S 5941 5699 00:00:00 winbindd: idmap child. May 23 00:16:14 master091 SAMBA[20605]: ERROR: Stuck process after service stop: May 23 00:16:14 master091 SAMBA[20605]: S PID PGRP TIME COMMAND May 23 00:16:14 master091 SAMBA[20605]: S 5699 5699 00:00:00 /usr/sbin/winbindd -D May 23 00:16:14 master091 SAMBA[20605]: S 5700 5699 00:00:00 winbindd: domain child [MASTER091] May 23 00:16:14 master091 SAMBA[20605]: S 5941 5699 00:00:00 winbindd: idmap child May 23 00:16:14 master091 SAMBA[20605]: PIDFILE: 20125 May 23 00:16:14 master091 systemd[1]: Stopping LSB: Samba NetBIOS nameserver (nmbd)... May 23 00:16:14 master091 nmbd[20619]: Stopping NetBIOS name server: nmbd. May 23 00:16:14 master091 systemd[1]: nmbd.service: Succeeded. May 23 00:16:14 master091 samba-ad-dc[20366]: Stopping nmbd (via systemctl): nmbd.service. May 23 00:16:14 master091 systemd[1]: Stopped LSB: Samba NetBIOS nameserver (nmbd). May 23 00:16:14 master091 samba-ad-dc[20366]: . May 23 00:16:14 master091 systemd[1]: samba-ad-dc.service: Succeeded. May 23 00:16:14 master091 systemd[1]: Stopped LSB: Samba daemons for the AD DC. May 23 00:16:15 master091 systemd[1]: Starting LSB: Samba NetBIOS nameserver (nmbd)... May 23 00:16:16 master091 nmbd[20652]: Starting NetBIOS name server: nmbd. May 23 00:16:16 master091 systemd[1]: Started LSB: Samba NetBIOS nameserver (nmbd). May 23 00:16:16 master091 systemd[1]: Starting LSB: Samba daemons for the AD DC... May 23 00:16:16 master091 samba-ad-dc[20676]: Starting nmbd (via systemctl): nmbd.service. May 23 00:16:18 master091 samba-ad-dc[20676]: Starting Samba AD DC server: samba. May 23 00:16:18 master091 systemd[1]: Started LSB: Samba daemons for the AD DC. log.samba [2024/05/23 00:16:18.189325, 0, pid=20695] ../../source4/samba/server.c:623(binary_smbd_main) samba version 4.18.3-Univention started. Copyright Andrew Tridgell and the Samba Team 1992-2023 [2024/05/23 00:16:18.788086, 0, pid=20696] ../../source4/samba/server.c:896(binary_smbd_main) binary_smbd_main: samba: using 'prefork' process model [2024/05/23 00:16:25.532500, 0, pid=20709] ../../source4/samba/service_stream.c:373(stream_setup_socket) stream_setup_socket: Failed to listen on ::1:389 - NT_STATUS_ADDRESS_ALREADY_ASSOCIATED [2024/05/23 00:16:25.532604, 0, pid=20709] ../../source4/ldap_server/ldap_server.c:1186(add_socket) add_socket: ldapsrv failed to bind to ::1:389 - NT_STATUS_ADDRESS_ALREADY_ASSOCIATED [2024/05/23 00:16:25.532620, 0, pid=20709] ../../source4/samba/service_task.c:36(task_server_terminate) task_server_terminate: task_server_terminate: [Failed to startup ldap server task] [2024/05/23 00:16:25.539240, 0, pid=20696] ../../source4/samba/server.c:392(samba_terminate) samba_terminate: samba_terminate of samba 20696: Failed to startup ldap server task
I am looking into this and although I have not yet been able to reproduce the NT_STATUS_ADDRESS_ALREADY_ASSOCIATED error, I like to share my preliminary findings with you. The solution to Bug 56914 added the assumption to the stop-branch of /etc/init.d/samba-ad-dc that the processes /usr/sbin/smbd and /usr/sbin/winbindd are always part of the main samba process tree. This assumption is the result of removing the following lines from the stop-branch of /etc/init.d/samba-ad-dc (for details see description of Bug 56914): - ## check for smbd and winbindd as well, in case ADDS has just been configured - for service in smbd winbindd; do - pid=$(pgrep -x "$service") - if [ -n "$pid" ]; then - start-stop-daemon --stop --quiet --oknodo \ - --name "$service" --retry "TERM/3/KILL/1" -v \ - | sed -rn 's/.*, (retry #|refused to die)/\1/p' \ - | while read line; do log_action_cont_msg "$line"; done - fi This assumption in generally true after successful provision of Samba (AD DC) as then /usr/sbin/smbd and /usr/sbin/winbindd are indeed forked by the main samba process tree startet by /etc/init.d/samba-ad-dc. Nevertheless, during provision of Samba (AD DC) this assumption is not true as the postinst-scripts from the packages winbind and samba will already have started /usr/sbin/winbindd (through 'invoke-rc.d --skip-systemd-native winbind $_dh_action', with $_dh_action="start") and /usr/sbin/smbd (through 'invoke-rc.d --skip-systemd-native smbd $_dh_action', with $_dh_action="start") respectively when the join scripts from package univention-samba4 will run. In my tests I could reproduce that the order of the commands of the function stop_conflicting_services() from /usr/lib/univention-install/96univention-samba4.inst _always_ causes the call '/etc/init.d/samba-ad-dc stop' to complain that the process '/usr/sbin/winbindd -D' (and its children) is still running and then terminating it - exactly the excerpt from journalctl.log in the description of this Bug and in comment 2 shows. This beahaviour of the call '/etc/init.d/samba-ad-dc stop' is a direct result from the assumption that /usr/sbin/winbindd is always part of the main samba process tree - which is not true when stop_conflicting_services() gets called, as the process '/usr/sbin/winbindd -D' was startet by 'invoke-rc.d --skip-systemd-native winbind $_dh_action', with $_dh_action="start, as explained above. # excerpt from /usr/lib/univention-install/96univention-samba4.inst stop_conflicting_services() { ## stop samba3 services and heimdal-kdc if present if [ -x /etc/init.d/samba ]; then if [ -n "$(pgrep -f '/usr/sbin/(smbd|nmbd)')" ]; then /etc/init.d/samba stop ## the smbd init script might refuse to run if it detects ADDC config in smb.conf start-stop-daemon --stop --quiet --retry 2 --exec /usr/sbin/smbd fi fi if [ -x /etc/init.d/winbind ]; then if [ -n "$(pgrep -xf /usr/sbin/winbindd)" ]; then /etc/init.d/winbind stop # Bug #35600: Really stop all winbind processes start-stop-daemon --stop --quiet --retry 2 --exec /usr/sbin/winbindd fi fi To be precise, calling stop_conflicting_services() will detect the running /usr/sbin/smbd, then call '/etc/init.d/samba stop', which calls '/etc/init.d/samba-ad-dc stop' with the results explained above. All this will not make the provision of Samba (AD DC) fail because we didn't hit here the NT_STATUS_ADDRESS_ALREADY_ASSOCIATED error. However, IMHO this situation should be avoided and i suggest the following two patches to do so: 1. In stop_conflicting_services() change the order of calls to first stop winbind, then samba (see attachment 'patch to /usr/lib/univention-install/96univention-samba4.inst') 2. In the stop-branch of /etc/init.d/samba-ad-dc send the TERM Signal also to /usr/sbin/smbd and /usr/sbin/winbindd. (see attachment 'patch to /etc/init.d/samba-ad-dc')
Created attachment 11214 [details] patch to /usr/lib/univention-install/96univention-samba4.inst
Created attachment 11215 [details] patch to /etc/init.d/samba-ad-dc
Thanks for your efforts and the time spent to troubleshoot this, that's really helpful! We will review your patches and see how to proceed.
Comment on attachment 11214 [details] patch to /usr/lib/univention-install/96univention-samba4.inst --- a/usr/lib/univention-install/96univention-samba4.inst 2024-02-05 16:53:54.000000000 +0100 +++ b/usr/lib/univention-install/96univention-samba4.inst 2024-07-02 15:37:14.044000000 +0200 @@ -207,6 +207,13 @@ stop_conflicting_services() { ## stop samba3 services and heimdal-kdc if present + if [ -x /etc/init.d/winbind ]; then + if [ -n "$(pgrep -f /usr/sbin/winbindd)" ]; then + /etc/init.d/winbind stop + # Bug #35600: Really stop all winbind processes + start-stop-daemon --stop --quiet --retry 2 --exec /usr/sbin/winbindd + fi + fi if [ -x /etc/init.d/samba ]; then if [ -n "$(pgrep -f '/usr/sbin/(smbd|nmbd)')" ]; then /etc/init.d/samba stop @@ -214,13 +221,6 @@ start-stop-daemon --stop --quiet --retry 2 --exec /usr/sbin/smbd fi fi - if [ -x /etc/init.d/winbind ]; then - if [ -n "$(pgrep -xf /usr/sbin/winbindd)" ]; then - /etc/init.d/winbind stop - # Bug #35600: Really stop all winbind processes - start-stop-daemon --stop --quiet --retry 2 --exec /usr/sbin/winbindd - fi - fi if [ -x /etc/init.d/heimdal-kdc ]; then if [ -n "$(pgrep -f '/usr/lib/heimdal-servers/(kdc|kpasswdd)')" ]; then /etc/init.d/heimdal-kdc stop
Comment on attachment 11215 [details] patch to /etc/init.d/samba-ad-dc --- a/etc/init.d/samba-ad-dc 2024-03-05 13:13:26.000000000 +0100 +++ b/etc/init.d/samba-ad-dc 2024-07-02 15:50:59.616000000 +0200 @@ -77,10 +77,15 @@ log_daemon_msg "Stopping $DESC" $NAME ## sometimes samba takes a long time to terminate, ## which would make starting new samba processes fail. - pids=$(pgrep -F "$PIDFILE"; pgrep --exact '(smbd|winbindd)') + pid_samba=$(pgrep -F "$PIDFILE") + pid_smbd=$(pgrep --exact 'smbd') + pid_winbindd=$(pgrep --exact 'winbindd') start-stop-daemon --stop --quiet --pidfile $PIDFILE --name samba ret="$?" - if [ -n "$pids" ]; then + [ -n "$pid_smbd" ] && start-stop-daemon --stop --quiet --pid $pid_smbd --name smbd + [ -n "$pid_winbindd" ] && start-stop-daemon --stop --quiet --pid $pid_winbindd --name winbindd + pids="$pid_samba $pid_smbd $pid_winbindd" + if [ -n "${pids// /}" ]; then unset pgids kgids for pid in $pids; do pgids="$pgids -g $pid"
Created attachment 11224 [details] revised patch to /etc/init.d/samba-ad-dc cosmetic changes
Created attachment 11225 [details] revised patch to /usr/lib/univention-install/96univention-samba4.inst "pgrep -xf /usr/sbin/winbindd" will never match as both winbind.service and samba-ad-dc.service will start winbind with at least "-D" Option.
The suggested changes affect two parts: 1. /etc/init.d/samba-ad-dc shipped by source package samba 2. Joinscript /usr/lib/univention-install/96univention-samba4.inst The first part has been done via ucs-patches and that package has been built via repo-ng: 1.a) For UCS 5.2-0 ucs-patches@1fbc3549 | Make samba-ad-dc stop smbd and winbindd explicitly ucs-patches:samba/ucs_5.2-0/2:4.21.1-1/15_samba4_stop.patch Package: samba Version: 2:4.21.1-1A~5.2.0.202411191702 Branch: 5.2-0 1.b) Backport for 5.0-9: ucs-patches@d9fe48bf | Make samba-ad-dc stop smbd and winbindd explicitly ucs-patches:samba/ucs_5.0-0-errata5.0-9/2:4.18.3-1/15_samba4_stop.patch Package: samba Version: 2:4.18.3-1A~5.0.0.202411191740 Branch: 5.0-0 Scope: errata5.0-9 The second part has been done normally via the ucs repo: 2.a) For UCS 5.2-0 ucs@8ee0c6a8b46 | Let stop_conflicting_services stop winbindd first Package: univention-samba4 Version: 11.0.7 Branch: 5.2-0 2.b) Backport for 5.0-9: ucs@97b95a86470 | Let stop_conflicting_services stop winbindd first 5460dd9a93f | Advisories Package: univention-samba4 Version: 9.0.18-3 Branch: 5.0-0 Scope: errata5.0-9
OK, looks good
<https://errata.software-univention.de/#/?erratum=5.0x1183> <https://errata.software-univention.de/#/?erratum=5.0x1184>