51742 – reboot/halt hangs forever on shutdown

Bug 51742 - reboot/halt hangs forever on shutdown

Summary: reboot/halt hangs forever on shutdown

Status:	CLOSED FIXED

Alias:	None

Product:	UCS
Classification:	Unclassified
Component:	General
Version:	UCS 5.0
Hardware:	Other Linux

Importance:	P5 normal
Target Milestone:	UCS 4.4-8-errata
Assignee:	Philipp Hahn
QA Contact:	Erik Damrose

URL:
Keywords:

Depends on:	46208
Blocks:
	Show dependency tree / graph

Reported:	2020-07-29 13:56 CEST by Arvid Requate
Modified:	2021-07-23 13:16 CEST (History)
CC List:	3 users (show)

See Also:
What kind of report is it?:	Bug Report
What type of bug is this?:	5: Major Usability: Impairs usability in key scenarios
Who will be affected by this bug?:	1: Will affect a very few installed domains
How will those affected feel about the bug?:	5: Blocking further progress on the daily work
User Pain:	0.143
Enterprise Customer affected?:	Yes
School Customer affected?:
ISV affected?:
Waiting Support:
Flags outvoted (downgraded) after PO Review:
Ticket number:	2020121821000411
Bug group (optional):
Customer ID:	00026, 00424
Max CVSS v3 score:

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Arvid Requate

2020-07-29 13:56:14 CEST

I'm pretty sure I've filed a bug about this about 4 years ago, but here we go:

reboot/halt hangs forever at "Stopping ldap server(s): slapd"

Didn't find a way to debug this yet, because there is no console login left at that point. UCS 4.4 but currently observed on a UCS 5 alpha VM (currently still basically running the UCS 4 slapd/univention-ldap-server packages).

Comment 1 Arvid Requate

2021-06-21 12:15:13 CEST

Still happens with UCS 5.0-0.

First it hung a while with "Stopping plymouth-halt.service" and then it switched to "Stopping ldap server(s): slapd".

Comment 2 Philipp Hahn

2021-06-22 16:24:24 CEST

This is not restricted to "slapd.service", which just is the last service which successfully shut down. The real culprit is some other system unit.

To debug this:
1. dmesg -n 8
2. /usr/share/doc/systemd/README.Debian → Debugging boot/shutdown problems

For customer 00026 it was their "/etc/init.d/tomcat*" missing "stop" actions, so systemd did not stop them on reboot. That kept those processes in memory AND swap, which prevented swap from being turned off. systemd was then stuck waiting for swap to get unmounted.
Solution there: terminate the process and do "swapoff -a" before "reboot".

I observed a similar problem on "lagan" and "soft-test02" multiples times today, where I just captured this though Serial-over-LAN debug:
MEI (Intel(R) Management Engine Interface) might not be the culprit in all cases.
with 4.10-0 debug kernel

[ 1331.549176] systemd-shutdown[1]: Deactivating swaps.
[ 1332.367887] kvm: exiting hardware virtualization
[ 1457.842202] INFO: task systemd-shutdow:1 blocked for more than 120 seconds.
[ 1457.851217]       Not tainted 4.10.0 #2
[ 1457.857072] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1457.866960] systemd-shutdow D    0     1      0 0x00000000
[ 1457.874525] Call Trace:
[ 1457.879045]  __schedule+0x233/0x6f0
[ 1457.884603]  ? __perf_remove_from_context+0x3e/0xb0
[ 1457.891564]  schedule+0x36/0x80
[ 1457.896759]  schedule_preempt_disabled+0xe/0x10
[ 1457.903340]  __mutex_lock_slowpath+0x18c/0x2b0
[ 1457.909832]  mutex_lock+0x2f/0x40
[ 1457.915181]  mei_cl_bus_remove_devices+0x27/0x60 [mei]
[ 1457.922347]  mei_stop+0x23/0x80 [mei]
[ 1457.928038]  mei_me_remove+0x46/0xe0 [mei_me]
[ 1457.934432]  pci_device_shutdown+0x36/0x70
[ 1457.940569]  device_shutdown+0x12d/0x1b0
[ 1457.946528]  kernel_restart_prepare+0x36/0x40
[ 1457.952931]  kernel_restart+0x12/0x60
[ 1457.958623]  SYSC_reboot+0x1c8/0x1f0
[ 1457.964235]  ? vfs_writev+0x3c/0x50
[ 1457.969761]  ? do_writev+0x52/0xd0
[ 1457.975202]  SyS_reboot+0xe/0x10
[ 1457.980460]  entry_SYSCALL_64_fastpath+0x1e/0xad
[ 1457.987085] RIP: 0033:0x7fb15b8a98c6
[ 1457.992630] RSP: 002b:00007ffd44691618 EFLAGS: 00000206 ORIG_RAX: 00000000000000a9
[ 1458.002219] RAX: ffffffffffffffda RBX: 0000559f3a000040 RCX: 00007fb15b8a98c6
[ 1458.011395] RDX: 0000000001234567 RSI: 0000000028121969 RDI: fffffffffee1dead
[ 1458.020548] RBP: 0000000000000001 R08: 0000000000000028 R09: 0000000000000005
[ 1458.029673] R10: 0000000000000002 R11: 0000000000000206 R12: 00007fb15bebb7d4
[ 1458.038795] R13: 0000000000000001 R14: 0000559f3a002c40 R15: 00007ffd44691998

Comment 3 Philipp Hahn

2021-07-02 12:07:06 CEST

The customer 00026 problem is different:
- They are using 3 NFS shares
- They had services *started* by SysV init scripts, but no corresponding *stop* actions
- Some processes heavily pushed the system to SWAP
On shutdown systemd was unable to umount everything respectively turn off swap correctly
- Even after adding *stop* actions the system did not reboot
- a manual `swapoff -a` before `reboot` only helped in some cases

Further investigation shows systemd-shutdown doing a sync() system call.
> Linux kernel hung check at sys_sync()
It looks like NFS shared were still mounted but the network was already down.
> NFS not responding


Similar shutdown problems overlayed our last pullcord "30 Networking issues", where our KVM servers did not reboot in all cases - they also use NFS and SWAP. This distracted me from working efficiently on the underlying problem.

Comment 4 Philipp Hahn

2021-07-13 17:53:24 CEST

# Reproducer:
ucr set grub/append='systemd.log_level=debug systemd.log_target=console console=ttyS0,115200 console=tty0'
dmesg -n 8
systemctl start debug-shell  # on tty9
mount -t nfs omar.knut.univention.de:/mnt/omar/sde/buildsystem2 /mnt
yes > /mnt/temp/reboot &  # this will write data in the background to have dirt data for writeback
ip link set enp1s0 down
sync &  # this will force a flush, which fill not succeed as the network is down
# INFO: task sync:... blocked for more than 120 seconds.
…
#  [<ffffffff97......>] ? sys_sync
#  [<ffffffff97......>] ? do_syscall_64
#  [<ffffffff97......>] ? entry_SYSCALL_64_after_swapgs
…
# nfs: server: 192.168.0.10 not responding, still trying
…
reboot
…
[Failed] Umount /mnt
…
[  OK  ] Reached target Shutdown.

<man:systemd-shutdown(8)> also uses <man:sync(2)> and blocks if the NFS server is not reachable.

<https://github.com/systemd/systemd/issues/6115> has some useful information on that:
- some services do not have "stop" actions
  - this includes "havaged", "lvmetad", "nmbd", "rpcbind", "systemd-jurndald", "systemd-udevd"
- these services keeps running and may block umount()ing some file systems
  - same for `swapoff -a` which fails if used>ram
- some services MUST NOT be killed, e.g. `iscsid`
  - so killing ALL processes is not a generic option
- the network services and interfaces are stopped
  - this makes any NFS server unreachable
  - pending NFS data cannot be flushed any longer
- systemd[1] does replace itself via `exec(systemd-shutdown)`, which does a <man:sync(2)>
- the last process PID 1 will be blocked forever then.
(my VM was only rebooted because the WATCHDOG timer kicked in and reset the VM after 10m)

* v236~259 added `MNT_FORCE` to "umount()", which helps with unreachable NFS.
* v236~2^2~1 moved the blocking "umount()" into a forked child process.
* v237~173^2~9 also moved the "sync()" by "systemd-shutdown" into a forked child process.

<https://packages.debian.org/search?keywords=systemd&searchon=sourcenames&suite=all&section=all>
* UCS-4.4 has systemd 232-25+deb9u12
* UCS-5.0 has systemd 241-7~deb10u7
* "stretch-backports" has 241-5~bpo9+1 from Debian-9-Stretch, which is UCS-4.4, bus is missing many later fixes


Fix
===
backport the following packages from Buster to Stretch
- bash-completion-2.8 (util-linux Breaks)
- shadow-4.5 (util-linux Depends)
- util-linux-2.33 (systemd Build-Depends)
- systemd-241 (Drop python3-evdev)
Either for errata4.4-x or customer scope only.

Variant
-------
Replace the dependency on "bin:libmount-dev (>= 2.30)" from "src:utils-linux" by a packport of the patch required by systemd to break the dependency chain there: see systemd:d6e8096669fef4aa5132843fb488fa35de349901, e.g. 
systemd is already very low-level, but pulling in other low-level dependencies just for that tiny functionality is a risk to all other programs running on the system as well.

Alternative
===========
See if we can introduce a "shutdown dependency", which
- kills all processes having files opened on NFS *before* the network is shut down.
  ```
  for ((i=0;i<3;i+=1))
  do
    awk '$9=="nfs"||$9=="nfs4"{print $5}' /proc/self/mountinfo | xargs -r fuser --kill -SIGTERM --mount
    sleep 3
  done
  awk '$9=="nfs"||$9=="nfs4"{print $5}' /proc/self/mountinfo | xargs -r fuser --kill -SIGKILL --mount
  ```
- forcefully umount all NFS file systems
  ```
  umount --force --read-only --lazy --no-canonicalize --all --types nfs,nfs4 # --fake -verbose
  ```

The system would sill be affected by `systemd-shutdown` doing a `sync()` itself, so the system would still hang if not all processes can be killed.
(`umount --no-canonicalize` is broken << 2.30 and might block - this is why systems wants the newer version)

Comment 5 Philipp Hahn

2021-07-16 12:53:10 CEST

r19396 | Bug #51742: systemd-241-7~deb10u7~bpo9
r19397 | Bug #51742: systemd-241-7~deb10u7~bpo9
r19398 | Bug #51742: systemd-241-7~deb10u7~bpo9

repo-copy-dsc -cp /mnt/build-storage/upstream/debian/pool/main/d/debhelper/debhelper_12.1.1~bpo9+1.dsc

repo-copy-dsc -cp /mnt/build-storage/upstream/debian/pool/main/d/dh-autoreconf/dh-autoreconf_17~bpo9+1.dsc

repo-copy-dsc -cp /mnt/build-storage/upstream/debian/pool/main/m/meson/meson_0.49.0-1~bpo9+1.dsc

Package: bash-completion
Version: 1:2.8-6~bpo9+1A~4.4.0.202107160950
Branch: ucs_4.4-0
Scope: systemd44

Package: shadow
Version: 1:4.5-1.1~bpo9+1A~4.4.0.202107160952
Branch: ucs_4.4-0
Scope: systemd44

Package: util-linux
Version: 2.33.1-0.1~bpo9+1A~4.4.0.202107161004
Branch: ucs_4.4-0
Scope: systemd44

Package: systemd
Version: 241-7~deb10u7~bpo9+1A~4.4.0.202107161017
Branch: ucs_4.4-0
Scope: systemd44


ucr set repository/online/component/systemd44{=yes,/server=service.univention.de,/prefix=apt/00026,/parts=maintained,/username=00026,/password=REDACTED}

Comment 6 Philipp Hahn

2021-07-16 14:27:24 CEST

Now also available as a UCS component:
> ucr set repository/online/component/systemd44{=yes,/version=4.4,/description=systemd-reboot-fix}

(The customer repository can be removed)

Also available: <https://help.univention.com/t/reboot-shutdown-sometimes-fails-and-hangs/18253>

Comment 7 Philipp Hahn

2021-07-22 14:13:59 CEST

Debian-10-Buster Security Update: <https://www.debian.org/security/2021/dsa-4942>

Package: systemd
Version: 241-7~deb10u8~bpo9+1A~4.4.0.202107221410
Branch: ucs_4.4-0
Scope: systemd44

Comment 8 Erik Damrose

2021-07-23 12:34:05 CEST

OK: Backport of required packages
OK: ucr set repository/online/component/systemd44{=yes,/version=4.4,/description=systemd-reboot-fix}; univention-upgrade
OK: Boot/reboot/shutdown with systemd 241 on UCS 4.4

I added the following note to the help article:

When booting with the updated `systemd` package, `systemd` tries to activate a kernel feature that is not yet present in the kernel used in *UCS 4.4*. Thus, the following message can be seen in the `dmesg` output. It can be ignored, as [a fallback is done if the option is not supported](https://github.com/systemd/systemd/pull/6294).
`kernel: cgroup: cgroup2: unknown option "nsdelegate"`

-> Verified + Closed, nothing more to release.