Univention Bugzilla – Bug 39775
libnss-extrausers not thread safe? - prevents libvirt from starting
Last modified: 2016-08-03 15:56:32 CEST
libnss-extrausers 0.6-3.12.201409252135 # cat libvirtd.gdb file /usr/sbin/libvirtd set args -f /root/libvirtd.conf set environment MALLOC_CHECK_ 2 run # gdb -x libvirtd.gdb ... Program received signal SIGABRT, Aborted. Program received signal SIGSEGV, Segmentation fault. $ bt #0 0x00007ffff40c6165 in *__GI_raise (sig=<optimized out>) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 #1 0x00007ffff40c93e0 in *__GI_abort () at abort.c:92 #2 0x00007ffff4109c10 in malloc_printerr (action=2, str=0x7ffff41e0b92 "free(): invalid pointer", ptr=0x6547) at malloc.c:6317 #3 0x00007ffff40fab3d in _IO_new_fclose (fp=0x5555559e55b0) at iofclose.c:88 #4 0x00007fffeb38d7d5 in _nss_extrausers_endgrent () from /usr/lib/libnss_extrausers.so.2 #5 0x00007fffeb38e67b in _nss_extrausers_initgroups_dyn () from /usr/lib/libnss_extrausers.so.2 #6 0x00007ffff413c362 in internal_getgrouplist (user=<optimized out>, group=110, size=0x7fffe8383868, groupsp=0x7fffe8383860, limit=<optimized out>) at initgroups.c:101 #7 0x00007ffff413c644 in getgrouplist (user=user@entry=0x5555558e96b0 "libvirt-qemu", group=group@entry=110, groups=groups@entry=0x5555559d69b0, ngroups=ngroups@entry=0x7fffe83838bc) at initgroups.c:153 #8 0x00007ffff76b732d in mgetgroups (username=0x5555558e96b0 "libvirt-qemu", gid=110, groups=0x7fffe83839b0) at ../../../../gnulib/lib/mgetgroups.c:90 #9 0x00007ffff75553e4 in virGetGroupList (uid=uid@entry=109, gid=25970, gid@entry=1009, list=list@entry=0x7fffe83839b0) at ../../../src/util/virutil.c:1057 #10 0x00007ffff7511759 in virFileAccessibleAs ( ... The exact signal differs: most of my test times SIGABRT, sometimes SIGSEGV, sometimes others. Removing "extrausers" for "group" in /etc/nsswitch.conf and adding "Tech" to "/etc/groups" made the problem go away. libnss-extrausers seems not be be thread-safe! Upsteam: <http://anonscm.debian.org/cgit/users/brlink/libnss-extrausers.git/> Backport of patch: <https://forge.univention.org/bugzilla/show_bug.cgi?id=29915> NSS debugging: <https://ldpreload.com/blog/testing-glibc-nsswitch> GDB: <https://sourceware.org/gdb/onlinedocs/gdb/Thread-Stops.html> glibc: <http://www.gnu.org/software/libc/manual/html_node/Name-Service-Switch.html#Name-Service-Switch>
Ticket #2015110421000306
Happened again. Temporary fix for libvirt is to disable NSS module extrausers: /etc/init.d/libvirtd stop pkill -9 libvirtd sed -e '/^group/s/extrausers//' -i /etc/nsswitch.conf /etc/init.d/libvirtd start ucr commit /etc/nsswitch.conf
Ticket #2016061421000386
Ticket#2016071921000231
happens also at univention productive environment
Created attachment 7816 [details] Test program "Good" news: I can trigger the bug with a test program other than libvirtd. It crashes every time with 100 threads in parallel doing getgrouplist(). getgrouplist() internally uses the "initgroups_dyn" implementation, which is implemented by "_nss_extrausers_initgroups_dyn()". This function calls "_nss_extrausers_setgrent()", which is not thread-save, because if overwrites the "static FILE *groupsfile".
# repo_admin.py --cherrypick ... Cherry picked libnss-extrausers[58795] from 4.0-0-0[75]/None[0] to 4.1[76]/errata4.1-2[446] r16615 | Bug #39775 nss-extrausers: Fix threading Package: libnss-extrausers Version: 0.6-3.13.201607201843 Branch: ucs_4.1-0 Scope: errata4.1-2 r71120 | Bug #39775 nss-extrausers: Fix threading YAML libnss-extrausers.yaml QA: The test-program needs to be C-compiled, which currently prevents it from being added to ucs-test.
YAML: OK Code review: OK Tests: OK I was able to reproduce the issue with the test program and the old package. With the new package it works. The output of 'getent group' in my environment is identical between the old and the new version.
<http://errata.software-univention.de/ucs/4.1/221.html>