Univention Bugzilla – Bug 52163
slapd segfaults on ppolicy
Last modified: 2021-01-06 16:53:38 CET
Same environment as in #37915 (comment 3), 4.4-4 errata698 (contains fix for issue described there) new segfaults Sep 29 09:28:43 server kernel: [1718146.046322] traps: slapd[4483] general protection ip:7f31020c34ab sp:7f2ed5ffb3c0 error:0 Sep 29 09:28:43 server kernel: [1718146.046329] in liblber-2.4.so.2.10.8[7f31020bb000+d000] Sep 29 09:28:50 server kernel: [1718153.316897] traps: auth[746] general protection ip:7fe21dba24fc sp:7ffddef78e70 error:0 Sep 29 09:28:50 server kernel: [1718153.316903] in libc-2.24.so[7fe21db29000+195000] Sep 29 13:49:38 server kernel: [1733801.154092] traps: slapd[30238] general protection ip:7ff0988bc4ab sp:7fee23ffe3c0 error:0 Sep 29 13:49:38 server kernel: [1733801.154099] in liblber-2.4.so.2.10.8[7ff0988b4000+d000] 2 of 3 comparable machines affected
Without having looked at the coredump: Upstream had one or two iterations of refinement of the patch, so maybe we should consider taking the most recent upstream patch.
There has not been any recent activity on this bug. Has the problem been seen somewhere else as well in the meantime or has its assessment changed?
crash still happens 3-4 times a day at the linked customer. a workaround (checking for working slapd every 5 minutes and restarting if needed) is implemented further work, especially investigation if the upstream patch would solve the problem is highly welcome
We should just replace our ppolicy patch by the improved upstream one. Nobrainer.
r19245 | Adjust patches to upstream r19246 | Adjust patch to upstream r19247 | Adjust patch 7ac85d12dd | Advisory Package: openldap Version: 2.4.45+dfsg-1~bpo9+1A~4.4.0.202012162101 Branch: ucs_4.4-0 Scope: errata4.4-7
OK - 99_Bug37915_avoid_deadlock_and_race_condition.quilt (update to current upstream) OK - 70_ppolicy_udm_lock.quilt OK - Jenkins Tests But i am not sure if we fixed the problem. I can't see a segfault but with a little change on the 10_ldap/56ppolicy_account_lockout_concurrent i can reproduce the slapd "deadlock". ---------- --- a/test/ucs-test/tests/10_ldap/56ppolicy_account_lockout_concurrent +++ b/test/ucs-test/tests/10_ldap/56ppolicy_account_lockout_concurrent @@ -29,6 +29,7 @@ restart_slapd_if_it_hangs() { if pgrep -lf objectClass=OpenLDAProotDSE | grep -qe '\<ldapsearch\>'; then msg="slapd deadlock" echo "ERROR: slapd hangs, attempting kill and start" >&2 + exit 0 pid=$(lslocks -n -o PID,PATH | sed -n 's| /var/lib/univention-ldap/listener/listener.lock||p') if [ -n "$pid" ]; then echo "INFO: Process locking /var/lib/univention-ldap/listener/listener.lock: " >&2 @@ -59,7 +60,7 @@ deactivate_ppolicy() { ucr unset ldap/ppolicy ldap/ppolicy/enabled; /etc/init.d/slapd restart } -ucr set ldap/ppolicy=yes ldap/ppolicy/enabled=yes; /etc/init.d/slapd restart; undo deactivate_ppolicy +#ucr set ldap/ppolicy=yes ldap/ppolicy/enabled=yes; /etc/init.d/slapd restart; undo deactivate_ppolicy default_ppolicy_ldif=$(univention-ldapsearch -LLL -b "cn=default,cn=ppolicy,cn=univention,$ldap_base") @@ -94,7 +95,7 @@ pwdFailureCountInterval: $new_pwdFailureCountInterval test_username_list=() test_userdn_list=() -num_testusers=5 +num_testusers=50 for ((i=0;i<num_testusers;i++)); do test_username=$(user_randomname) user_create "$test_username" && ---------- -> ucr set ldap/ppolicy=yes ldap/ppolicy/enabled=yes -> /etc/init.d/slapd restart -> /usr/share/ucs-test/10_ldap/56ppolicy_account_lockout_concurrent -f modifying entry "cn=default,cn=ppolicy,cn=univention,dc=new,dc=test" info 2020-12-18 11:45:12 create user a7vrrdsc using udm-test users/user create --position=cn=users,dc=new,dc=test --set username=a7vrrdsc --set firstname=Max --set lastname=Muster --set organisation=firma.de_GmbH --set password=univention ... Check if slapd is still responsive: ERROR: slapd hangs, attempting kill and start info 2020-12-18 11:47:03 remove user gxs5fzzy Mow slapd hangs, i don't know how to debug this situation, but it happens every time. Should we clone the bug for this issue, so that we can proceed with this one here?
Cool, thanks for checking that! > Should we clone the bug for this issue, so that we can proceed with this one here? I'm unsure if this is a regression. If not then we can split that off. I wonder if this is a variation of Bug #51722 or something entirely different.
(In reply to Arvid Requate from comment #9) > Cool, thanks for checking that! > > > Should we clone the bug for this issue, so that we can proceed with this one here? > > I'm unsure if this is a regression. If not then we can split that off. No regression - also happens with the current released version 2.4.45+dfsg-1~bpo9+1A~4.4.0.202007242143 (4.4-6) > I wonder if this is a variation of Bug #51722 or something entirely > different. No, stopped u-d-n and u-d-l before the test (forgot to mention here). So i will clone this bug, so we can at least release the updated patches?
> No, stopped u-d-n and u-d-l before the test (forgot to mention here). Ok, good. Sigh, at least we are not stumbling over our own feet here. > So i will clone this bug, so we can at least release the updated patches? Ok, if you are fine with that.
sorry, forgot one thing, please merge to UCS 5.0-0
r19260 | merged patches Package: openldap Version: 2.4.47+dfsg-3+deb10u4A~5.0.0.202012231410 Branch: ucs_5.0-0
you changed the patches in openldap/5.0-0-0-ucs/2.4.47+dfsg-3+deb10u2 but we already have 2.4.47+dfsg-3+deb10u4 for 5.0-0, so please merge these changes to 2.4.47+dfsg-3+deb10u4
Ah, damn, thanks: r19261 | merged patches Package: openldap Version: 2.4.47+dfsg-3+deb10u4A~5.0.0.202012231957 Branch: ucs_5.0-0
Still valid from comment8 OK - 99_Bug37915_avoid_deadlock_and_race_condition.quilt (update to current upstream) OK - 70_ppolicy_udm_lock.quilt OK - Jenkins Tests Remaining QA: OK: Remaining issues moved to bug 52515 OK: Merge to UCS 5, 5.0-0-0-ucs/2.4.47+dfsg-3+deb10u4/ diff 99_Bug37915_avoid_deadlock_and_race_condition.quilt diff 70_ppolicy_udm_lock.* OK: Packages built with patches in 4.4 and 5.0 OK: ucs-test OK: yaml Verified
<https://errata.software-univention.de/#/?erratum=4.4x856>