Bug 52163 - slapd segfaults on ppolicy
slapd segfaults on ppolicy
Status: CLOSED FIXED
Product: UCS
Classification: Unclassified
Component: LDAP
UCS 4.4
Other Linux
: P5 normal (vote)
: UCS 4.4-7-errata
Assigned To: Arvid Requate
Erik Damrose
:
Depends on:
Blocks: 52515
  Show dependency treegraph
 
Reported: 2020-09-29 17:41 CEST by Dirk Ahrnke
Modified: 2021-01-06 16:53 CET (History)
5 users (show)

See Also:
What kind of report is it?: Bug Report
What type of bug is this?: 7: Crash: Bug causes crash or data loss
Who will be affected by this bug?: 1: Will affect a very few installed domains
How will those affected feel about the bug?: 5: Blocking further progress on the daily work
User Pain: 0.200
Enterprise Customer affected?:
School Customer affected?: Yes
ISV affected?:
Waiting Support:
Flags outvoted (downgraded) after PO Review:
Ticket number:
Bug group (optional):
Max CVSS v3 score:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Dirk Ahrnke univentionstaff 2020-09-29 17:41:50 CEST
Same environment as in #37915 (comment 3), 4.4-4 errata698 (contains fix for issue described there)

new segfaults

Sep 29 09:28:43 server kernel: [1718146.046322] traps: slapd[4483] general protection ip:7f31020c34ab sp:7f2ed5ffb3c0 error:0
Sep 29 09:28:43 server kernel: [1718146.046329]  in liblber-2.4.so.2.10.8[7f31020bb000+d000]
Sep 29 09:28:50 server kernel: [1718153.316897] traps: auth[746] general protection ip:7fe21dba24fc sp:7ffddef78e70 error:0
Sep 29 09:28:50 server kernel: [1718153.316903]  in libc-2.24.so[7fe21db29000+195000]
Sep 29 13:49:38 server kernel: [1733801.154092] traps: slapd[30238] general protection ip:7ff0988bc4ab sp:7fee23ffe3c0 error:0
Sep 29 13:49:38 server kernel: [1733801.154099]  in liblber-2.4.so.2.10.8[7ff0988b4000+d000]

2 of 3 comparable machines affected
Comment 2 Arvid Requate univentionstaff 2020-09-29 18:20:14 CEST
Without having looked at the coredump: Upstream had one or two iterations of refinement of the patch, so maybe we should consider taking the most recent upstream patch.
Comment 4 Nico Gulden univentionstaff 2020-12-01 13:07:05 CET
There has not been any recent activity on this bug. Has the problem been seen somewhere else as well in the meantime or has its assessment changed?
Comment 5 Dirk Ahrnke univentionstaff 2020-12-01 13:45:49 CET
crash still happens 3-4 times a day at the linked customer.
a workaround (checking for working slapd every 5 minutes and restarting if needed) is implemented
further work, especially investigation if the upstream patch would solve the problem is highly welcome
Comment 6 Arvid Requate univentionstaff 2020-12-01 17:36:26 CET
We should just replace our ppolicy patch by the improved upstream one. Nobrainer.
Comment 7 Arvid Requate univentionstaff 2020-12-16 22:57:17 CET
r19245 | Adjust patches to upstream
r19246 | Adjust patch to upstream
r19247 | Adjust patch

7ac85d12dd | Advisory

Package: openldap
Version: 2.4.45+dfsg-1~bpo9+1A~4.4.0.202012162101
Branch: ucs_4.4-0
Scope: errata4.4-7
Comment 8 Felix Botner univentionstaff 2020-12-18 12:49:05 CET
OK - 99_Bug37915_avoid_deadlock_and_race_condition.quilt (update to current upstream)
OK - 70_ppolicy_udm_lock.quilt
OK - Jenkins Tests

But i am not sure if we fixed the problem. I can't see a segfault but with a little change on the 10_ldap/56ppolicy_account_lockout_concurrent i can reproduce the slapd "deadlock".

----------

--- a/test/ucs-test/tests/10_ldap/56ppolicy_account_lockout_concurrent
+++ b/test/ucs-test/tests/10_ldap/56ppolicy_account_lockout_concurrent
@@ -29,6 +29,7 @@ restart_slapd_if_it_hangs() {
        if pgrep -lf objectClass=OpenLDAProotDSE | grep -qe '\<ldapsearch\>'; then
                msg="slapd deadlock"
                echo "ERROR: slapd hangs, attempting kill and start" >&2
+               exit 0
                pid=$(lslocks -n -o PID,PATH | sed -n 's| /var/lib/univention-ldap/listener/listener.lock||p')
                if [ -n "$pid" ]; then
                        echo "INFO: Process locking /var/lib/univention-ldap/listener/listener.lock: " >&2
@@ -59,7 +60,7 @@ deactivate_ppolicy() {
        ucr unset ldap/ppolicy ldap/ppolicy/enabled; /etc/init.d/slapd restart
 }
 
-ucr set ldap/ppolicy=yes ldap/ppolicy/enabled=yes; /etc/init.d/slapd restart; undo deactivate_ppolicy
+#ucr set ldap/ppolicy=yes ldap/ppolicy/enabled=yes; /etc/init.d/slapd restart; undo deactivate_ppolicy
 
 default_ppolicy_ldif=$(univention-ldapsearch -LLL -b "cn=default,cn=ppolicy,cn=univention,$ldap_base")
 
@@ -94,7 +95,7 @@ pwdFailureCountInterval: $new_pwdFailureCountInterval
 test_username_list=()
 test_userdn_list=()
 
-num_testusers=5
+num_testusers=50
 for ((i=0;i<num_testusers;i++)); do
        test_username=$(user_randomname)
        user_create "$test_username" &&
----------

-> ucr set ldap/ppolicy=yes ldap/ppolicy/enabled=yes
-> /etc/init.d/slapd restart
-> /usr/share/ucs-test/10_ldap/56ppolicy_account_lockout_concurrent  -f
modifying entry "cn=default,cn=ppolicy,cn=univention,dc=new,dc=test"

info 2020-12-18 11:45:12	 create user a7vrrdsc using udm-test users/user create --position=cn=users,dc=new,dc=test --set username=a7vrrdsc --set firstname=Max --set lastname=Muster --set organisation=firma.de_GmbH --set password=univention
...
Check if slapd is still responsive: ERROR: slapd hangs, attempting kill and start
info 2020-12-18 11:47:03	 remove user gxs5fzzy

Mow slapd hangs, i don't know how to debug this situation, but it happens every time.

Should we clone the bug for this issue, so that we can proceed with this one here?
Comment 9 Arvid Requate univentionstaff 2020-12-18 13:56:06 CET
Cool, thanks for checking that!

> Should we clone the bug for this issue, so that we can proceed with this one here?

I'm unsure if this is a regression. If not then we can split that off.

I wonder if this is a variation of Bug #51722 or something entirely different.
Comment 10 Felix Botner univentionstaff 2020-12-18 14:12:30 CET
(In reply to Arvid Requate from comment #9)
> Cool, thanks for checking that!
> 
> > Should we clone the bug for this issue, so that we can proceed with this one here?
> 
> I'm unsure if this is a regression. If not then we can split that off.

No regression - also happens with the current released version 2.4.45+dfsg-1~bpo9+1A~4.4.0.202007242143 (4.4-6)
 
> I wonder if this is a variation of Bug #51722 or something entirely
> different.

No, stopped u-d-n and u-d-l before the test (forgot to mention here).

So i will clone this bug, so we can at least release the updated patches?
Comment 11 Arvid Requate univentionstaff 2020-12-18 14:14:43 CET
> No, stopped u-d-n and u-d-l before the test (forgot to mention here).

Ok, good. Sigh, at least we are not stumbling over our own feet here.


> So i will clone this bug, so we can at least release the updated patches?

Ok, if you are fine with that.
Comment 12 Felix Botner univentionstaff 2020-12-22 12:18:11 CET
sorry, forgot one thing, please merge to UCS 5.0-0
Comment 13 Arvid Requate univentionstaff 2020-12-23 15:21:48 CET
r19260 | merged patches

Package: openldap
Version: 2.4.47+dfsg-3+deb10u4A~5.0.0.202012231410
Branch: ucs_5.0-0
Comment 14 Felix Botner univentionstaff 2020-12-23 15:33:44 CET
you changed the patches in openldap/5.0-0-0-ucs/2.4.47+dfsg-3+deb10u2
but we already have 2.4.47+dfsg-3+deb10u4 for 5.0-0, so please merge these changes to 2.4.47+dfsg-3+deb10u4
Comment 15 Arvid Requate univentionstaff 2020-12-23 21:07:04 CET
Ah, damn, thanks:

r19261 | merged patches

Package: openldap
Version: 2.4.47+dfsg-3+deb10u4A~5.0.0.202012231957
Branch: ucs_5.0-0
Comment 16 Erik Damrose univentionstaff 2021-01-05 18:27:45 CET
Still valid from comment8
OK - 99_Bug37915_avoid_deadlock_and_race_condition.quilt (update to current upstream)
OK - 70_ppolicy_udm_lock.quilt
OK - Jenkins Tests

Remaining QA:
OK: Remaining issues moved to bug 52515
OK: Merge to UCS 5, 5.0-0-0-ucs/2.4.47+dfsg-3+deb10u4/

diff 99_Bug37915_avoid_deadlock_and_race_condition.quilt
diff 70_ppolicy_udm_lock.*

OK: Packages built with patches in 4.4 and 5.0
OK: ucs-test
OK: yaml
Verified