Univention Bugzilla – Full Text Bug Listing |
Summary: | Test 25reconnect_uldap failed in jenkins | ||
---|---|---|---|
Product: | UCS Test | Reporter: | Alexander Kramer <kramer> |
Component: | General | Assignee: | Philipp Hahn <hahn> |
Status: | CLOSED FIXED | QA Contact: | Stefan Gohmann <gohmann> |
Severity: | normal | ||
Priority: | P5 | CC: | best, gohmann |
Version: | unspecified | Keywords: | interim-2 |
Target Milestone: | UCS 4.0 | ||
Hardware: | Other | ||
OS: | Linux | ||
See Also: |
https://forge.univention.org/bugzilla/show_bug.cgi?id=34292 https://forge.univention.org/bugzilla/show_bug.cgi?id=35148 |
||
What kind of report is it?: | --- | What type of bug is this?: | --- |
Who will be affected by this bug?: | --- | How will those affected feel about the bug?: | --- |
User Pain: | Enterprise Customer affected?: | ||
School Customer affected?: | ISV affected?: | ||
Waiting Support: | Flags outvoted (downgraded) after PO Review: | ||
Ticket number: | Bug group (optional): | ||
Max CVSS v3 score: | |||
Bug Depends on: | 35841, 35852 | ||
Bug Blocks: | |||
Attachments: |
verbose strace
verbose stdout |
Description
Alexander Kramer
2014-08-28 13:34:52 CEST
The strace for the process give the following endless output: select(0, NULL, NULL, NULL, {0, 4286}) = 0 (Timeout) select(0, NULL, NULL, NULL, {0, 10000}) = 0 (Timeout) select(0, NULL, NULL, NULL, {0, 10000}) = 0 (Timeout) ... If you run the script from the shell it also gets stuck after printing: ************************************************************************ Test: connect, stop, - start after 9 seconds, search ************************************************************************ [ ok ] Stopping ldap server(s): slapd ...done. [ ok ] Starting ldap server(s): slapd ...done. [ ok ] Checking Schema ID: ...done. And the endless strace for this process looks the same: select(0, NULL, NULL, NULL, {0, 5463}) = 0 (Timeout) select(0, NULL, NULL, NULL, {0, 10000}) = 0 (Timeout) select(0, NULL, NULL, NULL, {0, 10000}) = 0 (Timeout) ... Created attachment 6077 [details]
verbose strace
Created attachment 6078 [details]
verbose stdout
r53174: I added a missing wait to a subprocess call. "gdb -p %1" shows the process with a single thread hanging in /usr/lib/python2.7/dist-packages/ldap/ldapobject.py:766 def reconnect(self,uri): while self._pending_reconnect: time.sleep(0.01) else: self._pending_reconnect = 1 "timeval.tv_usec = 10_000" equals those 0.01s. Adding that .wait()ing for the restart terminates the stall, but the test fails then for the negative check (delaying 11s while only 10s are allowed). python-ldap_2.4.9 broke it: * ldapobject.ReconnectLDAPObject.reconnect() now does kind of an internal locking to pause other threads while reconnecting is pending. UCS-4.0 uses 2.4.10, which is the latest version in Debian. Debian-RFP 2.4.15: <https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=742487> May-be fixed by 2.4.13: <https://mail.python.org/pipermail/python-ldap/2014q1/003327.html> * ldapobject.ReconnectLDAPObject now uses internal class-wide lock for serializing reconnects * Method signature of ReconnectLDAPObject.reconnect() changed to be able to call it with separate retry_max and retry_delay values See <https://mail.python.org/pipermail/python-ldap/2013q3/003265.html> for a patch; this alone does not seem to fix the bug. A quick test with a self-compiled 2.4.15 also failed:The process is then multi-threaded and hangs in some lock. The thread links to <https://bugs.launchpad.net/ldapuserfolder/+bug/650371>. See Bug #35841 for a patch for python-ldap. The patch is not enough to fix the test case, as some internal working seems to have changes. (In reply to Stefan Gohmann from comment #4) > r53174: I added a missing wait to a subprocess call. this patch needs to be reverted, as it breaks the whole idea of the test case: slapd is (re-)started *while* the main test case is running to test the reconnect behavior. By adding the .wait() the tests runs only after the slapd is fully working again. r53510 | Bug #35741 test: 10_ldap/25reconnect_uldap: revert subprocess.call.wait r53174 was reverted as Bug #35841 and Bug #35852 have been fixed. Package: ucs-test Version: 5.0.9-1.794.201409100922 Branch: ucs_4.0-0 No changelog as this is only a revert, so no visible source code change. OK, looks good. UCS 4.0-0 has been released: http://docs.univention.de/release-notes-4.0-0-en.html http://docs.univention.de/release-notes-4.0-0-de.html If this error occurs again, please use "Clone This Bug". |