Univention Bugzilla – Bug 34648
Invalid pickle files
Last modified: 2017-09-20 15:03:32 CEST
After creating an invalid connector pickle file, the connector stops working. This could happen if the listener was killed: echo "foobar" >/var/lib/univention-connector/s4/1392031001.854120 The connector should write a debug message and the corrupted pickle files should be ignored.
Traceback from status.log: /s4connector/__init__.py", line 948, in poll_ucs self._generate_dn_list_from(files) File "/usr/lib/pymodules/python2.6/univention/s4connector/__init__.py", line 917, in _generate_dn_list_from dn,new,old,old_dn=cPickle.load(f) UnpicklingError: invalid load key, 'f'.
Traceback (most recent call last): File "/usr/lib/pymodules/python2.6/univention/s4connector/s4/main.py", line 266, in main connect() File "/usr/lib/pymodules/python2.6/univention/s4connector/s4/main.py", line 175, in connect s4.initialize_ucs() File "/usr/lib/pymodules/python2.6/univention/s4connector/__init__.py", line 854, in initialize_ucs self.poll_ucs() File "/usr/lib/pymodules/python2.6/univention/s4connector/__init__.py", line 948, in poll_ucs self._generate_dn_list_from(files) File "/usr/lib/pymodules/python2.6/univention/s4connector/__init__.py", line 917, in _generate_dn_list_from dn,new,old,old_dn=cPickle.load(f) UnpicklingError: invalid load key, 'f'.
This happened to us several times when a school server probably crashed while the listener was creating pickle files for the connector, resulting in 0 byte files. In one case this wasn't even noticed until months later.
Created attachment 9005 [details] 34648-s4c-pickle-error-handling-421.patch With this patch the S4C ignores invalid pickle files and saves them as rejected. Unpickling can throw an `UnpicklingError` or an `EOFError`. In both cases, a message is logged, the object ignored and saved as rejected. It is saved as rejected, to only process the invalid file once. Deleting the invalid file would do the trick aswell, but might make debugging harder.
Committed in r81410 (advisory r81412).
*** Bug 41450 has been marked as a duplicate of this bug. ***
Created attachment 9148 [details] bug34648_qa.diff > It is saved as rejected, to only process the invalid file once. Rejects are retried every connector/s4/retryrejected sync runs. See attached diff for a suggestion: ============================================================================= 24.08.2017 23:33:32,150 LDAP (PROCESS): sync from ucs: Resync rejected file: /var/lib/univention-connector/s4/1503610232.484407 24.08.2017 23:33:32,150 LDAP (WARNING): __sync_file_from_ucs: invalid pickle file /var/lib/univention-connector/s4/1503610232.484407: unpickling stack underflow 24.08.2017 23:34:27,495 LDAP (PROCESS): sync from ucs: Resync rejected file: /var/lib/univention-connector/s4/1503610232.484407 24.08.2017 23:34:27,495 LDAP (WARNING): __sync_file_from_ucs: invalid pickle file /var/lib/univention-connector/s4/1503610232.484407: unpickling stack underflow ============================================================================= I'd rather see the message at log level "Error", also in the diff.
Created attachment 9166 [details] 34648-invalid-pickle-s4-421.patch (In reply to Arvid Requate from comment #7) > Rejects are retried every connector/s4/retryrejected sync runs. See attached > diff for a suggestion: > > if dn in ('ERROR: broken file',): > continue This is really bad in terms of maintainability. We should introduce a field that marks if a resync should be attempted. A somewhat a little more general (but still not optimal and untested) proposal is attached.
Your proposal reminds me of extended DNs in AD, so maybe we can mimic that format: <ERROR=broken file>;$dn Rationale for my approach: 1. If we keep the reject, it should be visible in uinvention-s4connector-list-rejected 2. I'd like to convey a bit more info about the nature of the reject than just "unknown"
Created attachment 9167 [details] 34648-invalid-pickle-s4-421.patch The attached patch refines the `<NORESYNC>` marker: An optional reason can be given. In the invalid pickle file case `reason='broken file'`. This introduces a new optional parameter to `list_rejected_ucs()` to filter `<NORESYNC>` marked DNs. The connector will not attempt to resync files marked with `<NORESYNC>`, `univention-s4connector-list-rejected` still works as expected: root@ucs-master40:~# univention-s4connector-list-rejected UCS rejected 1: UCS DN: <NORESYNC=broken file>;unknown S4 DN: <not found> Filename: /var/lib/univention-connector/s4/1502791035.726657 S4 rejected last synced USN: 3821 4.2-1: r82611/82612, YAML: r82613 4.2-2: r82614/r82615, YAML: r82616
I've also reverted unreleased errata4.2-1 changes (r81361:82612) since Erratum 114 from the ucs-4.2-1 SVN branch and merged the unpublished errata4.2-1 advisory into the errata4.2-2 advisory. So everything's up to date in the ucs-4.2-2 branch.
Ok, now I just have one grievance, I cannot address which reject I want to remove, as the DNs are all the same: root@master10:~# /usr/share/univention-s4-connector/remove_ucs_rejected.py \ '<NORESYNC=broken file>;unknown' The rejected UCS object <NORESYNC=broken file>;unknown has been removed. That's not much of an issue, but maybe we can add the filename like: '<NORESYNC=broken file: 1502791035.726657>;unknown' I guess using a colon as separator would be ok here (see http://www.rlmueller.net/CharactersEscaped.htm ).
(In reply to Arvid Requate from comment #12) > That's not much of an issue, but maybe we can add the filename like: > '<NORESYNC=broken file: 1502791035.726657>;unknown' Implemented in 6f7ea6a, YAML 219619d. root@ucs-master40:~# /usr/share/univention-s4-connector/remove_ucs_rejected.py "<NORESYNC=broken file:1502796706.650481>;unknown" The rejected UCS object <NORESYNC=broken file:1502796706.650481>;unknown has been removed.
Nice.
<http://errata.software-univention.de/ucs/4.2/167.html>