Univention Bugzilla – Bug 33520
UDM module process dies when response contains binary data
Last modified: 2017-08-08 07:09:41 CEST
I opened a univention-bittorrent file in the LDAP directory tree and the UMC-UDM module process died with the following traceback: 21.11.13 16:14:06.866 MODULE ( ERROR ) : Traceback (most recent call last): File "/usr/sbin/univention-management-console-module", line 112, in <module> notifier.loop() File "/usr/lib/pymodules/python2.6/notifier/nf_generic.py", line 284, in loop step() File "/usr/lib/pymodules/python2.6/notifier/nf_generic.py", line 276, in step __min_timer = dispatch.dispatcher_run() File "/usr/lib/pymodules/python2.6/notifier/dispatch.py", line 72, in dispatcher_run if not disp(): File "/usr/lib/pymodules/python2.6/notifier/threads.py", line 154, in _simple_threads_dispatcher task.announce() File "/usr/lib/pymodules/python2.6/notifier/threads.py", line 135, in announce self._callback( self, self._result ) File "/usr/lib/pymodules/python2.6/notifier/__init__.py", line 104, in __call__ return self._function( *tmp, **self._kwargs ) File "/usr/lib/pymodules/python2.6/univention/management/console/modules/udm/__init__.py", line 132, in _thread_finished self.finished( request.id, result ) File "/usr/lib/pymodules/python2.6/univention/management/console/modules/__init__.py", line 271, in finished self.result( res ) File "/usr/lib/pymodules/python2.6/univention/management/console/modules/__init__.py", line 278, in result self.signal_emit( 'success', response ) File "/usr/lib/pymodules/python2.6/notifier/signals.py", line 75, in signal_emit self.__signals[ signal ].emit( *args ) File "/usr/lib/pymodules/python2.6/notifier/signals.py", line 41, in emit cb( *args ) File "/usr/lib/pymodules/python2.6/notifier/__init__.py", line 104, in __call__ return self._function( *tmp, **self._kwargs ) File "/usr/lib/pymodules/python2.6/univention/management/console/protocol/modserver.py", line 109, in _reply self.response( msg ) File "/usr/lib/pymodules/python2.6/univention/management/console/protocol/modserver.py", line 292, in response data = str( msg ) File "/usr/lib/pymodules/python2.6/univention/management/console/protocol/message.py", line 315, in __str__ return Message._formattedMessage(self._id, self._type, self.mimetype, self.command, body, self.arguments) File "/usr/lib/pymodules/python2.6/univention/management/console/protocol/message.py", line 119, in _formattedMessage data = json.dumps( body ) File "/usr/lib/pymodules/python2.6/simplejson/__init__.py", line 261, in dumps return _default_encoder.encode(obj) File "/usr/lib/pymodules/python2.6/simplejson/encoder.py", line 214, in encode chunks = self.iterencode(o, _one_shot=True) File "/usr/lib/pymodules/python2.6/simplejson/encoder.py", line 282, in iterencode return _iterencode(o, 0) UnicodeDecodeError: 'utf8' codec can't decode byte 0xc6 in position 197: invalid continuation byte
The problem is > File > "/usr/lib/pymodules/python2.6/univention/management/console/protocol/message. > py", line 119, in _formattedMessage > data = json.dumps( body ) giving trying to encode data of a type (bytes) that JSON cannot store || giving a <str> object instead of a <unicode> object to the json library > File "/usr/lib/pymodules/python2.6/simplejson/__init__.py", line 261, in > dumps > return _default_encoder.encode(obj) leading to the json library interpreting it as encoded text || leading to the json library having to guess/assume an encoding of some byte string > UnicodeDecodeError: 'utf8' codec can't decode byte 0xc6 in position 197: > invalid continuation byte and thus failing.
Created attachment 5814 [details] fix encoding recursively if decoding fails IMHO this is not an UMC problem: json.dumps(b'\xe4') will of course always fail. The attached patch assumes that if there are bytes which cannot be decoded as UTF-8 that they are ISO8859-1 and transforms them to UTF-8 (this can at least not fail). JSON is not able to store binary data.
We can add a intelligent mechanism to detect this by using the following: ldap.schema.subentry.NOT_HUMAN_READABLE_LDAP_SYNTAXES Dictionary where the keys are the OIDs of LDAP syntaxes known to be not human-readable when displayed to a console without conversion and which cannot be decoded to a types.UnicodeType. When mapping the attribute from self.oldattr into self.info we can detect those attributes and decode them as latin-1 (ISO8859-1) (decoding bytes in latin-1 can never fail and preserves the original bytes).
I'd much rather see some kind of encoding that makes the binary/opaque nature of the value apparent. Like for example in LDIF where it is base64. Otherwise we just push the problem down the line. Decoding to ISO8859-1 seems conceptually wrong, and also probably confusing to the users. (In the same way that replacing the value with u'' may remove the Traceback but is not a "real" solution) I'm also not sure that ISO8859-1 helps in all cases (e.g. 00₁₆, 08₁₆, &c.)
(In reply to Janek Walkenhorst from comment #4) Yes, base64 would be better of course.
This issue has been filed against UCS 3. UCS 3 is out of the normal maintenance and many UCS components have vastly changed in UCS 4. If this issue is still valid, please change the version to a newer UCS version otherwise this issue will be automatically closed in the next weeks.
This issue has been filed against UCS 3.1. UCS 3.1 is out of maintenance and many UCS components have vastly changed in later releases. Thus, this issue is now being closed. If this issue still occurs in newer UCS versions, please use "Clone this bug" or reopen this issue. In this case please provide detailed information on how this issue is affecting you.