Bug 33520

Summary:	UDM module process dies when response contains binary data
Product:	UCS	Reporter:	Florian Best <best>
Component:	UMC - Domain management (Generic)	Assignee:	UMC maintainers <umc-maintainers>
Status:	RESOLVED WONTFIX	QA Contact:
Severity:	normal
Priority:	P5	CC:	gohmann, walkenhorst
Version:	UCS 3.1
Target Milestone:	UCS 3.2-x
Hardware:	Other
OS:	Linux
See Also:	https://forge.univention.org/bugzilla/show_bug.cgi?id=28070
What kind of report is it?:	---	What type of bug is this?:	---
Who will be affected by this bug?:	---	How will those affected feel about the bug?:	---
User Pain:		Enterprise Customer affected?:
School Customer affected?:		ISV affected?:
Waiting Support:		Flags outvoted (downgraded) after PO Review:
Ticket number:		Bug group (optional):
Max CVSS v3 score:
Attachments:	fix encoding recursively if decoding fails

Description Florian Best

2013-11-21 16:26:09 CET

I opened a univention-bittorrent file in the LDAP directory tree and the UMC-UDM module process died with the following traceback:

21.11.13 16:14:06.866  MODULE      ( ERROR   ) : Traceback (most recent call last):
  File "/usr/sbin/univention-management-console-module", line 112, in <module>
    notifier.loop()
  File "/usr/lib/pymodules/python2.6/notifier/nf_generic.py", line 284, in loop
    step()
  File "/usr/lib/pymodules/python2.6/notifier/nf_generic.py", line 276, in step
    __min_timer = dispatch.dispatcher_run()
  File "/usr/lib/pymodules/python2.6/notifier/dispatch.py", line 72, in dispatcher_run
    if not disp():
  File "/usr/lib/pymodules/python2.6/notifier/threads.py", line 154, in _simple_threads_dispatcher
    task.announce()
  File "/usr/lib/pymodules/python2.6/notifier/threads.py", line 135, in announce
    self._callback( self, self._result )
  File "/usr/lib/pymodules/python2.6/notifier/__init__.py", line 104, in __call__
    return self._function( *tmp, **self._kwargs )
  File "/usr/lib/pymodules/python2.6/univention/management/console/modules/udm/__init__.py", line 132, in _thread_finished
    self.finished( request.id, result )
  File "/usr/lib/pymodules/python2.6/univention/management/console/modules/__init__.py", line 271, in finished
    self.result( res )
  File "/usr/lib/pymodules/python2.6/univention/management/console/modules/__init__.py", line 278, in result
    self.signal_emit( 'success', response )
  File "/usr/lib/pymodules/python2.6/notifier/signals.py", line 75, in signal_emit
    self.__signals[ signal ].emit( *args )
  File "/usr/lib/pymodules/python2.6/notifier/signals.py", line 41, in emit
    cb( *args )
  File "/usr/lib/pymodules/python2.6/notifier/__init__.py", line 104, in __call__
    return self._function( *tmp, **self._kwargs )
  File "/usr/lib/pymodules/python2.6/univention/management/console/protocol/modserver.py", line 109, in _reply
    self.response( msg )
  File "/usr/lib/pymodules/python2.6/univention/management/console/protocol/modserver.py", line 292, in response
    data = str( msg )
  File "/usr/lib/pymodules/python2.6/univention/management/console/protocol/message.py", line 315, in __str__
    return Message._formattedMessage(self._id, self._type, self.mimetype, self.command, body, self.arguments)
  File "/usr/lib/pymodules/python2.6/univention/management/console/protocol/message.py", line 119, in _formattedMessage
    data = json.dumps( body )
  File "/usr/lib/pymodules/python2.6/simplejson/__init__.py", line 261, in dumps
    return _default_encoder.encode(obj)
  File "/usr/lib/pymodules/python2.6/simplejson/encoder.py", line 214, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/usr/lib/pymodules/python2.6/simplejson/encoder.py", line 282, in iterencode
    return _iterencode(o, 0)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xc6 in position 197: invalid continuation byte

Comment 1 Janek Walkenhorst

2013-11-21 17:51:38 CET

The problem is
>   File
> "/usr/lib/pymodules/python2.6/univention/management/console/protocol/message.
> py", line 119, in _formattedMessage
>     data = json.dumps( body )
giving trying to encode data of a type (bytes) that JSON cannot store
||
giving a <str> object instead of a <unicode> object to the json library

>   File "/usr/lib/pymodules/python2.6/simplejson/__init__.py", line 261, in
> dumps
>     return _default_encoder.encode(obj)
leading to the json library interpreting it as encoded text
||
leading to the json library having to guess/assume an encoding of some byte string

> UnicodeDecodeError: 'utf8' codec can't decode byte 0xc6 in position 197:
> invalid continuation byte
and thus failing.

Comment 2 Florian Best

2014-03-05 15:02:53 CET

Created attachment 5814 [details]
fix encoding recursively if decoding fails

IMHO this is not an UMC problem: json.dumps(b'\xe4') will of course always fail.

The attached patch assumes that if there are bytes which cannot be decoded as UTF-8 that they are ISO8859-1 and transforms them to UTF-8 (this can at least not fail).

JSON is not able to store binary data.

Comment 3 Florian Best

2016-05-27 06:45:13 CEST

We can add a intelligent mechanism to detect this by using the following:

ldap.schema.subentry.NOT_HUMAN_READABLE_LDAP_SYNTAXES
 Dictionary where the keys are the OIDs of LDAP syntaxes known to be not human-readable when displayed to a console without conversion and which cannot be decoded to a types.UnicodeType.

When mapping the attribute from self.oldattr into self.info we can detect those attributes and decode them as latin-1 (ISO8859-1) (decoding bytes in latin-1 can never fail and preserves the original bytes).

Comment 4 Janek Walkenhorst

2016-05-27 11:50:56 CEST

I'd much rather see some kind of encoding that makes the binary/opaque nature of the value apparent.
Like for example in LDIF where it is base64.
Otherwise we just push the problem down the line.

Decoding to ISO8859-1 seems conceptually wrong, and also probably confusing to the users. (In the same way that replacing the value with u'' may remove the Traceback but is not a "real" solution)

I'm also not sure that ISO8859-1 helps in all cases (e.g. 00₁₆, 08₁₆, &c.)

Comment 5 Florian Best

2016-05-27 11:54:29 CEST

(In reply to Janek Walkenhorst from comment #4)
Yes, base64 would be better of course.

Comment 6 Stefan Gohmann

2017-06-16 20:40:50 CEST

This issue has been filed against UCS 3. UCS 3 is out of the normal maintenance and many UCS components have vastly changed in UCS 4.

If this issue is still valid, please change the version to a newer UCS version otherwise this issue will be automatically closed in the next weeks.

Comment 7 Stefan Gohmann

2017-08-08 07:09:41 CEST

This issue has been filed against UCS 3.1.

UCS 3.1 is out of maintenance and many UCS components have vastly changed in later releases. Thus, this issue is now being closed.

If this issue still occurs in newer UCS versions, please use "Clone this bug" or reopen this issue. In this case please provide detailed information on how this issue is affecting you.