Univention Bugzilla – Full Text Bug Listing |
Summary: | UDM module process dies when response contains binary data | ||
---|---|---|---|
Product: | UCS | Reporter: | Florian Best <best> |
Component: | UMC - Domain management (Generic) | Assignee: | UMC maintainers <umc-maintainers> |
Status: | RESOLVED WONTFIX | QA Contact: | |
Severity: | normal | ||
Priority: | P5 | CC: | gohmann, walkenhorst |
Version: | UCS 3.1 | ||
Target Milestone: | UCS 3.2-x | ||
Hardware: | Other | ||
OS: | Linux | ||
See Also: | https://forge.univention.org/bugzilla/show_bug.cgi?id=28070 | ||
What kind of report is it?: | --- | What type of bug is this?: | --- |
Who will be affected by this bug?: | --- | How will those affected feel about the bug?: | --- |
User Pain: | Enterprise Customer affected?: | ||
School Customer affected?: | ISV affected?: | ||
Waiting Support: | Flags outvoted (downgraded) after PO Review: | ||
Ticket number: | Bug group (optional): | ||
Max CVSS v3 score: | |||
Attachments: | fix encoding recursively if decoding fails |
Description
Florian Best
2013-11-21 16:26:09 CET
The problem is > File > "/usr/lib/pymodules/python2.6/univention/management/console/protocol/message. > py", line 119, in _formattedMessage > data = json.dumps( body ) giving trying to encode data of a type (bytes) that JSON cannot store || giving a <str> object instead of a <unicode> object to the json library > File "/usr/lib/pymodules/python2.6/simplejson/__init__.py", line 261, in > dumps > return _default_encoder.encode(obj) leading to the json library interpreting it as encoded text || leading to the json library having to guess/assume an encoding of some byte string > UnicodeDecodeError: 'utf8' codec can't decode byte 0xc6 in position 197: > invalid continuation byte and thus failing. Created attachment 5814 [details]
fix encoding recursively if decoding fails
IMHO this is not an UMC problem: json.dumps(b'\xe4') will of course always fail.
The attached patch assumes that if there are bytes which cannot be decoded as UTF-8 that they are ISO8859-1 and transforms them to UTF-8 (this can at least not fail).
JSON is not able to store binary data.
We can add a intelligent mechanism to detect this by using the following: ldap.schema.subentry.NOT_HUMAN_READABLE_LDAP_SYNTAXES Dictionary where the keys are the OIDs of LDAP syntaxes known to be not human-readable when displayed to a console without conversion and which cannot be decoded to a types.UnicodeType. When mapping the attribute from self.oldattr into self.info we can detect those attributes and decode them as latin-1 (ISO8859-1) (decoding bytes in latin-1 can never fail and preserves the original bytes). I'd much rather see some kind of encoding that makes the binary/opaque nature of the value apparent. Like for example in LDIF where it is base64. Otherwise we just push the problem down the line. Decoding to ISO8859-1 seems conceptually wrong, and also probably confusing to the users. (In the same way that replacing the value with u'' may remove the Traceback but is not a "real" solution) I'm also not sure that ISO8859-1 helps in all cases (e.g. 00₁₆, 08₁₆, &c.) (In reply to Janek Walkenhorst from comment #4) Yes, base64 would be better of course. This issue has been filed against UCS 3. UCS 3 is out of the normal maintenance and many UCS components have vastly changed in UCS 4. If this issue is still valid, please change the version to a newer UCS version otherwise this issue will be automatically closed in the next weeks. This issue has been filed against UCS 3.1. UCS 3.1 is out of maintenance and many UCS components have vastly changed in later releases. Thus, this issue is now being closed. If this issue still occurs in newer UCS versions, please use "Clone this bug" or reopen this issue. In this case please provide detailed information on how this issue is affecting you. |