Bug 50617 - Replace repr() and ast.literal_eval() as wire protocol for UDM-CLI
Replace repr() and ast.literal_eval() as wire protocol for UDM-CLI
Status: CLOSED FIXED
Product: UCS
Classification: Unclassified
Component: UDM - CLI
UCS 4.4
Other Linux
: P5 major (vote)
: UCS 5.0
Assigned To: Florian Best
Arvid Requate
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2019-12-10 11:43 CET by Owen Synge
Modified: 2021-05-25 16:00 CEST (History)
2 users (show)

See Also:
What kind of report is it?: Release Management
What type of bug is this?: ---
Who will be affected by this bug?: ---
How will those affected feel about the bug?: ---
User Pain:
Enterprise Customer affected?:
School Customer affected?:
ISV affected?:
Waiting Support:
Flags outvoted (downgraded) after PO Review:
Ticket number:
Bug group (optional):
Max CVSS v3 score:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Owen Synge univentionstaff 2019-12-10 11:43:44 CET
Using `repr` and `ast.literal_eval` as wire protocol casues issues as these for the python 2 -> python3 transition as these will show strings (binary and unicode) differently on different version.
Comment 1 Owen Synge univentionstaff 2019-12-10 11:45:51 CET
This bug must be tested for backwards compatability.
Comment 2 Florian Best univentionstaff 2019-12-10 11:53:10 CET
ùnivention-cli-client transmits the repr() of the CLI arguments (udm users/user modify --dn ... --set ...).
→ This should be easily replaceable with JSON.

univention-cli-server transmits the repr() of the object representation.
→ This cannot easily be replaced with JSON as JSON can only contain UTF-8 and the python json library can only consume unicode-strings (everything which is bytestrings is decoded as UTF-8). Using json.dumps() arround this will either transform data wronly or will crash for any sequence which is not UTF-8-able.
So, we have to fix UDM's encoding in general:
WIP Idea: Every property must be decoded to a unicode string. Regular properties should be decoded as UTF-8 and every binary attribute can be decoded as ISO8859-1.
Comment 3 Owen Synge univentionstaff 2019-12-11 11:48:45 CET
I made a Branch origin/owen/json with this hash in it.
 
510950602697667ba9e9509e570a542fcef42f74

It seems to pass my testing in python2. I would be interested in finding where it breaks things.
Comment 4 Florian Best univentionstaff 2019-12-12 09:58:21 CET
Maybe instead of JSON we should use msgpack (https://msgpack.org/index.html). It's faster and binary-data instead of plain-text/json/repr (https://github.com/msgpack/msgpack/blob/master/spec.md).
But we must be cautious here, because it expects byte-strings or encode()s everything, which is a unicode string, as UTF-8

And then we also should transmit length information, so create some kind of simple protocol (using struct.pack()), instead of splitting at a null-byte.

Also I don't think we need backwards compatibility. The CLI-server is only a local process, which we can kill in the package's pre+postinst.
Comment 5 Owen Synge univentionstaff 2019-12-12 14:00:51 CET
These are all opinions of mine and are for discussion.

Performance/msg size does not look like a major issue in this case.

msgpack is new to me, and superficially looks like a good protocol. Have not yet had time to deep dive into it. But it does look more complex to use than JSON.

I think the issues of unicode string / UTF-8 is encoding is orthogonal as we will need to send all text in a format that is utf-8 / unicode format anyway.

With JSON, binary data `could` be encoded with base64. Not needing to this is main advantage of msgpack.
Comment 6 Florian Best univentionstaff 2019-12-20 12:29:43 CET
(In reply to Owen Synge from comment #5)
> These are all opinions of mine and are for discussion.
> 
> Performance/msg size does not look like a major issue in this case.
Yes.

> msgpack is new to me, and superficially looks like a good protocol. Have not
> yet had time to deep dive into it. But it does look more complex to use than
> JSON.
Okay, maybe this was an overhead idea.

> I think the issues of unicode string / UTF-8 is encoding is orthogonal as we
> will need to send all text in a format that is utf-8 / unicode format anyway.
Yep.

> With JSON, binary data `could` be encoded with base64. Not needing to this
> is main advantage of msgpack.
For most UDM modules having binary data we already use base64.
Comment 7 Arvid Requate univentionstaff 2020-03-12 19:33:05 CET
Fun fact: FIDO2/Webauthn uses the CBOR protocol (RFC 7049 - https://cbor.io/ - "made in Bremen"™ :-)), for bandwith-constrained communication with authenticator tokens. The protrocol RFC compares it to msgpack.

Thinking about this, I vaguely remember that we have a somewhat related issue in the context of the REST API, isn't it? If we need base64 encoded values there, then maybe there is some simplicity gained if we stick to this simple recipe and also use it for other socket communication like UDM-CLI (and UMCP or whatever that will be replaced with?).
Comment 8 Florian Best univentionstaff 2020-09-22 10:45:10 CEST
This has been fixed in:

univention-directory-manager-modules (15.0.0-1)
52fff92cccbc | Bug #50617 UDM: use JSON instead of repr/eval for communication

changelog-5.0-0.xml
60fb5d6c62e5 | Changelog Bug #50617
Comment 9 Arvid Requate univentionstaff 2021-03-02 15:56:36 CET
Ok
Comment 10 Florian Best univentionstaff 2021-05-25 16:00:23 CEST
UCS 5.0 has been released:
 https://docs.software-univention.de/release-notes-5.0-0-en.html
 https://docs.software-univention.de/release-notes-5.0-0-de.html

If this error occurs again, please use "Clone This Bug".