Bug 56403 - Recommended file format is encoded to utf-8 instead of utf-16
Recommended file format is encoded to utf-8 instead of utf-16
Status: CLOSED FIXED
Product: UCS@school
Classification: Unclassified
Component: UMC - Class lists
UCS@school 5.0
Other Linux
: P5 normal (vote)
: UCS@school 5.0 v4-errata
Assigned To: Alexander Steffen
Ole Schwiegert
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2023-08-04 17:02 CEST by Jan-Luca Kiok
Modified: 2023-10-26 11:22 CEST (History)
4 users (show)

See Also:
What kind of report is it?: Bug Report
What type of bug is this?: 3: Simply Wrong: The implementation doesn't match the docu
Who will be affected by this bug?: 3: Will affect average number of installed domains
How will those affected feel about the bug?: 2: A Pain – users won’t like this once they notice it
User Pain: 0.103
Enterprise Customer affected?:
School Customer affected?: Yes
ISV affected?:
Waiting Support:
Flags outvoted (downgraded) after PO Review:
Ticket number:
Bug group (optional): Regression
Max CVSS v3 score:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Jan-Luca Kiok univentionstaff 2023-08-04 17:02:10 CEST
For compatibility reasons the recommended file format for class list exports is tab (\t) separated, utf16 (little-endian) encoded with a bom character at the start.

Since https://forge.univention.org/bugzilla/show_bug.cgi?id=55102 however the file is encoded to utf8 instead.
Comment 2 Florian Best univentionstaff 2023-08-04 17:09:06 CEST
I assume a patch could be:

diff --git ucs-school-umc-lists/umc/python/schoollists/__init__.py ucs-school-umc-lists/umc/python/schoollists/__init__.py
index 6e478cafa..1deefa9a5 100644
--- ucs-school-umc-lists/umc/python/schoollists/__init__.py
+++ ucs-school-umc-lists/umc/python/schoollists/__init__.py
@@ -73,7 +73,7 @@ class Instance(SchoolBaseModule):
         filename = os.path.join(path, os.path.basename(classlist))
         try:
             with open(filename, "rb") as fd:
-                self.finished(request.id, fd.read(), mimetype="text/csv")
+                self.finished(request.id, fd.read(), mimetype='text/csv; charset="UTF-16"')
         except EnvironmentError:
             raise UMC_Error(
                 _("The class list does not exists. Please create a new one."),
@@ -122,9 +122,9 @@ class Instance(SchoolBaseModule):
         timestamp = datetime.now().strftime("%Y-%m-%d_%H_%M_%S")
         filename = "%s_%s-%s.csv" % (classlistname.replace("/", "_"), timestamp, uuid.uuid4())
         path = os.path.join("/usr/share/ucs-school-umc-lists/classlists/", filename)
-        with open(path, "w") as fd:
+        with open(path, "wb") as fd:
             os.chmod(path, 0o600)
-            fd.write(write_classlist_csv(fieldnames, rows, separator))
+            fd.write(write_classlist_csv(fieldnames, rows, separator).encode('utf-16'))
 
         url = "/univention/command/schoollists/csvlistget?classlist=%s" % (quote(filename),)
         self.finished(
Comment 3 Jan-Luca Kiok univentionstaff 2023-08-04 17:35:04 CEST
Yes, sounds promising - Without looking into this further, we have two possible file formats and the other one should remain utf-8 encoded, does your proposal take this into account or will every file be utf-16?
Comment 4 Alexander Steffen univentionstaff 2023-10-10 08:01:45 CEST
Fixed with:

Package: ucs-school-umc-lists
Version: 3.0.9
Branch: ucs_5.0-0
Scope: ucs-school-5.0
Comment 5 Tobias Wenzel univentionstaff 2023-10-24 15:11:05 CEST
Everything looks ok and Issue is closed: set to verify

https://git.knut.univention.de/univention/ucsschool/-/issues/1079#note_223994
Comment 6 Tobias Wenzel univentionstaff 2023-10-26 11:22:14 CEST
Errata updates for UCS@school 5.0 v4 have been released.

https://docs.software-univention.de/ucsschool-changelog/5.0v4/en/changelog.html
https://docs.software-univention.de/ucsschool-changelog/5.0v4/de/changelog.html

If this error occurs again, please clone this bug.