Bug 41934 - Regression: legacy import suddenly does normalization for first name and last name
Regression: legacy import suddenly does normalization for first name and last...
Status: CLOSED FIXED
Product: UCS@school
Classification: Unclassified
Component: Import scripts
UCS@school 4.1 R2
Other Linux
: P5 normal (vote)
: UCS@school 4.1 R2 vXXX
Assigned To: Daniel Tröder
Florian Best
: interim-2
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2016-08-09 13:18 CEST by Sönke Schwardt-Krummrich
Modified: 2016-11-10 16:00 CET (History)
2 users (show)

See Also:
What kind of report is it?: Bug Report
What type of bug is this?: 5: Major Usability: Impairs usability in key scenarios
Who will be affected by this bug?: 5: Will affect all installed domains
How will those affected feel about the bug?: 3: A User would likely not purchase the product
User Pain: 0.429
Enterprise Customer affected?:
School Customer affected?: Yes
ISV affected?:
Waiting Support:
Flags outvoted (downgraded) after PO Review:
Ticket number:
Bug group (optional): External feedback
Max CVSS v3 score:


Attachments
no name normalization + ucs-test (6.49 KB, patch)
2016-09-28 12:21 CEST, Daniel Tröder
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Sönke Schwardt-Krummrich univentionstaff 2016-08-09 13:18:26 CEST
Regression: The legacy import script suddenly does normalization for first name and last name. At least one customer uses correct UTF8-coded names in import files that are now converted back to ASCII.

The legacy import should not use normalization for first names and last names.
Comment 1 Florian Best univentionstaff 2016-08-15 12:22:53 CEST
(In reply to Sönke Schwardt-Krummrich from comment #0)
At least one customer uses correct UTF8-coded names in
> import files that are now converted back to ASCII.
What does this mean? UTF-8 is ASCII compatible.
Comment 2 Sönke Schwardt-Krummrich univentionstaff 2016-08-15 12:39:45 CEST
(In reply to Florian Best from comment #1)
> (In reply to Sönke Schwardt-Krummrich from comment #0)
> At least one customer uses correct UTF8-coded names in
> > import files that are now converted back to ASCII.
> What does this mean? UTF-8 is ASCII compatible.

The legacy import converts umlauts (UTF-8) back to ASCII characters (ö → oe).

Btw: UTF-8 is not compatible with ASCII because ASCII is only a subset of UTF-8.
Comment 3 Daniel Tröder univentionstaff 2016-09-28 12:21:31 CEST
Created attachment 8040 [details]
no name normalization + ucs-test

Attached patch:
* do not normalize given name and family name in legacy import
* ucs-test
Comment 4 Daniel Tröder univentionstaff 2016-10-10 10:48:40 CEST
r73024:
* LegacyImportUser overwrites make_firstname() and make_lastname() to not do normalization anymore
* added test to 34_import-users-legacy that imports users with german and french umlauts in given names and family names to test non-normalization
Comment 5 Florian Best univentionstaff 2016-11-02 18:29:38 CET
# cat test.csv 
A       canton1 Mäyer   Antön   oldschool       oldschool-1A            Anton1c2@school.local   0       1       0

# /usr/share/ucs-school-import/scripts/import_user test.csv
…
{u'activate_new_users': {u'default': True},
 u'classes': {},
 u'csv': {u'delimiter': u'\t',
          u'header_lines': 0,
          u'incell-delimiter': {u'default': u','},
          u'mapping': {u'0': u'__action',
                       u'1': u'name',
                       u'10': u'__is_staff',
                       u'11': u'password',
                       u'12': u'__ignore',
                       u'13': u'__ignore',
                       u'14': u'__ignore',
                       u'15': u'__ignore',
                       u'16': u'__ignore',
                       u'17': u'__ignore',
                       u'18': u'__ignore',
                       u'19': u'__ignore',
                       u'2': u'lastname',
                       u'20': u'__ignore',
                       u'21': u'__ignore',
                       u'22': u'__ignore',
                       u'23': u'__ignore',
                       u'24': u'__ignore',
                       u'25': u'__ignore',
                       u'26': u'__ignore',
                       u'27': u'__ignore',
                       u'28': u'__ignore',
                       u'29': u'__ignore',
                       u'3': u'firstname',
                       u'30': u'__ignore',
                       u'4': u'school',
                       u'5': u'school_classes',
                       u'6': u'__ignore',
                       u'7': u'email',
                       u'8': u'__is_teacher',
                       u'9': u'__activate'}},
 u'dry_run': False,
 u'factory': u'ucsschool.importer.legacy.legacy_csv_user_import_factory.LegacyCsvUserImportFactory',
 u'input': {u'filename': 'test.csv', u'type': u'csv'},
 u'logfile': u'/var/log/univention/ucs-school-import.log',
 u'maildomain': None,
 u'mandatory_attributes': [u'firstname', u'lastname', u'name', u'school'],
 u'no_delete': False,
 u'output': {u'new_user_passwords': None,
             u'user_import_summary': u'/var/lib/ucs-school-import/user_import_summary_%Y-%m-%d_%H:%M:%S.csv'},
 u'password_length': 8,
 u'scheme': {u'email': u'<email>',
             u'recordUID': u'<name>',
             u'username': {u'allow_rename': False,
                           u'default': u'<name>[COUNTER2]'}},
 u'school': None,
 u'sourceUID': u'LegacyDB',
 u'tolerate_errors': -1,
 u'user_deletion': {u'delete': True, u'expiration': 0},
 u'user_role': None,
 u'verbose': True}
…

# univention-ldapsearch -LLLb 'uid=canton1,cn=schueler,cn=users,ou=oldschool,dc=school,dc=local' gecos sn cn uid | ldapsearch-wrapper  | ldapsearch-decode64 
dn: uid=canton1,cn=schueler,cn=users,ou=oldschool,dc=school,dc=local
uid: canton1
cn: Antön Mäyer
gecos: Antoen Maeyer
sn: Mäyer

→ The gecos are still changed.
Comment 6 Daniel Tröder univentionstaff 2016-11-03 12:04:32 CET
gecos is not set by the import script, but by UDM itself. UDM translates special characters, because they are not allowed there (ancient UNIX stuff).
This should be the same as it was with the legacy import.
Comment 7 Florian Best univentionstaff 2016-11-03 13:42:08 CET
Then OK.
OK: YAML
Comment 8 Sönke Schwardt-Krummrich univentionstaff 2016-11-10 16:00:45 CET
UCS@school 4.1 R2 v7 has been released.

http://docs.software-univention.de/changelog-ucsschool-4.1R2v7-de.html

If this error occurs again, please clone this bug.