Univention Bugzilla – Bug 41934
Regression: legacy import suddenly does normalization for first name and last name
Last modified: 2016-11-10 16:00:45 CET
Regression: The legacy import script suddenly does normalization for first name and last name. At least one customer uses correct UTF8-coded names in import files that are now converted back to ASCII. The legacy import should not use normalization for first names and last names.
(In reply to Sönke Schwardt-Krummrich from comment #0) At least one customer uses correct UTF8-coded names in > import files that are now converted back to ASCII. What does this mean? UTF-8 is ASCII compatible.
(In reply to Florian Best from comment #1) > (In reply to Sönke Schwardt-Krummrich from comment #0) > At least one customer uses correct UTF8-coded names in > > import files that are now converted back to ASCII. > What does this mean? UTF-8 is ASCII compatible. The legacy import converts umlauts (UTF-8) back to ASCII characters (ö → oe). Btw: UTF-8 is not compatible with ASCII because ASCII is only a subset of UTF-8.
Created attachment 8040 [details] no name normalization + ucs-test Attached patch: * do not normalize given name and family name in legacy import * ucs-test
r73024: * LegacyImportUser overwrites make_firstname() and make_lastname() to not do normalization anymore * added test to 34_import-users-legacy that imports users with german and french umlauts in given names and family names to test non-normalization
# cat test.csv A canton1 Mäyer Antön oldschool oldschool-1A Anton1c2@school.local 0 1 0 # /usr/share/ucs-school-import/scripts/import_user test.csv … {u'activate_new_users': {u'default': True}, u'classes': {}, u'csv': {u'delimiter': u'\t', u'header_lines': 0, u'incell-delimiter': {u'default': u','}, u'mapping': {u'0': u'__action', u'1': u'name', u'10': u'__is_staff', u'11': u'password', u'12': u'__ignore', u'13': u'__ignore', u'14': u'__ignore', u'15': u'__ignore', u'16': u'__ignore', u'17': u'__ignore', u'18': u'__ignore', u'19': u'__ignore', u'2': u'lastname', u'20': u'__ignore', u'21': u'__ignore', u'22': u'__ignore', u'23': u'__ignore', u'24': u'__ignore', u'25': u'__ignore', u'26': u'__ignore', u'27': u'__ignore', u'28': u'__ignore', u'29': u'__ignore', u'3': u'firstname', u'30': u'__ignore', u'4': u'school', u'5': u'school_classes', u'6': u'__ignore', u'7': u'email', u'8': u'__is_teacher', u'9': u'__activate'}}, u'dry_run': False, u'factory': u'ucsschool.importer.legacy.legacy_csv_user_import_factory.LegacyCsvUserImportFactory', u'input': {u'filename': 'test.csv', u'type': u'csv'}, u'logfile': u'/var/log/univention/ucs-school-import.log', u'maildomain': None, u'mandatory_attributes': [u'firstname', u'lastname', u'name', u'school'], u'no_delete': False, u'output': {u'new_user_passwords': None, u'user_import_summary': u'/var/lib/ucs-school-import/user_import_summary_%Y-%m-%d_%H:%M:%S.csv'}, u'password_length': 8, u'scheme': {u'email': u'<email>', u'recordUID': u'<name>', u'username': {u'allow_rename': False, u'default': u'<name>[COUNTER2]'}}, u'school': None, u'sourceUID': u'LegacyDB', u'tolerate_errors': -1, u'user_deletion': {u'delete': True, u'expiration': 0}, u'user_role': None, u'verbose': True} … # univention-ldapsearch -LLLb 'uid=canton1,cn=schueler,cn=users,ou=oldschool,dc=school,dc=local' gecos sn cn uid | ldapsearch-wrapper | ldapsearch-decode64 dn: uid=canton1,cn=schueler,cn=users,ou=oldschool,dc=school,dc=local uid: canton1 cn: Antön Mäyer gecos: Antoen Maeyer sn: Mäyer → The gecos are still changed.
gecos is not set by the import script, but by UDM itself. UDM translates special characters, because they are not allowed there (ancient UNIX stuff). This should be the same as it was with the legacy import.
Then OK. OK: YAML
UCS@school 4.1 R2 v7 has been released. http://docs.software-univention.de/changelog-ucsschool-4.1R2v7-de.html If this error occurs again, please clone this bug.