Univention Bugzilla – Bug 45626
User is removed and recreated on every import
Last modified: 2017-12-21 12:23:04 CET
Created attachment 9269 [details] Test import data If this file is imported multiple times, the user "hanna.venckh" from the attached CSV file is removed and recreated with a new username during every import starting with the second import. /usr/share/ucs-school-import/scripts/ucs-school-user-import \ --conffile /usr/share/ucs-school-import/configs/user_import_http-api.json \ --user_role student --school gsost \ --infile test_users_2017-10-31_02:48:54.csv \ --sourceUID gsost-student --verbose "gsost","Hanna","Venckhuß","1c,1a","A teacher.","+16-990-25713" May be related to the umlaut in the last name. Config /usr/share/ucs-school-import/configs/user_import_http-api.json: { "factory": "ucsschool.importer.default_user_import_factory.DefaultUserImportFactory", "input": { "type": "csv" }, "csv": { "mapping": { "Schule": "school", "Vorname": "firstname", "Nachname": "lastname", "Klassen": "school_classes", "Beschreibung": "description", "Telefon": "phone", "EMail": "mailPrimaryAddress" } }, "scheme": { "email": "<firstname>.<lastname>@<maildomain>", "recordUID": "<firstname>.<lastname>", "username": { "default": "<:umlauts><firstname>.<lastname><:lower>[COUNTER2]" } }, "verbose": false }
It is a configuration error. The problem ist with the ucsschoolRecordUID that gets saved as SGFubmEuVmVuY2todcOf because of the ß. When the configuration is changed, it works: 19c19 < "recordUID": "<firstname>.<lastname>", --- > "recordUID": "<:umlauts><firstname>.<lastname>", I guess we should either make it a default in the configuration or in the code.
(In reply to Daniel Tröder from comment #1) > It is a configuration error. I used the default config/example → bug in the MVP example > The problem ist with the ucsschoolRecordUID that gets saved as > SGFubmEuVmVuY2todcOf because of the ß. When the configuration is changed, it > works: I think, the recordUID is not saved as "SGFubmEuVmVuY2todcOf". This string is base64 encoded: $ echo "SGFubmEuVmVuY2todcOf" | base64 -d Hanna.Venckhuß $ univention-ldapsearch returns this value base64 encoded and indicates this by two colons "::" between LDAP attribute name and value. > 19c19 > < "recordUID": "<firstname>.<lastname>", > --- > > "recordUID": "<:umlauts><firstname>.<lastname>", > > I guess we should either make it a default in the configuration or in the > code. Is there a problem with UTF-8/string vs. unicode encoding? If yes, we should fix it. If no, we should fix the default config.
(In reply to Sönke Schwardt-Krummrich from comment #2) > (In reply to Daniel Tröder from comment #1) > > It is a configuration error. > > I used the default config/example → bug in the MVP example Yes - fixed in commit ed05fd74. > > The problem ist with the ucsschoolRecordUID that gets saved as > > SGFubmEuVmVuY2todcOf because of the ß. When the configuration is changed, it > > works: > > I think, the recordUID is not saved as "SGFubmEuVmVuY2todcOf". This string > is base64 encoded: > $ echo "SGFubmEuVmVuY2todcOf" | base64 -d > Hanna.Venckhuß > $ > univention-ldapsearch returns this value base64 encoded and indicates this > by two colons "::" between LDAP attribute name and value. > > > 19c19 > > < "recordUID": "<firstname>.<lastname>", > > --- > > > "recordUID": "<:umlauts><firstname>.<lastname>", > > > > I guess we should either make it a default in the configuration or in the > > code. > > Is there a problem with UTF-8/string vs. unicode encoding? > If yes, we should fix it. If no, we should fix the default config. Yes - the problem is not with the storage, but with the search: ldap.filter.filter_format() decodes 'ß' to '\xc3\x9f', which doesn't match the saved value. We have to either always use "<:umlauts>" or base64-encode all recordUIDs and LDAP-searches.
RFC in branch dtroeder/45626_rUID_filter: Commit 3ed35f11 (https://git.knut.univention.de/univention/ucsschool/commit/3ed35f11a53b1fa588f5a674143e52800afb65d9) stores recordUID and sourceUID as already-escaped strings, so that an LDAP search with the same names should match. If the solution is OK, I'll write a test, advisory etc. Please comment.
I think values should be escaped where they are inserted into a filter, not in self.record_uid. Also: For escaping single values there is ldap.filter.escape_filter_chars().
@Daniel: Which ldap search is failing? I can't find the code.
Found it. Fixed in 91758e34: https://git.knut.univention.de/univention/ucsschool/commit/91758e3438df1f35cd98de2eb681830e30bfbee7 The problem is the comparison of a unicode object and a string containing encoded unicode. The {str,unicode}.__eq__ can handle it: In [11]: u'ß' == 'ß', u'\xdf' == 'ß', u'\xdf' == '\xc3\x9f', u'\xdf' == u'ß', '\xc3\x9f' == 'ß', '\xc3\x9f' == u'ß' Out[11]: (True, True, True, True, True, True) But the hashes are different: In [14]: hash('ß'), hash('\xc3\x9f'), hash(u'ß'), hash(u'\xdf') Out[14]: (24960149699224340, 24960149699224340, 28544085598, 28544085598) This comes into play, when the sets were compared: In [17]: 'ß' == u'ß', set('ß') == set(u'ß') Out[17]: (True, False) With 91758e34 both sets contain unicode strings, and the comparison works again even when "<:umlauts>" is not used in the configuration scheme.
[4.2 3f15130a] Bug #45626: handle non-ASCII in recordUID and sourceUID [4.2 b6629533] Bug #45626: advisory update ucs-school-import 15.0.3-11
(In reply to Daniel Tröder from comment #3) > (In reply to Sönke Schwardt-Krummrich from comment #2) > > (In reply to Daniel Tröder from comment #1) > > > It is a configuration error. > > > > I used the default config/example → bug in the MVP example > Yes - fixed in commit ed05fd74. I don't think that this is correct in all cases: "recordUID": "<:umlauts><firstname>.<lastname>", The following users will create a recordUID collision with this scheme: - Martin Dräger → Martin.Draeger - Martin Draeger → Martin.Draeger I suggest to remove the "<:umlauts>" in user_import_http-api.json to make collisions less likely. → REOPEN Otherwise: OK: code change for the actual fix OK: functional test OK: advisory
81cd8678: Bug #45626: remove "<:umlauts>" from default configuration to lower the chance of recordUID collisions ucs-school-import (15.0.3-17) Package version in advisory will be adapted when Bug #45577 is fixed.
OK: code change OK: functional test > Package version in advisory will be adapted when Bug #45577 is fixed. OK →VERIFIED
UCS@school 4.2 v6 has been released. http://docs.software-univention.de/changelog-ucsschool-4.2v6-de.html If this error occurs again, please clone this bug.