Bug 42313 - ucsschool-import: remove illegal characters from username
ucsschool-import: remove illegal characters from username
Status: CLOSED FIXED
Product: UCS@school
Classification: Unclassified
Component: Import scripts
UCS@school 4.1 R2
Other Linux
: P5 normal (vote)
: UCS@school 4.1 R2 vXXX
Assigned To: Daniel Tröder
Florian Best
: interim-1
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2016-09-08 11:11 CEST by Daniel Tröder
Modified: 2016-10-04 13:24 CEST (History)
2 users (show)

See Also:
What kind of report is it?: Bug Report
What type of bug is this?: 5: Major Usability: Impairs usability in key scenarios
Who will be affected by this bug?: 4: Will affect most installed domains
How will those affected feel about the bug?: 3: A User would likely not purchase the product
User Pain: 0.343
Enterprise Customer affected?:
School Customer affected?: Yes
ISV affected?:
Waiting Support:
Flags outvoted (downgraded) after PO Review:
Ticket number:
Bug group (optional): API change
Max CVSS v3 score:


Attachments
doctests (2.68 KB, patch)
2016-09-21 15:47 CEST, Florian Best
Details | Diff
patch (did not consider case insensitivity), attachment 8027 on Bug #42478 is a rewrite (982 bytes, patch)
2016-09-22 13:28 CEST, Florian Best
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Daniel Tröder univentionstaff 2016-09-08 11:11:16 CEST
Illegal characters must be revmoved from username, before creating/modifying a user.

Found in Bug #42105 Comment 2  https://forge.univention.org/bugzilla/show_bug.cgi?id=42105#c2 :

ValidationError when adding ImportStudent(name='e.brinkmann-', school='School2', dn='uid=e.brinkmann-,cn=schueler,cn=users,ou=School2,dc=autotest300,dc=local', old_dn=None) (source_uid:NewDB record_uid: Elftrud.Brinkmann-Hartmann2): {'name': ["Username must only contain numbers, letters and dots, and may not be 'admin'!"]}
Comment 1 Daniel Tröder univentionstaff 2016-09-08 11:19:15 CEST
Usernames are now stripped of all characters but numbers, letters and dots.

r72415: add missing arguments to exception constructor call
r72416: remove illegal characters from usernames
Comment 2 Daniel Tröder univentionstaff 2016-09-08 15:44:28 CEST
r72449: remove dot from beginning of username
Comment 3 Daniel Tröder univentionstaff 2016-09-08 17:56:47 CEST
The check on illegal characters is only done, if the username has to be calculated. If it is given in the CSV input, it it not checked. A bad character in a passed username will lead to the import rejecting to add/modify that user (non-fatal exception).

If it is desired, that usernames passed from input should also be checked (and modified), please reopen.
Comment 4 Daniel Tröder univentionstaff 2016-09-08 18:43:43 CEST
r72452: added test for this to 90_ucsschool/34_import-users_via_cli_v2 -> test_create_with_illegal_chars_in_username()
Comment 5 Stefan Gohmann univentionstaff 2016-09-09 19:13:07 CEST
I guess I found another one:

Removing disallowed characters '-' from username 'g.dueker-gr.'.
Adding ImportStudent(name='g.duekergr.', school='School3', dn='uid=g.duekergr.,cn=schueler,cn=users,ou=School3,dc=autotest300,dc=local', old_dn=None) (source_uid:NewDB record_uid:Gulja.Düker-Gr.) attributes={'$dn$': 'uid=g.duekergr.,cn=schueler,cn=users,ou=School3,dc=autotest300,dc=local', 'display_name': 'Gulja Dueker-Gr.', 'record_uid': u'Gulja.D\xfcker-Gr.', 'firstname': 'Gulja', 'lastname': 'Dueker-Gr.', 'type_name': 'Student', 'school': 'School3', 'name': 'g.duekergr.', 'disabled': 'none', 'email': u'gulja.dueker-gr.@autotest300.local', 'birthday': None, 'type': 'importStudent', 'schools': ['School3'], 'password': 'vz=(3=x%+:HHk2E', 'source_uid': u'NewDB', 'school_classes': {'School3': ['School3-3', 'School3-m']}, 'objectType': 'users/user'} udm_properties={u'phone': [u'+47-609-968302'], 'overridePWHistory': '1', u'description': u'A student.', 'overridePWLength': '1'}...
Creating ImportStudent(name='g.duekergr.', school='School3', dn='uid=g.duekergr.,cn=schueler,cn=users,ou=School3,dc=autotest300,dc=local', old_dn=None)
Entry #0: ValidationError when adding ImportStudent(name='g.duekergr.', school='School3', dn='uid=g.duekergr.,cn=schueler,cn=users,ou=School3,dc=autotest300,dc=local', old_dn=None) (source_uid:NewDB record_uid: Gulja.Düker-Gr.): {'name': ["Username must only contain numbers, letters and dots, and may not be 'admin'!"]}
Traceback (most recent call last):
  File "/usr/lib/pymodules/python2.7/ucsschool/importer/mass_import/user_import.py", line 131, in create_and_modify_users
    success = user.create(lo=self.connection)
  File "/usr/lib/pymodules/python2.7/ucsschool/importer/models/import_user.py", line 149, in create
    return super(ImportUser, self).create(lo, validate)
  File "/usr/lib/pymodules/python2.7/ucsschool/lib/models/base.py", line 417, in create
    success = self.create_without_hooks(lo, validate)
  File "/usr/lib/pymodules/python2.7/ucsschool/importer/models/import_user.py", line 152, in create_without_hooks
    success = super(ImportUser, self).create_without_hooks(lo, validate)
  File "/usr/lib/pymodules/python2.7/ucsschool/lib/models/base.py", line 430, in create_without_hooks
    raise ValidationError(self.errors.copy())
UserValidationError: ValidationError when adding ImportStudent(name='g.duekergr.', school='School3', dn='uid=g.duekergr.,cn=schueler,cn=users,ou=School3,dc=autotest300,dc=local', old_dn=None) (source_uid:NewDB record_uid: Gulja.Düker-Gr.): {'name': ["Username must only contain numbers, letters and dots, and may not be 'admin'!"]}
Starting univention-directory-notifier

For more details Build #18:
http://jenkins.knut.univention.de:8080/job/UCSschool%204.1/job/UCSschool%204.1%20(R2)%20Large%20Environment/
Comment 6 Daniel Tröder univentionstaff 2016-09-12 08:31:06 CEST
Fixed in r72474.
Comment 7 Florian Best univentionstaff 2016-09-16 14:58:46 CEST
ucs-school-import/modules/ucsschool/importer/utils/username_handler.py is missing python-version and copyright header.
Comment 8 Daniel Tröder univentionstaff 2016-09-16 15:23:34 CEST
r72655: added copyright header
Comment 9 Florian Best univentionstaff 2016-09-20 14:22:10 CEST
Using command line arguments: {'input': {'filename': 'users.csv'}, 'verbose': False}
{u'activate_new_users': {u'default': True},
 u'classes': {},
 u'csv': {u'header_lines': 1,
          u'incell-delimiter': {u'default': u','},
          u'mapping': {u'Benutzername': u'username',
                       u'Klassen': u'school_classes',
                       u'Mailadresse': u'email',
                       u'Nachname': u'lastname',
                       u'Schulen': u'schools',
             u'recordUID': u'<email>',
             u'username': {u'allow_rename': False,
                           u'default': u'<:umlauts><firstname>[0].<lastname><:lower>[COUNTER2]'}},
 u'school': None,
 u'sourceUID': u'NewDB',
 u'tolerate_errors': 0,
 u'user_deletion': {u'delete': True, u'expiration': 0},
 u'user_role': u'student',
 u'verbose': False}

CSV:
"Schulen","Nachname","Klassen","Mailadresse","description"
"gsmitte","Anton.Mayer.asdfasfd","1A","baz0@localhost.de","Dies ist ein Schüler"

Causes Traceback:
Creating ImportStudent(name='anton.mayer.', school='gsmitte', dn='uid=anton.mayer.,cn=schueler,cn=users,ou=gsmitte,dc=school,dc=local', old_dn=None)
Entry #0: ValidationError when adding ImportStudent(name='anton.mayer.', school='gsmitte', dn='uid=anton.mayer.,cn=schueler,cn=users,ou=gsmitte,dc=school,dc=local', old_dn=None) (source_uid:NewDB record_uid: baz0@localhost.de): {'name': ["Username must only contain numbers, letters and dots, and may not be 'admin'!"]}
Traceback (most recent call last):
  File "/usr/lib/pymodules/python2.7/ucsschool/importer/mass_import/user_import.py", line 131, in create_and_modify_users
    success = user.create(lo=self.connection)
  File "/usr/lib/pymodules/python2.7/ucsschool/importer/models/import_user.py", line 150, in create
    return super(ImportUser, self).create(lo, validate)
  File "/usr/lib/pymodules/python2.7/ucsschool/lib/models/base.py", line 417, in create
    success = self.create_without_hooks(lo, validate)
  File "/usr/lib/pymodules/python2.7/ucsschool/importer/models/import_user.py", line 153, in create_without_hooks
    success = super(ImportUser, self).create_without_hooks(lo, validate)
  File "/usr/lib/pymodules/python2.7/ucsschool/lib/models/base.py", line 430, in create_without_hooks
    raise ValidationError(self.errors.copy())
UserValidationError: ValidationError when adding ImportStudent(name='anton.mayer.', school='gsmitte', dn='uid=anton.mayer.,cn=schueler,cn=users,ou=gsmitte,dc=school,dc=local', old_dn=None) (source_uid:NewDB record_uid: baz0@localhost.de): {'name': ["Username must only contain numbers, letters and dots, and may not be 'admin'!"]}


FYI: I could add a user with the username '_' or '0', '5' or 'aA' (I think UDM prevented once to add users with uppercase characters not in the beginning).
I guess this is okay.
Comment 10 Florian Best univentionstaff 2016-09-20 14:38:33 CEST
The method 'UsernameHandler.format_username' is really complex (has a mccabe of 13!) and contains a lot of redundancy.

If I understand correctly the constraints of the functions are:
* the name must not contain illegal characters
* the name must not start or end with a dot
* the name must not be longer than X chars
* things in [ ] brackets must be replaced by some other things

Please rename self.username_pattern as the name does not reflect what it does/is.

What does [ALWAYSCOUNTER] and [COUNTER2] mean?

Where does the "3" come from?
self.username_max_length - 3  # numbers >999 are not supported

I would suggest to first make the [] replacements and afterwards ensure the constraints.
Comment 11 Daniel Tröder univentionstaff 2016-09-21 08:11:19 CEST
(In reply to Florian Best from comment #10)
> The method 'UsernameHandler.format_username' is really complex (has a mccabe
> of 13!) and contains a lot of redundancy.
The function is complex, because things have to be done in a certain order.
I don't see any redundancy - please elaborate.

> If I understand correctly the constraints of the functions are:
> * the name must not contain illegal characters
> * the name must not start or end with a dot
> * the name must not be longer than X chars
> * things in [ ] brackets must be replaced by some other things
> 
> Please rename self.username_pattern as the name does not reflect what it
> does/is.
That's true. But that would be an API change, as that property is already published in the manual. As any developer that wants to work on this, will read the manual, it should be fine.
Ofc such a small API change should be no problem, but please first ask professional services, if they have not yet subclassed UsernameHandler.

> What does [ALWAYSCOUNTER] and [COUNTER2] mean?
Manual: http://docs.software-univention.de/ucsschool-import-handbuch-4.1R2.html#configuration:unique_usernames

> Where does the "3" come from?
> self.username_max_length - 3  # numbers >999 are not supported
Comment: len(str(999)) == 3

> I would suggest to first make the [] replacements and afterwards ensure the
> constraints.
Not possible: When shortening the username, that would cut of the counter number.
Comment 12 Florian Best univentionstaff 2016-09-21 14:04:19 CEST
Please also read comment #9 which is the reason why this was REOPENED.
Comment 13 Florian Best univentionstaff 2016-09-21 14:54:19 CEST
The function should also never return an empty string imho!
Comment 14 Florian Best univentionstaff 2016-09-21 15:16:59 CEST
I think 'Max[Muster]Mann' should be replaced to MaxMusterMann instead of raising a FormatError?!
We know all possible values for between the brackets [].
Comment 15 Florian Best univentionstaff 2016-09-21 15:27:05 CEST
I it on purpose that 3 chars are always stripped from the username max length even if no ALWAYSCOUNTER / COUNTER2 is used?

>>> UsernameHandler(4).format_username('Max.Mustermann')
Username 'Max.Mustermann' to long, shortened to 'M'.
'M'
>>> UsernameHandler(3).format_username('Max.Mustermann')
Username 'Max.Mustermann' to long, shortened to ''.
''
Comment 16 Florian Best univentionstaff 2016-09-21 15:47:39 CEST
Created attachment 8025 [details]
doctests

Some suggestions for doctests.
Comment 17 Florian Best univentionstaff 2016-09-21 17:01:56 CEST
When I add a lastname of test[ALWAYSCOUNTER] I get this traceback:

 u'scheme': {u'email': u'<firstname>[0].<lastname>@<maildomain>',
             u'recordUID': u'<email>',
             u'username': {u'allow_rename': False,
                           u'default': u'<:umlauts><firstname>[0].<lastname><:lower>[COUNTER2]'}},

Shouldn't the value be escaped?

Traceback (most recent call last):
  File "/usr/lib/pymodules/python2.7/ucsschool/importer/mass_import/user_import.py", line 106, in create_and_modify_users
    user = self.determine_add_modify_action(imported_user)
  File "/usr/lib/pymodules/python2.7/ucsschool/importer/mass_import/user_import.py", line 195, in determine_add_modify_action
    imported_user.prepare_all(new_user=True)
  File "/usr/lib/pymodules/python2.7/ucsschool/importer/models/import_user.py", line 227, in prepare_all
    self.prepare_attributes(new_user)
  File "/usr/lib/pymodules/python2.7/ucsschool/importer/models/import_user.py", line 240, in prepare_attributes
    self.make_username()
  File "/usr/lib/pymodules/python2.7/ucsschool/importer/models/import_user.py", line 483, in make_username
    self.name = self.username_handler.format_username(self.name)
  File "/usr/lib/pymodules/python2.7/ucsschool/importer/utils/username_handler.py", line 125, in format_username
    raise FormatError("More than one counter variable found in username scheme '{}'.".format(name), name, name)
FormatError: More than one counter variable found in username scheme '.test[alwayscounter][counter2]'.
Comment 18 Sönke Schwardt-Krummrich univentionstaff 2016-09-22 12:10:03 CEST
(In reply to Florian Best from comment #15)
> I it on purpose that 3 chars are always stripped from the username max
> length even if no ALWAYSCOUNTER / COUNTER2 is used?

No. The constraint is currently that Windows is not able to handle usernames longer than 20 characters. In exam mode, the exam username gets the prefix "exam-" which reduces the available username length to 15 characters.

I think we should cut off the username length at 12 characters if COUNTER2 or ALWAYSCOUNTER is used, so the first 1000 users of each username will fit into the 15-character-limit.

If the username is shorter (e.g. format: "<firstname>[0:3]<lastname>[0:3]" → length is 6) then the username should not get shortened if still in range (max_username_length=6).

Any objections?
Comment 19 Florian Best univentionstaff 2016-09-22 12:28:01 CEST
(In reply to Sönke Schwardt-Krummrich from comment #18)
> (In reply to Florian Best from comment #15)
> > I it on purpose that 3 chars are always stripped from the username max
> > length even if no ALWAYSCOUNTER / COUNTER2 is used?
> 
> I think we should cut off the username length at 12 characters if COUNTER2
> or ALWAYSCOUNTER is used, so the first 1000 users of each username will fit
> into the 15-character-limit.
Yes, that is currently done. But what I actually asked was if that should also be done if COUNTER2 or ALWAYSCOUNTER is NOT used?
Comment 20 Sönke Schwardt-Krummrich univentionstaff 2016-09-22 12:37:47 CEST
(In reply to Sönke Schwardt-Krummrich from comment #18)
> Any objections?

Yes, by myself after thinking a 3rd and 4th time about this topic.

When using this pattern:

>>> UsernameHandler(4).format_username('Max.Mustermann')
Username 'Max.Mustermann' to long, shortened to 'M'.
'M'
>>> UsernameHandler(3).format_username('Max.Mustermann')
Username 'Max.Mustermann' to long, shortened to ''.

the username schema definition is wrong. If you use ALWAYSCOUNTER / COUNTER2 you need space for the counter. So defining a max length of 3 or 4 is simply wrong.

If we cut off only the required number of characters for the counter value, we would get the following usernames (max_username_length=4):
Max1 … Max9
Ma10 … Ma99
M100 … M999
counter >= 1000 → Error message
So this will result in different usernames depending on the counter → irritating.

So please leave it as it is. We will wait for change requests by customer.




(In reply to Florian Best from comment #19)
> > I think we should cut off the username length at 12 characters if COUNTER2
> > or ALWAYSCOUNTER is used, so the first 1000 users of each username will fit
> > into the 15-character-limit.
> Yes, that is currently done. But what I actually asked was if that should
> also be done if COUNTER2 or ALWAYSCOUNTER is NOT used?

Discussed that a few seconds ago with Daniel. If the counter is NOT used, the username is cut off at max_username_length characters.
Comment 21 Florian Best univentionstaff 2016-09-22 12:41:57 CEST
(In reply to Sönke Schwardt-Krummrich from comment #20)
> (In reply to Sönke Schwardt-Krummrich from comment #18)
> > Any objections?
> 
> Yes, by myself after thinking a 3rd and 4th time about this topic.
> 
> When using this pattern:
> 
> >>> UsernameHandler(4).format_username('Max.Mustermann')
> Username 'Max.Mustermann' to long, shortened to 'M'.
> 'M'
> >>> UsernameHandler(3).format_username('Max.Mustermann')
> Username 'Max.Mustermann' to long, shortened to ''.
> 
> the username schema definition is wrong. If you use ALWAYSCOUNTER / COUNTER2
> you need space for the counter. So defining a max length of 3 or 4 is simply
> wrong.
> 
> If we cut off only the required number of characters for the counter value,
> we would get the following usernames (max_username_length=4):
> Max1 … Max9
> Ma10 … Ma99
> M100 … M999
> counter >= 1000 → Error message
> So this will result in different usernames depending on the counter →
> irritating.
> 
> So please leave it as it is. We will wait for change requests by customer.
> 
> 
> 
> 
> (In reply to Florian Best from comment #19)
> > > I think we should cut off the username length at 12 characters if COUNTER2
> > > or ALWAYSCOUNTER is used, so the first 1000 users of each username will fit
> > > into the 15-character-limit.
> > Yes, that is currently done. But what I actually asked was if that should
> > also be done if COUNTER2 or ALWAYSCOUNTER is NOT used?
> 
> Discussed that a few seconds ago with Daniel. If the counter is NOT used,
> the username is cut off at max_username_length characters.
This is currently not the case as you can see in 

> >>> UsernameHandler(4).format_username('Max.Mustermann')
> Username 'Max.Mustermann' to long, shortened to 'M'.

which has not ALWAYSCOUNTER.
Comment 22 Daniel Tröder univentionstaff 2016-09-22 13:14:15 CEST
r72746:
* rename username_patterns to counter_variable_to_function
* adapt docs
* adapt other ucs-test
* add unittest (incl your test data) to 34_import-users_via_cli_v2 → test_create_with_illegal_chars_in_username()
* if no [VAR] is used, usernames get cut off at max_username_length
* [VAR]s in <property> input problem will be handled in separate Bug #42472
Comment 23 Daniel Tröder univentionstaff 2016-09-22 13:20:12 CEST
(In reply to Daniel Tröder from comment #22)
> * rename username_patterns to counter_variable_to_function
Prof. services was asked, if they have used the property, before renaming it.
Comment 24 Florian Best univentionstaff 2016-09-22 13:28:39 CEST
Created attachment 8026 [details]
patch (did not consider case insensitivity), attachment 8027 [details] on Bug #42478 is a rewrite
Comment 25 Florian Best univentionstaff 2016-09-22 18:02:38 CEST
OK: Everything else at Bug #42478
Comment 26 Sönke Schwardt-Krummrich univentionstaff 2016-10-04 13:24:50 CEST
UCS@school 4.1 R2 v5 has been released.

http://docs.software-univention.de/changelog-ucsschool-4.1R2v5-de.html

If this error occurs again, please clone this bug.