Univention Bugzilla – Bug 54589
Wrong regex for UDM syntax gid allows wrong characters
Last modified: 2022-05-25 09:51:56 CEST
The regular expression für the UDM syntax gid does not represent what is really meant by the author and what is useful. " -." does not only allow the characters space, dash and dot but all characters in the ASCII range " " (32) up to "." (46) → see list of wrongly allowed characters below. IIRC the correct regex would be: u"(?u)^\\w([\\w .-]*\\w)?$" ALSO: Please check, why single ticks are currently allowed in group names! Also note: this is a breaking change. We have to find an appropriate release for this change and have to announce this change before! (in case a customer used a lot of e.g. plus signs in group names) class gid(simple): min_length = 1 # TODO: not enforced here max_length = 32 # TODO: not enforced here regex = re.compile(u"(?u)^\\w([\\w -.’]*\\w)?$") # FIXME: The " -." in "[\w -.]" matches the ASCII character range(ord(' '), ord('.')+1) == range(32, 47) error_message = _( "A group name must start and end with a letter, number or underscore. In between additionally spaces, dashes " "and dots are allowed." ) $ python3 >>> for i in range(ord(' '), ord('.')+1): print(i, repr(chr(i))) ... 32 ' ' 33 '!' 34 '"' 35 '#' 36 '$' 37 '%' 38 '&' 39 "'" 40 '(' 41 ')' 42 '*' 43 '+' 44 ',' 45 '-' 46 '.' >>> root@master:~# udm groups/group create --position cn=groups,$(ucr get ldap/base) --set name="Group (name) + cool2" Object created: cn=Group (name) \+ cool,cn=groups,dc=dev,dc=nstx,dc=de
*** Bug 39776 has been marked as a duplicate of this bug. ***
*** Bug 33656 has been marked as a duplicate of this bug. ***
(In reply to Florian Best from comment #1) > *** Bug 39776 has been marked as a duplicate of this bug. *** Same for the syntax classes: `uid_umlauts` and `uid_umlauts_lower_except_first_letter`.
(In reply to Florian Best from comment #2) > *** Bug 33656 has been marked as a duplicate of this bug. *** (In reply to Frank Greif from comment #7) > Currently (4.4.1.239) the regexp of class 'gid' is somewhat weird: > > 1334 regex = re.compile(ur"(?u)^\w([\w -.’]*\w)?$") > > The bracketed character class would match (left to right): > > * anything deemed a 'word character' in Unicode > * the range of characters between 0x20 (space) and 0x2E (dot) > * the Unicode char with codepoint U+2019 (right single quotation mark) > > I'd propose to change: > > * position the minus at the right end (so it can't be misunderstood as a > range) > * remove the spurious U+2019 char > > This would not solve this bug, but at least it would make the regexp really > match what the description says. We probably have to keep the U+2019 (’) for french installations.
> We probably have to keep the U+2019 (’) for french installations. Really? U+2019 is a punctuation character. (if I understood the issue correctly, then space, dot and dash are the only punctuation chars intended to be allowed?) French accented vowels are already covered by \w in Unicode context.
*** Bug 24137 has been marked as a duplicate of this bug. ***
*** Bug 18332 has been marked as a duplicate of this bug. ***