Bug 52309 - validate UCS@school objects when loading from LDAP and safely log errors, object data and traceback
validate UCS@school objects when loading from LDAP and safely log errors, obj...
Product: UCS@school
Classification: Unclassified
Component: Ucsschool-lib
Other Linux
: P5 normal (vote)
: UCS@school 4.4 v7-errata
Assigned To: Tobias Wenzel
Daniel Tröder
Depends on: 52783 52784
  Show dependency treegraph
Reported: 2020-11-03 15:36 CET by Ole Schwiegert
Modified: 2021-02-23 16:14 CET (History)
4 users (show)

See Also:
What kind of report is it?: Feature Request
What type of bug is this?: ---
Who will be affected by this bug?: ---
How will those affected feel about the bug?: ---
User Pain:
Enterprise Customer affected?:
School Customer affected?: Yes
ISV affected?:
Waiting Support:
Flags outvoted (downgraded) after PO Review:
Ticket number:
Bug group (optional):
Max CVSS v3 score:


Note You need to log in before you can comment on or make changes to this bug.
Description Ole Schwiegert univentionstaff 2020-11-03 15:36:40 CET
Very often we get confronted with errors and problems that are due to corrupt or wrong data our code does not handle and then throws an exception.

The ucsschool.lib should intercept those exceptions and always log the dn of the object that causes the exception.

This change should be implemented for ucsschool.lib and kelvin.
Comment 1 Patrick Ziegler univentionstaff 2020-11-17 09:19:51 CET
Added to backlog: https://taiga.knut.univention.de/project/oschwieg-ucs-5/us/3873
Comment 2 Tobias Wenzel univentionstaff 2021-01-25 09:12:43 CET
I implemented the validation for ucsschool 4.4 in

[twenzel/52309_improve_ucsschool_logging] 5ab999657 Bug #52309: add package dependencies
[twenzel/52309_improve_ucsschool_logging] 2adc3466c fixup! Bug #52309: simplefy validation texts & some fixes
[twenzel/52309_improve_ucsschool_logging] 578372a13 Bug #52309: simplefy validation texts & some fixes
[twenzel/52309_improve_ucsschool_logging] f26b1690f Bug #52309: integrate validate_udm in school lib
[twenzel/52309_improve_ucsschool_logging] eb186820e Bug #52309: group validation
[twenzel/52309_improve_ucsschool_logging] 5dacbc21a Bug #52309: add backupcount to get_file_handler
[twenzel/52309_improve_ucsschool_logging] ee448eb66 Bug #52309: user validation and obj_to_dict

merge request: https://git.knut.univention.de/univention/ucsschool/-/merge_requests/13

After loading the UDM objects Users, Groups and Share Objects are validated and validation errors 
are logged to a logger which is passed to the validation function.
The complete udm dict as well as the stacktrace is logged to /var/log/univention/ucs-school-validation.log
with a uuid in each log to make grep easier.

During the validation lo is not used, only tests like 'A student must have at least one school group at a school they are
part of'. 

The UDM objects which are passed to the validation function are converted to match the format of dicts of the UDM Rest API. 
In Kelvin we can use the to_dict() method to get the same format.

The Kelvin part has yet to be implemented, so I'm leaving this bug open.
Comment 3 Daniel Tröder univentionstaff 2021-01-26 16:56:58 CET
Code review of code in branch 4.4 (Kelvin later):

* missing argument "when" in docstring of "get_file_handler()"

* missing call to "validate()" in "get_udm_object()" in kelvin branch

* Please don't put a comma at the end of lists/dicts/func-signatures if they fit into 1 line. It forces a newer black to explode the list over multiple lines.

* Rebase branch on 4.4, run "make format-all; make lint-all"; fix unnecessary list expansions from trailing comma.

* in 408_ucsschool_lib_validation_to_dict.py → test_udm_user_to_dict():
  - use @pytest.mark.parametrize instead of "for cls in [..]".
  - use existing fixture "schoolenv" instead of "with utu.UCSTestSchool() as schoolenv:".
  - to get the corresponding UDM object, use school_obj.get_udm_object() instead of "udm_modules.lookup(..)".
  - use udm_obj.dn instead of udm_obj.position.getDn()
  - assert udm_obj.position == dict_obj["position"]
  - for option in udm_obj.options: → the if then and else parts are all the same.

* in 408_ucsschool_lib_validation_to_dict.py → test_udm_group_to_dict() and test_udm_share_to_dict(): same as test_udm_user_to_dict()

* 407_ucsschool_lib_validation_users.py:
  - random_logger(): the temporary file is not removed.
  - random_logger(): return the logger directly. No test_function calls it twice, so there is no need to return a function. This will save one line per test.
  - mock_logger_file(): the temporary file is not removed.
  - mock_logger_file(): use of builtin name "file" as variable
  - test_correct_ldap_position(): please add argument "ids=...()" to parametrize()
  - test_*(): don't change container while iterating though them, make a copy first and iterate through that. This code will fail in Python 3.

* validator.py:
  - no need to cache access to ucr.get("ldap/base", ""), 1. it is always set, so ucr["ldap/base] can be used, 2. ucr makes a dictionary access, it cannot be made faster.
  - instead of "role.split(":")" use roles.get_role_info()
  - test_test_missing_role(): one "test" to much :)
  - user_role_mapping() does not detect teacher_and_staff and thus validate_user_roles() won't check on both roles.
  - no '{}' in "is missing a Teacher or Staff group".format(...)
  - validate_group_membership(): missing return type in type annotation
  - I don't understand the sentence "User is with r x is not part of schools x,y".
  - "does not have {}-role." → "does not have role {!r}."
  - "Students must not any other roles..." → missing verb
  - in validate_user_roles() it is only checked for students if they have a student:school:$OU role per OU, not for the other 3 roles.
  - For better readability, when printing a list to a string using str.join(), either add a blank to ",".join() → ", ".join() or use "{!r}".format(a_list).
  - If regex is already compiled, instead of "re.match(regex, string)" write "regex.match(string)".
  - There is a lot of code like this:
         roles = [
              for group in groups
              if re.match(),
         if not roles:
    That code has always O(n) and is difficult to understand.
    But it can be faster and better readable if it is written like this:
         for group in groups:
              if re.match():
  - The type annotation for a function has to be like this:
         def func():  # type: (Dict[str, List[Any]]) -> List[str]
    Dict and List are generics and are upper case and show the type of their content. The general form is: # type: (..) -> ..
  - validate_user_required_attributes(): not necessary to check on "groups" and "primaryGroup". That is already done by UDM and can never be empty.
  - validate_group_position(): """Validate user position..." → """Validate position..."
  - validate_user_roles(): repeats code from validate_obligatory_roles()
  - validate_user_position(), validate_group_position() and validate_share_position() all repeat the same code.
  - is_student() checks two different things (roles and options).


Generally I think the code will be hard to maintain. Most test function are for multiple object types / roles. So they will grow and become more complex the more obejct types / roles we have. Also because of this there are a lot of checks (like "is_student()") done multiple times more than necessary.
I suggest to convene and discuss a different architecture.
Comment 4 Daniel Tröder univentionstaff 2021-02-05 16:53:38 CET
* Please see my remarks in gitlab.
* Please run "make lint-all" to see problems found by the linters "black" and "flake8".
* "lru_cache" is not used but it is imported and a package dependency exists.
-> "python-backports.functools-lru-cache" is in unmaintained, so luckily it can be removed from the dependency list.
* Please add a tests that checks that private data goes into the private logfile and the general log file contains only non-sensitive data.
* Please ask the ProfS about the desired retention time for the private_data_logger. Currently it is "1000 files". That will 1. be to many and 2. could be a privacy issue (data may have to be deleted after 3 months). Possible, that this has to be configurable.
* All static type hints are syntactically wrong → https://mypy.readthedocs.io/en/latest/cheat_sheet.html

OK: code structure... mostly... see gitlab about passing data by (inherited) methods instead of kwargs
OK: objects are thoroughly validated - very nice!
OK: tests 407_* and 408_* pass

TODO QA: manual test in 4.4
TODO QA: code review and tests in kelvin
Comment 5 Tobias Wenzel univentionstaff 2021-02-08 16:43:29 CET
Thanks a lot for the review!

You'll find my changes in 

[twenzel/52309_improve_ucsschool_logging] e42750807 Bug #52309: qa feedback part 2
[twenzel/52309_improve_ucsschool_logging] 9f303b9c5 fixup! Bug #52309: add test for uuids
[twenzel/52309_improve_ucsschool_logging] f1a1fd1c3 fixup! Bug #52309: split logging
[twenzel/52309_improve_ucsschool_logging] 1a99ead5d Bug #52309: add test for uuids
[twenzel/52309_improve_ucsschool_logging] 7dc2acd83 Bug #52309: split logging
Comment 6 Tobias Wenzel univentionstaff 2021-02-09 10:52:54 CET
A new UCR-V was added:

Description[de]=Diese Variable steuert, wie viele Kopien der Log-Datei zum Loggen vertraulicher Daten während der Validierung von UCS@school Objekten in Rotation gehalten werden. Diese Log-Dateien sind nur von root lesbar (Standard: 60)
Description[en]=This variable controls how many copies of the log file, which is used for logging sensitive data, are kept in rotation during the validation of UCS@school objects. These log files can only be read by root (Default: 60)

[twenzel/52309_improve_ucsschool_logging] acad8f3a2 Bug #52309: add ucr-v for backupcount
Comment 7 Tobias Wenzel univentionstaff 2021-02-09 14:29:44 CET
Cherry picked the changes of 4.4 to kelvin branch in 

[twenzel/52309_validation] 98eea0f9c fixup! Bug #52309: qa feedback part 2
[twenzel/52309_validation] cab7cb3f1 Bug #52309: qa feedback part 2

The changes now include:

1) Validation-Classes are determined by UDM Options and implement the methods which
yield the needed groups and roles.
2) To reduce calls to UCR and to make the code more public, regexes, groupnames etc.
were moved to SchoolSearchBase.
3) Logging
- The logging is split: validation errors, dn and an uuid for the event are logged to the logger passed to the validation, e.g. to http.log or the schoolwizzard.log. 
- The stacktrace as well as a dict representation of the udm object and the
event uuid are logged to /var/log/univention/ucs-school-validation.log or /var/log/univention/ucsschool-kelvin-rest-api/ucs-school-validation.log
- The UCR-V ucsschool/validation/logging/backupcount was introduced (see Comment 6)
Comment 8 Tobias Wenzel univentionstaff 2021-02-10 16:55:15 CET
The UCR-V ucsschool/validation/logging/backupcount is now synced into the kelvin-docker container 

[twenzel/52309_validation] 5e2fcb881 Bug #52309: sync ucr-v to container
Comment 9 Tobias Wenzel univentionstaff 2021-02-11 12:02:19 CET
Documentation for ucsschool was added in

[twenzel/52309_improve_ucsschool_logging] 14ecec29d fixup! Bug #52309: add documentation for validation logging
[twenzel/52309_improve_ucsschool_logging] f56e71f3a Bug #52309: add documentation for validation logging

Documentation for kelvin was added in 

[twenzel/52309_validation] 297e8cadd fixup! fixup! Bug #52309: add documentation
[twenzel/52309_validation] 9aeb1f618 fixup! Bug #52309: add documentation
[twenzel/52309_validation] 41ba484cb Bug #52309: add documentation
Comment 10 Tobias Wenzel univentionstaff 2021-02-17 15:32:19 CET
As discussed, the commits were squashed, merged & the packages were build.

[4.4] 99ced2b1a Bug #52309: update advisory
[4.4] c43a43dc6 Bug #52309: changelog and advisory
[4.4] d09513771 Bug #52309: Merge branch 'twenzel/52309_improve_ucsschool_logging' into 4.4
[4.4] 03272ca9d Bug #52309: add documentation for validation logging
[4.4] 90d42d37e Bug #52309: add ucr-v for backupcount
[4.4] ac65d7aaf Bug #52309: group & user validation

Package: ucs-school-lib
Version: 12.2.10A~
Branch: ucs_4.4-0
Scope: ucs-school-4.4

Package: ucs-test-ucsschool
Version: 6.0.186A~
Branch: ucs_4.4-0
Scope: ucs-school-4.4

and in feature/kelvin

[feature/kelvin] 34a868f48 Bug #52309: Merge branch 'twenzel/52309_validation' into feature/kelvin
[feature/kelvin] 011ecd04f Bug #52309: add documentation
[feature/kelvin] 2c77474f0 Bug #52309: sync ucr-v to container
[feature/kelvin] 76a24458f Bug #52309: group & user validation
Comment 11 Tobias Wenzel univentionstaff 2021-02-18 10:34:44 CET
As discussed

[4.4] 71aafc655 Bug #52309: update advisory
[4.4] 8de238409 Bug #52309: correct type hints and replace decorator

[feature/kelvin] 27fb9494b Bug #52309: correct type hints and replace decorator
[feature/kelvin] 9b85f3f26 Bug #52309: add newline in requirements.txt

the container was built on docker.knut

Thanks again for the QA!
Comment 12 Daniel Tröder univentionstaff 2021-02-19 10:15:03 CET
OK: code and tests in Kelvin branch
OK: Jenkins job Kelvin branch
OK: code and tests in 4.4 branch

Just waiting for the 4.4 Jenkins jobs to run and succeed and this bug is verified.
Comment 13 Daniel Tröder univentionstaff 2021-02-22 17:17:04 CET
OK: Jenkins job

Only thing missing for verify is the merge of the code to the 5.0 branch.
Comment 14 Tobias Wenzel univentionstaff 2021-02-23 09:09:11 CET
Nice! As discussed, here is the merge request https://git.knut.univention.de/univention/ucsschool/-/merge_requests/14
Comment 15 Daniel Tröder univentionstaff 2021-02-23 10:42:29 CET
Thank you.
Comment 16 Tobias Wenzel univentionstaff 2021-02-23 16:14:19 CET
Errata update for UCS@school 4.4 v8 have been released.


If this error occurs again, please clone this bug.