Univention Bugzilla – Bug 47681
Provide better error message if mandatory columns do not match in json-config and csv file
Last modified: 2019-02-12 12:13:31 CET
Scenario: I uploaded a CSV file to the HTTP-API UMC User Import module with a column name "Primarykey" that contained the record_uid. But in the user_import.json config file, the mapping was "ID": "record_uid". So, I had a mismatch between Primarykey (file) <-> ID (config). In this case, the error message / Traceback in /var/lib/ucs-school-import/jobs/*/*/ucs-school-import.log is not helpful at all: > 2018-08-28 09:45:10 ERROR cmdline.main:138 Outer Exception catcher: AttributeError("'NoneType' object has no attribute 'replace'",) > Traceback (most recent call last): > File "/usr/lib/pymodules/python2.7/ucsschool/importer/frontend/cmdline.py", line 118, in main > self.do_import() > File "/usr/lib/pymodules/python2.7/ucsschool/importer/frontend/cmdline.py", line 96, in do_import > importer.mass_import() > File "/usr/lib/pymodules/python2.7/ucsschool/importer/mass_import/mass_import.py", line 70, in mass_import > self.import_users() > File "/usr/lib/pymodules/python2.7/ucsschool/importer/mass_import/mass_import.py", line 100, in import_users > user_import.create_and_modify_users(imported_users) # 90% - 100% > File "/usr/lib/pymodules/python2.7/ucsschool/importer/mass_import/user_import.py", line 122, in create_and_modify_users > user = self.determine_add_modify_action(imported_user) > File "/usr/lib/pymodules/python2.7/ucsschool/importer/mass_import/user_import.py", line 197, in determine_add_modify_action > user = imported_user.get_by_import_id(self.connection, imported_user.source_uid, imported_user.record_uid) > File "/usr/lib/pymodules/python2.7/ucsschool/importer/models/import_user.py", line 338, in get_by_import_id > (source_uid, record_uid) > File "/usr/lib/python2.7/dist-packages/ldap/filter.py", line 59, in filter_format > return filter_template % (tuple(map(escape_filter_chars,assertion_values))) > File "/usr/lib/python2.7/dist-packages/ldap/filter.py", line 43, in escape_filter_chars > s = assertion_value.replace('\\', r'\5c') > AttributeError: 'NoneType' object has no attribute 'replace' We should check and catch this. I think something like "No record_uid given." as error message should be fine.
Lowering priority, because workaround exists: use correct data/config. TODO: 1) raise custom Exception in get_by_import_id() if record_uid or source_uid is empty 2) add check for empty record_uid and source_uid to ImportUser.validate() (see Bug #45715). Then 1) shouldn't even happen.
I disagree that a workaround exists. A workaround would be an existing tool to check for config errors/problems that would have caught this. I would prefer the default csv reader throwing an error in case the the mapping doesn't match the csv header instead of some tracebacks deep in the code.
The CSV reader is not the correct place. The required attributes (record_uid and source_uid) can be created at other places (e.g. from scheme, hook). Additionally to the TODOs above, record_uid and source_uid should be hardcoded members of ImportUser.required_attributes (→ depend on Bug #47691).
The CSV reader can check however, that all configuration keys in csv:mapping exist as column names. The reverse ist not required (superfluous columns are ignored).
Working on it in branch dtroeder/47681_check_input_and_config: 1003251d5: Bug #47681: check that CSV columns configured in csv:mapping exist
With this "record_uid" and "source_uid" (and in the case of a legacy import also "name") are checked before use: 3a7188d5f: Bug #47681: create custom error message when username, source_uid or record_uid as missing "mandatory_attributes" now includes by default all properties that must not be empty for an import to work: * "record_uid" and "source_uid" (& "name") are required to find existing users in LDAP * "firstname" and "lastname" are required by UDM to create a user * "school" is required by the ucsschool.lib to store the user object in the correct OU "mandatory_attributes" is now checked to not miss any of those. bd5a7f433: Bug #47681: add record_uid and source_uid to mandatory_attributes, check configuration of mandatory_attributes From comment5 (1003251d5): When starting to read the CSV, it is checked that the CSV columns configured in csv:mapping exist in the CSV file.
The check of the csv mapping, missing recordUID, sourceUID or username works fine. If you build the package I can do the final tests and verify.
Addad a changelog entry, merged, built and created an advisory. ucs-school-import (16.0.2-49)
Advisory & Changelog: OK Manual tests with different configs and import files (missing columns, not needed columns, missing sourceUID/recordUID) worked. Some tests seem to fail because of the changes made here. Waiting for them to be fixed before verifying
90_ucsschool.228_import_user_input_data_in_pyhooks.test 90_ucsschool.234_modify_user_data_in_pyhooks.test
90_ucsschool.34_import-users-legacy.test 90_ucsschool.34_import-users_legacy_migration.test
90_ucsschool.203_import-users_empty_email_scheme.test 90_ucsschool.229_import-users_blank_email.test
The legacy import does not officially support a "password" column, but the old script did anyway. As the "new legacy" script is meant as a drop-in, it must too. So now a missing 'password' column is not considered an error anymore: [4.3] a42a97dbb Bug #47681: continue supporting optional, undocumented 'password' column in the legacy import On the other hand, a test that used the missing test about missing columns had to be adapted: [4.3] d28dc0016 Bug #47681: missing columns are an error now [4.3] 4f6d4a8c8 Bug #47681: advisory update
ucs-school-testuser-import fails if not provided with the option --create-email-addresses: ./ucs-school-testuser-import --student 10 --classes 2 AE ------ Importing users... ------ ------ Starting to read users from input data... ------ Error reading 1. user: Columns configured in csv:mapping missing: EMail. Traceback (most recent call last): File "/usr/lib/pymodules/python2.7/ucsschool/importer/mass_import/user_import.py", line 108, in read_input import_user = self.reader.next() File "/usr/lib/pymodules/python2.7/ucsschool/importer/reader/base_reader.py", line 87, in next input_dict = self.import_users.next() File "/usr/lib/pymodules/python2.7/ucsschool/importer/reader/csv_reader.py", line 169, in read ', '.join(missing_columns))) ConfigurationError: Columns configured in csv:mapping missing: EMail. ------ User import statistics ------ Read users from input data: 0 Errors: 1 Entry #: Error description 0: NoName: Columns configured in csv:mapping missing: EMail. ------ End of user import statistics ------ ------ Writing user import summary to /var/lib/ucs-school-import/summary/2018/10/user_import_summary_2018-10-01_07:57:07.csv... ------ Searching for hooks of type 'ResultPyHook' in: /usr/share/ucs-school-import/pyhooks... Found hook classes: Loaded hooks: {}. ------ Importing users done. ------ Starting univention-directory-notifier Starting univention-directory-notifier (via systemctl): univention-directory-notifier.service. univention-directory-notifier started More than 0 errors. Exiting. Errors: 0: Columns configured in csv:mapping missing: EMail. ------ ucs-school-user-import return with exit code 1 ------
The email address column is now always created, but if --create-email-addresses is not used, it will be empty for all users. The csv:schema:email configuration was removed. So when the CSV email column is empty, users will not have a email address. The help text for testing a CSV file meant for the HTTP API import was improved to be copyandpasteable. [4.3] a2a3cf059 Bug #47681: always create email column, create empty email addresses when --create-email-addresses is not used [4.3] d2cd22998 Bug #47681: improve help text for testing HTTP-API CSV import from the command line [4.3] f1eec5895 Bug #47681: changelog [4.3] 9f515adc4 Bug #47681: advisory update ucs-school-import (16.0.2-54) I added post-build a note about the help message to the internal docs: [4.3 05f65f0d3] Bug #47681: add note about help message to internal docs
A side effect with the new check for the mandatory attributes: If you provide mandatory attributes in the config yourself, you overwrite the default. Lets say you make email mandatory with ["email"]. Now the importer aborts with mandadory attribute missing tests for last/first/name, source/record_uid, school and name since they were removed from the list by the config. To have to add them just to add one custom mandatory attribute is very tedious. A solution would be to always add these, anyway necessary attributes, to the configured list after all config files were applied. That keeps the possibility to override the mandatory attribute list in other configs but keeps the anyway necessary attributes in the list. --as discussed with Daniel
It is now possible to set in a json config: { "mandatory_attributes": ["email"] } and the import will do: Appending ['firstname', 'lastname', 'name', 'record_uid', 'school', 'source_uid'] to "mandatory_attributes". [4.3] 7b7cb7167 Bug #47885: automatically add missing properties to mandatory_attributes [4.3] 65ae3256c Bug #47885: fix doc [4.3] cc816c09a Bug #47885: advisory update ucs-school-import (16.0.2-55) http://jenkins.knut.univention.de:8080/job/UCSschool-4.3/job/Handbook/lastBuild/artifact/webroot/ucsschool-import-handbuch-4.3.html#table:userimport_configuration
Advisory & Changelog: OK Package installs: OK Works: OK (like before) mandatory_attributes now extended: OK looks all fine for me now
UCS@school 4.3 v6 has been released. https://docs.software-univention.de/changelog-ucsschool-4.3v6-de.html If this error occurs again, please clone this bug.