Bug 47681 - Provide better error message if mandatory columns do not match in json-config and csv file
Provide better error message if mandatory columns do not match in json-config...
Status: CLOSED FIXED
Product: UCS@school
Classification: Unclassified
Component: Import scripts
UCS@school 4.3
Other other
: P5 normal (vote)
: UCS@school 4.3 v6
Assigned To: Daniel Tröder
Ole Schwiegert
:
Depends on: 47691
Blocks: 48655
  Show dependency treegraph
 
Reported: 2018-08-28 10:06 CEST by Michael Grandjean
Modified: 2019-02-12 12:13 CET (History)
5 users (show)

See Also:
What kind of report is it?: Bug Report
What type of bug is this?: 2: Improvement: Would be a product improvement
Who will be affected by this bug?: 2: Will only affect a few installed domains
How will those affected feel about the bug?: 3: A User would likely not purchase the product
User Pain: 0.069
Enterprise Customer affected?:
School Customer affected?: Yes
ISV affected?:
Waiting Support:
Flags outvoted (downgraded) after PO Review:
Ticket number:
Bug group (optional):
Max CVSS v3 score:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Grandjean univentionstaff 2018-08-28 10:06:09 CEST
Scenario: I uploaded a CSV file to the HTTP-API UMC User Import module with a column name "Primarykey" that contained the record_uid. But in the user_import.json config file, the mapping was "ID": "record_uid". So, I had a mismatch between Primarykey (file) <-> ID (config).
In this case, the error message / Traceback in /var/lib/ucs-school-import/jobs/*/*/ucs-school-import.log is not helpful at all:

> 2018-08-28 09:45:10 ERROR cmdline.main:138  Outer Exception catcher: AttributeError("'NoneType' object has no attribute 'replace'",)
> Traceback (most recent call last):
>   File "/usr/lib/pymodules/python2.7/ucsschool/importer/frontend/cmdline.py", line 118, in main
>     self.do_import()
>   File "/usr/lib/pymodules/python2.7/ucsschool/importer/frontend/cmdline.py", line 96, in do_import
>     importer.mass_import()
>   File "/usr/lib/pymodules/python2.7/ucsschool/importer/mass_import/mass_import.py", line 70, in mass_import
>     self.import_users()
>   File "/usr/lib/pymodules/python2.7/ucsschool/importer/mass_import/mass_import.py", line 100, in import_users
>     user_import.create_and_modify_users(imported_users)  # 90% - 100%
>   File "/usr/lib/pymodules/python2.7/ucsschool/importer/mass_import/user_import.py", line 122, in create_and_modify_users
>     user = self.determine_add_modify_action(imported_user)
>   File "/usr/lib/pymodules/python2.7/ucsschool/importer/mass_import/user_import.py", line 197, in determine_add_modify_action
>     user = imported_user.get_by_import_id(self.connection, imported_user.source_uid, imported_user.record_uid)
>   File "/usr/lib/pymodules/python2.7/ucsschool/importer/models/import_user.py", line 338, in get_by_import_id
>     (source_uid, record_uid)
>   File "/usr/lib/python2.7/dist-packages/ldap/filter.py", line 59, in filter_format
>     return filter_template % (tuple(map(escape_filter_chars,assertion_values)))
>   File "/usr/lib/python2.7/dist-packages/ldap/filter.py", line 43, in escape_filter_chars
>     s = assertion_value.replace('\\', r'\5c')
> AttributeError: 'NoneType' object has no attribute 'replace'

We should check and catch this. I think something like "No record_uid given." as error message should be fine.
Comment 1 Daniel Tröder univentionstaff 2018-08-28 10:27:39 CEST
Lowering priority, because workaround exists: use correct data/config.

TODO:
1) raise custom Exception in get_by_import_id() if record_uid or source_uid is empty
2) add check for empty record_uid and source_uid to ImportUser.validate() (see Bug #45715). Then 1) shouldn't even happen.
Comment 2 Jürn Brodersen univentionstaff 2018-09-03 21:54:09 CEST
I disagree that a workaround exists. A workaround would be an existing tool to check for config errors/problems that would have caught this.

I would prefer the default csv reader throwing an error in case the the mapping doesn't match the csv header instead of some tracebacks deep in the code.
Comment 3 Daniel Tröder univentionstaff 2018-09-04 09:17:14 CEST
The CSV reader is not the correct place. The required attributes (record_uid and source_uid) can be created at other places (e.g. from scheme, hook).

Additionally to the TODOs above, record_uid and source_uid should be hardcoded members of ImportUser.required_attributes (→ depend on Bug #47691).
Comment 4 Daniel Tröder univentionstaff 2018-09-17 10:54:20 CEST
The CSV reader can check however, that all configuration keys in csv:mapping exist as column names.
The reverse ist not required (superfluous columns are ignored).
Comment 5 Daniel Tröder univentionstaff 2018-09-17 11:21:45 CEST
Working on it in branch dtroeder/47681_check_input_and_config:

1003251d5: Bug #47681: check that CSV columns configured in csv:mapping exist
Comment 6 Daniel Tröder univentionstaff 2018-09-17 15:13:42 CEST
With this "record_uid" and "source_uid" (and in the case of a legacy import also "name") are checked before use:

3a7188d5f: Bug #47681: create custom error message when username, source_uid or record_uid as missing

"mandatory_attributes" now includes by default all properties that must not be empty for an import to work:
* "record_uid" and "source_uid" (& "name") are required to find existing users in LDAP
* "firstname" and "lastname" are required by UDM to create a user
* "school" is required by the ucsschool.lib to store the user object in the correct OU

"mandatory_attributes" is now checked to not miss any of those.

bd5a7f433: Bug #47681: add record_uid and source_uid to mandatory_attributes, check configuration of mandatory_attributes

From comment5 (1003251d5): When starting to read the CSV, it is checked that the CSV columns configured in csv:mapping exist in the CSV file.
Comment 7 Ole Schwiegert univentionstaff 2018-09-25 10:21:47 CEST
The check of the csv mapping, missing recordUID, sourceUID or username works fine. If you build the package I can do the final tests and verify.
Comment 8 Daniel Tröder univentionstaff 2018-09-25 11:59:30 CEST
Addad a changelog entry, merged, built and created an advisory.

ucs-school-import (16.0.2-49)
Comment 9 Ole Schwiegert univentionstaff 2018-09-26 09:25:46 CEST
Advisory & Changelog: OK
Manual tests with different configs and import files (missing columns, not needed columns, missing sourceUID/recordUID) worked.

Some tests seem to fail because of the changes made here. Waiting for them to be fixed before verifying
Comment 10 Ole Schwiegert univentionstaff 2018-09-28 08:05:03 CEST
90_ucsschool.228_import_user_input_data_in_pyhooks.test
90_ucsschool.234_modify_user_data_in_pyhooks.test
Comment 11 Ole Schwiegert univentionstaff 2018-09-28 08:20:48 CEST
90_ucsschool.34_import-users-legacy.test
90_ucsschool.34_import-users_legacy_migration.test
Comment 12 Ole Schwiegert univentionstaff 2018-09-28 08:23:38 CEST
90_ucsschool.203_import-users_empty_email_scheme.test
90_ucsschool.229_import-users_blank_email.test
Comment 13 Daniel Tröder univentionstaff 2018-09-28 14:26:57 CEST
The legacy import does not officially support a "password" column, but the old script did anyway. As the "new legacy" script is meant as a drop-in, it must too. So now a missing 'password' column is not considered an error anymore:

[4.3] a42a97dbb Bug #47681: continue supporting optional, undocumented 'password' column in the legacy import

On the other hand, a test that used the missing test about missing columns had to be adapted:

[4.3] d28dc0016 Bug #47681: missing columns are an error now

[4.3] 4f6d4a8c8 Bug #47681: advisory update
Comment 14 Ole Schwiegert univentionstaff 2018-10-01 07:58:14 CEST
ucs-school-testuser-import fails if not provided with the option --create-email-addresses:

./ucs-school-testuser-import --student 10 --classes 2 AE

------ Importing users... ------
------ Starting to read users from input data... ------
Error reading 1. user: Columns configured in csv:mapping missing: EMail.
Traceback (most recent call last):
  File "/usr/lib/pymodules/python2.7/ucsschool/importer/mass_import/user_import.py", line 108, in read_input
    import_user = self.reader.next()
  File "/usr/lib/pymodules/python2.7/ucsschool/importer/reader/base_reader.py", line 87, in next
    input_dict = self.import_users.next()
  File "/usr/lib/pymodules/python2.7/ucsschool/importer/reader/csv_reader.py", line 169, in read
    ', '.join(missing_columns)))
ConfigurationError: Columns configured in csv:mapping missing: EMail.
------ User import statistics ------
Read users from input data: 0
Errors: 1
Entry #: Error description
  0: NoName: Columns configured in csv:mapping missing: EMail.
------ End of user import statistics ------
------ Writing user import summary to /var/lib/ucs-school-import/summary/2018/10/user_import_summary_2018-10-01_07:57:07.csv... ------
Searching for hooks of type 'ResultPyHook' in: /usr/share/ucs-school-import/pyhooks...
Found hook classes: 
Loaded hooks: {}.
------ Importing users done. ------
Starting univention-directory-notifier
Starting univention-directory-notifier (via systemctl): univention-directory-notifier.service.

univention-directory-notifier started
More than 0 errors. Exiting. Errors:
0: Columns configured in csv:mapping missing: EMail.
------ ucs-school-user-import return with exit code 1 ------
Comment 15 Daniel Tröder univentionstaff 2018-10-01 11:28:35 CEST
The email address column is now always created, but if --create-email-addresses is not used, it will be empty for all users.

The csv:schema:email configuration was removed. So when the CSV email column is empty, users will not have a email address.

The help text for testing a CSV file meant for the HTTP API import was improved to be copyandpasteable.

[4.3] a2a3cf059 Bug #47681: always create email column, create empty email addresses when --create-email-addresses is not used
[4.3] d2cd22998 Bug #47681: improve help text for testing HTTP-API CSV import from the command line
[4.3] f1eec5895 Bug #47681: changelog
[4.3] 9f515adc4 Bug #47681: advisory update

ucs-school-import (16.0.2-54)

I added post-build a note about the  help message to the internal docs:

[4.3 05f65f0d3] Bug #47681: add note about help message to internal docs
Comment 16 Ole Schwiegert univentionstaff 2018-10-02 09:06:35 CEST
A side effect with the new check for the mandatory attributes: If you provide mandatory attributes in the config yourself, you overwrite the default. Lets say you make email mandatory with ["email"]. Now the importer aborts with mandadory attribute missing tests for last/first/name, source/record_uid, school and name since they were removed from the list by the config.

To have to add them just to add one custom mandatory attribute is very tedious. A solution would be to always add these, anyway necessary attributes, to the configured list after all config files were applied. That keeps the possibility to override the mandatory attribute list in other configs but keeps the anyway necessary attributes in the list.

--as discussed with Daniel
Comment 17 Daniel Tröder univentionstaff 2018-10-02 11:19:19 CEST
It is now possible to set in  a json config:

{
    "mandatory_attributes": ["email"]
}

and the import will do:

Appending ['firstname', 'lastname', 'name', 'record_uid', 'school', 'source_uid'] to "mandatory_attributes".


[4.3] 7b7cb7167 Bug #47885: automatically add missing properties to mandatory_attributes
[4.3] 65ae3256c Bug #47885: fix doc
[4.3] cc816c09a Bug #47885: advisory update

ucs-school-import (16.0.2-55)

http://jenkins.knut.univention.de:8080/job/UCSschool-4.3/job/Handbook/lastBuild/artifact/webroot/ucsschool-import-handbuch-4.3.html#table:userimport_configuration
Comment 18 Ole Schwiegert univentionstaff 2018-10-04 09:06:51 CEST
Advisory & Changelog: OK
Package installs: OK
Works: OK (like before)
mandatory_attributes now extended: OK

looks all fine for me now
Comment 19 Sönke Schwardt-Krummrich univentionstaff 2018-11-16 11:48:15 CET
UCS@school 4.3 v6 has been released.

https://docs.software-univention.de/changelog-ucsschool-4.3v6-de.html

If this error occurs again, please clone this bug.