50321 – Assist migration from legacy import to new import

Bug 50321 - Assist migration from legacy import to new import

Summary: Assist migration from legacy import to new import

Status:	CLOSED FIXED

Alias:	None

Product:	UCS@school
Classification:	Unclassified
Component:	Import scripts
Version:	UCS@school 4.4
Hardware:	Other Linux

Importance:	P5 normal
Target Milestone:	UCS@school 4.4 v3-errata
Assignee:	Sönke Schwardt-Krummrich
QA Contact:	Daniel Tröder

URL:
Keywords:

Depends on:
Blocks:

Reported:	2019-10-04 16:30 CEST by Sönke Schwardt-Krummrich
Modified:	2019-10-09 13:53 CEST (History)
CC List:	0 users

See Also:
What kind of report is it?:	Development Internal
What type of bug is this?:	---
Who will be affected by this bug?:	---
How will those affected feel about the bug?:	---
User Pain:
Enterprise Customer affected?:
School Customer affected?:
ISV affected?:
Waiting Support:
Flags outvoted (downgraded) after PO Review:
Ticket number:
Bug group (optional):
Customer ID:
Max CVSS v3 score:

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Sönke Schwardt-Krummrich

2019-10-04 16:30:27 CEST

Scenario:
Users were created with the old user import (import_user) or manually via the UMC module "Benutzer (Schulen)" in LDAP. The LDAP attributes ucsschoolRecordUID and ucsschoolSourceUID are either not set or are set with ucsschoolRecordUID==$UID and ucsschoolSourceUID=="LegacyDB". This is usually impractical when switching to the new CLI import (ucs-school-import-user) or the UMC module "Benutzer-Import". For the two new import paths it is assumed that ucsschoolRecordUID and ucsschoolSourceUID are set with meaningful values.

A script is required which supports the migration of user objects from the old to the new import.

Solution approach:
A script supports the domain administrator in migrating the user objects by passing a CSV file to a script to be implemented. The CSV file consists of two columns with user name and RecordUID. The script then sets the corresponding RecordUID for the specified user objects. Optionally, a SourceUID can be specified via command line, which is then set homogeneously for all users specified in the file.
This step has to be done only once for the existing users. The user objects can then all be maintained via e.g. ucs-school-import-user.

Beispiel:
# cd /usr/share/ucs-school-import/scripts/
# cat mapping.csv
"username";"record_uid"
"anton123";"12345678"
"tom.teacher";"L887766"
"jeff.staff";"E-2345234-234-B"
# ./migrate_ucsschool_import_user --input-file=mapping.csv --modify-record-uid --source-uid="gsmitte"
User: 'anton123'   record_uid: [] ==> ['12345678']   source_uid: [] ==> ['gsmitte']
User: 'tom.teacher'   record_uid: [] ==> ['L887766']   source_uid: ['LegacyDB'] ==> ['gsmitte']
User: 'jeff.staff'   record_uid: [] ==> ['E-2345234-234-B']   source_uid: [] ==> ['gsmitte']
# cat /var/log/univention/ucs-school-migration-import-user.log
2019-09-13 15:49:20 INFO  migrate_ucsschool_import_user.__init__:142  Given arguments: ['/usr/share/ucs-school-import/scripts/migrate_ucsschool_import_user', '--modify-record-uid', '--input-file=mapping.csv', '--source-uid=gsmitte']
2019-09-13 15:49:20 INFO  migrate_ucsschool_import_user.modify_record_uid:294  User: 'anton123'   record_uid: [] ==> ['12345678']   source_uid: [] ==> ['gsmitte']
2019-09-13 15:49:20 INFO  migrate_ucsschool_import_user.modify_record_uid:294  User: 'tom.teacher'   record_uid: [] ==> ['L887766']   source_uid: ['LegacyDB'] ==> ['gsmitte']
2019-09-13 15:49:20 INFO  migrate_ucsschool_import_user.modify_record_uid:294  User: 'jeff.staff'   record_uid: [] ==> ['E-2345234-234-B']   source_uid: [] ==> ['gsmitte']
#

To simplify the creation of such a two-column CSV file, the future CSV file can be used for ucs-school-import-user if it contains the information "First Name", "Last Name" and "RecordUID" for each user. This file can be passed to the migrate_ucsschool_import_user script in a previous step using the --guess-usernames parameter. In addition to the parameter, the column in which the values for first name, last name and RecordUID are located must also be specified.

The script then attempts to identify the relevant user names using first and last names and generates the file "mapping.csv" required in the above example.
Since first and last names do not have to be unique, it is possible that either no user is found or even several user objects are found in the case of frequently occurring names. These problem cases are displayed in the output file at the very beginning and must be assigned manually. All cases that were uniquely assigned are listed below. In any case, the entire file should be checked again to avoid incorrect assignments.

Example:
# cd /usr/share/ucs-school-import/scripts/
# cat import.csv
"record_uid";"school";"lastname";"firstname";"classes";"comment"
"12345678";"gsmitte";"Meyer";"Anton";"gsmitte-1A";""
"L887766";"gsmitte";"Teacher";"Tom";"";"just some comments"
"E-2345234-234-B";"gsmitte";"Staff";"Jeff";"";"in-house technician"
# cat: import.csv: No such file or directory
# ./migrate_ucsschool_import_user --input-file=import.csv --guess-usernames --column-firstname=4 --column-lastname=3 --column-record-uid=1 --output-file=mapping.csv
# cat import.csv
"This CSV file consists of 2 sections:"
"The first section lists all problematic users for whom either *no* or *several* user names were found"
"in LDAP by first and last name. These entries must be checked and corrected manually. The entries of the"
"first section can be recognized by a note in the comment column and the column for the user name being empty."
"The second section contains all entries for which exactly one user was found in LDAP with the specified"
"first and last name. These should nevertheless be checked again for correctness."
"This file contains 2 ambiguous entries and 1 unambiguous entries."
"After completion of the corrections, the lines with this text paragraph must be removed!"
""
"username","record_uid","comment","input_data"
"","12345678","multiple users found: anton8,anton64,anton123","12345678 § gsmitte § Meyer § Anton § gsmitte-1A § "
"","E-2345234-234-B","no user found in LDAP","E-2345234-234-B § gsmitte § Staff § Jeff §  § in-house technician"
"tom.teacher","L887766","",""
#
<CORRECT THE FILE MANUALLY HERE ==> FILL COLUMN FOR USERNAME>
# cat mapping.csv
"username","record_uid","comment","input_data"
"anton123","12345678","multiple users found: anton8,anton64,anton123","12345678 § gsmitte § Meyer § Anton § gsmitte-1A § "
"jeff.staff","E-2345234-234-B","no user found in LDAP","E-2345234-234-B § gsmitte § Staff § Jeff §  § in-house technician"
"tom.teacher","L887766","",""
# ./migrate_ucsschool_import_user --input-file=mapping.csv --modify-record-uid --source-uid="gsmitte"
User: 'anton123'   record_uid: [] ==> ['12345678']   source_uid: [] ==> ['gsmitte']
User: 'jeff.staff'   record_uid: [] ==> ['E-2345234-234-B']   source_uid: [] ==> ['gsmitte']
User: 'tom.teacher'   record_uid: [] ==> ['L887766']   source_uid: ['LegacyDB'] ==> ['gsmitte']
#

Comment 1 Sönke Schwardt-Krummrich

2019-10-04 16:57:17 CEST

Hint: The script always assumes, that there is a header row and therefore skips the first row of the CSV file.

Hint2: when guessing usernames, every 50 import lines, a small status line is printed:
Processing line 50 ...
Processing line 100 ...
Processing line 150 ...


Implemented in branch: sschwardt/50321/migrate-ucsschool-import-user

@QA: 
For a first test, simply copy migrate-ucsschool-import-user to 
/usr/share/ucs-school-import/scripts and the ucs-test script 
248_migrate_ucsschool_import_user to your test machine.

Please reopen for merge to 4.4 main branch.

Comment 2 Sönke Schwardt-Krummrich

2019-10-08 15:47:25 CEST

Merged to branch "4.4":

[4.4] fa48c3995 Bug #50321: add advisory
[4.4] c13c7537a Bug #50321: Merge branch 'sschwardt/50321/migrate-ucsschool-import-user' into 4.4
[4.4] e8a4c71f5 Bug #50321: add test for migrate_ucsschool_import_user
[4.4] 79dab1ae7 Bug #50321: added script migrate_ucsschool_import_user

Package: ucs-test-ucsschool
Version: 6.0.64A~4.4.0.201910081540
Branch: ucs_4.4-0
Scope: ucs-school-4.4

Package: ucs-school-import
Version: 17.0.14A~4.4.0.201910081540
Branch: ucs_4.4-0
Scope: ucs-school-4.4

Please note, the output of the script has changed:
User: 'bxg59r9i2i'   record_uid: None ==> '537381'   source_uid: None ==> 'hvwpy5hcjx'
User: 'gu617c7rkm'   record_uid: None ==> '537382'   source_uid: None ==> 'hvwpy5hcjx'
User: 'ajbmjgusrj'   record_uid: None ==> '537383'   source_uid: None ==> 'hvwpy5hcjx'
User: 'yw2lvrhbmb'   record_uid: None ==> '537384'   source_uid: None ==> 'hvwpy5hcjx'

Comment 3 Sönke Schwardt-Krummrich

2019-10-08 16:12:22 CEST

[4.4] bd52ef640 Bug #50321: update advisory
[4.4] 752451590 Bug #50321: reuse encoding and dialect of input file in migrate_ucsschool_import_user

Package: ucs-school-import
Version: 17.0.15A~4.4.0.201910081610
Branch: ucs_4.4-0
Scope: ucs-school-4.4

Comment 4 Daniel Tröder

2019-10-09 12:53:02 CEST

OK: automatic test
OK: manual test
OK: advisory

Comment 5 Sönke Schwardt-Krummrich

2019-10-09 13:53:48 CEST

UCS@school 4.4 v3 has been released.

https://docs.software-univention.de/changelog-ucsschool-4.4v3-de.html

If this error occurs again, please clone this bug.