Bug 47221 - Add pyhook directly after reading CSV line but before processing line
Add pyhook directly after reading CSV line but before processing line
Status: CLOSED FIXED
Product: UCS@school
Classification: Unclassified
Component: Import scripts
UCS@school 4.3
Other Linux
: P5 normal (vote)
: UCS@school 4.3 v5
Assigned To: Sönke Schwardt-Krummrich
Daniel Tröder
:
Depends on:
Blocks: 47740
  Show dependency treegraph
 
Reported: 2018-06-20 16:49 CEST by Sönke Schwardt-Krummrich
Modified: 2018-09-11 11:34 CEST (History)
0 users

See Also:
What kind of report is it?: Feature Request
What type of bug is this?: ---
Who will be affected by this bug?: ---
How will those affected feel about the bug?: ---
User Pain:
Enterprise Customer affected?:
School Customer affected?: Yes
ISV affected?:
Waiting Support:
Flags outvoted (downgraded) after PO Review:
Ticket number:
Bug group (optional):
Max CVSS v3 score:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Sönke Schwardt-Krummrich univentionstaff 2018-06-20 16:49:27 CEST
A post read hook is required for the importer that gives the ability to alter the incoming data entry before the entry is processed the first time by the importer.
Comment 1 Sönke Schwardt-Krummrich univentionstaff 2018-07-12 15:01:12 CEST
1) The LDAP connection self.lo has been moved from CsvReader to its superclass BaseReader.
2) The BaseReader now scans /usr/share/ucs-school-import/pyhooks/ for pyhooks that are derived from the class PostReadPyHook. The hook method "entry_read" is called directly after reading one entry from the data file.
3 arguments are passed to entry_read():
- int entry_count: index of the data entry (e.g. line of the CSV file)
- list[str] input_data: input data as raw as possible (e.g. raw CSV columns). 
  The input_data may be changed.
- dict[str, str] input_dict: input data mapped to column names. The input_dict
  may be changed.

ucs-school-import (16.0.2-24)
163cb3f30245 | Bug #47221: add new PostReadPyHook
36c6fd1bfb71 | Bug #47221: move lo from CsvReader to BaseReader
Comment 2 Daniel Tröder univentionstaff 2018-07-29 10:07:50 CEST
* OK: moving LDAP connection object from CsvReader to BaseReader and making it a member variable
* OK: code, fixed a docstring (no rebuild):

[4.3 fc47c380e] Bug #47221: sphinx doesn't support space in combined type-parameter-string
[4.3 ea9bf72c2] Bug #47221: sphinx doesn't support space in combined type-parameter-string

* addes a ucs-test to verify functionality:
[4.3 106d5d4f8] Bug #47221: add test for PostReadPyHook

ucs-test-ucsschool (5.0.2-73)
Comment 3 Sönke Schwardt-Krummrich univentionstaff 2018-08-30 11:06:18 CEST
I had to reopen the bug because three extensions were needed:

1) It had to be possible to skip individual data records (lines of the import file), if the PostReadPyHook decides so. The base_reader has been adapted accordingly. Each PostReadPyHook can throw the exception UcsSchoolImportSkipImportRecord in the entry_read() method, whereby the base_reader skips this record.

2) A hook method is required, which is executed *after* reading all data records. For this purpose the PostReadPyHook was extended by the method all_entries_read(). Here you get all ImportUser objects and possible errors as a list to look at them again. Changes should not be made here any more if possible (it was not tested for possible side effects!).

3) For example, to pass collected data from entry_read() to all_entries_read(), the PyHooksLoader has been modified to provide a global cache (in a class variable) for the classes of the imported PyHook files. Data that is written to a class variable in the PostReadPyHook* instance* is therefore no longer overwritten by reloading the class and can therefore be passed between the functions of a hook class across different instances of the class.

Additionally in ucs-school-import/doc/ there is now a chart which shows a sequence diagram when which PyHook is called.

The following packages were adapted and rebuilt for this purpose:

Package: ucs-school-lib
Version: 11.0.1-20A~4.3.0.201808301052
Branch: ucs_4.3-0
Scope: ucs-school-4.3

Package: ucs-school-import
Version: 16.0.2-40A~4.3.0.201808301052
Branch: ucs_4.3-0
Scope: ucs-school-4.3

a4ca06ea7 Bug #47221: update advisories
51eb3db2c Bug #47221: added technical documentation for PyHooks
7743bf725 Bug #47221: add changelog entry
8581adc56 Bug #47221: test new method all_entries_read() in PostReadPyHook
0ad81400c Bug #47221: add changelog entry
b670e95c2 Bug #47221: call new all_entries_read() PostReadPyHook method after the input file has been processed
bb5a011c8 Bug #47221: add new method all_entries_read() to PostReadPyHook
adec235e0 Bug #47221: allow PostReadHook to skip single lines of input file
d4aba3429 Bug #47221: add class cache in PyHookLoader
802c249aa Bug #47221: update advisory
Comment 5 Daniel Tröder univentionstaff 2018-09-03 11:18:04 CEST
(In reply to Sönke Schwardt-Krummrich from comment #4)
> currently fails

SwitchGivenNameCnHook.all_entries_read() from test237_post_read_pyhookpy is not used:

[2018-09-03 02:51:29.983152] 2018-09-03 02:51:29 INFO  pyhooks_loader.get_hook_objects:146  Loaded hooks: {'entry_read': ['SwitchGivenNameCnHook.entry_read']}.
Comment 6 Sönke Schwardt-Krummrich univentionstaff 2018-09-03 11:29:05 CEST
f5700f37f Bug #47221: update test due to API extensions in PostReadPyHook

Package: ucs-test-ucsschool
Version: 5.0.2-89A~4.3.0.201809031127
Branch: ucs_4.3-0
Scope: ucs-school-4.3
Comment 7 Daniel Tröder univentionstaff 2018-09-03 13:15:10 CEST
Either use a mutable type as class variable or access the attribute in the class' scope.

[4.3 f29d2f673] Bug #47221: fix test

ucs-test-ucsschool (5.0.2-90)
Comment 8 Daniel Tröder univentionstaff 2018-09-03 14:28:22 CEST
[4.3 0a577891b] Bug #47221: improve docstring

No code change, no rebuild.

(In reply to Sönke Schwardt-Krummrich from comment #3)
> I had to reopen the bug because three extensions were needed:
OK: class cache in PyHookLoader
OK: new method all_entries_read() in PostReadPyHook

> Additionally in ucs-school-import/doc/ there is now a chart which shows a
> sequence diagram when which PyHook is called.
Excellent!

OK: ucs-test is successful
Comment 9 Daniel Tröder univentionstaff 2018-09-04 13:10:43 CEST
(In reply to Daniel Tröder from comment #8)
> (In reply to Sönke Schwardt-Krummrich from comment #3)
> > Additionally in ucs-school-import/doc/ there is now a chart which shows a
> > sequence diagram when which PyHook is called.
> Excellent!

[4.3] 6f59227ea Bug #47447: build hooks graph in buildsystem, add section about hooks to internal documentation

The graph can now be found in the internal import documentation at: https://billy.knut.univention.de/~dtroeder/http-api-doc/hooks.html
Comment 10 Sönke Schwardt-Krummrich univentionstaff 2018-09-11 11:34:39 CEST
UCS@school 4.3 v5 has been released.

https://docs.software-univention.de/changelog-ucsschool-4.3v5-de.html

If this error occurs again, please clone this bug.