Bug 50039 - Improve Logging of Importer
Improve Logging of Importer
Status: CLOSED FIXED
Product: UCS@school
Classification: Unclassified
Component: Import scripts
UCS@school 4.4
Other Linux
: P5 normal (vote)
: UCS@school 4.4 v4-errata
Assigned To: Ole Schwiegert
Daniel Tröder
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2019-08-22 09:44 CEST by Christian Völker
Modified: 2020-03-25 16:09 CET (History)
3 users (show)

See Also:
What kind of report is it?: Feature Request
What type of bug is this?: ---
Who will be affected by this bug?: ---
How will those affected feel about the bug?: ---
User Pain:
Enterprise Customer affected?: Yes
School Customer affected?: Yes
ISV affected?:
Waiting Support:
Flags outvoted (downgraded) after PO Review:
Ticket number: 2019082221000281, 2020012821000101
Bug group (optional): Design
Max CVSS v3 score:


Attachments
Proposed Shell Script (473 bytes, application/x-shellscript)
2019-11-27 10:43 CET, Christian Völker
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Christian Völker univentionstaff 2019-08-22 09:44:53 CEST
Request from customer (and definetly from support, too!) to improve the logging of the current importer.
===============GERMAN===================
P.S.: "Früher" wurde in /var/lib/ucs-school-import/jobs/<jahr>/<jobnr>
eine Logdatei geführt. Dort gibt es in diesem Fall zwar die .log,
allerdings ist sie noch 0 Bytes lang. Ich kann die Fehlermeldung also
nur am Bildschirm sehen. Es wäre schön, wenn das Log schon während der
Abarbeitung des DryRun geschrieben würde.
Auch /var/log/univention/ucs-school-import.log und .info sind nicht
zielführend (haben als letzten Eintrag etwas von der vergangenen Nacht).
Nur in der Datei
/var/log/univention/ucs-school-import/workers-dryrun.log
habe ich die Infos auch gefunden. Die ist aber mit im Moment 238M etwas
unhandlich und könnte eventuell mal in's logrotate einbezogen werden...
===============

For supporting it is VERY difficult to request the needed file from customers as it appears to depend on several different switches (through UMC or shell, dry run or not,...)

Having ONE logfile where the most important information is logged would be great!
Or at least all log files in a single directory. 

For troubleshooting it is very difficult to tell the customer:
"Check the job-id of the failing import. Then look into /var/lib/ucs-school-import/jobs/<year>/<job-id> and send the files. But it you started the import through UMC, we need /var/log/univention/ucs-school-import.log. Just in case the error appeared in a dry run, then please send /var/log/univention/ucs-school-import/workers-dryrun.log. If this is too large, split it somehow...."
Comment 1 Daniel Tröder univentionstaff 2019-08-22 15:28:54 CEST
Please create a separate bug for logrotate.

The dry-run is a separate job, so its logfile is written to /var/lib/ucs-school-import/jobs/<jahr>/<jobnr> with jobnr being 1 lower that the actual import job.
Only in the case of a configuration error so severe, that the import won't even start, nothing is written here.

The UMC module has separate logfiles in /var/log/univention/ucs-school-import/* This is where always something can be found - even if the configuration prevents the import from starting.

It may be possible to unite the logfiles in /var/log/univention/ucs-school-import/* into /var/log/univention/ucs-school-import.log. Then there would be only two files to consider:
* /var/log/univention/ucs-school-import.log
* /var/lib/ucs-school-import/jobs/<jahr>/<jobnr>/ucs-school-import.log
Comment 2 Christian Völker univentionstaff 2019-08-22 16:04:35 CEST
What about working with soft-links? Name the job-Log according to <year>_<job-id>.
Thus, we will have all logfiles in a single directory where we can easily collect them!

I did some test imports on my machines and do NOT have any job directories (no dry run, performed on shell) even though the import was successful.

If figured out having an additional directory /var/lib/ucs-school-import/summary/2019/08 where import summaries are logged. ANOTHER directory.

Sorry but honestly, for troubleshooting this is a nightmare! In the end I do not have a change to do offline-troubleshooting as I can not even say which files the customer should forward to support as long as I haven't been on the system itself.

Or create a script which collects ALL relevant files and logs so the customer can easily execute and send the resulting file to us! Or even better, tell us which files to include in univention-support-info script.
Comment 3 Daniel Tröder univentionstaff 2019-08-23 08:28:41 CEST
Summary and password files are _not_ log files. They are CSV files meant to be used to match the input data with the actions taken by the import system.

(In reply to Christian Völker from comment #2)
> What about working with soft-links? Name the job-Log according to
> <year>_<job-id>.
> Thus, we will have all logfiles in a single directory where we can easily
> collect them!
1. That will quickly be to much in cases where an import is done each day.
2. The directory in /v/l/u-s-i/jobs/YEAR/ID contains much more than just the log file. It was specially designed for the support to get all relevant information. That includes having all configuration files that were used to create the jobs configuration as well as the installed hooks, the summary and the password files.

Ofc there will only be something in the log file if the import was actually started. If it didn't start, then there is a configuration error. Then you must look in /var/log/univention/ucs-school-import.

You must distinguish between command line import and UMC import module. They work very differently and produce different log files in different places.

So first thing to do, is to ask the customer _how_ he started the import: with the UMC or on the command line.

Regarding the command line import:
Please read the configuration output at the start of an import run. It tells you where each file will be.
As you can edit the configuration, it is up to _you_ to set the path for log/summay/password files.
The documentation can be found as usual in /usr/share/doc/ucs-school-import/*_readme.txt.

Regarding the UMC import:
The import job is _not_ executed by the UMC module! The execution flow is like this:
1. The UMC module assembles all necessary information and creates /v/l/u-s-i/jobs/YEAR/ID and its content (configs, hooks). 
2. Then it sends a message through a messaging system (RabbitMQ) to a job queuing server (Celery).
3. Celery knows two queues: "dry_run" (4 jobs can run in parallel) and "import" (one job at a time). It accepts the job in the requested one.
4. Celery executes the job in one of its worker processes. The worker logs to /v/l/u/u-s-i/worker_*.log
5. If the import configuration is OK, the import code starts logging to /v/l/u-s-i/jobs/YEAR/ID/u-s-i.log
Comment 4 Christian Völker univentionstaff 2019-11-26 12:09:33 CET
I need to raise this.

We desperately need some sort of troubleshooting tool. 

Customer tells import has failed. He has usually not idea about the jobID.

It is not helpful if we need to ask the customer about details just to get troubleshooting information. We need a way to tell the customer to start ONE command and then we should have all needed troubleshooting information.

Collect latest five jobs or whatever...
Comment 5 Daniel Tröder univentionstaff 2019-11-26 16:19:05 CET
What about letting the customer execute this line:

tar cjpf imports.tar.bz2 /usr/share/ucs-school-import /var/lib/ucs-school-import /var/log/univention/ucs-school-import/ /var/log/univention/ucs-school-import.*

It will include everything ucs-school-import related, including logs, configs and hooks.
Comment 6 Christian Völker univentionstaff 2019-11-26 17:38:10 CET
Of course this would collect ALL data.

The resulting file would be huge as customers run importer daily/weekly and never delete old entries.

It would be better if it just would collect the last couple of runs as these are the interesting ones when troubleshooting.

But, yes, this is somehow helpful.
Comment 7 Daniel Tröder univentionstaff 2019-11-27 10:01:27 CET
tar cjpf imports.tar.bz2 /usr/share/ucs-school-import /var/log/univention/ucs-school-import/ /var/log/univention/ucs-school-import.* $(\ls -td /var/lib/ucs-school-import/jobs/*/* | head)
Comment 8 Daniel Tröder univentionstaff 2019-11-27 10:04:02 CET
tar cjpf imports.tar.bz2 \
    /usr/share/ucs-school-import \
    $(\ls -td /var/log/univention/ucs-school-import/* | head) \
    $(\ls -td /var/log/univention/ucs-school-import.* | head) \
    $(\ls -td /var/lib/ucs-school-import/jobs/*/* | head)

You could create a script similar to this and upload it here for inclusion in the product.
Comment 9 Christian Völker univentionstaff 2019-11-27 10:43:07 CET
Created attachment 10233 [details]
Proposed Shell Script
Comment 10 Daniel Tröder univentionstaff 2020-03-10 19:46:38 CET
Commits made in branch twenzel/50039_logging_of_importer:

1c247510b Bug #50039: Symlink to last log-file
fd70b2bdb Bug #50039: Add error log to ucs-school-import
8f18d100b Bug #50039: Add origin of import to logs
d433e69d4 Bug #50039: Symlink to last fail-log
7e2a7a870 Bug #50039: Log first line of csv.
204c52fe3 Bug #50039: Create symlink to each failed log
2b8fe884f Bug #50039: Shift logging of source such that it is also printed to log-file.
cdbb300f8 Bug #50039: keep argument names and order when wrapping a function
9aa1c88f4 Bug #50039: simplify and make better readable from where the import was started
5b3303b73 Bug #50039: always create an error log
7bdc63124 Bug #50039: write import statistics to error log
4f3529530 Bug #50039: reraise exception after catching it
7b4e308d2 Bug #50039: fix duplicate log entries
2a739495e Bug #50039: cleanup
636c78312 Bug #50039: create symlink to lastest error log
70ec7a25d Bug #50039: create consistent filenames for symlinks
Comment 11 Daniel Tröder univentionstaff 2020-03-10 20:11:03 CET
ERROR LOG
~~~~~~~~~
A log file "ucs-school-import-error.log" with entries of log level ERROR and above is now created besides the regular log file "ucs-school-import.log" (with entries of log level DEBUG and above). The only entries of level INFO that are written to "ucs-school-import-error.log" are the configuration dump and the import summary.

/var/log/univention/ucs-school-import.log
/var/log/univention/ucs-school-import-error.log

/var/lib/ucs-school-import/jobs/2020/17/ucs-school-import.log
/var/lib/ucs-school-import/jobs/2020/17/ucs-school-import-error.log



IMPORT SOURCE
~~~~~~~~~~~~~

The source of the import job is now printed to the log file (directly before the configuration dump):

----- UCS@school import tool starting ------
Import started by HTTP API (class 'HttpApiImportFrontend').

------ UCS@school import tool starting ------
Import started by commandline (class 'UserImportCommandLine').



CSV HEADER
~~~~~~~~~~

The first line of the input file (CSV) is now printed to the log file (directly before the configuration dump):

----- UCS@school import tool starting ------
Import started by HTTP API (class 'HttpApiImportFrontend').
First line of u'/var/lib/ucs-school-import/jobs/2020/17/1583864943-00test_users_small.csv':
'"Schulen","Benutzertyp","Vorname","Nachname","Klassen","Beschreibung","Telefon","EMail"\n'
------ UCS@school import tool configured ------



SYMLINKS TO LATEST LOGFILES
~~~~~~~~~~~~~~~~~~~~~~~~~~~

Multiple symlinks are now created in the directory /var/log/univention/ucs-school-import:

* LAST-LOG-DEBUG: latest log file with DEBUG entries
* LAST-LOG-ERROR: latest log file with ERROR entries
* LAST-FAIL: import log file with DEBUG entries of the last import that failed
* FAIL_2020-03-10_19:28:48_cmdline: log file with DEBUG entries started from the cmdline
* FAIL_2020-03-10_20:08:51_job_0019: log file with DEBUG entries started from UMC
Comment 12 Daniel Tröder univentionstaff 2020-03-10 20:44:11 CET
Merged code into 4.4, built package and created advisory.

[4.4] c45c771e0 Bug #50039: Merge branch 'twenzel/50039_logging_of_importer' into 4.4
[4.4] c8a9fa8da Bug #50039: changelog, advisory
[4.4] 16b985faf Bug #50039: advisory update

ucs-school-import (17.0.26)
Comment 13 Daniel Tröder univentionstaff 2020-03-10 20:46:21 CET
OK: ucs-school-import-error.log
OK: Import started by ...
OK: First line of ...
OK: symlinks
Comment 14 Sönke Schwardt-Krummrich univentionstaff 2020-03-25 16:09:34 CET
UCS@school 4.4 v5 has been released.

https://docs.software-univention.de/changelog-ucsschool-4.4v5-de.html

If this error occurs again, please clone this bug.