Bug 42366 - Capture information about joinscript failure in Feedback
Capture information about joinscript failure in Feedback
Status: CLOSED FIXED
Product: UCS
Classification: Unclassified
Component: System setup
UCS 4.3
Other Linux
: P5 normal (vote)
: UCS 4.3-1-errata
Assigned To: Arvid Requate
Erik Damrose
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2016-09-13 11:50 CEST by Florian Best
Modified: 2019-03-12 22:31 CET (History)
6 users (show)

See Also:
What kind of report is it?: Bug Report
What type of bug is this?: 6: Setup Problem: Issue for the setup process
Who will be affected by this bug?: 1: Will affect a very few installed domains
How will those affected feel about the bug?: 3: A User would likely not purchase the product
User Pain: 0.103
Enterprise Customer affected?:
School Customer affected?:
ISV affected?:
Waiting Support:
Flags outvoted (downgraded) after PO Review:
Ticket number: 2018041221000504, 2018041321000851, 2018042421001099, 2018030121000447
Bug group (optional): Error handling, External feedback
Max CVSS v3 score:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Florian Best univentionstaff 2016-09-13 11:50:08 CEST
Version: 4.1-3 errata239 (Vahr)

Domäneneinrichtung (Dies kann einige Zeit dauern): FAILED: 35univention-management-console-module-diagnostic.inst

We got no more information. We should add more information to the traceback reporting.
Comment 1 Alexander Kläser univentionstaff 2016-09-15 12:33:07 CEST
We should improve the quality of the errors which we report to the user. The error here is exactly what the user will see.
Comment 2 Arvid Requate univentionstaff 2017-05-10 16:27:01 CEST
Ok, since we have no clue what's going on here, I think we should take this as an initiative to improve the collection of joinscript output. I guess that would require adjustment to umc-system-setup and/or supporting modules/libraries?
Comment 3 Arvid Requate univentionstaff 2017-08-01 13:13:49 CEST
How do we handle this in the UMC join module?

In the diagnostic module it's hard to identify the passage of the join.log that would be interesting for a specific joinscript execution failure. So maybe we should start dumping diagnostic output from the die() function into separate files for each failing joinscript? Evaluating and presenting that would be trivial.

We could do this in two steps: if die() is called without arguments, then the it could save the join.log lines since "^RUNNING 90failedjoinscript". This first step would give us an immediate ROI for Support, and would also help the user when presented in UMC-diagnostic. The die() method could accept an argument (and options) where a joinscript can pass a different, more specific error output instead.
Comment 4 Johannes Keiser univentionstaff 2018-05-03 17:49:54 CEST
Failing joinscripts are reported every now and then.

More information for the feedback-dialog would be really helpful.

Version: 4.3-0 errata0 (Neustadt) FAILED: 26univention-samba.inst
Version: 4.3-0 errata12 (Neustadt) FAILED: 96univention-samba4.inst
Version: 4.3-0 errata0 (Neustadt) FAILED: 91univention-saml.inst
Version: 4.3-0 errata21 (Neustadt) FAILED: 96univention-samba4.inst
Version: 4.2-3 errata310 (Lesum) FAILED: 35univention-management-console-module-updater.inst
Comment 5 Jannik Ahlers univentionstaff 2018-06-22 11:55:02 CEST
There are 3 different points where failing joinscripts get reported to the customer:

-System Setup
During system setup, all joinscripts get executed for the first time. If one fails the user gets presented with an error page with the possibility to report the error to us. This is where we get the traceback reports from that Florian and Johannes mentioned.

-UMC Join Module
When a join script fails in the join module, the user gets prompted with an error message that leads to the join log, which is conveniently available in the module itself.

-Diagnostic Module
The diagnostics module just runs univention-check-join-status and returns the output and links to an sdb article and the join module.

As the bug originally was opened for improving the first case, I will focus on that for now. I don't think the join module needs any improvement for this as it already provides access to the log. The diagnostic module could get enhanced, but that probably is another bug.
Comment 6 Jannik Ahlers univentionstaff 2018-06-29 16:30:29 CEST
Successful build
Package: univention-system-setup
Version: 11.0.5-4A~4.3.0.201806291616
Branch: ucs_4.3-0
Scope: errata4.3-1

Successful build
Package: univention-join
Version: 10.0.0-15A~4.3.0.201806291620
Branch: ucs_4.3-0
Scope: errata4.3-1

The relevant parts of the join log now gets added to the error message that appears in the system setup module after the join failed and gets sent to us if the user clicks the 'send feedback' button.
I also made the logging a bit more consistent.

Please test this on different system roles as the join process is vastly different.
Also the python file sometimes isn't correctly loaded if you copy it onto the machine before the join.

I tested it by writing 'false || die "teststring"' at the end (but before the line 'joinscript_save_current_version' of the joinscript '30univention-appcenter.inst'.
Comment 7 Quality Assurance univentionstaff 2018-07-04 16:04:59 CEST
--- mirror/ftp/4.3/unmaintained/4.3-1/source/univention-system-setup_11.0.5-1A~4.3.0.201806081223.dsc
+++ apt/ucs_4.3-0-errata4.3-1/source/univention-system-setup_11.0.5-4A~4.3.0.201806291616.dsc
@@ -1,6 +1,18 @@
-11.0.5-1A~4.3.0.201806081223 [Fri, 08 Jun 2018 12:23:17 +0200] Univention builddaemon <buildd@univention.de>:
+11.0.5-4A~4.3.0.201806291616 [Fri, 29 Jun 2018 16:16:29 +0200] Univention builddaemon <buildd@univention.de>:
 
   * UCS auto build. No patches were applied to the original source package
+
+11.0.5-4 [Fri, 29 Jun 2018 16:06:54 +0200] Jannik Ahlers <ahlers@univention.de>:
+
+  * Bug #42366: Improve error message and feedback when join fails
+
+11.0.5-3 [Fri, 29 Jun 2018 15:10:42 +0200] Richard Ulmer <ulmer@univention.de>:
+
+  * Bug #45931: Improve display of information about installed updates
+
+11.0.5-2 [Thu, 14 Jun 2018 13:17:43 +0200] Richard Ulmer <ulmer@univention.de>:
+
+  * Bug #45931: Give more detailed information about installed updates
 
 11.0.5-1 [Fri, 08 Jun 2018 12:01:09 +0200] Dirk Wiesenthal <wiesenthal@univention.de>:
 

<http://10.200.17.11/4.3-1/#248385186724779815>
Comment 8 Quality Assurance univentionstaff 2018-07-04 16:05:02 CEST
--- mirror/ftp/4.3/unmaintained/component/4.3-1-errata/source/univention-join_10.0.0-14A~4.3.0.201806212130.dsc
+++ apt/ucs_4.3-0-errata4.3-1/source/univention-join_10.0.0-15A~4.3.0.201806291620.dsc
@@ -1,6 +1,11 @@
-10.0.0-14A~4.3.0.201806212130 [Thu, 21 Jun 2018 21:30:25 +0200] Univention builddaemon <buildd@univention.de>:
+10.0.0-15A~4.3.0.201806291620 [Fri, 29 Jun 2018 16:20:48 +0200] Univention builddaemon <buildd@univention.de>:
 
   * UCS auto build. No patches were applied to the original source package
+
+10.0.0-15 [Fri, 29 Jun 2018 16:10:01 +0200] Jannik Ahlers <ahlers@univention.de>:
+
+  * Bug #42366: Repair logging of failed join scripts and adapt to changes in
+    system setup
 
 10.0.0-14 [Thu, 21 Jun 2018 21:28:49 +0200] Arvid Requate <requate@univention.de>:
 

<http://10.200.17.11/4.3-1/#248385186724779815>
Comment 9 Arvid Requate univentionstaff 2018-07-09 17:49:57 CEST
Ok, this looks better.

I created Bug 47327 to generally improve logging. For example if univention-join detects and error before running the join scripts, it outputs a single line prefixed with "* Message". For Bug 42124 I had to squeeze everything into this single line which is hardly readable in the System Setup output.
Comment 10 Felix Botner univentionstaff 2018-07-17 11:04:18 CEST
This breaks the installation tests!

The setup process hangs in the "setup-scripts" status, seems that the execution of the join scripts in setup-join hangs forever with a read to stdin

Next in setup-join is 

  echo "Running postjoin scripts"

but we can't see this in the setup.log, so the problem is in the

if [ $? -ne 1 ]; then
...
(
...
) |& (
..
)
...
fi

construct.

Please make sure the installation tests are fine after fixing this.
Comment 11 Erik Damrose univentionstaff 2018-07-17 11:49:11 CEST
Additional note to comment#10: we saw several tee processes lingering around while system setup was not progressing, it was showing the last executing joinscript in the progress bar. Killing the tee processes advanced the setup process
Comment 12 Felix Botner univentionstaff 2018-07-17 12:08:37 CEST
After killing 4 tee process, the setup continued, i applied the following patch

-           $i > >(tee -a "$JOIN_LOG") 2> >(tee -a "$JOIN_LOG" >&2)
+           $i 2>&1 | tee -a "$JOIN_LOG"

and started the tests
Comment 13 Arvid Requate univentionstaff 2018-07-24 16:44:37 CEST
17ab82392d | Improve readability of code
75a47d9bb2 | Advisory
Comment 14 Erik Damrose univentionstaff 2018-08-06 13:59:47 CEST
Works very well. The joinscript output is parsed and shown to the user. Feedbackmails will be more helpful now.

OK: univention-system-setup
OK: univention-join
OK: parsing of logfiles, correct detection of complete joinscript output.
OK: Feedbackmails with error
OK: yaml
Verified
Comment 15 Erik Damrose univentionstaff 2018-08-06 14:10:33 CEST
Whoops, clicked on verify to early.

Commit b9dae7bd introduced a change to the die() function in management/univention-join/joinscripthelper.lib by explicitly setting "set +e". This will overwrite any modifications to that flag done by scripts calling that function. We should not change the flag in the function, or reset it to its previous state when exiting.
Comment 16 Arvid Requate univentionstaff 2018-08-07 20:33:55 CEST
Ok, the "set +e" seems to have been a temporary setting during development of the patch. I've removed it and rebuilt the package:

7562556fb6 | Remove debug leftover
c54386e7d2 | Advisory
Comment 17 Erik Damrose univentionstaff 2018-08-08 09:27:45 CEST
OK: code review, set +e removed
OK: yaml
Verified