Bug 37314 - Improve error reporting: collect useful crash SIGSEGV core file information
Improve error reporting: collect useful crash SIGSEGV core file information
Status: REOPENED
Product: UCS
Classification: Unclassified
Component: UMC - System diagnostic
UCS 4.4
Other Linux
: P5 enhancement with 4 votes (vote)
: ---
Assigned To: UMC maintainers
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2014-12-12 10:43 CET by Philipp Hahn
Modified: 2022-02-18 08:42 CET (History)
5 users (show)

See Also:
What kind of report is it?: Development Internal
What type of bug is this?: ---
Who will be affected by this bug?: ---
How will those affected feel about the bug?: ---
User Pain:
Enterprise Customer affected?: Yes
School Customer affected?:
ISV affected?:
Waiting Support:
Flags outvoted (downgraded) after PO Review:
Ticket number: 2016061421000591
Bug group (optional):
Max CVSS v3 score:


Attachments
crash-report v2 (879 bytes, text/plain)
2016-06-28 09:16 CEST, Philipp Hahn
Details
crash-report v3 (1.10 KB, text/plain)
2016-07-01 12:18 CEST, Philipp Hahn
Details
crash-report v4 (1.39 KB, application/x-shellscript)
2018-06-28 13:56 CEST, Philipp Hahn
Details
crash-report v5 (1.40 KB, application/x-shellscript)
2018-07-02 15:54 CEST, Philipp Hahn
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Philipp Hahn univentionstaff 2014-12-12 10:43:18 CET
We should provide a tool like "reportbug" or "apport" to help create useful error reports. When a process dies by SIGSEGV or by Python traceback it is important to collect information:
1. which package provides the binary
2. which version of that package is installed
3. on which other packages does the package depend on and which versions are they.
4. include any generated core file or traceback.

<https://wiki.ubuntu.com/Apport> automatically does that and files the information as a new private bug. We should do something like that with our bug tracking system, too.
Comment 1 Philipp Hahn univentionstaff 2015-05-21 08:11:32 CEST
We should extend the System Health module to monitor the system for crashed processes. I would suggest the following:

1. Install a core dump script which dumps the core and extra information into some /var/spool/univention-crash-report/ directory.
  - the core file
  - `lsof -n -p $pid`
  - `ps www -p $pid`
  - `uname -a`
  - `dpkg-query -W` recursively on the package containing the crashed file and all its dependent libraries
The script should be enabled/disabled by UCRV.
The script should implement some rate limiting to not fill the hard disk or to detect a binary faulting too often.

2. An extension in the UMC System Health module to check the directory for pending issues.
  - ask the user to send the data to Univention
  - option to include the core file, as it might contain sensitive data
  - option to delete all previous issues

#!/bin/sh
#
# Dump core file and collect more information about crashed process
# <http://man7.org/linux/man-pages/man5/core.5.html>
#
BASE='/var/spool/univention-crash-report'

if [ -z "${1:-}" ]
then
        install -o 0 -g 0 -m 0700 -d "$BASE"
        echo "|$(readlink -f "$0") %u %g %p %s %e" >/proc/sys/kernel/core_pattern
        exit 0
fi

user="$1" group="$2" pid="$3" signal="$4" exe="$5"

time="$(date +%s)"
base="${BASE}/${user}_${group}-${signal}-${time}"

ps www -p "$pid" >"${base}.ps"
lsof -n -p "$pid" >"${base}.lsof"
cat >"${base}.core"
Comment 2 Alexander Kläser univentionstaff 2015-06-02 10:39:59 CEST
UMC server crashes could be reported there, as well (cf., Bug 33427).
Comment 3 Janis Meybohm univentionstaff 2015-06-09 13:14:04 CEST
(In reply to Philipp Hahn from comment #0)
> <https://wiki.ubuntu.com/Apport> automatically does that and files the
> information as a new private bug. We should do something like that with our
> bug tracking system, too.

even without "auto-bug-filing" (which might be unwanted by development :-)) collecting those data for statistical evaluation would be immensely helpful!
We (support) could make stronger suggestions on how often problems occur or if they occur in other customers environments.
Comment 4 Janis Meybohm univentionstaff 2015-08-26 08:34:12 CEST
More details on the apport error tracking infrastructure can be found at:
https://wiki.ubuntu.com/ErrorTracker
https://wiki.ubuntu.com/ErrorTracker/ServerArchitecture
Comment 5 Philipp Hahn univentionstaff 2015-08-26 08:42:54 CEST
From DebConf15:
- <https://wiki.debian.org/AutomaticDebugPackages> (DDEB) builds external packages with debug symbols for all packages containing compiled binaries - they are hosted separately from the main Debian mirror network, because those packages are normally large in size and seldomly needed
- For Debian-Apport there is a GSoC project this year 2015:
<http://bugs.debian.org/796464>
<https://wiki.debian.org/SummerOfCode2015/Projects#SummerOfCode2015.2FProjects.2FApportForDebian.Apport_for_Debian>
<http://blog.yurushao.info/2015/07/Debian-Apport-GSoC/>
<http://www.researchut.com/blog/gsoc-apport-for-debian>
Comment 6 Philipp Hahn univentionstaff 2016-03-16 08:21:49 CET
For Linux-Kernel-Crash-Dump see Bug #25918 and attachment 7535 [details]
Comment 7 Philipp Hahn univentionstaff 2016-06-28 09:16:59 CEST
Created attachment 7769 [details]
crash-report v2
Comment 8 Philipp Hahn univentionstaff 2016-06-28 09:20:15 CEST
Ticket#2016061421000591 — Probleme mit dem Bind
Comment 9 Philipp Hahn univentionstaff 2016-07-01 12:18:41 CEST
Created attachment 7782 [details]
crash-report v3

Collects package dependencies recursively
Comment 10 Philipp Hahn univentionstaff 2016-08-16 08:53:58 CEST
systemd has a special service for handing core dumps: <https://www.freedesktop.org/software/systemd/man/systemd-coredump.html>
Comment 11 Florian Best univentionstaff 2017-06-28 14:52:52 CEST
There is a Customer ID set so I set the flag "Enterprise Customer affected".
Comment 12 Arvid Requate univentionstaff 2017-08-01 12:50:18 CEST
This is pretty independent of the specific UCS version, so we should provide it via updates.univention.de/download like what has been done for Bug 40461:

 http://updates.software-univention.de/download/univention-system-check/
Comment 13 Philipp Hahn univentionstaff 2018-06-28 13:56:46 CEST
Created attachment 9578 [details]
crash-report v4

Drop --no-all-versions
Save /proc/$pid/{cmdline,environ,maps,map_files/*}
Allow usage outside kernel crash reporting
Comment 14 Philipp Hahn univentionstaff 2018-07-02 15:54:07 CEST
Created attachment 9583 [details]
crash-report v5

Improve speed by batching dpkg-S
Comment 15 Stefan Gohmann univentionstaff 2019-01-03 07:17:34 CET
This issue has been filled against UCS 4.0. The maintenance with bug and security fixes for UCS 4.0 has ended on 31st of May 2016.

Customers still on UCS 4.0 are encouraged to update to UCS 4.3. Please contact
your partner or Univention for any questions.

If this issue still occurs in newer UCS versions, please use "Clone this bug" or simply reopen the issue. In this case please provide detailed information on how this issue is affecting you.