Bug 42614 - NMI watchdog: BUG: soft lockup - CPU#X stuck for 22s! [smbd:XXX]
NMI watchdog: BUG: soft lockup - CPU#X stuck for 22s! [smbd:XXX]
Status: CLOSED DUPLICATE of bug 41054
Product: UCS
Classification: Unclassified
Component: Kernel
UCS 4.1
Other Linux
: P5 critical (vote)
: UCS 4.1-x-errata
Assigned To: Kernel maintainers
:
Depends on: 40558 42927
Blocks:
  Show dependency treegraph
 
Reported: 2016-10-10 12:04 CEST by Nico Stöckigt
Modified: 2017-09-15 14:06 CEST (History)
10 users (show)

See Also:
What kind of report is it?: Bug Report
What type of bug is this?: 7: Crash: Bug causes crash or data loss
Who will be affected by this bug?: 3: Will affect average number of installed domains
How will those affected feel about the bug?: 3: A User would likely not purchase the product
User Pain: 0.360
Enterprise Customer affected?: Yes
School Customer affected?: Yes
ISV affected?:
Waiting Support:
Flags outvoted (downgraded) after PO Review:
Ticket number: 2016091621002544
Bug group (optional):
Max CVSS v3 score:


Attachments
remotelog_kernelcrash (9.30 MB, text/x-log)
2016-10-26 12:09 CEST, Jens Thorp-Hansen
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Nico Stöckigt univentionstaff 2016-10-10 12:04:45 CEST
+++ This bug was initially created as a clone of Bug #40558 +++

/var/Log/messages
Sep 15 06:26:45 ucs-sla-01 rsyslogd: [origin software="rsyslogd" swVersion="5.8.11" x-pid="2416" x-info="http://www.rsyslog.com"] rsyslogd was HUPed
Sep 15 11:41:57 ucs-sla-01 kernel: imklog 5.8.11, log source = /proc/kmsg started.
Sep 15 11:41:57 ucs-sla-01 rsyslogd: [origin software="rsyslogd" swVersion="5.8.11" x-pid="2527" x-info="http://www.rsyslog.com"] start
Sep 15 11:41:57 ucs-sla-01 kernel: [    0.000000] PAT configuration [0-7]: WB  WT  UC- UC  WC  WP  UC  UC  
Sep 15 11:41:57 ucs-sla-01 kernel: [    0.000000] Initializing cgroup subsys cpuset
Sep 15 11:41:57 ucs-sla-01 kernel: [    0.000000] Initializing cgroup subsys cpu
Sep 15 11:41:57 ucs-sla-01 kernel: [    0.000000] Initializing cgroup subsys cpuacct

/var/log/daemon.log
Sep 15 11:29:02 ucs-sla-01 snmpd[3267]: Connection from UDP: [ipv4>0]:45650->[ipv4>1]
Sep 15 11:29:02 ucs-sla-01 snmpd[3267]: Connection from UDP: [ipv4>0]:45650->[ipv4>1]
Sep 15 11:29:02 ucs-sla-01 snmpd[3267]: Connection from UDP: [ipv4>0]:45650->[ipv4>1]
Sep 15 11:29:02 ucs-sla-01 snmpd[3267]: Connection from UDP: [ipv4>0]:55703->[ipv4>1]
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
@Sep 15 11:41:57 ucs-sla-01 acpid: cannot open input layer
Sep 15 11:41:57 ucs-sla-01 acpid: starting up with netlink and the input layer
Sep 15 11:41:57 ucs-sla-01 acpid: 44 rules loaded
Sep 15 11:41:57 ucs-sla-01 acpid: waiting for events: event logging is off
Sep 15 11:41:58 ucs-sla-01 avahi-daemon[2616]: Found user 'avahi' (UID 112) and group 'avahi' (GID 120).
Sep 15 11:41:58 ucs-sla-01 avahi-daemon[2616]: Successfully dropped root privileges.


/var/log/syslog
Sep 15 11:29:02 ucs-sla-01 snmpd[3267]: Connection from UDP: [ipv4:0]:56271->[ipv4:1]
Sep 15 11:29:02 ucs-sla-01 snmpd[3267]: Connection from UDP: [ipv4:0]:56271->[ipv4:1]
Sep 15 11:29:02 ucs-sla-01 snmpd[3267]: Connection from UDP: [ipv4:0]:56271->[ipv4:1]
Sep 15 11:29:02 ucs-sla-01 snmpd[3267]: Connection from UDP: [ipv4:0]:45650->[ipv4:1]
Sep 15 11:29:02 ucs-sla-01 snmpd[3267]: Connection from UDP: [ipv4:0]:45650->[ipv4:1]
Sep 15 11:29:02 ucs-sla-01 snmpd[3267]: Connection from UDP: [ipv4:0]:45650->[ipv4:1]
Sep 15 11:29:02 ucs-sla-01 snmpd[3267]: Connection from UDP: [ipv4:0]:55703->[ipv4:1]
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@
Sep 15 11:41:57 ucs-sla-01 kernel: imklog 5.8.11, log source = /proc/kmsg started.
Sep 15 11:41:57 ucs-sla-01 rsyslogd: [origin software="rsyslogd" swVersion="5.8.11" x-pid="2527" x-info="http://www.rsyslog.com"] start
Sep 15 11:41:57 ucs-sla-01 kernel: [    0.000000] PAT configuration [0-7]: WB  WT  UC- UC  WC  WP  UC  UC  
Sep 15 11:41:57 ucs-sla-01 kernel: [    0.000000] Initializing cgroup subsys cpuset

system info:
 version/erratalevel: 211
 version/patchlevel: 2
 version/releasename: Vahr
 version/version: 4.1

 Intel(R) Xeon(R) CPU E5-2630 v2 @ 2.60GHz [16 emulated cores]
 HP DL360
 XEN Enterprise
Comment 1 Stefan Gohmann univentionstaff 2016-10-10 12:08:36 CEST
Which kernel was used while it happened?
Comment 2 Nico Stöckigt univentionstaff 2016-10-10 12:32:59 CEST
(In reply to Stefan Gohmann from comment #1)
> Which kernel was used while it happened?


initial info was:
 Linux ucs-sla-01 4.1.0-ucs190-amd64 #1 SMP Debian 4.1.6-1.190.201604142226 (2016-04-14) x86_64 GNU/Linux

Last occurrence of the incident was 15.09.2016.

from the logs:
 Sep 15 11:41:57 kernel: Linux version 4.1.0-ucs190-amd64 (debian-kernel@lists.debian.org) (gcc version 4.7.2 (Debian 4.7.2-5.9.201403121731) ) #1 SMP Debian 4.1.6-1.190.201604142226 (2016-04-14)

Jul 11 22:31:53 kernel: Linux version 4.1.0-ucs190-amd64 (debian-kernel@lists.debian.org) (gcc version 4.7.2 (Debian 4.7.2-5.9.201403121731) ) #1 SMP Debian 4.1.6-1.190.201604142226 (2016-04-14)

 Jul 11 21:30:51 kernel: Linux version 3.16.0-ucs195-amd64 (debian-kernel@lists.debian.org) (gcc version 4.7.2 (Debian 4.7.2-5.9.201403121731) ) #1 SMP Debian 3.16.7-ckt25-2~bpo70+1.195.201605301151 (2016-05-3

 Jul 11 17:16:55 kernel: Linux version 3.10.0-ucs175-amd64 (debian-kernel@lists.debian.org) (gcc version 4.4.5 (Debian 4.4.5-8.3.201104271833) ) #1 SMP Debian 3.10.11-1.175.201602151247 (2016-02-15)
Comment 3 Jens Thorp-Hansen univentionstaff 2016-10-26 12:09:18 CEST
Created attachment 8156 [details]
remotelog_kernelcrash
Comment 4 Jens Thorp-Hansen univentionstaff 2016-10-26 12:09:59 CEST
Kernelversion:
Oct 26 11:07:48 ucs-sla-01 [1433468.128015] CPU: 2 PID: 30298 Comm: smbd Tainted: G             L  4.1.0-ucs174-amd64 #1 Debian 4.1.6-1.174.201602110938
Comment 5 Stefan Gohmann univentionstaff 2016-11-28 08:13:00 CET
Duplicate of Bug #41054?
Comment 6 Philipp Hahn univentionstaff 2016-11-28 09:09:35 CET
(In reply to Stefan Gohmann from comment #5)
> Duplicate of Bug #41054?

From searching attachment 8156 [details] for /mb_cache_entry/ I would say: yes
As the information for the original crash is lost forever, I close this bug as a DUPLICATE.

*** This bug has been marked as a duplicate of bug 41054 ***
Comment 7 Stefan Gohmann univentionstaff 2017-09-15 14:06:28 CEST
Set status of old resolved issues to closed.