Bug 47319

Summary: UNIVENTION_RAID returns "UNKNOWN: dm:[No devices to check];......"
Product: UCS Reporter: Sönke Schwardt-Krummrich <schwardt>
Component: Monitoring (Prometheus or Nagios)Assignee: Felix Botner <botner>
Status: CLOSED FIXED QA Contact: Erik Damrose <damrose>
Severity: normal    
Priority: P5 CC: botner, markus.daehlmann, scheinig
Version: UCS 4.3   
Target Milestone: UCS 4.3-2-errata   
Hardware: Other   
OS: Linux   
What kind of report is it?: Bug Report What type of bug is this?: 4: Minor Usability: Impairs usability in secondary scenarios
Who will be affected by this bug?: 3: Will affect average number of installed domains How will those affected feel about the bug?: 2: A Pain – users won’t like this once they notice it
User Pain: 0.137 Enterprise Customer affected?:
School Customer affected?: Yes ISV affected?:
Waiting Support: Yes Flags outvoted (downgraded) after PO Review:
Ticket number: 2018070621000654 Bug group (optional):
Max CVSS v3 score:

Description Sönke Schwardt-Krummrich univentionstaff 2018-07-09 12:40:41 CEST
The handling of device mapper RAID is broken in check_raid as of UCS 4.3.
The check always returns "UNKNOWN" state if no dm raid is configured (even if a mdraid is configured):
UNKNOWN: dm:[No devices to check]; mdstat:[md3(2.73 TiB raid5):UUUU, md2(1023.44 MiB raid1):UUUU]

So the check is currently nearly useless.

See also:
https://github.com/glensc/nagios-plugin-check_raid/pull/172
https://github.com/glensc/nagios-plugin-check_raid/issues/142

Temporary workaround (one line!):
sed -i -re 's,/usr/lib/nagios/plugins/check_raid$,/usr/lib/nagios/plugins/check_raid -p mdstat,' /etc/nagios/nrpe.univention.d/UNIVENTION_RAID.cfg

Then restart the NRPE daemon: "service nagios-nrpe-server restart"

We should consider to ship univention-nagios-raid with "-p mdstat" als default arguments for check_raid.
Comment 1 Christina Scheinig univentionstaff 2018-09-17 10:19:16 CEST
The customer would like the fix within the next 2 weeks. Is that realistic?
Comment 2 Felix Botner univentionstaff 2018-09-19 11:36:47 CEST
univention-nagios-raid: e369348b2bcb33b7fe00b3880a8c51115fff30b7
yaml: 4bf5f2ec9d5aa5c179636954095be27ccba7ce28

We now always use "check_raid -p mdstat" (and no longer the dm plugin, if that is a problem, we have to make it configurable in the future).

For updates, we simply fix the /etc/nagios/nrpe.univention.d/UNIVENTION_RAID.cfg (which is generated by a listener handler) and restart the nrpe server.

New installations should use the modified command in "etc/nagios-plugins/config/univention-raid.cfg" directly.

See nagios/univention-nagios-raid/test.txt for information about how to setup a raid.
Comment 3 Erik Damrose univentionstaff 2018-09-24 16:45:40 CEST
OK: adapted check works for working, broken, rebuilding raid
OK: Config is fixed during update
OK: yaml
Verified
Comment 4 Erik Damrose univentionstaff 2018-09-26 13:24:41 CEST
<http://errata.software-univention.de/ucs/4.3/243.html>