Bug 42812

Summary: Consider nscd processes in docker containers
Product: UCS Reporter: Stefan Gohmann <gohmann>
Component: Monitoring (Prometheus or Nagios)Assignee: Jürn Brodersen <brodersen>
Status: CLOSED FIXED QA Contact: Felix Botner <botner>
Severity: normal    
Priority: P5 CC: damrose, grandjean, lutz.willek, robert.evert, stephan.hendl, steuwer, stoeckigt, troeder
Version: UCS 4.1   
Target Milestone: UCS 4.2-1-errata   
Hardware: Other   
OS: Linux   
See Also: https://forge.univention.org/bugzilla/show_bug.cgi?id=34787
https://forge.univention.org/bugzilla/show_bug.cgi?id=45414
https://forge.univention.org/bugzilla/show_bug.cgi?id=49967
What kind of report is it?: Bug Report What type of bug is this?: 4: Minor Usability: Impairs usability in secondary scenarios
Who will be affected by this bug?: 3: Will affect average number of installed domains How will those affected feel about the bug?: 2: A Pain – users won’t like this once they notice it
User Pain: 0.137 Enterprise Customer affected?: Yes
School Customer affected?: ISV affected?:
Waiting Support: Flags outvoted (downgraded) after PO Review:
Ticket number: 2017022721000385 Bug group (optional):
Max CVSS v3 score:
Bug Depends on:    
Bug Blocks: 45186, 45187    

Description Stefan Gohmann univentionstaff 2016-11-01 16:51:44 CET
After installing a docker appbox app, the Nagios check for nscd shows a warning:

[1447873023] SERVICE NOTIFICATION: root@localhost;master411.deadlock41.intranet;UNIVENTION_NSCD;WARNING;notify-service-by-email;PROCS WARNING: 5 processes with command name nscd

root@master411:~# ps aufwx 
[...]
root     11342  0.0  1.5 186176 15916 ?        Sl   15:04   0:06 /usr/bin/docker -d -p /var/run/docker.pid --storage-driver=overlay --bip=172.17.42.1/16
root     11474  0.0  0.1  10676  1408 ?        Ss   15:04   0:00  \_ init [2]  
[...]
root     16614  0.0  0.1  99312  1832 ?        Ssl  15:05   0:00  |   \_ /usr/sbin/nscd
root     16628  0.0  0.1  10460  1488 ?        Ss   15:05   0:00  |   \_ /usr/sbin/inetd
daemon   16698  0.0  0.0  16684   144 ?        Ss   15:05   0:00  |   \_ /usr/sbin/atd
root     16720  0.0  0.1  10484  1640 ?        Ss   15:05   0:00  |   \_ /usr/sbin/cron
[...]
Comment 1 Nico Stöckigt univentionstaff 2017-02-27 15:18:35 CET
This also occurs in a customers environment when changing the Doodle App to the docker one.
Comment 2 Ingo Steuwer univentionstaff 2017-04-10 08:07:50 CEST
Happened to me after installing the etherpad docker app.

Is this realy "Minor Usability"? It happens with default Nagios settings for each docker app?
Comment 3 robert.evert 2017-06-06 08:34:39 CEST
Is it safe to disable the NSCD daemon in the docker app?
Comment 4 Erik Damrose univentionstaff 2017-06-15 16:21:12 CEST
Another mention of this behavior in https://help.univention.com/t/failed-upgrade-from-owncloud-9-to-to-9-1/5953
Comment 5 Michael Grandjean univentionstaff 2017-06-15 16:55:42 CEST
Just to state the obvious:
As a workaround, one can manually adjust the WARNING and CRITICAL level for this Nagios check. There's no need to disable nscd inside the docker container just to make the warning go away.
I added an example at https://help.univention.com/t/failed-upgrade-from-owncloud-9-to-to-9-1/5953/11
Comment 6 Jürn Brodersen univentionstaff 2017-07-06 11:25:44 CEST
r80894: Check the nscd socket instead of the process.
Package: univention-nagios
Version: 10.0.1-3A~4.2.0.201707061034
Branch: ucs_4.2-0
Scope: errata4.2-1
r80895: YAML

I tried to check if the nscd process is running inside docker by checking the cgroup of the nscd process but that cgroup can be changed or not set at all.

The check now does an "lsof" on /var/run/nscd/socket that should also work in cases where /var/run/nscd/ is mounted in the container.

Note:
lsof is part of univention-base-packages, is that enough or should I add it to this package as well?

Ps.
I wondered why a non running nscd is considered critical. Bug 34787 is the reason.
Comment 7 Felix Botner univentionstaff 2017-08-07 15:36:04 CEST
UCS 4.2-1 master + released univention-nagios-server
update univention-nagios-server

-> /usr/lib/nagios/plugins/check_nrpe -H 10.200.7.50 -c UNIVENTION_NSCD
NRPE: Unable to read output

-> grep -r UNIVENTION_NSCD
nagios3/conf.univention.d/services/UNIVENTION_NSCD,master.four.two.cfg:    service_description     UNIVENTION_NSCD
nagios3/conf.univention.d/services/UNIVENTION_NSCD,master.four.two.cfg:    check_command           check_nrpe_1arg!UNIVENTION_NSCD
nagios/nrpe.univention.d/UNIVENTION_NSCD.cfg:command[UNIVENTION_NSCD]=PluginNameNotFoundError
Comment 8 Jürn Brodersen univentionstaff 2017-08-07 18:34:33 CEST
The udm nagios/service update for the nscd check was in the wrong join script.

I moved the code into 30univention-nagios-client.inst.

The ldap modification runs a listener script which writes the nagios config. With the update code in 26univention-nagios-common.inst the listener got called before the new nscd check was installed which resulted in a broken nagios config file.

r81865: Fix update path
Package: univention-nagios
Version: 10.0.1-4A~4.2.0.201708071805
Branch: ucs_4.2-0-errata4.2-1
Scope: errata4.2-1
Comment 9 Daniel Tröder univentionstaff 2017-08-08 05:19:56 CEST
RUNNING 30univention-nagios-client.inst
2017-08-08 05:15:44.940615494+02:00 (in joinscript_init)
File: /etc/nagios/nrpe.cfg

[..]

WARNING: cannot append cn=m90s4,cn=dc,cn=computers,dc=uni,dc=dtr to assignedHosts, value exists
No modification: cn=UNIVENTION_JOINSTATUS,cn=nagios,dc=uni,dc=dtr
LDAP Error: Invalid DN syntax: invalid DN: cn=UNIVENTION_NSCD,
EXITCODE=3


$NAGIOSCONTAINER definition from 26univention-nagios-common.inst is missing:

 81865   jbroders 	univention-directory-manager nagios/service modify "$@" --dn "cn=UNIVENTION_NSCD,$NAGIOSCONTAINER" --set checkCommand="check_univention_nscd" --set checkArgs='' || die
Comment 10 Jürn Brodersen univentionstaff 2017-08-08 10:53:21 CEST
(In reply to Daniel Tröder from comment #9)
> RUNNING 30univention-nagios-client.inst
> 2017-08-08 05:15:44.940615494+02:00 (in joinscript_init)
> File: /etc/nagios/nrpe.cfg
> 
> [..]
> 
> WARNING: cannot append cn=m90s4,cn=dc,cn=computers,dc=uni,dc=dtr to
> assignedHosts, value exists
> No modification: cn=UNIVENTION_JOINSTATUS,cn=nagios,dc=uni,dc=dtr
> LDAP Error: Invalid DN syntax: invalid DN: cn=UNIVENTION_NSCD,
> EXITCODE=3
> 
> 
> $NAGIOSCONTAINER definition from 26univention-nagios-common.inst is missing:
> 
>  81865   jbroders 	univention-directory-manager nagios/service modify "$@"
> --dn "cn=UNIVENTION_NSCD,$NAGIOSCONTAINER" --set
> checkCommand="check_univention_nscd" --set checkArgs='' || die

Thanks!

r81879: Fix undefined variable
Package: univention-nagios
Version: 10.0.1-5A~4.2.0.201708080949
Branch: ucs_4.2-0-errata4.2-1
Scope: errata4.2-1
Comment 11 Felix Botner univentionstaff 2017-08-08 17:16:43 CEST
OK
Comment 12 Arvid Requate univentionstaff 2017-08-09 16:57:17 CEST
<http://errata.software-univention.de/ucs/4.2/129.html>