Bug 54919 - Fix metrics of SSL and SWAP check
Fix metrics of SSL and SWAP check
Status: CLOSED FIXED
Product: UCS
Classification: Unclassified
Component: Monitoring (Prometheus or Nagios)
UCS 5.0
Other Linux
: P5 normal (vote)
: UCS 5.0-2-errata
Assigned To: Florian Best
Siavash Sefid Rodi
https://git.knut.univention.de/univen...
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2022-06-30 17:54 CEST by Florian Best
Modified: 2022-07-06 17:03 CEST (History)
0 users

See Also:
What kind of report is it?: ---
What type of bug is this?: ---
Who will be affected by this bug?: ---
How will those affected feel about the bug?: ---
User Pain:
Enterprise Customer affected?:
School Customer affected?:
ISV affected?:
Waiting Support:
Flags outvoted (downgraded) after PO Review:
Ticket number:
Bug group (optional):
Max CVSS v3 score:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Florian Best univentionstaff 2022-06-30 17:54:26 CEST
The prometheus alert metrics for checking SSL validity and SWAP usage are wrong.

-       --set query="univention_ssl_certificate_expiry_seconds >= 1728000" || die "UNIVENTION_SSL" # 20 days
+       --set query="univention_ssl_certificate_expiry_seconds < 1728000" || die "UNIVENTION_SSL" # 20 days

-       --set query="(node_memory_SwapTotal_bytes - node_memory_SwapFree_bytes) * 100 / node_memory_SwapTotal_bytes <= 20" || die "UNIVENTION_SWAP"
+       --set query="(node_memory_SwapTotal_bytes - node_memory_SwapFree_bytes) * 100 / node_memory_SwapTotal_bytes > 80" || die "UNIVENTION_SWAP"
Comment 1 Florian Best univentionstaff 2022-07-05 13:24:06 CEST
https://help.univention.com/t/univention-monitoring-client-throwing-exceptions-after-upgrade-to-5-0-2/20131
reported also:

Traceback (most recent call last):
  File "/usr/share/univention-monitoring-client/scripts//check_univention_s4_connector", line 75, in <module>
    S4Connector.main()
  File "/usr/lib/python3/dist-packages/univention/monitoring/__init__.py", line 74, in main
    self.write_metrics()
  File "/usr/share/univention-monitoring-client/scripts//check_univention_s4_connector", line 71, in write_metrics
    self.debug('Found %d reject(s)! Please check output of univention-s4connector-list-rejected.' % (rejects,))
AttributeError: 'S4Connector' object has no attribute 'debug'
run-parts: /usr/share/univention-monitoring-client/scripts//check_univention_s4_connector exited with return code 1

Traceback (most recent call last):
  File "/usr/share/univention-monitoring-client/scripts//check_univention_samba_drs_failures", line 86, in <module>
    CheckSambaDrsRepl.main()
  File "/usr/lib/python3/dist-packages/univention/monitoring/__init__.py", line 74, in main
    self.write_metrics()
  File "/usr/share/univention-monitoring-client/scripts//check_univention_samba_drs_failures", line 65, in write_metrics
    (info_type, info) = drsuapi.DsReplicaGetInfo(self.drsuapi_handle, 1, req1)
TypeError: cannot unpack non-iterable drsuapi.DsReplicaGetInfo object
run-parts: /usr/share/univention-monitoring-client/scripts//check_univention_samba_drs_failures exited with return code 1
Comment 2 Florian Best univentionstaff 2022-07-06 14:23:05 CEST
Some more adjustments have been done, Summary of everything:
* metrics for SSL certificate validity have been fixed
* metrics for swap usage have been fixed
* exception in check_univention_s4_connector when trying to log that there are rejects has been fixed
* TypeError in check_univention_samba_drs_failures has been fixed
* split the join status check into one which is a warning when some joinscripts are not executed (instead of critical)
* rename "hostname" label into "instance" as this is more common in the prometheus world
* fix the nmblookup call in the nagios plugin and in this script: "-R" was changed somewhen to "--recursion"
Comment 3 Florian Best univentionstaff 2022-07-06 14:24:07 CEST
univention-nagios.yaml
a10aa95ce88c | Bug #54919: fix nmblookup call

univention-nagios (13.0.3-2)
a10aa95ce88c | Bug #54919: fix nmblookup call

univention-monitoring-client.yaml
94b96470ddf5 | YAML Bug #54919
3b38c3741214 | Bug #54919: fixed expressions of ssl and swap checks

univention-monitoring-client (1.0.0-3)
079f1f540f86 | Bug #54919: assign new warning script on upgrade
e151865678c3 | Bug #54919: rename "hostname" label into "instance" as this is more common in prometheus
a10aa95ce88c | Bug #54919: fix nmblookup call
db257396d0b6 | fixup! Bug #54919: added creation/modification of UNIVENTION_SWAP_WARNING/UNIVENTION_JOINSTATUS_WARNING to postinst
aee9221b0a62 | Bug #54919: ucslint
4350149c1d51 | Bug #54919: fix check_univention_samba_drs_failures
5704845a13ca | Bug #54919: fix logging of rejects in check_univention_s4_connector
b7330b5dc386 | Bug #54919: corrected type in cups monitoring package
68132ad01096 | Bug #54919: added creation/modification of UNIVENTION_SWAP_WARNING/UNIVENTION_JOINSTATUS_WARNING to postinst
383ba566fb5e | Bug #54919: splittet join alert into warning and critical
3b38c3741214 | Bug #54919: fixed expressions of ssl and swap checks
Comment 4 Siavash Sefid Rodi univentionstaff 2022-07-06 14:24:44 CEST
OK: * metrics for SSL certificate validity have been fixed
OK: * metrics for swap usage have been fixed
OK: * exception in check_univention_s4_connector when trying to log that there are rejects has been fixed
OK: * TypeError in check_univention_samba_drs_failures has been fixed
OK: * split the join status check into one which is a warning when some joinscripts are not executed (instead of critical)
OK: * rename "hostname" label into "instance" as this is more common in the prometheus world
OK: * fix the nmblookup call in the nagios plugin and in this script: "-R" was changed somewhen to "--recursion"