Bug 55447 - wrong ownership of data-directories causing prometheus to fill up disk
wrong ownership of data-directories causing prometheus to fill up disk
Status: NEW
Product: UCS
Classification: Unclassified
Component: UCS Dashboard
UCS 5.0
Other Linux
: P5 normal (vote)
: ---
Assigned To: UCS maintainers
UCS maintainers
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2022-11-22 12:31 CET by Dirk Ahrnke
Modified: 2023-03-13 16:33 CET (History)
1 user (show)

See Also:
What kind of report is it?: Bug Report
What type of bug is this?: 4: Minor Usability: Impairs usability in secondary scenarios
Who will be affected by this bug?: 2: Will only affect a few installed domains
How will those affected feel about the bug?: 2: A Pain – users won’t like this once they notice it
User Pain: 0.091
Enterprise Customer affected?:
School Customer affected?: Yes
ISV affected?:
Waiting Support:
Flags outvoted (downgraded) after PO Review:
Ticket number:
Bug group (optional):
Max CVSS v3 score:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Dirk Ahrnke univentionstaff 2022-11-22 12:31:58 CET
environmental info
UCS: 5.0-2 errata471
Installed: admin-dashboard=2.1 prometheus-node-exporter=2.0.1 ucsschool=5.0 v3 4.4/prometheus=2.35.0-5
role: domaincontroller_backup

Note: prometheus was installed with UCS 4.4

After trying to run the prometheus joinscript which was not yet successful during the UCS-/App-Upgrade the customer noted that the /var-partition was at 100%. A restart using "univention-app restart prometheus" immediately decreased the usage but after a short time increased again at high speed.

"univention-app logs prometheus" showed:
ts=2022-11-22T09:49:50.923Z caller=db.go:829 level=error component=tsdb msg="compaction failed" err="reloadBlocks blocks: delete 271 blocks: delete obsolete block 01GJDAADBC3Z9MZEC2KER7QN13: unlinkat data/01GJDAADBC3Z9MZEC2KER7QN13.tmp-for-deletion/meta.json: permission denied"

Those directories had a timestamp of the unsuccessful run of the joinscript and were owned by "root:root".

It appears as if at least the deletion works again after "chown -R nobody:nogroup" to the questionable dirs.
Comment 1 Dirk Ahrnke univentionstaff 2023-03-13 16:33:45 CET
There is a chance that the root-ownership is caused by processes started with "univention-app shell ..." without specifying a user. 

I have noticed several folders after an unsuccessful attempt to do a database migration. 

univention-app shell -u nobody prometheus promtool tsdb create-blocks-from rules ....

at least did not leave such folders