Bug 54452 - Race condition in App Center cache invalidation+rebuild can cause App Center listener converter services to fail
Race condition in App Center cache invalidation+rebuild can cause App Center ...
Status: CLOSED FIXED
Product: UCS
Classification: Unclassified
Component: App Center
UCS 5.0
Other Linux
: P5 normal (vote)
: UCS 5.0-1-errata
Assigned To: Philipp Hahn
Dirk Wiesenthal
https://git.knut.univention.de/univen...
:
Depends on: 51986
Blocks:
  Show dependency treegraph
 
Reported: 2022-02-14 13:14 CET by Philipp Hahn
Modified: 2022-04-28 10:59 CEST (History)
12 users (show)

See Also:
What kind of report is it?: Bug Report
What type of bug is this?: 7: Crash: Bug causes crash or data loss
Who will be affected by this bug?: 2: Will only affect a few installed domains
How will those affected feel about the bug?: 5: Blocking further progress on the daily work
User Pain: 0.400
Enterprise Customer affected?: Yes
School Customer affected?: Yes
ISV affected?:
Waiting Support:
Flags outvoted (downgraded) after PO Review:
Ticket number: 2022020321000599
Bug group (optional): Regression
Max CVSS v3 score:


Attachments
proposed pytest (3.77 KB, text/x-python)
2022-02-16 17:15 CET, Nikola Radovanovic
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Philipp Hahn univentionstaff 2022-02-14 13:14:53 CET
This still applies to 5.0-1 and has become much worse.

+++ This bug was initially created as a clone of Bug #51986 +++

Due to a race condition during the rebuild of the App Center cache an app listener converter service may stop running, affected apps will then no longer process user or group changes.

When univention-app update is running, the app center cache is invalidated, new metafiles are downloaded and extracted. The first usage of appcenter python library functions finds no cache and rebuilds it, e.g.

from univention.appcenter.app_cache import Apps
Apps().find(<appid>)

The race condition can happen when the univention-app update process is still unpacking the downloaded metafiles while another process trigger the cache rebuild. The rebuild is done by taking the app ini files in certain directories and building the cache from them. As the cache is then present, there is no mechanism to detect that not all files have been unpacked and the cache has to be rebuild.

That can lead to the case that an app is missing in the cache. The App Center listener converter tries to get the App object from the cache upon starting, if that fails, no processing will happen. The check is done in 

management/univention-appcenter/scripts/univention-appcenter-listener-converter:182
...
    app = Apps().find(args.app)
    if not app:
        logger.critical('App not found')
        sys.exit(1)

Symptoms:
/var/log/univention/listener_modules/<appid>.log shows
2020-09-10 10:01:48 CRITICAL App not found

# service univention-appcenter-listener-converter@<appid> status will show
Sep 09 16:40:21 ucs systemd[1]: univention-appcenter-listener-converter@<appid>.service: Main process exited, code=exited, status=1/FAILURE
Sep 09 16:40:21 ucs systemd[1]: univention-appcenter-listener-converter@<appid>.service: Unit entered failed state.
Sep 09 16:40:21 ucs-ma systemd[1]: univention-appcenter-listener-converter@ucs-to-school-transformer.service: Failed with result 'exit-code'.

Another univention-app update will usually not invalidate the cache immediately, because the cache is only invalidated when the metadata files changed on the app center server.

The chance for the problem to occur rises the more apps are installed that use a listener converter, as each service gets triggered every 5 seconds. The affected customer system has 3 apps with converters installed.

Workaround: remove cache files to force a rebuild with the next access to the cache. (Check if the test app center is used and adapt the path below accordingly)
rm /var/cache/univention-appcenter/appcenter.software-univention.de/4.4/.apps.*json
Comment 1 Philipp Hahn univentionstaff 2022-02-14 17:32:30 CET
- [x] Code https://git.knut.univention.de/univention/ucs/-/merge_requests/275
- [ ] Test
- [ ] Review
- [ ] Import and build
Comment 2 Erik Damrose univentionstaff 2022-02-14 19:16:49 CET
Minor fixes were done in the feature branch. After review the code was merged and build for errata5.0-1:

2405d13e merge
a64dfe85 changelog
6dd7423d yaml

univention-appcenter 9.0.2-72A~5.0.0.202202141913
Comment 3 Dirk Wiesenthal univentionstaff 2022-02-15 04:22:17 CET
We will have to wait for test results. But my manual tests were successful. The all.tar.gz is now downloaded as .tmp.tar and the cache files are extracted before the temporary file is renamed to its "official" name .all.tar. This should greatly reduce any race condition.

This works as expected and I did not find an error in the code nor in many, many, cache updates.
Comment 4 Julia Bremer univentionstaff 2022-02-15 10:19:05 CET
The test 80_docker.59_app_center_signature
currently fails when doing an `app update`, because the file downloaded in 
"def _download_apps" is now called .tmp.tar while the corresponding gpg file is still saved as all.tar.gpg. 
The signature verification therefore fails.

http://jenkins.knut.univention.de:8080/job/UCS-5.0/job/UCS-5.0-1/job/AutotestJoin/lastCompletedBuild/SambaVersion=no-samba,Systemrolle=master-part-II/testReport/80_docker/59_app_center_signature/master090/
Comment 5 Philipp Hahn univentionstaff 2022-02-15 12:59:07 CET
[5.0-1] 645e74a0df style[AppC]: Re-indent
 .../univention-appcenter/python/appcenter/actions/update.py     | 68 ++++++++++++++---------------
 1 file changed, 34 insertions(+), 34 deletions(-)

[5.0-1] ea361de870 fix[AppC]: Verify .tmp.tar with .all.tar.gpg
 .../univention-appcenter/python/appcenter/actions/update.py     | 36 ++++++++++++++++-------------
 1 file changed, 20 insertions(+), 16 deletions(-)

[5.0-1] 82e93d845c feat[AppC]: Specify signature file name
 management/univention-appcenter/python/appcenter/utils.py | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

[5.0-1] 4aa8ea379d refactor[AppC]: inverted download logic
 management/univention-appcenter/python/appcenter/actions/update.py | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

[5.0-1] 4427054233 doc[AppC]: Fix PEP-484 type annotations
 management/univention-appcenter/python/appcenter/actions/update.py | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Package: univention-appcenter
Version: 9.0.2-72A~5.0.0.202202151205
Branch: ucs_5.0-0
Scope: errata5.0-1

[5.0-1] b4b0eb8a7a Bug #53523: univention-appcenter 9.0.2-72A~5.0.0.202202151205
 doc/errata/staging/univention-appcenter.yaml | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

OK: manual tests
OK: ucs-test -E dangerous -c -F raw -s docker -s appcenter
Comment 6 Philipp Hahn univentionstaff 2022-02-15 16:40:05 CET
[5.0-1] 79ad90dd6d fix[AppC]: Do not delete cache files
 management/univention-appcenter/python/appcenter/actions/update.py | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

Package: univention-appcenter
Version: 9.0.2-72A~5.0.0.202202151637
Branch: ucs_5.0-0
Scope: errata5.0-1

[5.0-1] 5e7d2fb6f0 Bug #53523: univention-appcenter 9.0.2-72A~5.0.0.202202151637
 doc/errata/staging/univention-appcenter.yaml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
Comment 7 Philipp Hahn univentionstaff 2022-02-15 17:09:01 CET
[5.0-1] 5ac57aa763 fix[AppC]: Write app.json atomically
 management/univention-appcenter/python/appcenter/app_cache.py | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

Package: univention-appcenter
Version: 9.0.2-72A~5.0.0.202202151705
Branch: ucs_5.0-0
Scope: errata5.0-1

[5.0-1] 5e7d2fb6f0 Bug #53523: univention-appcenter 9.0.2-72A~5.0.0.202202151637
 doc/errata/staging/univention-appcenter.yaml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
Comment 8 Dirk Wiesenthal univentionstaff 2022-02-16 10:50:53 CET
YAML: OK
Manual Tests: OK
App Center and Test App Center: OK
UMC: OK
Nightly Tests: OK
Comment 10 Nikola Radovanovic univentionstaff 2022-02-16 17:15:04 CET
Created attachment 10917 [details]
proposed pytest

Here is the proposed pytest module. If it is conceptually OK - I would add one more, similar to this one but in which half of the reader threads will start first, then writer and then remaining half of the readers