Bug 51126 - Shared lazy-loaded read-only ConfigRegistry singleton
Shared lazy-loaded read-only ConfigRegistry singleton
Status: CLOSED FIXED
Product: UCS
Classification: Unclassified
Component: UCR
UCS 4.4
Other Linux
: P5 normal (vote)
: UCS 5.0
Assigned To: Philipp Hahn
Daniel Tröder
:
: 52438 (view as bug list)
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2020-04-21 09:02 CEST by Philipp Hahn
Modified: 2021-05-25 15:59 CEST (History)
2 users (show)

See Also:
What kind of report is it?: Development Internal
What type of bug is this?: ---
Who will be affected by this bug?: ---
How will those affected feel about the bug?: ---
User Pain:
Enterprise Customer affected?:
School Customer affected?:
ISV affected?:
Waiting Support:
Flags outvoted (downgraded) after PO Review:
Ticket number:
Bug group (optional): API change, Cleanup
Max CVSS v3 score:
hahn: Patch_Available+


Attachments
Proof of concept for lazy UCR (2.02 KB, text/x-python)
2020-04-21 11:21 CEST, Philipp Hahn
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Philipp Hahn univentionstaff 2020-04-21 09:02:40 CEST
Our code is full of this pattern
  from univention.config_registry import ConfigRegistry
  ucr = ConfigRegistry()
  ucr.load()

$ git grep -Fi -e 'ucr.load()' -e 'configregistry.load()' -e 'bc.load()' -e 'config_registry.load()' -e 'baseConfig.load()' -e 'cr.load()' -e 'ucrReg.load()' -e 'confreg.load()' -e 'reg.load()' -e 'cfgRegistry.load()' -e 'ucr().load()' | wc -l
563

A simple "udm modules" instantiates and loads 15 instances.
Many times that instance is used to look-up a single value before the instance is thrown away again.
Even worse that code is most often already executed during module import time, which complicates things like unit testing, Python API generation, ...

Several parts of UCS thus provide a singleton instance for sharing:
  univention.admin.configRegistry
  listener.configRegistry
  ...
or provide their own implementation to create a singleton:
  services/univention-apache/univention-add-vhost
  ...

UCR should provide a lazy-loaded thread-save singleton itself.

class ConfigRegistry(object)
    _instance = None
    _lock = threading.Lock()
    def __new__(cls):
        if cls._instance is None:
            with cls._lock:
                if cls._instance is None:
                    cls._instance = super(ConfigRegistry, cls).__new__(cls)
                    cls._instance.load()
        return cls._instance

TODO: lazy-loading, e.g. <https://wrapt.readthedocs.io/en/latest/index.html> or combined with something ugly as sys.modules["univention.config_registry.ucr"] = ...

Care should be taken that this instance is not mutated by accident: If a previous user assumed a private copy and used that to store other changes there, that would be visible by all other users using that shared instance now. If that matters <https://docs.python.org/3/library/types.html#types.MappingProxyType> can be used to provide a read-only version by default.

Another issue is users doing repetitive ucs.load()s to get the current value but other users expecting their instance to not change. Having UCR reload itself after any update (even from a different process) would be a win IMHO. That might require a new backend for UCR (Bug #15720).

(Live updates also would have the benefit of being able to change the debug level of running processes on the fly if done right.)
Comment 1 Daniel Tröder univentionstaff 2020-04-21 10:54:22 CEST
In UCS@school we already use lazy-object-proxy (https://github.com/ionelmc/python-lazy-object-proxy): https://git.knut.univention.de/univention/ucsschool/-/blob/4.4/ucs-school-lib/python/models/utils.py#L86

That code is in use in production.
It provides a shared, lazily loaded instance for all UCS@school clients.
We've had no undesired side effects so far.
Comment 2 Daniel Tröder univentionstaff 2020-04-21 11:11:09 CEST
A nice API would be something like

-------------------------------
from univention.config_registry import ro_instance as ucr
-------------------------------
from univention.config_registry import auto_load_ro_instance as ucr
-------------------------------
from univention.config_registry import private_instance
ucr = private_instance()
-------------------------------

* 'ro_instance' would be a shared, lazy UCR instance that does not allow __setitem__() etc (good idea btw: MappingProxyType).
* 'auto_load_ro_instance' would be the same as 'ro_instance', but would update itself when the database changes (how "expensive" is inotify for 4 files?).
* 'private_instance' would be a function that returns a fresh, non-shared instance without restrictions (labmda: ConfigRegistry().load() [if load() would return self]).


For 'auto_load_ro_instance' to be useful there would have to be a callback mechanism for clients to get notified of changes and run code (changelog level or  proxy etc).
Comment 3 Philipp Hahn univentionstaff 2020-04-21 11:21:13 CEST
Created attachment 10338 [details]
Proof of concept for lazy UCR

(In reply to Daniel Tröder from comment #1)
Good idea with python-lazy-proxy - I forgot about that.

Attached is a self-implemented version - I learned a lot about Python dirty internal details.

(In reply to Daniel Tröder from comment #2)
> A nice API would be something like

nice +1

> * 'auto_load_ro_instance' would be the same as 'ro_instance', but would
> update itself when the database changes (how "expensive" is inotify for 4
> files?).

the problem are the 4 file descriptors for it - one for the directory /etc/univention/ would be enough.
Comment 4 Daniel Tröder univentionstaff 2020-11-27 16:13:41 CET
*** Bug 52438 has been marked as a duplicate of this bug. ***
Comment 6 Daniel Tröder univentionstaff 2021-02-04 09:43:34 CET
> class _ConfigRegistry(dict):
>     def load(self, autoload=False)

Why is the default of 'autoload' False?
I cannot see a reason to reload the unchanged data into the UCR instance unless the client code has purposely written into it. But in that case the client knows that and can force a reload. To keep the old behavior for the sake of backwards compatibility to weird use cases means loosing a performance gain.

I suggest to change the default and make the argument express the intention:

    def load(self, force=False)
    """
    ...
    :param force: Load file even if unmodified since last read.
    """
Comment 7 Daniel Tröder univentionstaff 2021-03-02 18:31:49 CET
I have review and commented the merge request (link in commment5) and would approve it.
This new Python API helps removing a lot of boilerplate code from UCS and UCS@school while at the same time making our Python code much better manageable (testability, documentation and conformance) as it will remove undesired I/O access at import time from almost all of UCS code!

I am in strong support of merging this into UCS 5.
Comment 8 Philipp Hahn univentionstaff 2021-03-16 18:46:13 CET
[5.0-0] 54dc8a8f1f feat[ucr]: Singleton API

Package: univention-config-registry
Version: 15.0.7-7A~5.0.0.202103161838
Comment 9 Daniel Tröder univentionstaff 2021-03-17 09:06:06 CET
Previous review still applies:
  * code review
  * documentation review
  * unit tests
  * manual tests

OK: tests moved from ucs-test to in-package unit tests (893 passed, 17 skipped, 2 xfailed, 1 xpassed in 23.21 seconds)
OK: merge and build
OK: UCS 5.0-0 changelog entry
OK: 03_ucr section green in Jenkins tests (only "MultiEnv: AMI=5.0, errata, join, Autotest" seems to be working atm)
Comment 10 Florian Best univentionstaff 2021-05-25 15:59:57 CEST
UCS 5.0 has been released:
 https://docs.software-univention.de/release-notes-5.0-0-en.html
 https://docs.software-univention.de/release-notes-5.0-0-de.html

If this error occurs again, please use "Clone This Bug".