Univention Bugzilla – Bug 50256
listener mdb database handling incorrect
Last modified: 2022-10-20 10:04:54 CEST
from Bug #50114, comment 5 Reading <https://www.openldap.org/lists/openldap-technical/201306/msg00098.html> and <https://www.openldap.org/lists/openldap-technical/201306/msg00116.html> we should call `mdb_stat -ef /var/lib/univention-directory-listener/cache/` to also get the freelist information. Listening further to <https://www.infoq.com/presentations/lmdb-lighting-memory-mapped-database/> LMDB does CoW with MVCC and at least keeps the last 2 transactions open, but maybe more if readers are still active. So extra space is needed, especially when doing large transactions. This might lead to MDB_MAP_FULL even if it looks like there is enough free space left. Also reading <https://lmdb.readthedocs.io/en/release/#transaction-management> the DB may grow without limits while a reader is still active. Data is never modified in-place as CoW is done. A long running reader may keep older versions alive, so there might even exist more than the 2 last trees. We should check if UDL (or some other process!) starts a long-running reader which keeps the LMDB from reclaiming its free pages. During a short look at udl/src/cache.c I already found several cases where LMDB-transactions are not close()ed correctly in error cases: mdb_txn_begin() is not followed by mdb_rxn_abort() or mdb_txn_commit(). We should consider converting that (and all the other UDL memory allocations) to __attribute__((cleanup(...))). ... > root@master:~ # ls -alh /var/lib/univention-directory-listener/cache/ > -rw------- 1 listener nogroup 1,6G Sep 16 10:25 data.mdb > -rw------- 1 listener nogroup 8,0K Sep 16 10:25 lock.mdb > > Ausgabe von du: > 1,6G /var/lib/univention-directory-listener/cache/ Hint: you can use `ls -s` (`ls -AgGhs /var/lib/univention-directory-listener/cache/`) to get both the file size and block usage from `ls`. Nit: See attachment 10079 [details] for a request to enhance the documentation.
Created attachment 10870 [details] bug50256.patch > During a short look at udl/src/cache.c I already found several cases where LMDB-transactions are not close()ed correctly in error cases: > mdb_txn_begin() is not followed by mdb_rxn_abort() or mdb_txn_commit(). The attached patch fixes two locations. Please note that the transaction opened in cache_first_entry gets closed later by calling cache_free_cursor from change_init_module.
As this is going to be critical in big environments, I raised the importance of this.