Bug 56401 - High KDC load due to Primary running Samba with lmdb backend mixed with Backup that still on tdb
High KDC load due to Primary running Samba with lmdb backend mixed with Backu...
Status: CLOSED FIXED
Product: UCS
Classification: Unclassified
Component: Samba4
UCS 5.0
Other Linux
: P5 normal (vote)
: UCS 5.0-5-errata
Assigned To: Felix Botner
Arvid Requate
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2023-08-03 17:44 CEST by Arvid Requate
Modified: 2023-09-20 17:56 CEST (History)
5 users (show)

See Also:
What kind of report is it?: Bug Report
What type of bug is this?: 6: Setup Problem: Issue for the setup process
Who will be affected by this bug?: 1: Will affect a very few installed domains
How will those affected feel about the bug?: 5: Blocking further progress on the daily work
User Pain: 0.171
Enterprise Customer affected?:
School Customer affected?: Yes
ISV affected?:
Waiting Support:
Flags outvoted (downgraded) after PO Review:
Ticket number: 2022070721000411
Bug group (optional):
Max CVSS v3 score:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Arvid Requate univentionstaff 2023-08-03 17:44:20 CEST
In a customer environment we saw this error in log.samba on UCS Backup Node:
====
[2023/08/03 17:07:32.436702,  0, pid=73513] ../../lib/ldb-samba/ldb_wrap.c:79(ldb_wrap_debug)
  ldb: ltdb: tdb(/var/lib/samba/private/sam.ldb.d/DC%3DFOO%2CDC%3DDOMAIN%2CDC%3DCOM.ldb): tdb_expand overflow detected current map_size[4294967295] size[714700]!
  
[2023/08/03 17:07:32.478129,  0, pid=73513] ../../source4/dsdb/repl/replicated_objects.c:988(dsdb_replicated_objects_commit)
  dsdb_replicated_objects_commit:  Failed to prepare commit of transaction: ldb_wait from ../../source4/dsdb/samdb/ldb_modules/repl_meta_data.c:398 with LDB_WAIT_ALL: Operations error (1) (Operations error)
[2023/08/03 17:07:32.478484,  0, pid=73513] ../../source4/dsdb/repl/drepl_out_helpers.c:1184(dreplsrv_op_pull_source_apply_changes_trigger)
  Failed to commit objects: WERR_GEN_FAILURE/NT_STATUS_INVALID_NETWORK_RESPONSE
[2023/08/03 17:08:12.397599,  0, pid=73513] ../../lib/ldb-samba/ldb_wrap.c:79(ldb_wrap_debug)
====

In that case the Samba database on the UCS Primary Node had been converted from tdb to lmdb (probably using the script from Bug #53221) and then at some point the replication failed because the tdb sam.ldb database in the UCS Backup Node was full.

As a result the KDC on the Primary got hammered, apparently by the Samba DRS replication from the Backup which seems to be running wild. But also other systems seem to have hammered on the Primary KDC, such that it was impossible to get a ticket via kinit locally on the Primary itself. 


Maybe we should create a flag in cn=samba or so in OpenLDAP that tells other joining DCs that they need to provision samba using lmdb as key-value store instead of tdb.
Comment 2 Felix Botner univentionstaff 2023-09-19 09:51:38 CEST
Successful build
Package: univention-samba4
Version: 9.0.14-1
Branch: ucs_5.0-0
Scope: errata5.0-5
Comment 3 Arvid Requate univentionstaff 2023-09-20 12:51:39 CEST
Verified:
* Functional test
* Package update
* Advisory