Univention Bugzilla – Bug 37358
Broken DRS replication while the connection is broken and the master password is changed twice
Last modified: 2022-04-21 15:49:39 CEST
At least under the following constructed cicumstances, the drs replication between master and slave breaks permanently:
1. DRS Connection between Master and Slave unreachable (S4 stopped for example)
2. Server-Password-Change at Master twice
We hold two versions of kvno - lets say, the master started with 1.
It is raised to 3 then after two password changes. The master holds 3 and 2 now in keytab.
The slave still asks for 1, if Samba/DRS is responsive again...
The kvno is hold at msDS-KeyVersionNumber in CN=MASTER,OU=Domain Controllers,DC=domain,DC=test - it is only searchable if directly addressed.
The attribute itself is not editable because it is constructed:
ldbedit -H /var/lib/samba/private/sam.ldb.d/DC=DOMAIN,DC=TIM.ldb -b "cn=master,OU=Domain Controllers,DC=domain,DC=tim" msDS-KeyVersionNumber
failed to modify CN=MASTER,OU=Domain Controllers,DC=domain,DC=tim - objectclass_attrs: attribute 'msDS-KeyVersionNumber' on entry 'CN=MASTER,OU=Domain Controllers,DC=domain,DC=tim' is constructed!
ldbedit -H ldapi:///var/lib/samba/private/ldap_priv/ldapi sAMAccountName='Administrator' supplementalCredentials msds-keyversionnumber --controls=local_oid:184.108.40.206.4.1.7220.127.116.11:0
failed to modify CN=MASTER,OU=Domain Controllers,DC=domain,DC=tim - LDAP error 19 LDAP_CONSTRAINT_VIOLATION - <0000202F: objectclass_attrs: attribute 'msDS-KeyVersionNumber' on entry 'CN=MASTER,OU=Domain Controllers,DC=domain,DC=tim' is constructed!> <>
ldbsearch -H ldapi:///var/lib/samba/private/ldap_priv/ldapi sAMAccountName=master\$ replPropertyMetaData > master.replPropertyMetaData.ldif
ldbedit -H ldapi:///var/lib/samba/private/ldap_priv/ldapi samaccountname=master\$ replPropertyMetaData --controls=local_oid:18.104.22.168.4.1.722.214.171.124:0
rebuilds the msDS-KeyVersionNumber
...but it only fixes the kvno mismatch (Failed to find MASTER$@DOMAIN.TIM(kvno xy) in keytab FILE:/etc/krb5.keytab)
the DRS replication remains broken (after s4 restart).
(In reply to Tim Petersen from comment #1)
At the master:
> ldbsearch -H ldapi:///var/lib/samba/private/ldap_priv/ldapi
> sAMAccountName=master\$ replPropertyMetaData >
store and write to slave:
> ldbedit -H ldapi:///var/lib/samba/private/ldap_priv/ldapi
> samaccountname=master\$ replPropertyMetaData
> rebuilds the msDS-KeyVersionNumber
ucr set kerberos/kdc=ip_master
invoke-rc.d samba-ad-dc restart
The issue was that the slave uses 127.0.0.1 as kerberos/kdc, so he will always get the old keys which are rejected by the master. This has been introduced via Bug 29291.
Either we revert that change, or (better) we configure a special krb5.conf to be used by the samba-processes, e.g. by turning /var/lib/samba/private/krb5.conf into an UCR template (that's used by samba_dnsupdate) where we don't set 127.0.0.1 as kdc (This proposal contradicts Bug 34908) and set KRB5_CONFIG to this file in /etc/init.d/samba-ad-dc.
When a DC (call him "A" here) replaces his keys (server-password-change) the "other" DCs still hold onto their tickets, which stil refer to the previous Kerberos key version number (kvno). To cope with this situation, Samba4 keeps the Keys with the previous kvno in /etc/krb5.keytab on DC "A". This way he still accepts the "previous" Kerberos Service Tickets presented by the "other" DCs and replication continues to work. Especially the updated kerberos keys and unicodePwd (+ attribute version == kvno) are distributed to all "other" DCs.
What could possibly go wrong?
When DC "B" is offline for longer than two password changes of DC "A", then Service Tickets derived from his own local Kerberos-Samba4-Database will not be accepted any longer by DC "A", because he doesn't find the tickets kvno in his local /etc/krb5.keytab. DC "B" cannot fetch any changes from DC "A" any longer. A special variation of this is documented on Bug 35560. The key issue here is that DC "B" asks his own local KDC for tickets, which pulls the keys from the local Samba4 backend (Bug 29291).
Reprted again at 2015060321000363
As discussed, we should make samba use the DNS SRV records. That way we get the "self healing" effect from the round robin mechanism.
Simply reverting the change of Bug 29291 certainly is one but possibly not be the best option:
(1) In case we have a Samba4 DC in the SRV-records which doesn't exist, this might slow down (or result in temporary failures?) for Kerberos-authentication of local clients (users and processes) on other DCs.
(2) Likewise, in case we have a Samba4 DC in the SRC-records which has a large clock skew, then local clients (users and processes) would experience occasional authentication errors.
The ideal solution would be, if the Samba "drepl" process could be configured to use the SRV-records, while all other parts use 127.0.0.1. But I don't see any standard way to achieve this in Samba. I think the second best option is the proposal of Comment 5.
*** Bug 40260 has been marked as a duplicate of this bug. ***
*** Bug 35560 has been marked as a duplicate of this bug. ***
For Ticket #2020113021000561 I had this idea, which worked quite nicely:
On the Master the password haad been rotated and it had KVNO 126,
but the Backup still had KVNO 122 (for the master account).
Running the following command on the DC Backup solved the issue:
samba-tool drs replicate --local "dummy" "$IP_of_the_Master" "$(ucr get samba4/ldap/base)"
It seemed to be important to use the IP and not the FQDN, to avoid Kerberos.
This gave me the idea that we could automate this by means of a listener module.
(In reply to Arvid Requate from comment #10)
> For Ticket #2020113021000561 I had this idea, which worked quite nicely:
> On the Master the password haad been rotated and it had KVNO 126,
> but the Backup still had KVNO 122 (for the master account).
> Running the following command on the DC Backup solved the issue:
> samba-tool drs replicate --local "dummy" "$IP_of_the_Master" "$(ucr get
> It seemed to be important to use the IP and not the FQDN, to avoid Kerberos.
> This gave me the idea that we could automate this by means of a listener
Cool, so if the listener module detects a password change for the master (or any other DC?), we could simple call samba-tool drs replicate ... ?