Univention Bugzilla – Bug 38234
Takeover does not work when msDS-ReplicationEpoch is set
Last modified: 2021-06-15 19:20:16 CEST
Ticket: 2015041021000201 Takeover of a 2003 AD Domain failed with the following traceback: 2015-04-10 16:04:03,404 Pre-loading the Samba 4 and AD schema 2015-04-10 16:04:03,541 A Kerberos configuration suitable for Samba 4 has been generated at /var/lib/samba/private/krb5.conf 2015-04-10 16:04:03,792 ERROR(runtime): uncaught exception - (8593, 'WERR_DS_DIFFERENT_REPL_EPOCHS') 2015-04-10 16:04:03,792 File "/usr/lib/python2.7/dist-packages/samba/netcmd/__init__.py", line 175, in _run 2015-04-10 16:04:03,792 return self.run(*args, **kwargs) 2015-04-10 16:04:03,792 File "/usr/lib/python2.7/dist-packages/samba/netcmd/domain.py", line 620, in run 2015-04-10 16:04:03,793 keep_existing=keep_existing) 2015-04-10 16:04:03,793 File "/usr/lib/python2.7/dist-packages/samba/join.py", line 1190, in join_DC 2015-04-10 16:04:03,793 ctx.do_join() 2015-04-10 16:04:03,793 File "/usr/lib/python2.7/dist-packages/samba/join.py", line 1095, in do_join 2015-04-10 16:04:03,793 ctx.join_replicate() 2015-04-10 16:04:03,793 File "/usr/lib/python2.7/dist-packages/samba/join.py", line 818, in join_replicate 2015-04-10 16:04:03,793 replica_flags=ctx.replica_flags) 2015-04-10 16:04:03,794 File "/usr/lib/python2.7/dist-packages/samba/drs_utils.py", line 252, in replicate 2015-04-10 16:04:03,810 (level, ctr) = self.drs.DsGetNCChanges(self.drs_handle, req_level, req) 2015-04-10 16:04:03,814 checking sAMAccountName The NTDS-Settings object of one of the AD DCs had msDS-ReplicationEpoch set to 1. I assume this was set due to a domain name change in the past. The affected DC was downgraded to memberserver (dcpromo) but that only led to msDS-ReplicationEpoch=1 being set at another DCs NTDS-Settings. We then decided to remove the attribute from AD (although this is rated "catastrophic") and the takeover traceback was gone. I don't have details about the overall state of the domain no so I won't recommend this as workaround for now. Please see the following links for details: https://technet.microsoft.com/de-de/library/aa996670%28v=exchg.80%29.aspx?f=255&MSPPError=-2147217396 https://bugzilla.samba.org/show_bug.cgi?id=9500
According to Microsoft doc "How Domain Rename Works" a non-zero value indicates that the domain has been renamed at some point. In that case the incremented replication epoch takes care to aboid replication with not-yet-renamed DCs. Quoting: ========================================================================= [...] If two DCs have different msDS-ReplicationEpoch values, no directory replication RPC interaction is allowed between them. In addition to replication, nested group membership evaluation and global catalog lookups are also discontinued. [...]. The goal of the msDS-ReplicationEpochattribute is to minimize potentially complex interactions, including replication, between DCs that have completed the domain rename and those DCs that have not yet completed the domain rename. ========================================================================= So I guess your workaround is fine.
(In reply to Arvid Requate from comment #1) That does not exactly match what I have seen in the customers environment. In fact he had 4 AD DCs of which one has msDS-ReplicationEpoch=1 but the rename has happened years ago and replication between all DCs seemed normal.
I found this behavior in an environment with the following specifications: - Windows 2k8 AD-Master - UCS 4.1-4 Membermode (syncmode: read) with installed Samba4 (role: DC) [probably installed afterwards] Replication from MS-AD > OpenLDAP > Samba-AD works just fine, but: - univention-connector-list-rejected shows >500 'AD rejected' (in sync read?!) - connector.log show for nearly each of this rejects 'RuntimeError: (8593, 'WERR_DS_DIFFERENT_REPL_EPOCHS') for more info see ticket#...274
Comment 3 has been split of as Bug 43093, because that's about Member-Mode.
This issue has been filled against UCS 4.0. The maintenance with bug and security fixes for UCS 4.0 has ended on 31st of May 2016. Customers still on UCS 4.0 are encouraged to update to UCS 4.3. Please contact your partner or Univention for any questions. If this issue still occurs in newer UCS versions, please use "Clone this bug" or simply reopen the issue. In this case please provide detailed information on how this issue is affecting you.
I reopened the bug as discussed in dev consultation meeting, as this problem is a showstopper for our partner, see attached ticket. I also removed the bug group Workaround is available, as the existing workaround is only possible if the windows domain will be taken over. In this case the domain should be permanently connected to the UCS domain via connector.
Set back old status and cloned the bug as the topic and the component did not match here.