Bug 53450 - AD Connector rejects Userobject if Connector msDS-ReplicationEpoch is set
AD Connector rejects Userobject if Connector msDS-ReplicationEpoch is set
Status: NEW
Product: UCS
Classification: Unclassified
Component: AD Connector
UCS 5.0
Other Linux
: P5 normal (vote)
: ---
Assigned To: Samba maintainers
Samba maintainers
:
Depends on: 43093
Blocks:
  Show dependency treegraph
 
Reported: 2021-06-15 17:10 CEST by Dirk Schnick
Modified: 2021-07-28 12:38 CEST (History)
6 users (show)

See Also:
What kind of report is it?: Bug Report
What type of bug is this?: 4: Minor Usability: Impairs usability in secondary scenarios
Who will be affected by this bug?: 1: Will affect a very few installed domains
How will those affected feel about the bug?: 4: A User would return the product
User Pain: 0.091
Enterprise Customer affected?: Yes
School Customer affected?:
ISV affected?:
Waiting Support:
Flags outvoted (downgraded) after PO Review:
Ticket number: 2021060821000388
Bug group (optional):
Max CVSS v3 score:
requate: Patch_Available+


Attachments
diff (4.63 KB, text/plain)
2021-06-22 13:49 CEST, Dirk Schnick
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Dirk Schnick univentionstaff 2021-06-15 17:10:04 CEST
The root cause seems to be 
https://bugzilla.samba.org/show_bug.cgi?id=9500

I cloned the bug 38234 as topic and component did not match. In an environment where a renamed windows domain should be connected to an UCS domain by AD connector there is no workaround available. All userobjects will be rejected as the value of the attribute can not be handled.

In the given ticket this problem is a showstopper.

I opened the bug with component samba, as the main reason is located in the samba code.



+++ This bug was initially created as a clone of Bug #38234 +++

Ticket: 2015041021000201

Takeover of a 2003 AD Domain failed with the following traceback:


2015-04-10 16:04:03,404 Pre-loading the Samba 4 and AD schema
2015-04-10 16:04:03,541 A Kerberos configuration suitable for Samba 4 has been generated at /var/lib/samba/private/krb5.conf
2015-04-10 16:04:03,792 ERROR(runtime): uncaught exception - (8593, 'WERR_DS_DIFFERENT_REPL_EPOCHS')
2015-04-10 16:04:03,792   File "/usr/lib/python2.7/dist-packages/samba/netcmd/__init__.py", line 175, in _run
2015-04-10 16:04:03,792     return self.run(*args, **kwargs)
2015-04-10 16:04:03,792   File "/usr/lib/python2.7/dist-packages/samba/netcmd/domain.py", line 620, in run
2015-04-10 16:04:03,793     keep_existing=keep_existing)
2015-04-10 16:04:03,793   File "/usr/lib/python2.7/dist-packages/samba/join.py", line 1190, in join_DC
2015-04-10 16:04:03,793     ctx.do_join()
2015-04-10 16:04:03,793   File "/usr/lib/python2.7/dist-packages/samba/join.py", line 1095, in do_join
2015-04-10 16:04:03,793     ctx.join_replicate()
2015-04-10 16:04:03,793   File "/usr/lib/python2.7/dist-packages/samba/join.py", line 818, in join_replicate
2015-04-10 16:04:03,793     replica_flags=ctx.replica_flags)
2015-04-10 16:04:03,794   File "/usr/lib/python2.7/dist-packages/samba/drs_utils.py", line 252, in replicate
2015-04-10 16:04:03,810     (level, ctr) = self.drs.DsGetNCChanges(self.drs_handle, req_level, req)
2015-04-10 16:04:03,814 checking sAMAccountName


The NTDS-Settings object of one of the AD DCs had msDS-ReplicationEpoch set to 1. I assume this was set due to a domain name change in the past.
The affected DC was downgraded to memberserver (dcpromo) but that only led to msDS-ReplicationEpoch=1 being set at another DCs NTDS-Settings.

We then decided to remove the attribute from AD (although this is rated "catastrophic") and the takeover traceback was gone. I don't have details about the overall state of the domain no so I won't recommend this as workaround for now.

Please see the following links for details:

https://technet.microsoft.com/de-de/library/aa996670%28v=exchg.80%29.aspx?f=255&MSPPError=-2147217396
https://bugzilla.samba.org/show_bug.cgi?id=9500
Comment 2 Arvid Requate univentionstaff 2021-06-15 19:16:58 CEST
Small findings from looking at the code: drs_DsBind() (in /usr/lib/python3/dist-packages/samba/drs_utils.py) gets called via drsuapi_connect() from open_drs_connection() (in modules/univention/connector/ad/__init__.py).

Currently the code is so that only in drs_DsBind() we would have access to info.repl_epoch (which is what is also stated in the more general Samba Bug.
But I'm not sure how the repl_epoch is communicated to the AD-Server during the DsGetNCChanges call that finally causes the exception.
Comment 3 Arvid Requate univentionstaff 2021-06-15 19:25:59 CEST
Please ignore the cloned details of the Bug #38234 quoted in the Bug description above. This bug is not about AD-Takeover and not about joining into an AD-domain.
Comment 4 Dirk Schnick univentionstaff 2021-06-22 13:49:02 CEST
Created attachment 10756 [details]
diff
Comment 5 Dirk Schnick univentionstaff 2021-06-22 13:50:14 CEST
Customer said he found a bug with attached changes (diff) that stopped these rejects in the past. Unfortunately, the customer can no longer find the bug and I have not found any bug that has a connection to this problem.
However, this change ensured that the rejects no longer occurred. Maybe this is helpful for a solution?
Comment 6 Arvid Requate univentionstaff 2021-06-22 17:43:19 CEST
Yes, that's basically the patch I was talking abut in Comment 2. Nice that it works, so we should do that.
Comment 7 Arvid Requate univentionstaff 2021-06-22 17:45:38 CEST
We should also backport the fix to UCS 4.4
Comment 9 Arvid Requate univentionstaff 2021-06-24 16:18:39 CEST
I guess the patch was derived from Bug #43093 Comment 2
Comment 10 Dirk Schnick univentionstaff 2021-07-28 12:38:57 CEST
Patch worked in customer environment on UCS 4.4