Bug 32257 - Joining of N+1st Samba4 DC may fail due to S4C/DRS replication race
Joining of N+1st Samba4 DC may fail due to S4C/DRS replication race
Status: CLOSED FIXED
Product: UCS
Classification: Unclassified
Component: Samba4
UCS 3.1
Other Linux
: P5 normal (vote)
: UCS 3.1-1-errata
Assigned To: Stefan Gohmann
Arvid Requate
:
Depends on:
Blocks: 31865
  Show dependency treegraph
 
Reported: 2013-08-19 13:30 CEST by Arvid Requate
Modified: 2013-08-22 12:06 CEST (History)
2 users (show)

See Also:
What kind of report is it?: ---
What type of bug is this?: ---
Who will be affected by this bug?: ---
How will those affected feel about the bug?: ---
User Pain:
Enterprise Customer affected?:
School Customer affected?:
ISV affected?:
Waiting Support:
Flags outvoted (downgraded) after PO Review:
Ticket number:
Bug group (optional):
Max CVSS v3 score:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Arvid Requate univentionstaff 2013-08-19 13:30:37 CEST
Depending on the timing between UDM/S4-Connector and DRS replication the join of the N+1st Samba4 DC can fail. As a symptom the 98univention-samba4-dns.inst joinscript fails with
==============================================
Configure 98univention-samba4-dns.inst Mon Aug 19 09:25:31 CEST 2013
Waiting for RID Pool replication: ...................................................................................................................................................................................
Error no rIDSetReferences replicated for slave223
==============================================

Post failure the samba directory service shows a collision for the new DC account object:
==============================================
# record 1
dn: CN=SLAVE223\0ACNF:79dc4fbf-8c89-4621-a399-1833f13cd948,OU=Domain 
Controllers,DC=deadlock22,DC=local
whenCreated: 20130819072448.0Z

# record 2
dn: CN=slave223,OU=Domain Controllers,DC=deadlock22,DC=local
whenCreated: 20130819072531.0Z
==============================================

The first of these objects is the valid account object (with proper credentials and RID Set child object) which was created on the DC Backup of the domain. The second object was created by the S4-Connector on the DC Master, which is confirmed by the connector-s4.log, and does not hold proper credentials etc.
Comment 1 Arvid Requate univentionstaff 2013-08-19 13:30:59 CEST
Maybe the most simple solution would be to join against the S4 Connector host, but this is not a sufficient criterion in UCS@school environments (especially  in case we want to allow for the proposal of Bug 32187). The next best choice might be to use the DNS SRV record _ldap._tcp.pdc._msdcs. Another option would be to avoid the initial broadcast-join attempt altogether and instead pick one Samba 4 hosting DC at a time, check that the new DC account was replicated into it's local Samba directory service before initiating the samba-tool domain join.
Comment 2 Stefan Gohmann univentionstaff 2013-08-20 08:07:15 CEST
The univention-samba4 join script checks now if the host account was replicated to the samba 4 of the connector host. It waits for maximal five minutes.

I've created a temporary DVD for amd64 to make the test easier since the error occurred during the installation: ucs_3.1-1-latest-amd64.iso

You can use it in this way:
ucs-kt-instance-create -O ucs -V 3.1 -A amd64 -N UCS-3.1-Test-System -i /mnt/omar/vmwares/kvm/iso/iso-tests/ucs_3.1-1-latest-amd64.iso

3.1-1 Code: r43308
3.1-1 YAML: r43311
3.2 Code: r43313
3.2 Chagelog: r43312
Comment 3 Arvid Requate univentionstaff 2013-08-21 13:30:12 CEST
Verifed fixed, Advisory ok.

===============================================================
Configure 96univention-samba4.inst Wed Aug 21 11:14:48 CEST 2013
[...]
Found DC backup141.arbug32257.qa
[...]
Starting Samba 4 daemon: samba.
Waiting for DRS replication:  done
Object exists: cn=services,cn=univention,dc=arbug32257,dc=qa
===============================================================

Code also merged to UCS 3.2-0
Comment 4 Janek Walkenhorst univentionstaff 2013-08-22 12:06:47 CEST
http://errata.univention.de/ucs/3.1/170.html