Bug 33399 - 'Waiting for DRS replication' failed on a school slave
'Waiting for DRS replication' failed on a school slave
Status: CLOSED FIXED
Product: UCS
Classification: Unclassified
Component: Samba4
UCS 4.0
Other Linux
: P2 normal (vote)
: UCS 4.1-0-errata
Assigned To: Arvid Requate
Felix Botner
:
Depends on:
Blocks: 40387
  Show dependency treegraph
 
Reported: 2013-11-15 07:10 CET by Stefan Gohmann
Modified: 2016-02-04 13:58 CET (History)
5 users (show)

See Also:
What kind of report is it?: ---
What type of bug is this?: ---
Who will be affected by this bug?: ---
How will those affected feel about the bug?: ---
User Pain:
Enterprise Customer affected?:
School Customer affected?:
ISV affected?:
Waiting Support:
Flags outvoted (downgraded) after PO Review:
Ticket number:
Bug group (optional): External feedback, Troubleshooting
Max CVSS v3 score:
requate: Patch_Available+


Attachments
consider_only_S4_connector_hosts_with_DRS_connection.patch (1.23 KB, patch)
2014-04-24 13:37 CEST, Arvid Requate
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Stefan Gohmann univentionstaff 2013-11-15 07:10:01 CET
From the join.lg of a school slave with samba4:

Waiting for DRS replication: .................................................................................
........................................................................................................................................................................................................................... failed


This happens with UCS 3.1 and UCS 3.2. I think the new password is not synced to the S4 of the master:

root@slave2032:~# ldbsearch -H ldap://master203 -U slave2032\$%$(</etc/machine.secret) 
SPNEGO(gssapi_krb5) creating NEG_TOKEN_INIT failed: NT_STATUS_INVALID_PARAMETER
Failed to bind - LDAP error 49 LDAP_INVALID_CREDENTIALS -  <SASL:[GSS-SPNEGO]: NT_STATUS_LOGON_FAILURE> <>
Failed to connect to 'ldap://master203' with backend 'ldap': (null)
Failed to connect to ldap://master203 - (null)
root@slave2032:~#
Comment 1 Arvid Requate univentionstaff 2013-11-18 17:14:30 CET
Ok, as discussed offline, some more info about this:

This check was introduced for Bug 32257, but in the case of a UCS@school DC Slave it doesn't work, isn't required and adds an inefficient timeout delay, so it should be be avoided in this case. In case Bug 33388 gets fixed, this check may even be removed completely.
Comment 2 Arvid Requate univentionstaff 2014-04-24 13:37:41 CEST
Created attachment 5883 [details]
consider_only_S4_connector_hosts_with_DRS_connection.patch

This patch improves detection of the S4 Connector host in the univention-samba4 joinscript. It introduces a test if the DC advertising univentionService="S4 Connector" is available for DRS replication.

This should fix three unnecessary timeout situations of 'Waiting for DRS replication':

1) The problem of this bug, where a UCS@school slave PDC waits for DRS replication of its own machine account.

2) In UCS@school domains with more than one S4 Connector the current joinscript will find all of them and try to connect e.g. against "master slave1 slave2", which is not a valid hostname.

3) In UCS@school domains with one Samba4 school DC Slave, installation of a second Samba4 School will make the current joinscript wait for the first Samba4 school DC Slave to replicate the machine account of the school DC on the second school.
Comment 3 Michael Grandjean univentionstaff 2015-07-29 11:06:14 CEST
Customer experienced this with a non-edu school slave (4.0-2 errata 263)

The attached patch worked (at least the joinscripts finished with EXITCODE=0 and superficial tests were okay).
Comment 4 Sönke Schwardt-Krummrich univentionstaff 2015-11-11 10:05:02 CET
Another customer experienced this problem (2015110521000117)
Comment 5 Michael Grandjean univentionstaff 2015-12-15 17:07:22 CET
Once more during UCS@school Workshop ... 

UCS 4.1-0 errata 29, UCS@school Edu Slave
Comment 6 Stefan Gohmann univentionstaff 2016-01-07 14:15:22 CET
Since it has to be fixed in the Samba 4 package, I'll move it to the Samba 4 component.
Comment 7 Michael Grandjean univentionstaff 2016-01-17 20:28:49 CET
Hit me again on one of my test systems during re-join:

UCS 4.0-4 Errata 377
ucsschool_20151201
Comment 8 Michael Grandjean univentionstaff 2016-01-22 21:03:05 CET
Noticed by a customer, mentioned on summit.
I don't know why the join script sometimes fails at that point and sometimes doesn't. Nevertheless, even if it doesn't fail, the "Waiting for DRS replication"-part slows down the join process and should be removed.
Comment 9 Arvid Requate univentionstaff 2016-01-27 19:34:01 CET
Adjusted:

* The joinscript
* check_essential_samba4_dns_records

Advisory: univention-samba4.yaml
Comment 10 Felix Botner univentionstaff 2016-01-28 19:20:38 CET
ucs@school

OK - no DRS REPL IN JOIN on first school slave
OK - no DRS REPL IN JOIN on second school slave
OK - no DRS REPL IN JOIN on first school slave during rejoin
OK - no DRS REPL IN JOIN on second school slave during rejoin

ucs

OK - DRS REPL IN JOIN second s4 server
OK - DRS REPL IN JOIN third s4 server
OK - DRS REPL IN JOIN second s4 server during rejoin
OK - DRS REPL IN JOIN third s4 server during rejoin


OK - continue if server object can not be found in 
     scripts/check_essential_samba4_dns_records.sh
Comment 11 Janek Walkenhorst univentionstaff 2016-02-04 13:58:47 CET
<http://errata.software-univention.de/ucs/4.1/95.html>