41168 – UCS@school Samba/AD Slaves left over in Master DNS

Bug 41168 - UCS@school Samba/AD Slaves left over in Master DNS

Summary: UCS@school Samba/AD Slaves left over in Master DNS

Status:	CLOSED FIXED

Alias:	None

Product:	UCS
Classification:	Unclassified
Component:	Samba4
Version:	UCS 4.1
Hardware:	Other Linux

Importance:	P5 normal
Target Milestone:	UCS 4.1-3-errata
Assignee:	Stefan Gohmann
QA Contact:	Felix Botner

URL:
Keywords:

Duplicates (1):	41167 (view as bug list)
Depends on:
Blocks:

Reported:	2016-04-27 19:01 CEST by Arvid Requate
Modified:	2019-12-03 09:51 CET (History)
CC List:	3 users (show)

See Also:	41167 50280
What kind of report is it?:	---
What type of bug is this?:	---
Who will be affected by this bug?:	---
How will those affected feel about the bug?:	---
User Pain:
Enterprise Customer affected?:
School Customer affected?:
ISV affected?:
Waiting Support:
Flags outvoted (downgraded) after PO Review:
Ticket number:
Bug group (optional):	Troubleshooting
Customer ID:
Max CVSS v3 score:

Attachments
remove_ucsschool_samba4_slaves_from_dns.sh (1.85 KB, text/plain) 2016-04-27 19:01 CEST, Arvid Requate	Details
View All Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Arvid Requate

2016-04-27 19:01:13 CEST

Created attachment 7629 [details]
remove_ucsschool_samba4_slaves_from_dns.sh

In the situation of Bug #41167 the UCS@school DC Slaves were also still present in the DNS of the UCS@school Samba/AD Master.


The attached script may be useful for support to clean this up.


Generally this should be taken care of by the S4-Connector on the Master via the connector/s4/mapping/dns/* UCS variables.



+++ This bug was initially created as a clone of Bug #41167 +++

Ticket#2016042721000409 reported 100% CPU consumed samba (dreplsrv) due to Slave accounts left over in the Samba/AD account database on the UCS@school Master.

Comment 1 Sönke Schwardt-Krummrich

2016-07-07 23:27:01 CEST

@Arvid: what kind of impact does it have if the school slaves are present in DNS of the UCS@school Samba/AD Master?

Comment 2 Arvid Requate

2016-07-11 13:51:17 CEST

Well, if they are advertised in the SRV records, e.g. _kerberos._tcp, then they will be contacted by clients. Since the UCS@School Slave (branch site) DCs don't have the full user database this may e.g. result in intermittent authentication errors.

Comment 3 Stefan Gohmann

2016-07-11 16:47:30 CEST

See also Bug #41167.

Comment 4 Arvid Requate

2016-07-12 15:47:59 CEST

The case of Ticket#2016071121000755 again showed the intricate replication issues this causes:

After performing the steps to get out of Bug #41167 in that case the two DCs in the central school department had replication issues, apparently because they tried and failed to get proper information from the School Slave. We had to

1. run a modified version of the cleanup-script above

2. service samba stop & start on the DC Master
   samba restart was not enough in this case, apparently it left
   a stuck kccserv process running, causing this showrepl error:

ERROR(runtime): DsReplicaGetInfo of type 0 failed - (-1073610699, 'The operation cannot be performed.')

   After step 1 or 2 DRS replication traffic ramped up to 100% on the DC Master,
   maybe some stuff had not been replicated yet to "the other DC"

3. stop & start samba on "the other DC" to get rid of "WERR_BAD_NETPATH" in the
   showrepl output

4. Wait for DRS-replication to stabilize

All together neither a pleasant nor a straight forward experience.

Comment 5 Stefan Gohmann

2016-08-29 09:50:31 CEST

It looks like I'm able to reproduce it. For Bug #41167 I've implemented a "_remove_slavepdc_account_from_master_s4" in the Slave PDC join script : 96univention-samba4slavepdc.inst.

After a school slave has been removed, I see on the DC Backup:

DC=autotest300,DC=local
        Default-First-Site-Name\MASTER300 via RPC
                DSA object GUID: 8380008f-135a-4868-bb07-95a32cd687ec
                Last attempt @ Mon Aug 29 03:45:07 2016 EDT failed, result 58 (WERR_BAD_NET_RESP)
                17 consecutive failure(s).
                Last success @ Mon Aug 29 03:17:09 2016 EDT


And from the log.samba file:

 [2016/08/29 03:45:07.706391,  0, pid=3676] ../source4/dsdb/repl/replicated_objects.c:783(dsdb_replicated_objects_commit)
  Failed to apply records: ../ldb_tdb/ldb_index.c:1216: Failed to re-index objectGUID in CN=slave300-s1\0ACNF:65eb34c9-4121-4583-8785-b0ad90b555aa,CN=dc,CN=server,CN=computers,OU=School1,DC=autotest300,DC=local - ../ldb_tdb/ldb_index.c:1148: unique index violation on objectGUID in CN=slave300-s1\0ACNF:65eb34c9-4121-4583-8785-b0ad90b555aa,CN=dc,CN=server,CN=computers,OU=School1,DC=autotest300,DC=local: Entry already exists
[2016/08/29 03:45:07.706556,  0, pid=3676] ../source4/dsdb/repl/drepl_out_helpers.c:773(dreplsrv_op_pull_source_apply_changes_trigger)
  Failed to commit objects: WERR_GENERAL_FAILURE/NT_STATUS_INVALID_NETWORK_RESPONSE

This Jenkins job can be used to reproduce it:
http://jenkins.knut.univention.de:8080/job/UCSschool 4.1/job/UCSschool 4.1 (R2) Large Environment

Comment 6 Stefan Gohmann

2016-09-01 06:23:33 CEST

*** Bug 41167 has been marked as a duplicate of this bug. ***

Comment 7 Stefan Gohmann

2016-09-01 06:27:43 CEST

The Slave account is now "demoted" manually, see 96univention-samba4slavepdc.inst.

The tests were successful:
http://jenkins.knut.univention.de:8080/job/UCSschool%204.1/job/UCSschool%204.1%20(R2)%20Large%20Environment/10/testReport/

I execute the 'samba-tool drs kcc' only on one central Samba server. The others will execute it within five minutes.

Comment 8 Felix Botner

2016-09-13 10:24:50 CEST

OK - update, slave demoted on master, back (userAccountControl: 4096)
OK - installation, userAccountControl: 4096 for slave in masters samba
OK - yaml
OK - merged to 4.2-0

Comment 9 Janek Walkenhorst

2016-09-14 15:38:57 CEST

<http://errata.software-univention.de/ucs/4.1/264.html>