Bug 32685 - Move & selective replication
Move & selective replication
Status: CLOSED FIXED
Product: UCS
Classification: Unclassified
Component: Listener (univention-directory-listener)
UCS 3.1
Other Linux
: P5 normal (vote)
: UCS 4.1-3-errata
Assigned To: Philipp Hahn
Stefan Gohmann
:
: 42547 (view as bug list)
Depends on: 34355
Blocks: 42616
  Show dependency treegraph
 
Reported: 2013-09-25 07:55 CEST by Stefan Gohmann
Modified: 2017-02-24 17:34 CET (History)
5 users (show)

See Also:
What kind of report is it?: Bug Report
What type of bug is this?: 5: Major Usability: Impairs usability in key scenarios
Who will be affected by this bug?: 4: Will affect most installed domains
How will those affected feel about the bug?: 2: A Pain – users won’t like this once they notice it
User Pain: 0.229
Enterprise Customer affected?: Yes
School Customer affected?:
ISV affected?:
Waiting Support:
Ticket number: 2016080221000449
Bug group (optional):
Max CVSS v3 score:


Attachments
Test case: failed LDIF (963 bytes, text/plain)
2014-04-16 19:32 CEST, Philipp Hahn
Details
Fix creating backup (3.02 KB, patch)
2014-04-16 19:37 CEST, Philipp Hahn
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Stefan Gohmann univentionstaff 2013-09-25 07:55:50 CEST
The replication handler has the same problem as the s4 connector handler like discovered in Bug #32542.

Steps to reproduce:
 - setup a UCS@school Multi Env with two school slaves
 - create a windows client in cn=computers,$ldap_base
 - move the windows client to cn=computers,ou=school1,$ldap_base

The windows client should be removed from the slave of school2 but it is not.
Comment 1 Tim Petersen univentionstaff 2013-09-25 08:04:07 CEST
Unfortunately I forgot to mention this at Bug #32542.

As the object will be successfully moved on systems which can read ou=school1,$ldap_base (at least the school slave and the master), so, where it is needed, I think this is no major problem.
Comment 2 Philipp Hahn univentionstaff 2014-04-11 08:32:53 CEST
Probably most listener modules will get confused by doing something like
	udm user/users move uid=one,cn=visible,... cn=restricted,...
	udm user/users move uid=other,cn=restricted,... cn=visible,...
as the handler will see this as
	uid=one,cn=visible,... r
	uid=other,cn=visible,... a
and give, for example, the other user the old mailboxes of one.

This could also lead to such strange cases where a object suddenly changes its type, for example a "user" becoming a "group".

The problem could probably be solved by the listener:
1. instead of immediately processing the "r", the listener only stashes the "dn".
2. As normally the "a" is the immediate next command, the listener can then decide, if the previous 'r'
2.1 rewrite the previous 'r' into a 'd' if the target is restricted by ACL. The 'a' can be dropped.
2.2 drop the previous 'r' and only pass the 'a', if the source is restricted by ACL.
2.3 otherwise process the 'r' now to catch-up and then process the 'a' as normal.
Comment 3 Philipp Hahn univentionstaff 2014-04-15 14:36:31 CEST
The replication module also gets it wrong: If multiple objects are moved from a public location to a restricted position, the replication module only creates files in /var/lib/univention-directory-replication/ for each object to store the old DN.
"current_modrdn" points to the latest moved object.

Those objects are not not deleted in the local LDAP:
# udm settings/packages move "$@" --dn cn=test1,$B --position cn=restricted,$B
Object modified: cn=test1,dc=phahn,dc=dev
# udm settings/packages move "$@" --dn cn=test3,$B --position cn=restricted,$B
Object modified: cn=test3,dc=phahn,dc=dev

# ls -gG /var/lib/univention-directory-replication/
-rw------- 1  24 15. Apr 14:20 070aea16-58e4-1033-9f40-4b0e7babd9b7
-rw------- 1  24 15. Apr 14:20 3ac32ca8-58d3-1033-9f3f-4b0e7babd9b7
lrwxrwxrwx 1  78 15. Apr 14:20 current_modrdn -> /var/lib/univention-directory-replication/3ac32ca8-58d3-1033-9f3f-4b0e7babd9b7

# univention-ldapsearch -LLLs base -b cn=test1,$B dn entryUUID
dn: cn=test1,dc=phahn,dc=dev
entryUUID: 070aea16-58e4-1033-9f40-4b0e7babd9b7
# univention-ldapsearch -LLLs base -b cn=test3,$B dn entryUUID
dn: cn=test3,dc=phahn,dc=dev
entryUUID: 3ac32ca8-58d3-1033-9f3f-4b0e7babd9b7
# ldapsearch -LLL -h xen12.phahn.dev -p 7389 -x -D uid=Administrator,cn=users,$B -w univention '(entryUUID=3ac32ca8-58d3-1033-9f3f-4b0e7babd9b7)' dn
dn: cn=test3,cn=restricted,dc=phahn,dc=dev

# grep -A2 restricted /etc/univention/templates/files/etc/ldap/slapd.conf.d/60univention-ldap-server_acl-master access to dn.sub="cn=restricted,dc=phahn,dc=dev"
        by dn.exact="cn=nagi21,cn=dc,cn=computers,dc=phahn,dc=dev" none stop
        by * none break

As 'r' is normally followed by 'a' (expect when selective replication is in use), the replication module should always (and not only for 'm' and 'a') check "current_modrdn" to refer to the current command and if not, remove the previous object referred to by "current_modrdn".


1. dn='cn=test1,dc=phahn,dc=dev' old=True new=False command=r
just store the old DN.
2. dn='cn=test3,dc=phahn,dc=dev' old=True new=False command=r
first check, that the old stored DN does not correspond to this operation, so we can conclude, that the corresponding "a" was missed due to selective replication. We have to remove the old object instead.
Only then process the new operation and again store only the DN. Only with the next transaction can we decide, if we need to delete or move the object.
Comment 4 Philipp Hahn univentionstaff 2014-04-16 19:32:49 CEST
Created attachment 5877 [details]
Test case: failed LDIF

Running the attached test case twice switches the Listener into failed-LDIF-mode.
This happens because when the second run creates the object again, it is still in the LDAP from the previous run but not in the cache. This confuses the replication.py module deadly:

notifier returned = id: 24796   dn: cn=test,cn=visible,dc=phahn,dc=dev  cmd: a
updating cn=test,cn=visible,dc=phahn,dc=dev
checking parent_dn of cn=test,cn=visible,dc=phahn,dc=dev in local LDAP
running handlers for cn=test,cn=visible,dc=phahn,dc=dev
replication: Running handler for: cn=test,cn=visible,dc=phahn,dc=dev
LDAP keys=['hasSubordinates', 'entryCSN', 'cn', 'objectClass', 'univentionObjectType', 'entryUUID', 'modifyTimestamp', 'modifiersName', 'createTimestamp', 'entryDN', 'subschemaSubentry', 'structuralObjectClass', 'creatorsName']; listener keys=[]
replication: old entries from LDAP server and Listener do not match
replication: the current modrdn points to a different entryUUID: /var/lib/univention-directory-replication/60d4e3be-59d2-1033-8cc9-6d1c11cdc722
replication: the DN cn=test,cn=visible,dc=phahn,dc=dev from the current_modrdn_link has to be backuped and removed
replication: dump cn=test,cn=visible,dc=phahn,dc=dev to /var/univention-backup/replication/1397666818.65
dn=cn=test,cn=visible,dc=phahn,dc=dev: No such object
        machted dn: cn=visible,dc=phahn,dc=dev
replication: Running handler for: cn=test,cn=visible,dc=phahn,dc=dev
replication: the current modrdn points to a different entryUUID: /var/lib/univention-directory-replication/60d4e3be-59d2-1033-8cc9-6d1c11cdc722
replication: the DN cn=test,cn=visible,dc=phahn,dc=dev from the current_modrdn_link has to be backuped and removed
replication: dump cn=test,cn=visible,dc=phahn,dc=dev to /var/univention-backup/replication/1397666818.65
handler: replication (successful)
handler: faillog (successful)
handler: printusers (successful)
Could not write to transaction file /var/lib/univention-ldap/listener/listener. Check for /var/lib/univention-directory-replication/failed.ldif
Comment 5 Philipp Hahn univentionstaff 2014-04-16 19:37:42 CEST
Created attachment 5878 [details]
Fix creating backup

This fixed the following 2 bugs:

1. _backup_dn_recursive() tries to backup a non-existing entry, which raises a ldap.NO_SUCH_OBJECT exception, which is not caught. This makes the replication.py module switch to LDIF mode.
In that case an empty backup file was generated.
Shuffle the code to first try the search and only do the backup if that succeeds.

2. For the test case "old_dn" = "dn", so when "old_dn" was backuped and deleted, "old" still contained the values of the now deleted object. replication.py then took the wrong path and tried to use "_modify_object_from_old_and_new()" on a now deleted object.
Comment 6 Philipp Hahn univentionstaff 2014-05-05 08:13:10 CEST
(In reply to Philipp Hahn from comment #2)
> The problem could probably be solved by the listener:
> 1. instead of immediately processing the "r", the listener only stashes the
> "dn".

This was implemented in the listener for Bug 34355 with r49563 and r49564.
This can now be fixed easily in two lines already marked as FIXME:

Always call modules with command='d' when a object is 'removed', be that by a LDAP_DELETE or by selective replication.
Benefit: The modules no longer have to implement their own logic to track consecutive move-from / move-to events to detect moves to/from selectively replicated areas.

diff --git a/branches/ucs-3.2/ucs-3.2-1/management/univention-directory-listener/src/change.c b/branches/ucs-3.2/ucs-3.2-1/management/univention-directory-listener/src/change.c
index 303a7da..628fe52 100644
--- a/branches/ucs-3.2/ucs-3.2-1/management/univention-directory-listener/src/change.c
+++ b/branches/ucs-3.2/ucs-3.2-1/management/univention-directory-listener/src/change.c
@@ -681,15 +681,13 @@ int change_update_dn(struct transaction *trans) {
 
 	rv = ldap_search_ext_s(trans->lp->ld, base, scope, filter, attrs, attrsonly, serverctrls, clientctrls, &timeout, sizelimit, &res);
 	if (rv == LDAP_NO_SUCH_OBJECT) {
-		// FIXME: trans->cur.notify.command = 'd' // to overwrite 'r' without 'a'
-		rv = change_delete_dn(trans->cur.notify.id, trans->cur.notify.dn, trans->cur.notify.command);
+		rv = change_delete_dn(trans->cur.notify.id, trans->cur.notify.dn, 'd');
 	} else if (rv == LDAP_SUCCESS) {
 		if ((trans->ldap = ldap_first_entry(trans->lp->ld, res)) == NULL) {
 			/* entry exists (since we didn't get NO_SUCH_OBJECT),
 			* but was probably excluded through ACLs which makes it
 			* non-existent for us */
-			// FIXME: trans->cur.notify.command = 'd' // to overwrite 'r' without 'a'
-			rv = change_delete_dn(trans->cur.notify.id, trans->cur.notify.dn, trans->cur.notify.command);
+			rv = change_delete_dn(trans->cur.notify.id, trans->cur.notify.dn, 'd');
 		} else {
 			rv = change_update_cache(trans);
 		}
Comment 7 Philipp Hahn univentionstaff 2016-09-29 15:12:44 CEST
Package: univention-directory-listener
Version: 10.0.0-16.330.201609291501
Branch: ucs_4.1-0
Scope: errata4.1-3

r72898 | Bug #32685 UDL: Fix move with selective replication YAMLgit
 univention-directory-listener.yaml

r72899 | Bug #32685 UDL: Fix move with selective replication
Comment 8 Stefan Gohmann univentionstaff 2016-10-04 11:12:14 CEST
*** Bug 42547 has been marked as a duplicate of this bug. ***
Comment 9 Stefan Gohmann univentionstaff 2016-10-10 16:11:49 CEST
Code review: OK

UCS 4.2 merge: OK

ucs-test: OK

YAML: OK

Tests: OK, looks good. I've modified the s4 connector listener module. Every time I move a user into another school, I get only the delete command. Afterwards, the user is removed from Samba 4.
Comment 10 Janek Walkenhorst univentionstaff 2016-10-12 13:06:40 CEST
<http://errata.software-univention.de/ucs/4.1/291.html>