Bug 33621 - Concurrent changes in S4 and OpenLDAP
Concurrent changes in S4 and OpenLDAP
Status: CLOSED FIXED
Product: UCS
Classification: Unclassified
Component: S4 Connector
UCS 3.2
Other Linux
: P5 normal (vote)
: UCS 3.2-2-errata
Assigned To: Stefan Gohmann
Arvid Requate
:
: 34205 (view as bug list)
Depends on:
Blocks: 34450
  Show dependency treegraph
 
Reported: 2013-11-30 20:34 CET by Stefan Gohmann
Modified: 2020-05-19 15:14 CEST (History)
3 users (show)

See Also:
What kind of report is it?: ---
What type of bug is this?: ---
Who will be affected by this bug?: ---
How will those affected feel about the bug?: ---
User Pain:
Enterprise Customer affected?:
School Customer affected?:
ISV affected?:
Waiting Support:
Flags outvoted (downgraded) after PO Review:
Ticket number:
Bug group (optional):
Max CVSS v3 score:


Attachments
bug_33621-32263.patch (42.33 KB, patch)
2014-05-06 14:36 CEST, Stefan Gohmann
Details | Diff
bug_33621-32263_conffiles.patch (9.80 KB, patch)
2014-05-06 15:01 CEST, Stefan Gohmann
Details | Diff
s4cache_EntryDiff.patch (501 bytes, patch)
2014-07-21 19:04 CEST, Arvid Requate
Details | Diff
objectGUID.patch (6.00 KB, patch)
2014-07-21 19:28 CEST, Arvid Requate
Details | Diff
sync_to_ucs_call.patch (753 bytes, patch)
2014-07-21 19:54 CEST, Arvid Requate
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Stefan Gohmann univentionstaff 2013-11-30 20:34:40 CET
We have some reports about concurrent changes, see Bug #18501 for the AD Connector issue.

(In reply to Ingo Steuwer from Bug #18501 comment #12)
> reported by: 2013081521023638
> 
> S4-connector deletes a group that was deleted and recreated by UDM

A similar issue is Bug #32263.

I think we are unable to solve all issues, for example if the connector is stopped and on both sides the description of a user is changed to different values. Only one side can win.

But we should avoid the re-creation or re-deletion of objects.
Comment 1 Stefan Gohmann univentionstaff 2013-12-06 21:50:17 CET
If we will use a time stamp based sync we have to check the group member sync separately because we are using for the group members a diff based sync. And we can not change to a time stamp sync for members because this would break basic use cases.
Comment 2 Stefan Gohmann univentionstaff 2014-01-08 07:23:17 CET
I've added two test cases to demonstrate this issue:
 - 52_s4connector/020_concurrent_changes_in_ucs
 - 52_s4connector/021_concurrent_changes_in_ad
Comment 3 Stefan Gohmann univentionstaff 2014-03-07 15:10:31 CET
*** Bug 34205 has been marked as a duplicate of this bug. ***
Comment 4 Philipp Hahn univentionstaff 2014-03-20 14:12:40 CET
During QA for Bug #34267 it happened multiple times, that after a ucs-test run the test-group still existed, as it got re-added by the S4 connector after it had already been cleaned-up at the end of the test.
Same happened for modifying the user, when an added email-address was removed again by the S4 connector.
Comment 5 Stefan Gohmann univentionstaff 2014-04-11 11:48:55 CEST
(In reply to Philipp Hahn from comment #4)
> During QA for Bug #34267 it happened multiple times, that after a ucs-test
> run the test-group still existed, as it got re-added by the S4 connector

I've created two test cases for this issue:
 * 52_s4connector/022_concurrent_deletion_in_ucs
 * 52_s4connector/023_concurrent_deletion_in_ad
Comment 6 Stefan Gohmann univentionstaff 2014-04-11 11:55:44 CEST
(In reply to Stefan Gohmann from comment #5)
> (In reply to Philipp Hahn from comment #4)
> > During QA for Bug #34267 it happened multiple times, that after a ucs-test
> > run the test-group still existed, as it got re-added by the S4 connector
> 
> I've created two test cases for this issue:
>  * 52_s4connector/022_concurrent_deletion_in_ucs
>  * 52_s4connector/023_concurrent_deletion_in_ad

This is Bug #32263.
Comment 7 Stefan Gohmann univentionstaff 2014-04-22 08:45:08 CEST
The defined attributes are now synced in diff mode.

The password and the group membership is currently synced via separate sync functions. I've added a check in the password sync functions, the password are now only synchronized if one password attribute was changed.

The membership function is currently untouched.

Test cases:
 52_s4connector/021_concurrent_changes_in_ad
 52_s4connector/020_concurrent_changes_in_ucs
 52_s4connector/120sync_create_and_modify_ucs_user
 52_s4connector/124sync_modify_con_other_attributes

Code: r48654, r48656, r49219, r49225, r49466, r49467, r49468

YAML: r49473
Comment 8 Stefan Gohmann univentionstaff 2014-05-06 14:35:23 CEST
This bug will be moved to 3.2-2-errata.
Comment 9 Stefan Gohmann univentionstaff 2014-05-06 14:36:48 CEST
Created attachment 5893 [details]
bug_33621-32263.patch

This patch was reverted in order to get the package ready for 3.2-1-errata.
Comment 10 Stefan Gohmann univentionstaff 2014-05-06 15:01:46 CEST
Created attachment 5894 [details]
bug_33621-32263_conffiles.patch

Another patch
Comment 11 Arvid Requate univentionstaff 2014-05-28 17:50:00 CEST
Package version 8.0.33-47.473.201405151521 seems to break objectSid replication from Samba4 to OpenLDAP. After successfully joining a Windows client (not pre-created) the situation is as follows. Manual modification of the description on the Samba4 object also doesn't cause the sid to be synchronized. Machine snapshot is available.


# WIN7PRO231, computers, ar320i1.qa
dn: cn=WIN7PRO231,cn=computers,dc=ar320i1,dc=qa
sambaSID: S-1-4-2010
description: foo2

root@master40:~# univention-s4search cn=win7pro231 objectsid description
# record 1
dn: CN=WIN7PRO231,CN=Computers,DC=ar320i1,DC=qa
objectSid: S-1-5-21-2504708665-2173701359-1147132429-1602
description: foo2
Comment 12 Stefan Gohmann univentionstaff 2014-05-30 07:16:19 CEST
(In reply to Arvid Requate from comment #11)
> Package version 8.0.33-47.473.201405151521 seems to break objectSid
> replication from Samba4 to OpenLDAP. After successfully joining a Windows
> client (not pre-created) the situation is as follows. Manual modification of
> the description on the Samba4 object also doesn't cause the sid to be
> synchronized. Machine snapshot is available.

Yes, the change of the description does no longer tigger the synchronization of the other attributes.

I'm unable to reproduce the SID difference. Could you append the connector-s4.log with debug level 4?
Comment 14 Stefan Gohmann univentionstaff 2014-07-07 07:20:45 CEST
The following things need to be done:

* AD Takeover tests
In a short test the user krbtgt was rejected.  

* Re-sync one object
Previously it was possible to sync the whole object with the change of the description. Now it should be added a script which removes the object from the cache.
Comment 15 Stefan Gohmann univentionstaff 2014-07-07 07:50:08 CEST
Another issue:

* We don't check the values of the attributes. If I change the same attribute in a loop, the value is overwritten. The test cases 020_concurrent_changes_in_ucs_same_attribute and 021_concurrent_changes_in_ad_same_attribute demonstrate this case.
Comment 16 Stefan Gohmann univentionstaff 2014-07-11 06:24:47 CEST
(In reply to Stefan Gohmann from comment #15)
> Another issue:
> 
> * We don't check the values of the attributes. If I change the same
> attribute in a loop, the value is overwritten. The test cases
> 020_concurrent_changes_in_ucs_same_attribute and
> 021_concurrent_changes_in_ad_same_attribute demonstrate this case.

This issue was separated to Bug #35336.
Comment 17 Stefan Gohmann univentionstaff 2014-07-11 20:25:31 CEST
I've added some new helper tools for debugging this changes:

   - DN2base64Guid.py
     Converts the objectGUID of an S4 object to base64. This is helpful
     for debugging the /etc/univention/connector/s4cache.sqlite values.
   - base64Guid2DN.py
     Decodes the base64 based objectGUID and returns the S4 DN. This is
     helpful for debugging the /etc/univention/connector/s4cache.sqlite
     values.
   - resync_object_from_s4.py
     Removes all cache entries of an S4 object and resyncs this object
     from S4 to OpenLDAP.
   - resync_object_from_ucs.py
     Removes all cache entries of an UCS object and resyncs this object
     from OpenLDAP to UCS.
Comment 18 Stefan Gohmann univentionstaff 2014-07-11 21:01:50 CEST
(In reply to Stefan Gohmann from comment #14)
> The following things need to be done:
> 
> * AD Takeover tests
> In a short test the user krbtgt was rejected.  

I wasn't able to reproduce this. The rejected object was synced in the next turn. The traceback is already known: Bug #33263

> * Re-sync one object
> Previously it was possible to sync the whole object with the change of the
> description. Now it should be added a script which removes the object from
> the cache.

Done
Comment 19 Stefan Gohmann univentionstaff 2014-07-16 15:27:10 CEST
As discussed we need one more thing. After the creation of an object in S4, the object should be searched in S4 and dumped to the cache.
Comment 20 Stefan Gohmann univentionstaff 2014-07-16 16:49:25 CEST
(In reply to Stefan Gohmann from comment #19)
> As discussed we need one more thing. After the creation of an object in S4,
> the object should be searched in S4 and dumped to the cache.

Fixed with r51885
Comment 21 Stefan Gohmann univentionstaff 2014-07-17 08:52:30 CEST
(In reply to Stefan Gohmann from comment #20)
> (In reply to Stefan Gohmann from comment #19)
> > As discussed we need one more thing. After the creation of an object in S4,
> > the object should be searched in S4 and dumped to the cache.
> 
> Fixed with r51885

I've reverted the commit because the jenkins tests failed. I've created a new bug for this: Bug #35391
Comment 22 Arvid Requate univentionstaff 2014-07-21 19:04:46 CEST
Created attachment 6011 [details]
s4cache_EntryDiff.patch

The attribute list needs to be sorted for comparison.
Test case: run s4cache.py twice.
Comment 23 Arvid Requate univentionstaff 2014-07-21 19:28:21 CEST
Created attachment 6012 [details]
objectGUID.patch

objectGUID is not unicode but a binary NDR format which should be converted to a string via ndr_unpack.

I have no clue why the S4/AD Connector wraps all attributes to unicode (__object_from_element) when reading from S4, seems like legacy code, but we should not change that.

I see that the "S4 GUID" table is written in base64, but using this patch we can avoid that for the s4cache database: improves readability and simplifies the s4cache code.
Comment 24 Arvid Requate univentionstaff 2014-07-21 19:54:27 CEST
Created attachment 6013 [details]
sync_to_ucs_call.patch

Looks like one of the sync_to_ucs calls still passes only three arguments.
Comment 25 Stefan Gohmann univentionstaff 2014-07-22 07:01:11 CEST
(In reply to Arvid Requate from comment #24)
> Created attachment 6013 [details]
> sync_to_ucs_call.patch
> 
> Looks like one of the sync_to_ucs calls still passes only three arguments.

Yes, applied: r52020

(In reply to Arvid Requate from comment #22)
> Created attachment 6011 [details]
> s4cache_EntryDiff.patch
> 
> The attribute list needs to be sorted for comparison.
> Test case: run s4cache.py twice.

I use now a set() which should be faster.

(In reply to Arvid Requate from comment #23)
> Created attachment 6012 [details]
> objectGUID.patch
> 
> objectGUID is not unicode but a binary NDR format which should be converted
> to a string via ndr_unpack.
> 
> I have no clue why the S4/AD Connector wraps all attributes to unicode
> (__object_from_element) when reading from S4, seems like legacy code, but we
> should not change that.
> 
> I see that the "S4 GUID" table is written in base64, but using this patch we
> can avoid that for the s4cache database: improves readability and simplifies
> the s4cache code.

OK, applied with some changes: r52022
Comment 26 Arvid Requate univentionstaff 2014-07-23 17:20:56 CEST
Ok, verified:
* Code review
* ucs-test
* Jenkins seems to be ok too.
* Advisory: Ok

We still need to decide upon the release date. If it's released before the ad-member enhancements for shell-univention-lib then we need to revert the univention-s4-connector change made for version 8.0.33-63 and possibly later.
Comment 27 Arvid Requate univentionstaff 2014-07-24 13:43:35 CEST
* change made for version 8.0.33-63 has been reverted now for release.
* Built version updated in advisory.



Just for future reference: There is a new sqlite database and a new table now:

root@master40:~# sqlite3 /etc/univention/connector/s4cache.sqlite "select * from GUIDS;"
1|d1038bd5-b248-4bb5-b75d-8158ca731ac9

root@master40:~# sqlite3 /etc/univention/connector/s4internal.sqlite "select * from 'UCS Deleted';"
3cbb6838-0be2-1033-9a35-0d1639c0bdaf|e003cdd0-8023-4709-95d4-36be33390736
Comment 28 Janek Walkenhorst univentionstaff 2014-07-24 15:18:02 CEST
http://errata.univention.de/ucs/3.2/153.html