Bug 44278 - w2k3 de - not all GPO links have been taken over
w2k3 de - not all GPO links have been taken over
Status: CLOSED FIXED
Product: UCS
Classification: Unclassified
Component: AD Takeover
UCS 4.2
Other Linux
: P5 normal (vote)
: UCS 4.4-1-errata
Assigned To: Felix Botner
Arvid Requate
:
Depends on: 50022
Blocks:
  Show dependency treegraph
 
Reported: 2017-04-04 10:34 CEST by Felix Botner
Modified: 2019-09-04 15:48 CEST (History)
2 users (show)

See Also:
What kind of report is it?: Bug Report
What type of bug is this?: 5: Major Usability: Impairs usability in key scenarios
Who will be affected by this bug?: 1: Will affect a very few installed domains
How will those affected feel about the bug?: 4: A User would return the product
User Pain: 0.114
Enterprise Customer affected?:
School Customer affected?:
ISV affected?:
Waiting Support:
Flags outvoted (downgraded) after PO Review:
Ticket number:
Bug group (optional):
Max CVSS v3 score:


Attachments
ad-takeover.log (49.09 KB, text/x-log)
2017-04-04 10:35 CEST, Felix Botner
Details
connector-s4.log.gPLink.FAIL.txt (7.75 MB, text/plain)
2019-07-05 14:28 CEST, Felix Botner
Details
connector-s4.log.gPLink.OK.txt (6.34 MB, text/plain)
2019-07-05 14:30 CEST, Felix Botner
Details
Patch setting syncmode to read when starting the S4 connector (2.30 KB, patch)
2019-08-07 14:14 CEST, Fathan Vidjaja
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Felix Botner univentionstaff 2017-04-04 10:34:25 CEST
UCS 4.2 ad takeover german w2k3

# w2k3 GPO Links

gPLink: [LDAP://cn={DD5FC622-A1A2-4263-80FA-42D9C32EFC84},cn=policies,cn=syste
 m,DC=w2k3,DC=test;0][LDAP://cn={5CF6DFCA-E740-439A-80B6-B671719BA89F},cn=poli
 cies,cn=system,DC=w2k3,DC=test;0][LDAP://CN={31B2F340-016D-11D2-945F-00C04FB9
 84F9},CN=Policies,CN=System,DC=w2k3,DC=test;0]

# UCS GPO Links

gPLink: [LDAP://CN={31B2F340-016D-11D2-945F-00C04FB984F9},CN=Policies,CN=Syste
 m,DC=w2k3,DC=test;0]

# but, GPO's exists in UCS
-> ldbsearch -H /var/lib/samba/private/sam.ldb cn='{DD5FC622-A1A2-4263-80FA-42D9C32EFC84}' dn
dn: CN={DD5FC622-A1A2-4263-80FA-42D9C32EFC84},CN=Policies,CN=System,DC=w2k3,DC=test

-> ldbsearch -H /var/lib/samba/private/sam.ldb cn='{5CF6DFCA-E740-439A-80B6-B671719BA89F}' dn
dn: CN={5CF6DFCA-E740-439A-80B6-B671719BA89F},CN=Policies,CN=System,DC=w2k3,DC=test
Comment 1 Felix Botner univentionstaff 2017-04-04 10:35:06 CEST
Created attachment 8752 [details]
ad-takeover.log
Comment 2 Arvid Requate univentionstaff 2017-04-04 13:25:24 CEST
We should test again with W2K3 R2.
Comment 3 Arvid Requate univentionstaff 2019-07-04 18:04:16 CEST
We just saw this again in the UCS 4.4 AD-Takeover Jenkins tests:

http://jenkins.knut.univention.de:8080/job/UCS-4.4/job/UCS-4.4-0/view/Product%20Tests/job/product-test-samba-ad-takeover-all-tests/
Comment 4 Arvid Requate univentionstaff 2019-07-04 18:07:17 CEST
When I look into the connector-s4.log I guess that's the first thing that the S4-Connector does:

04.07.2019 15:56:30.031 MAIN        (------ ): DEBUG_INIT
04.07.2019 15:56:30.833 LDAP        (PROCESS): Building internal group membership cache
04.07.2019 15:56:30.993 LDAP        (PROCESS): Internal group membership cache was created
04.07.2019 15:56:31.726 LDAP        (PROCESS): sync from ucs: [  container_dc] [       add] DC=adtakeover,DC=local

and I guess that's overwriting the gPLink attribute. No clue how the test can have worked.
Comment 5 Arvid Requate univentionstaff 2019-07-04 22:36:56 CEST
Ok, the takeover code runs "/usr/share/univention-s4-connector/msgpo.py --write2ucs" in the start_s4_connector method before actually starting the connector. That script should sync the gPLink attributes from Samba/AD to OpenLDAP.
Comment 6 Felix Botner univentionstaff 2019-07-05 14:28:40 CEST
Created attachment 10108 [details]
connector-s4.log.gPLink.FAIL.txt

broken gPLink after takeover

$ univention-s4search dc=adtakeover gPLink
dn: DC=adtakeover,DC=local
gPLink: [LDAP://CN={31B2F340-016D-11D2-945F-00C04FB984F9},CN=Policies,CN=Syste
 m,DC=adtakeover,DC=local;0]

should be (and was after the join into ad)

$ univention-s4search dc=adtakeover
dn: DC=adtakeover,DC=local
gPLink: [LDAP://CN={31B2F340-016D-11D2-945F-00C04FB984F9},CN=Policies,CN=Syste
 m,DC=adtakeover,DC=local;0][LDAP://cn={D1960C4B-C8B2-44B4-87CF-8D272157DD23},
 cn=policies,cn=system,DC=adtakeover,DC=local;2][LDAP://cn={94CC7522-4978-42A3
 -A1B4-F04FD9F93445},cn=policies,cn=system,DC=adtakeover,DC=local;2]
Comment 7 Felix Botner univentionstaff 2019-07-05 14:30:06 CEST
Created attachment 10109 [details]
connector-s4.log.gPLink.OK.txt

Correct gPLink after takeover

$ univention-s4search dc=w2k12 gPLink
dn: DC=w2k12,DC=test
gPLink: [LDAP://cn={F0F4F747-F986-4A32-8B06-13C9C9362B49},cn=policies,cn=syste
 m,DC=w2k12,DC=test;0][LDAP://cn={3B7240ED-D485-44A7-A99A-B946AF045D4C},cn=pol
 icies,cn=system,DC=w2k12,DC=test;0][LDAP://CN={31B2F340-016D-11D2-945F-00C04F
 B984F9},CN=Policies,CN=System,DC=w2k12,DC=test;0]
Comment 8 Fathan Vidjaja univentionstaff 2019-07-25 11:04:20 CEST
I think this bug is a duplicate of Bug 46443 which I couldn't reproduce after the release of 4.3.
Comment 9 Arvid Requate univentionstaff 2019-07-30 16:42:00 CEST
See Comment 3, please try to reproduce by running those tests.
Comment 10 Fathan Vidjaja univentionstaff 2019-08-07 14:14:55 CEST
Created attachment 10153 [details]
Patch setting syncmode to read when starting the S4 connector

I was able to reproduce this bug with the automated product test of AD Takeover. What I observed is that S4 connector, will overwrite the default GPO links taken over from the AD. This will remove all GPO links linked to the Default GPO in the UCS server, which is the cause why the GPO checks fail in the test after the AD Takeover.
I started another run locally with the attached patch and the checks are executed succesfully. The patch switches the syncmode of S4 connector to read-only when the AD Takeover is starting the s4 connector module and switches it back when the module is loaded
I will test the patch in Jenkins and if fixes the problem, I will patch the AD Takeover module.
Comment 11 Arvid Requate univentionstaff 2019-08-12 20:30:07 CEST
Ok, but see Comment 5, we explicitly call /usr/share/univention-s4-connector/msgpo.py --write2ucs, why doesn't that work?

Also, as discussed, if you change the takeover code to initialize the connector in "read" mode and then later switch ro "sync" mode, then I guess the S4-Connector doesn't automatically look at all objects in OpenLDAP and sync them to Samba/AD. So you may have to re-initialze the S4-Connector after switching to "sync" mode or at least reset the lastUSN value in the S4-Connector ( sqlite3 /etc/univention/connector/s4internal.sqlite  "select value from s4 where key='lastUSN'" ).
Comment 12 Fathan Vidjaja univentionstaff 2019-08-20 14:15:34 CEST
The loss/overwriting of GPO links are caused by the listener that cannot keep up with the notifier. Specifically "well-known-sid-name-mapping.py"(see Bug 50022) takes too much time. While the AD-Takeover will wait for 10 minutes until the listener id catches up to notifier id, it will continue after 10 minutes even if the listener id is not the same as notifier id. Increasing the timeout after syncing the GPO links will solve this. I will test ad-takeover-all-test this fix in Jenkins.
Comment 13 Fathan Vidjaja univentionstaff 2019-08-21 17:20:56 CEST
ad-takeover-all-tests in Jenkins now runs without error:
http://jenkins.knut.univention.de:8080/job/UCS-4.4/job/UCS-4.4-1/view/Product%20Tests/job/product-test-samba-ad-takeover-all-tests/

commits: 
dda29ec9 Increasing threshold for timeout while waiting for listener
611cb782 Version Bump
748184c0 YAML
Comment 14 Felix Botner univentionstaff 2019-08-28 11:30:41 CEST
yep, test works now, patch looks good, modified the yaml slightly
Comment 15 Arvid Requate univentionstaff 2019-09-02 19:09:22 CEST
Looks Ok.
Comment 16 Arvid Requate univentionstaff 2019-09-04 15:48:08 CEST
<http://errata.software-univention.de/ucs/4.4/250.html>