Bug 41571 - windows computer not in exam group security group
windows computer not in exam group security group
Status: CLOSED FIXED
Product: UCS@school
Classification: Unclassified
Component: Samba 4
UCS@school 4.1 R2
Other Linux
: P5 normal (vote)
: UCS@school 4.1 R2 v10
Assigned To: Arvid Requate
Stefan Gohmann
:
Depends on:
Blocks: 43628 44975 45087 45088
  Show dependency treegraph
 
Reported: 2016-06-14 17:44 CEST by Felix Botner
Modified: 2017-07-27 19:45 CEST (History)
4 users (show)

See Also:
What kind of report is it?: Bug Report
What type of bug is this?: 5: Major Usability: Impairs usability in key scenarios
Who will be affected by this bug?: 3: Will affect average number of installed domains
How will those affected feel about the bug?: 3: A User would likely not purchase the product
User Pain: 0.257
Enterprise Customer affected?:
School Customer affected?:
ISV affected?:
Waiting Support:
Flags outvoted (downgraded) after PO Review:
Ticket number:
Bug group (optional):
Max CVSS v3 score:


Attachments
reject_if_s4_changed.diff (6.54 KB, patch)
2016-12-05 20:54 CET, Arvid Requate
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Felix Botner univentionstaff 2016-06-14 17:44:42 CEST
After starting an exam the windows clients have to be restarted.

But sometimes, despite the restart, the windows client is NOT in the exam group (e.g. OUschool1-Klassenarbeit) in the windows security group (see output of gpresult /R on the windows client after the restart). 
Therefore the computer settings of the exam GPO (see http://docs.software-univention.de/ucsschool-handbuch-4.1.html#school:exam) are not applied.

After an additional restart, gpresult lists the exam group in the security groups in gpresult and the GPO is applied.
Comment 1 Stefan Gohmann univentionstaff 2016-08-25 08:38:36 CEST
The UCS 4.1-3 GPO tests now fail:

http://jenkins.knut.univention.de:8080/job/UCS-4.1/job/UCS-4.1-3/job/AutotestJoin/29/SambaVersion=s4,Systemrolle=slave/testReport/51_samba4/56evaluate_windows_gpo/test/

---------------------------------------------------------------------------
*** BEGIN *** ['/usr/bin/python', '56evaluate_windows_gpo'] ***
*** 51_samba4/56evaluate_windows_gpo *** Test if GPOs created on a native Windows Server work with S4 ***
*** START TIME: 2016-08-24 21:20:09 ***

Looking for 'IP-0AD27336' host ip address:
Host IP is: 10.210.115.54
Trying to check Windows host '10.210.115.54' domain

Creating a 'ucs_test_gpo_user_c2zs' user for the test.
User 'ucs_test_gpo_user_c2zs' created successfully

Creating GPO for the test with a name: test_user_gpo_lb4a
USING partition_dn: DC=autotest095,DC=local
Replicate from master095 to slave095 was successful.
USING partition_dn: DC=autotest095,DC=local
Replicate from slave095 to master095 was successful.

Checking that GPO 'test_user_gpo_lb4a' exists.

Set-GPPermissions on 'test_user_gpo_lb4a' for 'ucs_test_gpo_user_c2zs' 'User' to 'GpoApply'

Removing GPOs created for the test: test_user_gpo_lb4a

Removing GPOs created for the test: test_machine_gpo_phao
An Error occured while removing GPO remotely: WinExeFailed('command \'winexe --interactive=0 -U autotest095.local\\Administrator%Univention@99 --runas autotest095.local\\Administrator%Univention@99 //10.210.115.54 PowerShell.exe -inputformat none -Noninteractive -ExecutionPolicy Bypass -File c:\\powershell-remove-gpo.ps1 autotest095.local test_machine_gpo_phao slave095 \' failed with: [08/25/2016 01:20:33] Das Gruppenrichtlinienobjekt "test_machine_gpo_phao" wurde in der Dom\x84ne "autotest095.local" nicht gefunden.\r\nParametername: gpoDisplayName\n None',)

Removing 'ucs_test_gpo_user_c2zs' user:
Traceback (most recent call last):
  File "56evaluate_windows_gpo", line 495, in <module>
    apply_gpo(test_user_gpo, test_username, 'User')
  File "56evaluate_windows_gpo", line 155, in apply_gpo
    ret_code, stdout, stderr = Win.Set_GPPermissions(gpo_name,
AttributeError: WinExe instance has no attribute 'Set_GPPermissions'
Deleted user ucs_test_gpo_user_c2zs
*** END TIME: 2016-08-24 21:20:34 ***
*** TEST DURATION (H:MM:SS.ms): 0:00:24.495216 ***
*** END *** 1 ***
---------------------------------------------------------------------------
Comment 2 Arvid Requate univentionstaff 2016-08-25 11:11:01 CEST
Yes, I adjusted ucs-windows-tools but didn't rebuild it.
Comment 3 Arvid Requate univentionstaff 2016-08-30 19:44:26 CEST
Now there is a test case 90_ucsschool/98_samba4_evaluate_windows_gpo which adds a joined Windows-Client and a test user into a test class, creates both a user GPO and a machine GPO, links both to the school OU (or the ldap base), starts an exam, reboots the Windows client and check the output of gpresult.
Let's see what Jenkins reports.
Comment 4 Arvid Requate univentionstaff 2016-08-31 22:36:39 CEST
The test case actually sometimes fails to configure the GPO security filter for "OUschool1-Klassenarbeit" (via Powershell cmdlet Set-GPPermissions). In that case the test now outputs the GPOresult XML as debug info, which shows the SDDL. When I freeze the script at that point in case of failure, I can see, that the sysvol ACLs are ok (OUschool1-Klassenarbeit present), but the ntsecuritydescriptor in Samba/AD doesn't show the correct permissions.

What's even more weird: The test creates two GPOs, one user GPO and one machine GPO and I'm pretty sure that I saw one run where it worked for the first and failed for the second GPO.

Note that this error happens *before* even starting the exam (and putting any user or computer into that group). It's just: create a GPO, attempt to set a security filter via powershell and check if this has been saved to Samba/AD.

We have seen a similar behaviour in one customer environment with several DCs where the the replication was distorted (and the account that was used for security filtering was not present on all DCs). Currently I cannot imagine a reason how this should apply to UCS@school. After all, the windows client "should" only talk to the UCS@school Slave PDC.

Also, I've made the test to run Set-GPPermissions once more in case of failure, and: In all my manual tests it worked on the second attempt! Weird. If the second attempt works the test simply says so but aborts anyway to indicate the error. To make debugging this more efficient I'd suggest to replace the exam.start() to exam.finish() part simply by "pass", because currently that's not the primary problem.

I guess the next step is to obtain a network trace from the perspective of the Windows client and run the test until it fails.
Comment 5 Stefan Gohmann univentionstaff 2016-09-01 06:02:39 CEST
Is it possible that the S4 Connector is the reason? In UCS@school the ntsecuritydescriptor of the GPO is synchronized by the S4 Connector.
Comment 6 Arvid Requate univentionstaff 2016-09-01 15:48:53 CEST
Good point, looks like I can confirm that.

I adjusted the standard 51_samba4/56evaluate_windows_gpo ucs-test case to also work on an UCS@school Samba/AD PDC Slave. That test works without setting up any exam and we can use it to compare the behaviour of UCS@school vs plain UCS.

When I set

ucr set connector/s4/mapping/gpo/ntsd=no; service univention-s4-connector restart

the 51_samba4/56evaluate_windows_gpo test case runs in a loop without any issues:

i=0; while ./56evaluate_windows_gpo -f; do echo "$i"; i=$((i+1)); done


When I re-enable ntsd-sync, it becomes flaky.
Comment 7 Stefan Gohmann univentionstaff 2016-09-01 16:23:41 CEST
Is the same attribute changed in a short time? Bug #35336.
Comment 8 Arvid Requate univentionstaff 2016-09-01 23:01:58 CEST
Yes I think so, I've implemented a wait in the test case, similar to 
52_s4connector/100sync_gpo_ntsecurity_descriptor , this seems to stabilize the test.

I've also implemented the check if the machine GPO has actually been applied to the Windows client after reboot (during exam), which checks the security filtering.

The same test for the user GPO part doesn't work yet, because I couldn't get winexe to authenticate with a non-Admin domain user. Anyway, the crucial part of the UCS@school Samba/AD Slave PDC Windows client GPO test for School exams is now implemented.
Comment 9 Arvid Requate univentionstaff 2016-09-01 23:04:03 CEST
I'm hesitating to set this to "worksforme" because this problem of Bug 35336 is not a purely theoretical issue in this case. If a Domain Admin uses the Microsoft Group Policy Management Console to adjust the security filtering for a GPO, she will typically run into this problem, because the GUI applies each change separately:

1. remove the default application permission for "Authenticated Users" (click)
2. add application permission some other group (click)

This can actually trigger this timing issue in real life.
Comment 10 Stefan Gohmann univentionstaff 2016-09-02 06:19:27 CEST
(In reply to Arvid Requate from comment #9)
> I'm hesitating to set this to "worksforme" because this problem of Bug 35336
> is not a purely theoretical issue in this case. If a Domain Admin uses the
> Microsoft Group Policy Management Console to adjust the security filtering
> for a GPO, she will typically run into this problem, because the GUI applies
> each change separately:
> 
> 1. remove the default application permission for "Authenticated Users"
> (click)
> 2. add application permission some other group (click)
> 
> This can actually trigger this timing issue in real life.

Then, I think we should fix it in the Connector. Maybe it is a good idea to start exactly with this attribute.
Comment 11 Florian Best univentionstaff 2016-10-26 11:15:02 CEST
http://jenkins.knut.univention.de:8080/job/UCSschool%204.1/job/UCSschool%204.1%20(R2)%20Multiserver/282/SambaVersion=s4/testReport/90_ucsschool/98_samba4_evaluate_windows_gpo/test/

Exception during Set-GPPermissions: WinExeFailed('command \'winexe --interactive=0 -U autotest203.local\\Administrator%univention --runas autotest203.local\\Administrator%univention //10.210.192.15 PowerShell.exe -inputformat none -Noninteractive -ExecutionPolicy Bypass -File c:\\univention-Set-GPPermissions.ps1 autotest203.local "test_user_gpo_mlbx" GpoRead ""Authenticated Users"" Group True \' failed with: [10/26/2016 02:21:03] Ausnahme beim Aufrufen von "ToBoolean" mit 1 Argument(en):  "Die Zeichenfolge wurde nicht als g\x81ltiger Boolean erkannt."\n None',)
Comment 12 Arvid Requate univentionstaff 2016-12-05 20:54:47 CET
Created attachment 8288 [details]
reject_if_s4_changed.diff

This approach would make sync_from_ucs check first if the uSNChanged attribute is untouched in Samba/AD since it has been cached in the s4cache during the last sync_to_ucs run for that object. If the uSNChanged has a new value, the sync_from_ucs temporarily rejects the change.

But there are several issues with this approach:

1. This must not happen during initial sync. The patch above takes care of that.

2. possible deadlock if the sync_to_ucs has an unrelated reject. To avoid this one could restrict application of the recipe above to "S4 reject"-free situations. That would add another special case.

3. The temporary reject may be "overtaken" (in the sense of cars) by a different USN change, usually the rebound of the second sync_to_ucs (the second modification of the nTSecurityDescriptor).

From this assessment I'm currently leaning towards a more specific solution that only applies to changes of the nTSecurityDescriptor, and not to the general type of problem. My impression is that we need a more fundamental approach for the general problem.
Comment 13 Arvid Requate univentionstaff 2016-12-05 21:52:04 CET
Now I've simply adjusted ntsd_to_s4(), which is used as post_con_*_function, to not overwrite the nTSecurityDescriptor in case the uSNChanged in Samba/AD differs from the value in the s4cache. By default this change is limited to UCS@school, because connector/s4/mapping/gpo/ntsd is not active in standard UCS.

Let's wait for the results of the CI tests.

Advisory: univention-s4-connector.yaml
Comment 14 Florian Best univentionstaff 2016-12-06 09:33:14 CET
(In reply to Arvid Requate from comment #12)
> Created attachment 8288 [details]
> reject_if_s4_changed.diff
+		try:
+			udm_function_level = int(self.baseConfig.get('%s/debug/udm/function' % self.CONFIGBASENAME, 0))
+		except ValueError:
+			function_level = 0

The variable name in the except block is wrong.
Comment 15 Arvid Requate univentionstaff 2016-12-06 17:33:13 CET
> reject_if_s4_changed.diff

As I said, this was the bad approach. Ignore it, that attachment was just the artefact of a failed experiment.
Comment 16 Arvid Requate univentionstaff 2016-12-08 21:33:17 CET
The Jenkins results don't show regressions AFAICS.

Regarding Comment 11: That was a transient effect because I messed up the test case.

I've created Bug #43142 because the 98_samba4_evaluate_windows_gpo is still flaky on UCS@school Slave PDCs, but that seems to be unrelated.


Advisory: univention-s4-connector.yaml
Comment 17 Stefan Gohmann univentionstaff 2016-12-20 20:12:45 CET
Excellent. 

Code review: OK

Tests: OK, My test script still fails but the manual tests look much better.

YAML: OK

UCS 4.2 merge: OK
Comment 18 Philipp Hahn univentionstaff 2016-12-21 15:32:54 CET
<http://errata.software-univention.de/ucs/4.1/365.html>