Bug 38217 - More robust sysvolreset
More robust sysvolreset
Status: CLOSED FIXED
Product: UCS
Classification: Unclassified
Component: Samba4
UCS 4.2
Other Linux
: P5 enhancement (vote)
: UCS 4.2-2-errata
Assigned To: Lukas Oyen
Arvid Requate
:
: 39123 (view as bug list)
Depends on:
Blocks: 39123 44876
  Show dependency treegraph
 
Reported: 2015-04-08 14:24 CEST by Janis Meybohm
Modified: 2017-09-20 15:03 CEST (History)
4 users (show)

See Also:
What kind of report is it?: Bug Report
What type of bug is this?: 5: Major Usability: Impairs usability in key scenarios
Who will be affected by this bug?: 3: Will affect average number of installed domains
How will those affected feel about the bug?: 3: A User would likely not purchase the product
User Pain: 0.257
Enterprise Customer affected?: Yes
School Customer affected?: Yes
ISV affected?:
Waiting Support:
Flags outvoted (downgraded) after PO Review:
Ticket number: 2016121321000581, 2016012921000411, 2017072421000327
Bug group (optional):
Max CVSS v3 score:
requate: Patch_Available+


Attachments
try_except.patch (1.86 KB, patch)
2015-10-26 13:18 CET, Arvid Requate
Details | Diff
Patch for /usr/share/pyshared/samba/provision/__init__.py (1.18 KB, patch)
2016-12-07 15:19 CET, Julius Hinrichs
Details | Diff
38217-robust-sysvolreset-461.patch (17.06 KB, patch)
2017-09-14 16:17 CEST, Lukas Oyen
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Janis Meybohm univentionstaff 2015-04-08 14:24:06 CEST
samba-tool ntacl sysvolreset is very error prone.
It tries to set acls on symlinks (traceback), fails to set ACLs for non existent GPO files/directories (traceback), ...

In every case the error is very cryptic, it always some kind of "ERROR(runtime): uncaught exception - (-1073741823, 'Undetermined error')" or an file not found error.

We should make sure that at least the (expected) file is printed to stderr.
Maybe we should also "try: except:" the "smbd.set_nt_acl(...)" calls completely so the process is not always interrupted and things that can be fixed get fixed.
Comment 1 Arvid Requate univentionstaff 2015-04-08 17:22:30 CEST
Ok, we should check if we can improve this and submit an upstream patch.
Comment 2 Arvid Requate univentionstaff 2015-10-26 13:18:32 CET
Created attachment 7225 [details]
try_except.patch

First basic patch
Comment 3 Stefan Gohmann univentionstaff 2015-11-17 15:24:54 CET
It should be first fixed for UCS 4.1. Afterwards we should consider a UCS 4.0 backport.
Comment 4 Stefan Gohmann univentionstaff 2016-02-05 06:50:51 CET
I re-joined some DCs and the sysvolreset took a long time. strace showed that every file in the PolicyDefinitions was checked.

# find /var/lib/samba/sysvol -type f | wc -l
7157
# find /var/lib/samba/sysvol/<domain>/Policies/PolicyDefinitions/ -type f | wc -l
6518

Ticket #2016012921000411
Comment 5 Julius Hinrichs univentionstaff 2016-12-07 15:19:56 CET
Created attachment 8292 [details]
Patch for /usr/share/pyshared/samba/provision/__init__.py

In this patch, ACLs are only set once for the Policies directory.
Comment 6 Arvid Requate univentionstaff 2017-04-24 13:54:39 CEST
*** Bug 39123 has been marked as a duplicate of this bug. ***
Comment 7 Felix Botner univentionstaff 2017-05-04 18:00:18 CEST
recheck priority
Comment 8 Florian Best univentionstaff 2017-06-28 14:53:09 CEST
There is a Customer ID set so I set the flag "Enterprise Customer affected".
Comment 9 Lukas Oyen univentionstaff 2017-09-14 16:17:38 CEST
Created attachment 9206 [details]
38217-robust-sysvolreset-461.patch

The attached patch (against 4.6.1) implements the new switch `--resume-on-error` for `samba-tool ntacl sysvolreset`. With this certain NTSTATUSErrors are ignored and logged as warnings. The first is a non existent symlink, the second a deleted policy folder, both throw the same error internally:

root@ucs-master40:~# samba-tool ntacl sysvolreset --resume-on-error
Unable to set ACL O:LAG:BAD:P(A;OICI;0x001f01ff;;;BA)(A;OICI;0x001200a9;;;SO)(A;OICI;0x001f01ff;;;SY)(A;OICI;0x001200a9;;;AU) on /var/lib/samba/sysvol/blabbel
Unable to set ACL O:DAG:DAD:P(A;OICI;0x001f01ff;;;DA)(A;OICI;0x001f01ff;;;EA)(A;OICIIO;0x001f01ff;;;CO)(A;OICI;0x001f01ff;;;DA)(A;OICI;0x001f01ff;;;SY)(A;OICI;0x001200a9;;;AU)(A;OICI;0x001200a9;;;ED) on /var/lib/samba/sysvol/loyen.intranet/Policies/{31B2F340-016D-11D2-945F-00C04FB984F9}

This is a different approach to Arvids patch, as the error-handling is performed one level up. `setntacl()` is a library function and not suited to print/log errors.

This does not include Julius patch and does not handle the remark in comment 04. It is `sysvolreset`s job to (re)set the ACLs on every file in the SYSVOL.

The attached patch does not change the behavior on provisioning. Non-existent symlinks and other errors will abort the procedure. A discussion could be had if instead of an opt-in `--resume-on-error`, logging should be the default and a flag like `--fail-on-first-error` enables the current behavior.

With samba 4.7.X, a new package `samba.ntstatus` will be shipped. With this the constant `NT_STATUS_OBJECT_NAME_NOT_FOUND = 0xc0000034` could be replaced.
Comment 10 Arvid Requate univentionstaff 2017-09-14 20:33:49 CEST
Ok, I think "--resume-on-error" should be default (maybe even without any alternative), because I can think of no situation where aborting would help or continuing would be disastrous. I mean, the error messages are clearly visible, and the tool may return an exit status != 0 to indicate a problem, but it should at least to the job it was called for, as good as it can.
Comment 11 Lukas Oyen univentionstaff 2017-09-18 15:42:07 CEST
Ok, `--resume-on-error` removed as a command line flag, but internally the behaviour is changed as if it was passed.

Committed as a samba patch in r17673/4, YAML 8e0b713a.
Comment 12 Arvid Requate univentionstaff 2017-09-18 20:13:33 CEST
Ok, works much better now.
Comment 13 Arvid Requate univentionstaff 2017-09-18 20:25:07 CEST
Actually, now it's better than sysvolcheck itself:

==========================================================================
root@master10:~# mv /var/lib/samba/sysvol/ar41i1.qa/Policies/\{108A861F-3CB8-4DD1-A6D1-23642C0CF23F\}  \
  /var/tmp/

root@master10:~# samba-tool ntacl sysvolcheck 
(2, 'No such file or directory')
get_nt_acl_conn: get_nt_acl returned NT_STATUS_OBJECT_NAME_NOT_FOUND.
(-1073741772, 'The object name is not found.')

root@master10:~# samba-tool ntacl sysvolreset
set_nt_acl_conn: open: error=2 (No such file or directory)
Unable to set ACL O:DAG:DAD:P(A;OICI;0x001f01ff;;;DA)(A;OICI;0x001f01ff;;;EA)(A;OICIIO;0x001f01ff;;;CO)(A;OICI;0x001f01ff;;;DA)(A;OICI;0x001f01ff;;;SY)(A;OICI;0x001200a9;;;AU)(A;OICI;0x001200a9;;;ED) on /var/lib/samba/sysvol/ar41i1.qa/Policies/{108A861F-3CB8-4DD1-A6D1-23642C0CF23F}
==========================================================================
Comment 14 Erik Damrose univentionstaff 2017-09-20 15:03:41 CEST
<http://errata.software-univention.de/ucs/4.2/165.html>