Bug 38868 - sysvol-sync.sh need error handling
sysvol-sync.sh need error handling
Status: CLOSED FIXED
Product: UCS
Classification: Unclassified
Component: Samba4
UCS 4.0
Other Linux
: P2 normal (vote)
: UCS 4.0-3-errata
Assigned To: Felix Botner
Arvid Requate
:
: 33238 (view as bug list)
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2015-07-09 14:45 CEST by Janis Meybohm
Modified: 2015-09-23 17:12 CEST (History)
5 users (show)

See Also:
What kind of report is it?: ---
What type of bug is this?: ---
Who will be affected by this bug?: ---
How will those affected feel about the bug?: ---
User Pain:
Enterprise Customer affected?:
School Customer affected?:
ISV affected?:
Waiting Support:
Flags outvoted (downgraded) after PO Review:
Ticket number:
Bug group (optional): Error handling
Max CVSS v3 score:


Attachments
sysvol-sync.sh (5.37 KB, application/x-shellscript)
2015-07-09 14:45 CEST, Janis Meybohm
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Janis Meybohm univentionstaff 2015-07-09 14:45:27 CEST
Created attachment 7014 [details]
sysvol-sync.sh

2015070221000354

I've seen lots of problems when debugging sysvol-sync issues in a customer environment with > 8 downstream DCs with slow and unsteady connections.
In different cases the rsync jobs fail for numerous reasons like:

rsync: change_dir "/var/lib/samba/sysvol" failed: Permission denied (13)
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1526) [Receiver=3.0.7]


2015-07-09 11:55:16: [dcxyz] rsync pull from downstream DC: dcxyz
rsync: send_files failed to open "/var/lib/samba/sysvol/xx-net.local/scripts/user/.jniklas.vbs.6jBdib": Permission denied (13)
rsync: send_files failed to open "/var/lib/samba/sysvol/xx-net.local/scripts/user/.smichelle.vbs.6EnvLj": Permission denied (13)
rsync: send_files failed to open "/var/lib/samba/sysvol/xx-net.local/scripts/user/.tjasmin.vbs.bu4XqG": Permission denied (13)
rsync: send_files failed to open "/var/lib/samba/sysvol/xx-net.local/scripts/user/.uregina.vbs.SjlU63": Permission denied (13)
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1526) [generator=3.0.7]


Jul 09 03:02:49 ssh: connect to host dcxyz port 22: Connection timed out
Jul 09 03:02:49 rsync: connection unexpectedly closed (0 bytes received so far) [Receiver]
Jul 09 03:02:49 rsync error: unexplained error (code 255) at io.c(601) [Receiver=3.0.7]



Mostly for debugging purposes I've created an extended sysvol-sync.sh with the follwing changes/additions:

* Improve/add logging (with timestamps)
* Does not trash stderr/stdout from rsync
* downstream DC's also sync into /var/cache/univention-samba4/sysvol-sync first
* check error codes from rsync and don't write to SYSVOL when != 0
* don't write to SYSVOL if a file from remote DC does not have POSIX ACLs


For me it seems as if ACLs may get lost when rsync fails because of connection timeout (maybe even because of permission denied) as permissions and ALCs get set _after_ the files have been transferred. With the attached sysvol-sync.sh files with broken ACLs are no longer written to SYSVOL on $samba4_sysvol_sync_host and therefore they don't get synced to downstream DCs.
Comment 1 Michael Grandjean univentionstaff 2015-08-07 12:39:18 CEST
This sysvol-sync.sh version would be a great improvement.
Would have made things much at easier at Ticket#2015080621000363 / Issue#2809
Comment 2 Janis Meybohm univentionstaff 2015-09-18 11:20:00 CEST
I've added a "--delete" switch to the "pull from parent s4dc" univention-ssh-rsync call so that files removed from upstream dc are removed from the downstream dc's "cold target/importdir" as well.

In concrete case files without ACLs had been synced to downstream dc's cold target and that stopped the sync to hot target (as expected). Deleting those files from upstream dc does not re-enable the sync to hot target on downstream dc as the files still exist there.
The "don't delete files from downstream dc"-behaviour should stay intact.

As we said one should probably also add a check that prevents the script from running in parallel as that may confuse rsync.
Comment 3 Janis Meybohm univentionstaff 2015-09-18 11:23:55 CEST
To refresh you knowledge about sysvol-sync's wizardry skills see:
https://hutten.knut.univention.de/mediawiki/index.php/Samba4_Debugging#Sysvol
Comment 4 Felix Botner univentionstaff 2015-09-21 14:03:57 CEST
*** Bug 33238 has been marked as a duplicate of this bug. ***
Comment 5 Felix Botner univentionstaff 2015-09-21 14:56:37 CEST
* adopted patch (with rsync --delete for "pull from parent s4dc")
* error handling (stop/continue if rsync fails or files with no ACL's )
* more logging (log rsync/ssh errors)
* use flock to prevent script from running twice
* added samba4/sysvol/sync/debug switch to enable additional debug messages

* merge to 4.1-0

YAML: 2015-09-21-univention-samba4.yaml
Comment 6 Arvid Requate univentionstaff 2015-09-22 15:45:08 CEST
Looks good, but please fix the quoting in log(), maybe like this:

log() {
        local msg="${2//$'\r'/}"
        builtin echo $(date +"%F %T") "$1" "${msg//$'\n'/}"
}
Comment 7 Felix Botner univentionstaff 2015-09-22 16:14:30 CEST
(In reply to Arvid Requate from comment #6)
> Looks good, but please fix the quoting in log(), maybe like this:
> 
> log() {
>         local msg="${2//$'\r'/}"
>         builtin echo $(date +"%F %T") "$1" "${msg//$'\n'/}"
> }

OK, updates errata4.0-1, 4.1-0

YAML: 2015-09-21-univention-samba4.yaml
Comment 8 Arvid Requate univentionstaff 2015-09-22 16:35:54 CEST
Ok, built and yaml updated.
Comment 9 Janek Walkenhorst univentionstaff 2015-09-23 17:12:46 CEST
<http://errata.software-univention.de/ucs/4.0/330.html>