Bug 40186 - Improve reliability of sysvol-sync
Improve reliability of sysvol-sync
Product: UCS
Classification: Unclassified
Component: Samba4
UCS 4.1
Other Linux
: P3 normal (vote)
: UCS 4.1-0-errata
Assigned To: Arvid Requate
Felix Botner
Depends on:
Blocks: 40346 42097
  Show dependency treegraph
Reported: 2015-12-07 19:10 CET by Arvid Requate
Modified: 2016-08-23 14:47 CEST (History)
1 user (show)

See Also:
What kind of report is it?: ---
What type of bug is this?: ---
Who will be affected by this bug?: ---
How will those affected feel about the bug?: ---
User Pain:
Enterprise Customer affected?:
School Customer affected?:
ISV affected?:
Waiting Support:
Flags outvoted (downgraded) after PO Review:
Ticket number:
Bug group (optional): Troubleshooting
Max CVSS v3 score:


Note You need to log in before you can comment on or make changes to this bug.
Description Arvid Requate univentionstaff 2015-12-07 19:10:24 CET
There has been at least one report of broken fACLs in sysvol. The primary suspect is sysvol-sync. There might be issues coming from concurrent reads from and writes to /var/lib/samba/sysvol. There are at least threee ideas on how to improve this:

a) Lock the sysvol while operating on it (e.g. man flock)
b) First check with "rsync -au --dry-run" if copying is required at all
c) Generate a consistent sysvol copy for the reading rsync processes
Comment 1 Arvid Requate univentionstaff 2015-12-07 19:16:13 CET
Regarding locking: We already create a local exclusive (write) lock in the sysvol-sync script. Using this lock file to coordinate locking remotely could possibly be done this way:


chgrp 'DC Slave Hosts' /var/loc /sysvol-sync
chmod g+w /var/lock/sysvol-sync

## create local write lock (This step is already done in the current script)
flock -n 9 || exit 0

## add a trap to release the shared (read) lock created in the next step below
trap "ssh -S '~/.ssh/control-%r@%h:%p' -O exit '$hostname\$@$remote_hostname'"

## try to create remote shared (read) lock, background multiplex ssh and wait
univention-ssh --no-split /etc/machine.secret \
    -M -S '~/.ssh/control-%r@%h:%p' \
    "$hostname\$@$remote_hostname" \
    "sh -c '(flock -s -n 8 || exit 1; echo GO; read WAIT;) 8>\"$LOCKFILE\"'" &
} | read GO

## rsync if multiplex master is established
if ssh -S '~/.ssh/control-%r@%h:%p' -O check "$hostname\$@$remote_hostname";
    rsync /etc/machine.secret -aAX --delete \
      -e 'ssh -S "~/.ssh/control-%r@%h:%p"' \
      "$hostname\$@$remote_hostname:/var/lib/samba/sysvol" "$importdir"

## release local write lock

I'm just unsure about concurrency behaviour with this kind of locking. Maybe when attempting to acquire the read lock we should block until we get it.
Comment 2 Arvid Requate univentionstaff 2015-12-16 19:30:35 CET
Unfortunately ssh multiplexing currently doesn't work with the univention-ssh wrapper, so the code above needed a bit of modification.

The sysvol-sync script has been adjusted to

> a) Lock the sysvol while operating on it (e.g. man flock)
> b) First check with "rsync -au --dry-run" if copying is required at all

Advisory: univention-samba4.yaml
Comment 3 Felix Botner univentionstaff 2015-12-21 14:33:26 CET
OK - check if there are changes before the sync
OK - exclusive lock while writing into local sysvol
OK - remote read lock while reading remote sysvol
OK - remote lock gets removed on destination if source becomes unavailable
OK - sshd/config/ClientAliveInterval (60s, sshd reload)

OK - univention-samba4.yaml
Comment 4 Arvid Requate univentionstaff 2015-12-22 16:04:46 CET