Bug 40186 - Improve reliability of sysvol-sync
Improve reliability of sysvol-sync
Status: CLOSED FIXED
Product: UCS
Classification: Unclassified
Component: Samba4
UCS 4.1
Other Linux
: P3 normal (vote)
: UCS 4.1-0-errata
Assigned To: Arvid Requate
Felix Botner
:
Depends on:
Blocks: 40346 42097
  Show dependency treegraph
 
Reported: 2015-12-07 19:10 CET by Arvid Requate
Modified: 2016-08-23 14:47 CEST (History)
1 user (show)

See Also:
What kind of report is it?: ---
What type of bug is this?: ---
Who will be affected by this bug?: ---
How will those affected feel about the bug?: ---
User Pain:
Enterprise Customer affected?:
School Customer affected?:
ISV affected?:
Waiting Support:
Flags outvoted (downgraded) after PO Review:
Ticket number:
Bug group (optional): Troubleshooting
Max CVSS v3 score:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Arvid Requate univentionstaff 2015-12-07 19:10:24 CET
There has been at least one report of broken fACLs in sysvol. The primary suspect is sysvol-sync. There might be issues coming from concurrent reads from and writes to /var/lib/samba/sysvol. There are at least threee ideas on how to improve this:

a) Lock the sysvol while operating on it (e.g. man flock)
b) First check with "rsync -au --dry-run" if copying is required at all
c) Generate a consistent sysvol copy for the reading rsync processes
Comment 1 Arvid Requate univentionstaff 2015-12-07 19:16:13 CET
Regarding locking: We already create a local exclusive (write) lock in the sysvol-sync script. Using this lock file to coordinate locking remotely could possibly be done this way:

================================================
LOCKFILE="/var/lock/sysvol-sync"
SYSVOL_SYNCDIR='/var/cache/univention-samba4/sysvol-sync'
importdir="${SYSVOL_SYNCDIR}/.$remote_hostname"

chgrp 'DC Slave Hosts' /var/loc /sysvol-sync
chmod g+w /var/lock/sysvol-sync

## create local write lock (This step is already done in the current script)
(
flock -n 9 || exit 0

## add a trap to release the shared (read) lock created in the next step below
trap "ssh -S '~/.ssh/control-%r@%h:%p' -O exit '$hostname\$@$remote_hostname'"


## try to create remote shared (read) lock, background multiplex ssh and wait
{
univention-ssh --no-split /etc/machine.secret \
    -M -S '~/.ssh/control-%r@%h:%p' \
    "$hostname\$@$remote_hostname" \
    "sh -c '(flock -s -n 8 || exit 1; echo GO; read WAIT;) 8>\"$LOCKFILE\"'" &
} | read GO

## rsync if multiplex master is established
if ssh -S '~/.ssh/control-%r@%h:%p' -O check "$hostname\$@$remote_hostname";
then
    rsync /etc/machine.secret -aAX --delete \
      -e 'ssh -S "~/.ssh/control-%r@%h:%p"' \
      "$hostname\$@$remote_hostname:/var/lib/samba/sysvol" "$importdir"
fi

## release local write lock
) 9>"$LOCKFILE"
================================================


I'm just unsure about concurrency behaviour with this kind of locking. Maybe when attempting to acquire the read lock we should block until we get it.
Comment 2 Arvid Requate univentionstaff 2015-12-16 19:30:35 CET
Unfortunately ssh multiplexing currently doesn't work with the univention-ssh wrapper, so the code above needed a bit of modification.


The sysvol-sync script has been adjusted to

> a) Lock the sysvol while operating on it (e.g. man flock)
> b) First check with "rsync -au --dry-run" if copying is required at all

Advisory: univention-samba4.yaml
Comment 3 Felix Botner univentionstaff 2015-12-21 14:33:26 CET
OK - check if there are changes before the sync
OK - exclusive lock while writing into local sysvol
OK - remote read lock while reading remote sysvol
OK - remote lock gets removed on destination if source becomes unavailable
OK - sshd/config/ClientAliveInterval (60s, sshd reload)

OK - univention-samba4.yaml
Comment 4 Arvid Requate univentionstaff 2015-12-22 16:04:46 CET
<http://errata.software-univention.de/ucs/4.1/40.html>