Bug 40260 - Potential Samba/AD DRS replication deadlock
Potential Samba/AD DRS replication deadlock
Status: RESOLVED DUPLICATE of bug 37358
Product: UCS
Classification: Unclassified
Component: Samba4
UCS 4.1
Other Linux
: P3 normal (vote)
: ---
Assigned To: Samba maintainers
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2015-12-15 19:31 CET by Arvid Requate
Modified: 2015-12-16 19:09 CET (History)
4 users (show)

See Also:
What kind of report is it?: ---
What type of bug is this?: ---
Who will be affected by this bug?: ---
How will those affected feel about the bug?: ---
User Pain:
Enterprise Customer affected?:
School Customer affected?:
ISV affected?:
Waiting Support:
Flags outvoted (downgraded) after PO Review:
Ticket number:
Bug group (optional): Troubleshooting
Max CVSS v3 score:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Arvid Requate univentionstaff 2015-12-15 19:31:26 CET
Since fixing bug 29291 Samba/AD DCs always use the local KDC. In some support sessions it was observed that this may cause Samba/AD DRS replication to fail permanently after one of the Samba/AD DCs rotated his password twice

a) during the life time of the Service Tickets used by other Samba/AD DCs

or

b) while at least one of the other Samba/AD DCs was not connected (off or whatever)


This is how things are supposed to work: When DC A changes his password, his Kerberos keys change too. After that DC B still continues to connect with the Kerberos service ticket he obtained before, which contains data (the session key) encrypted by the KDC with the old Kerberos key hashes of DC A. To make key transitions like these work seamlessly for the Kerberos clients, Kerberos uses the Key version number and retains the last set of old Kerberos keys in keytab of the service (in this case /etc/krb5.keytab). So, DC A can stil identify and use the previous Kerberos keys to decrypt the Service Ticket and authentication succeeds and replication continues to work. Everybody is happy. The important point here is that the local server only keeps the last set of old Kerberos keys (i.e. the previous), not an indefinite history of outdated Kerberos Keys (that's for security reasons, obviously).

Now, in case DC A changes his password *twice* during the lifetime of a service ticket, then DC B gets an authentication error, because DC A cannot open the relevant part of the Service Ticket any longer. In that case, he cannot replicate any longer. Fixing bug 29291 made this worse: Now, he cannot replicate any longer and he never asks any other KDC but himself, so he has no chance to learn the new Kerberos keys and cannot get a valid Service Ticket for DC A and finally replicate, ever. No other DC receives changes from DC A any longer.

Before fixing bug 29291, DC B at least had a statistical chance to contact a different KDC found in the DNS SRV record, possibly an up to date one, which would get him a fresh pair of keys and replication from DC A could recover.

The implications of this situation are bad enough IMHO to open this bug. The motivation for fixing bug 29291 was "ok", but maybe taking out the statistical element was not a good idea. I guess we need a new idea here. If nothing better comes up we may have to revert that change.
Comment 1 Arvid Requate univentionstaff 2015-12-16 19:09:32 CET

*** This bug has been marked as a duplicate of bug 37358 ***