Univention Bugzilla – Bug 43786
Clients loose their trust-relationship to the domain
Last modified: 2019-01-03 07:22:32 CET
Problem: With UCS version 4.1-4 errata 360 (between 355 and 366) clients loose their trust to the domain every 60 days. The clients were rolled out via Acronis TrueImage. Acronis allocates the same SIDs for the machines. -------------------------------------------------------------------------------- This update fixes the following issues: * Overflow in Samba NDR parsing function ndr_pull_dnsp_name causes vulnerability to remote code execution (CVE-2016-2123). * Unconditional privilege delegation to Kerberos servers in trusted realms (CVE-2016-2125). * Flaws in Kerberos PAC validation can trigger privilege elevation (CVE-2016-2126). * Samba has been updated to version 4.5.3. The Debian package version doesn't reflect this and stays at 2:4.5.1-1.849. * Rejoining a DC Backup or DN Slave failed in UCS 4.1-4 because samba-tool domain join didn't support the option --keep existing any longer. Bug#43132 Bug#43144 Bug#43176 -------------------------------------------------------------------------------- With ACMP (Aagon) this does not happen
Seems like we had the same issue today. Workaround for now: 1. Deactivate "Domain member: Maximum machine account password age" GPO for now. 2. Rejoin affected systems.
(In reply to Michel Smidt from comment #1) > Seems like we had the same issue today. > Workaround for now: > 1. Deactivate "Domain member: Maximum machine account password age" GPO for > now. > 2. Rejoin affected systems. How was the windows system installed and which UCS version is used?
(In reply to Stefan Gohmann from comment #2) > (In reply to Michel Smidt from comment #1) > > Seems like we had the same issue today. > > Workaround for now: > > 1. Deactivate "Domain member: Maximum machine account password age" GPO for > > now. > > 2. Rejoin affected systems. > > How was the windows system installed and which UCS version is used? The systems were Windows 7 systems which were distributed via Acronis. Affected systems were "a few" in a computer room, "all" in a science class. The terminalserver wasn't affected. Furthermore a virtualized client which was installed by hand (iso) was affected as well. The problem was noticed after the easter holiday. The password rotation before was set to 30 days. The resulting time frame (30 + 2 weeks) fell into the roll out phase of the school. UCS versions: Master - 4.1-4 errata 408 School-Slave - 4.1-4 errata 408
I couldn't find any "hard facts" about our cases yet, such as * Exact windows client error messages and eventlog entries. * univention-s4search objectsid=$SID_OF_MYCLIENT * Complete machine account objects (OpenLDAP and Samba/AD), as obtainable via ldbsearch -H ldapi:///var/lib/samba/private/ldap_priv/ldapi \ objectsid=$SID_OF_MYCLIENT \ '*' supplementalcredentials unicodepwd replPropertyMetaData \ ntsecuritydescriptor msds-keyversionnumber \ --show-binary In case anything like this happens again, we should definitely collect this data before working around the issue by setting DisablePasswordChange. So we can only assess rather vague evidence currently: * There are reports of failed "unattended" Windows installations in MS forums where the client got renamed during the process, which could break things. But this does not explain, why it worked until the machine password was rotated. So I would not bet on this. * On the other hand the Samba-advisory for CVE-2016-2126 mentioned above explicitly talks about a winbindd security issue when changing his own machine password. Note that they are not talking about client password changes here, but still it has a certain smell (e.g. https://bugzilla.redhat.com/show_bug.cgi?id=1403115 ). Then Bug 43850 Comment 1 comes to my mind, where we found that Kerberos ticket issued by Samba 4.6.1 was rejected by bind9 during DDNS update. My gut feeling is that the security fix for CVE-2016-2126 for Samba 4.5.1 and that Kerberos issue in 4.6.1 are connected and are about the des-cbc-crc keytype: Bug 43850 said "Checksum type 1 not keyed" and the Advisory for CVE-2016-2126 talks about unkeyed checksums. If this very vague line of reasoning should have a grain of truth, then I could imagine that Windows clients could also experience strange issues with their Kerberos Keys. If we have another case of this bug, it may be worth to try to use the re-ordered key priorities from a UCS 4.2 krb5.conf to check if the client starts to work again. And if not, we should experiment with "netdom resetpwd /server:DC_NAME /userd:USERNAME / password:PASSWORD". The only question would be: Why would this be connected to a client password change?
Another interesting thing here is the time interval of 60 days mentioned in the original bug report above: "clients loose their trust to the domain every 60 days." According to https://blogs.technet.microsoft.com/askds/2009/02/15/machine-account-password-process-2/ MS-Clients change their password every 30 days. So the issue could be triggered by the *second* password change. Incidentally Samba/AD stores the current and also the previous Kerberos hashes. After two rotations the original set of Kerberos hashes, as obtained during initial domain join would be dropped.
This seems to be the authoritative MS advice about this topic: https://social.technet.microsoft.com/wiki/contents/articles/9157.troubleshooting-ad-trust-relationship-between-workstation-and-primary-domain-failed.aspx It names three common causes: 1. SID has been assigned to multiple computers. 2. "If there are problems with system time, DNS configuration or other settings, secure channel’s password between Workstation and DCs may not [work]." 3. No SPN or DNSHost Name mentioned in the computer account attributes. I'll go again through the tickets to check if I can find any of the error messages documented as symptomatic for case 2 here: https://blogs.technet.microsoft.com/asiasupp/2007/01/17/typical-symptoms-when-secure-channel-is-broken/
Created attachment 8900 [details] collect_windowsclient_info.sh Since the issue is not reproducible until now (Last attempt: Windows 7 Clients rolled out with OPSI as in Ticket 2017030121000332, joined to UCS@school Slave), we'll have to collect more information if this happens again. The attached script should help collect server side information about the affected windows client: ./collect_windowsclient_info.sh In the end it encrypts the collected log file with the GPG support key.
Created attachment 8901 [details] log_client_communication.sh This second script should help capture network traffic between the Samba DC and the affected windows client: ./log_client_communication.sh <short-client-hostname> It needs to be run on the logon server of the client. In the end it encrypts the archive file with the GPG support key.
Again a customer reported issues related to this bug. UCS 4.1-5 e502, ca. 750 Win7 Clients He also mentioned http://implbits.com/active-directory/2012/04/13/dont-rejoin-to-fix.html as a partly successful workaround.
Created attachment 9545 [details] Logfile of DataCollector for Client bv-pgf-02
Created attachment 9546 [details] Logfile of TCPDump for Client bv-pgf-02
I run into the same problem while upgrading the UCS System, at the moment I'm at 4.1-3. We have only one UCS system which acts as PDC with Samba4. Currently around 5 Clients fall daily in this situation (the attached logfile is one of them). The tcpdump is form a Client with Win7 and it is at Prelogin State - I tried to Login, bumped into trust error and then I ended the capture. As there are only a few people reporting to this I believe when I upgrade to 4.2/4.3 the problem maybe solves itself. One thing makes me unsure to that is that univention-s4search (as also seen in tcpdump) is throwing an error. I can run telnet ucs 636 and telnet ucs 7636 getting a connect. Quite unsure if it is related to this and what to do next (update to 4.2/4.3 vs. fixing).
This issue has been filled against UCS 4.1. The maintenance with bug and security fixes for UCS 4.1 has ended on 5st of April 2018. Customers still on UCS 4.1 are encouraged to update to UCS 4.3. Please contact your partner or Univention for any questions. If this issue still occurs in newer UCS versions, please use "Clone this bug" or simply reopen the issue. In this case please provide detailed information on how this issue is affecting you.