Univention Bugzilla – Bug 40259
Local LDAP bind stops working during UCS@school rejoin
Last modified: 2017-03-24 16:17:17 CET
In an UCS@school workshop, 2 out of 5 UCS@school Slave installations failed during the re-join of the UCS@school configuration wizard. The problem was that the authentication against the local slapd stopped working. The process was still running, but a simple univention-ldapsearch was not possible anymore (ldap_bind: Invalid credentials (49)). Workaround: After manually restarting the slapd service, authentication against the LDAP was possible again and the pending join scripts could be executed successfully. I will collect the join.log of at least one of the systems tomorrow.
I had the same problem twice. I opened Bug #40247, Bug #40248, Bug #40249.
Any hints on how to reproduce this issue? Minimum number of UCS@school slaves? Or does it also happen with the first school slave?
(In reply to Sönke Schwardt-Krummrich from comment #2) > Any hints on how to reproduce this issue? Minimum number of UCS@school > slaves? Or does it also happen with the first school slave? Yes, with the first slave.
The UCS@school Mulitserver tests are currently failing with this problem: http://jenkins.knut.univention.de:8080/job/UCSschool%204.1/job/UCSschool%204.1%20(R2)%20Multiserver/171/testReport/ http://jenkins.knut.univention.de:8080/job/UCSschool%204.1/job/UCSschool%204.1%20(R2)%20Multiserver/170/testReport/ http://jenkins.knut.univention.de:8080/job/UCSschool%204.1/job/UCSschool%204.1%20(R2)%20Multiserver/169/testReport/ http://jenkins.knut.univention.de:8080/job/UCSschool%204.1/job/UCSschool%204.1%20(R2)%20Multiserver/168/testReport/
03univention-directory-listener.inst contains the following errors: 113 18.07.16 19:45:44.526 LISTENER ( ERROR ) : replication: Object class violation; dn="uid=Guest,cn=users,dc=autotest201,dc=local": object class violation while adding 114 18.07.16 19:45:44.526 LISTENER ( ERROR ) : » additional info: object class 'krb5KDCEntry' requires attribute 'krb5KeyVersionNumber' 115 18.07.16 19:45:44.544 LISTENER ( ERROR ) : replication: Object class violation; dn="uid=krbtgt,cn=users,dc=autotest201,dc=local": object class violation while adding 116 18.07.16 19:45:44.544 LISTENER ( ERROR ) : » additional info: object class 'krb5KDCEntry' requires attribute 'krb5KeyVersionNumber' 117 18.07.16 19:45:44.653 LISTENER ( ERROR ) : replication: Object class violation; dn="uid=join-slave,cn=users,dc=autotest201,dc=local": object class violation while adding 118 18.07.16 19:45:44.653 LISTENER ( ERROR ) : » additional info: object class 'krb5KDCEntry' requires attribute 'krb5KeyVersionNumber' 119 18.07.16 19:45:44.670 LISTENER ( ERROR ) : replication: Object class violation; dn="uid=join-backup,cn=users,dc=autotest201,dc=local": object class violation while adding 120 18.07.16 19:45:44.670 LISTENER ( ERROR ) : » additional info: object class 'krb5KDCEntry' requires attribute 'krb5KeyVersionNumber' 121 18.07.16 19:45:44.694 LISTENER ( ERROR ) : replication: Object class violation; dn="uid=sys-idp-user,cn=users,dc=autotest201,dc=local": object class violation while adding 122 18.07.16 19:45:44.694 LISTENER ( ERROR ) : » additional info: object class 'simpleSecurityObject' requires attribute 'userPassword' 123 18.07.16 19:45:44.715 LISTENER ( ERROR ) : replication: Object class violation; dn="uid=Administrator,cn=users,dc=autotest201,dc=local": object class violation while adding 124 18.07.16 19:45:44.715 LISTENER ( ERROR ) : » additional info: object class 'krb5KDCEntry' requires attribute 'krb5KeyVersionNumber' 125 18.07.16 19:45:44.719 LISTENER ( ERROR ) : replication: Object class violation; dn="uid=dns-master201,cn=users,dc=autotest201,dc=local": object class violation while adding 126 18.07.16 19:45:44.719 LISTENER ( ERROR ) : » additional info: object class 'krb5KDCEntry' requires attribute 'krb5KeyVersionNumber' 127 18.07.16 19:45:44.922 LISTENER ( ERROR ) : replication: Object class violation; dn="cn=master201,cn=dc,cn=computers,dc=autotest201,dc=local": object class violation while adding 128 18.07.16 19:45:44.922 LISTENER ( ERROR ) : » additional info: object class 'krb5KDCEntry' requires attribute 'krb5KeyVersionNumber' 129 18.07.16 19:45:44.949 LISTENER ( ERROR ) : replication: Object class violation; dn="uid=http-proxy-master201,cn=users,dc=autotest201,dc=local": object class violation while adding 130 18.07.16 19:45:44.949 LISTENER ( ERROR ) : » additional info: object class 'krb5KDCEntry' requires attribute 'krb5KeyVersionNumber' 131 18.07.16 19:45:45.363 LISTENER ( ERROR ) : replication: Object class violation; dn="cn=slave202,cn=dc,cn=server,cn=computers,ou=School1,dc=autotest201,dc=local": object class violation while adding 132 18.07.16 19:45:45.364 LISTENER ( ERROR ) : » additional info: object class 'krb5KDCEntry' requires attribute 'krb5KeyVersionNumber' 133 18.07.16 19:45:45.445 LISTENER ( ERROR ) : replication: Object class violation; dn="krb5PrincipalName=default@AUTOTEST201.LOCAL,cn=kerberos,dc=autotest201,dc=local": object class violation while adding 134 18.07.16 19:45:45.445 LISTENER ( ERROR ) : » additional info: object class 'krb5KDCEntry' requires attribute 'krb5KeyVersionNumber' 135 18.07.16 19:45:45.514 LISTENER ( ERROR ) : replication: Object class violation; dn="krb5PrincipalName=kadmin/admin@AUTOTEST201.LOCAL,cn=kerberos,dc=autotest201,dc=local": object class violation while adding 136 18.07.16 19:45:45.514 LISTENER ( ERROR ) : » additional info: object class 'krb5KDCEntry' requires attribute 'krb5KeyVersionNumber' 137 18.07.16 19:45:45.516 LISTENER ( ERROR ) : replication: Object class violation; dn="krb5PrincipalName=kadmin/hprop@AUTOTEST201.LOCAL,cn=kerberos,dc=autotest201,dc=local": object class violation while adding 138 18.07.16 19:45:45.516 LISTENER ( ERROR ) : » additional info: object class 'krb5KDCEntry' requires attribute 'krb5KeyVersionNumber' 139 18.07.16 19:45:45.550 LISTENER ( ERROR ) : replication: Object class violation; dn="krb5PrincipalName=kadmin/changepw@AUTOTEST201.LOCAL,cn=kerberos,dc=autotest201,dc=local": object class violation while adding 140 18.07.16 19:45:45.550 LISTENER ( ERROR ) : » additional info: object class 'krb5KDCEntry' requires attribute 'krb5KeyVersionNumber' 141 18.07.16 19:45:45.558 LISTENER ( ERROR ) : replication: Object class violation; dn="krb5PrincipalName=changepw/kerberos@AUTOTEST201.LOCAL,cn=kerberos,dc=autotest201,dc=local": object class violation while adding 142 18.07.16 19:45:45.558 LISTENER ( ERROR ) : » additional info: object class 'krb5KDCEntry' requires attribute 'krb5KeyVersionNumber' 143 18.07.16 19:45:45.586 LISTENER ( ERROR ) : replication: Object class violation; dn="krb5PrincipalName=WELLKNOWN/ANONYMOUS@AUTOTEST201.LOCAL,cn=kerberos,dc=autotest201,dc=local": object class violation while adding 144 18.07.16 19:45:45.586 LISTENER ( ERROR ) : » additional info: object class 'krb5KDCEntry' requires attribute 'krb5KeyVersionNumber' 145 18.07.16 19:45:45.611 LISTENER ( ERROR ) : replication: Object class violation; dn="krb5PrincipalName=krbtgt/AUTOTEST201.LOCAL@AUTOTEST201.LOCAL,cn=kerberos,dc=autotest201,dc=local": object class violation while adding 146 18.07.16 19:45:45.611 LISTENER ( ERROR ) : » additional info: object class 'krb5KDCEntry' requires attribute 'krb5KeyVersionNumber' 147 18.07.16 19:45:45.632 LISTENER ( ERROR ) : replication: Object class violation; dn="krb5PrincipalName=WELLKNOWN/org.h5l.fast-cookie@WELLKNOWN:ORG.H5L,cn=kerberos,dc=autotest201,dc=local": object class violation while adding 148 18.07.16 19:45:45.632 LISTENER ( ERROR ) : » additional info: object class 'krb5KDCEntry' requires attribute 'krb5KeyVersionNumber' 149 18.07.16 19:45:45.644 LISTENER ( ERROR ) : replication: Object class violation; dn="krb5PrincipalName=ldap/master201.autotest201.local@AUTOTEST201.LOCAL,cn=kerberos,dc=autotest201,dc=local": object class violation while adding 150 18.07.16 19:45:45.644 LISTENER ( ERROR ) : » additional info: object class 'krb5KDCEntry' requires attribute 'krb5KeyVersionNumber'
I had a further look into the logfiles of ucs-test this night and made the following observations: The join.log contains the following error: > Configure 01univention-ldap-server-init.inst Tue Jul 19 18:34:24 EDT 2016 > > Starting ldap server(s): slapd ...failed. > 578eaaf1 /etc/ldap/slapd.conf: line 110: unknown attr "@univentionVirtualMachine" in to clause 578eaaf1 … slapsshema: bad configuration file!. > invoke-rc.d: initscript slapd, action "start" failed. → This is because the schema is not known yet on the DC Slave. This should be okay. Bug #41782. The listener.log contains the following error: 19.07.16 18:43:50.897 LISTENER ( ERROR ) : ldap_extension: slapd.conf validation failed: 578ead26 /etc/ldap/slapd.conf: line 250: unknown attr "@msPrintConnectionPolicy" in to clause This might not be okay? There is nothing done at 18:43 on the DC Slave system. In the join.log we can see the 2 joins: Tue Jul 19 18:38:05 EDT 2016: finish /usr/share/univention-join/univention-join Tue Jul 19 18:53:48 EDT 2016: starting /usr/sbin/univention-join -dcaccount Administrator -dcpwd /tmp/tmpGpy01n → The first is the regular join, the second is the rejoin by the UCS@school installer. Between those 2 joins there is a reboot. The second join/school-installer failed immediately: 19.07.16 18:54:23.507 MODULE ( PROCESS ) : ************************************************************************** 19.07.16 18:54:23.507 MODULE ( PROCESS ) : * Join failed! * 19.07.16 18:54:23.507 MODULE ( PROCESS ) : * Contact your system administrator * 19.07.16 18:54:23.507 MODULE ( PROCESS ) : ************************************************************************** 19.07.16 18:54:23.507 MODULE ( PROCESS ) : * Message: FAILED: 00ucs-school-slave-check-ou.inst 19.07.16 18:54:23.507 MODULE ( WARN ) : Die Softwarepakete wurden erfolgreich installiert, jedoch konnte der Domänenbeitritt nicht abgeschlossen werden: FAILED: 00ucs-school-slave-check-ou.inst. Mehr Hinweise können in der Log-Datei /var/log/univention/join.log gefunden werden. Nach Beheben der entsprechenden Konflikte kann der Domänenbeitritt über das UMC-Modul "Domänenbeitritt" abgeschlossen werden. 19.07.16 18:54:23.507 MODULE ( PROCESS ) : ************************************************************************** At the same time of the error shown in the listener.log at 18:43 there runs the school-installer on the DC Master. Is that error the cause? Every schema/acl extension added by ucsRegisterLDAPSchema is checked for validity and reverted if adding it fails.
All current UCS@school jenkins tests have been fixed by svn r71119 which checks if the schema for specific attributes/objectclasses exists when writing the ACL's. Maybe this Bug can then be resolved as duplicate of Bug #41725 (not sure though).
Does this problem still happen in our jenkins environment?
(In reply to Sönke Schwardt-Krummrich from comment #8) > Does this problem still happen in our jenkins environment? No. From the logfile in Bug #40249: 14.12.15 13:53:28.350 LISTENER ( WARN ) : ucsschool-user-logonscripts: Invalid credentials: failed to connect to LDAP server 14.12.15 13:53:28.350 LISTENER ( ERROR ) : ucsschool-user-logonscripts: Invalid credentials: giving up creating a new LDAP connection A different reason is in Bug #42505. (In reply to Michael Grandjean from comment #0) > I will collect the join.log of at least one of the systems tomorrow. @Michael: when is tomorrow? :D
It still fails regularly on jenkins 4.1r2 multiserver. The slaves (10.210.201.10) 2nd join: ---------------------------- join.log ------------------------------------ Wed Sep 28 19:46:34 EDT 2016: starting /usr/sbin/univention-join -dcaccount Administrator -dcpwd /tmp/tmp7qoZYW [..] E: failed to modify DC Slave cn=slave2033,cn=dc,cn=server,cn=computers,ou=School2,dc=autotest203,dc=local [LDAP Error: Can't contact LDAP server] Wed Sep 28 19:46:51 EDT 2016: finish /usr/sbin/univention-join ---------------------------------------------------------------------- At the same time on the master: ----------- management-console-web-server.log ------------------------------- 28.09.16 19:46:23.528 MAIN ( INFO ) : CPAuth/auth: got new auth request (10.210.201.10:41293 <=> ) ----------- management-console-server.log ------------------------------- 28.09.16 19:46:26.528 MODULE ( INFO ) : Executing ['schoolwizards/schools/move_dc'] ----------- auth.log ----------------------------- Sep 28 19:46:35 master203 sshd[18358]: Accepted keyboard-interactive/pam for Administrator from 10.210.201.10 port 50563 ssh2 Sep 28 19:46:35 master203 sshd[18358]: pam_unix(sshd:session): session opened for user Administrator by (uid=0) Sep 28 19:46:35 master203 sshd[18368]: Received disconnect from 10.210.201.10: 11: disconnected by user Sep 28 19:46:35 master203 sshd[18358]: pam_unix(sshd:session): session closed for user Administrator Sep 28 19:46:35 master203 sshd[18370]: Accepted keyboard-interactive/pam for Administrator from 10.210.201.10 port 50582 ssh2 Sep 28 19:46:35 master203 sshd[18370]: pam_unix(sshd:session): session opened for user Administrator by (uid=0) Sep 28 19:46:36 master203 sshd[18378]: Received disconnect from 10.210.201.10: 11: disconnected by user Sep 28 19:46:36 master203 sshd[18370]: pam_unix(sshd:session): session closed for user Administrator Sep 28 19:46:45 master203 sshd[18387]: Accepted keyboard-interactive/pam for Administrator from 10.210.201.10 port 50773 ssh2 Sep 28 19:46:45 master203 sshd[18387]: pam_unix(sshd:session): session opened for user Administrator by (uid=0) Sep 28 19:46:45 master203 sshd[18394]: Received disconnect from 10.210.201.10: 11: disconnected by user Sep 28 19:46:45 master203 sshd[18387]: pam_unix(sshd:session): session closed for user Administrator Sep 28 19:46:48 master203 sshd[18396]: Accepted keyboard-interactive/pam for Administrator from 10.210.201.10 port 50840 ssh2 Sep 28 19:46:48 master203 sshd[18396]: pam_unix(sshd:session): session opened for user Administrator by (uid=0) Sep 28 19:46:49 master203 sshd[18403]: Received disconnect from 10.210.201.10: 11: disconnected by user Sep 28 19:46:49 master203 sshd[18396]: pam_unix(sshd:session): session closed for user Administrator Sep 28 19:46:49 master203 sshd[18410]: Accepted keyboard-interactive/pam for Administrator from 10.210.201.10 port 50864 ssh2 Sep 28 19:46:49 master203 sshd[18410]: pam_unix(sshd:session): session opened for user Administrator by (uid=0) Sep 28 19:46:51 master203 slapd[18118]: DIGEST-MD5 common mech free Sep 28 19:46:51 master203 sshd[18421]: Received disconnect from 10.210.201.10: 11: disconnected by user Sep 28 19:46:51 master203 sshd[18410]: pam_unix(sshd:session): session closed for user Administrator Sep 28 19:46:51 master203 slapd[18491]: Loaded metadata from "/usr/share/univention-management-console/saml/idp/ucs-sso.autotest203.local.xml" ----------- syslog ----------------------------- Sep 28 19:46:51 master203 logger: /etc/init.d/slapd graceful-restart (pid: 18458, ppid:18352 univention-dire) Sep 28 19:46:51 master203 logger: /etc/init.d/slapd graceful-stop (pid: 18466, ppid:18458 slapd) Sep 28 19:46:51 master203 logger: /etc/init.d/slapd start (pid: 18479, ppid:18458 slapd) Sep 28 19:46:51 master203 slapd[18491]: @(#) $OpenLDAP: slapd (Nov 6 2015 08:03:15) $#012#011root@ladda:/var/build/temp/tmp.bfW1EHuZcn/pbuilder/openldap-2.4.42+dfsg/debian/build/servers/ ---------------------------------------------------------------------- So the slave fails joining, because the masters slapd does a restart. I couldn't figure out why it does that. Is it possible, that on the master installations/tests were running while the slaves were being setup?
*** Bug 42814 has been marked as a duplicate of this bug. ***
Created attachment 8233 [details] three retries of remote univention-server-join I was not able to reproduce this and I couldn't find recent cases in jenkins. The attached patch tries three times to run univention-server-join on the master, in case it fails. This might fix a previously observed case. With the lack of logfiles, I don't know what other things to make more robust in univention-join...
Problem cannot be reproduced. The patch is not applied. If the problem reoccurs, please reopen.
We've not seen this problem again during the last 12 months. Please reopen this bug with further information, if the problem happens again.
(In reply to Sönke Schwardt-Krummrich from comment #14) > Please reopen this bug with further information, if the problem happens > again. s/reopen/clone/