Univention Bugzilla – Bug 33050
Various samba tests fail on S3 slave
Last modified: 2014-03-17 14:51:32 CET
http://jenkins.knut.univention.de:8080/view/Autotest/job/UCS%203.2%20Autotest%20MultiEnv/186/SambaVersion=s3,Systemrolle=slave/testReport/ I seems to be a replication problem, for example: ----------------------------------------------------------------------------- Fehlschlag 53_samba-common/46share_access_permissions_sambaValidUsers.Access a share as a user whitelisted by sambaValidUsers (from samba-common.Access a share as a user whitelisted by sambaValidUsers) Schlägt fehl seit 5 Builds (Seit Instabil#182 ) Dauer: 1 Minute 19 Sekunden. Beschreibung hinzufügen Fehlermeldung Test failed Standard Ausgabe (STDOUT) ## create user Object created: uid=vunvelve,cn=users,dc=autotest094,dc=local Object created: cn=s0v8ECF7,cn=slave094.autotest094.local,cn=shares,dc=autotest094,dc=local Waiting for replication: OK: replication complete (nid=2925 lid=2925) Done: replication complete. Waiting for postrun ## wait for samba share export Object removed: uid=vunvelve,cn=users,dc=autotest094,dc=local Object removed: cn=s0v8ECF7,cn=slave094.autotest094.local,cn=shares,dc=autotest094,dc=local Waiting for replication: OK: replication complete (nid=2928 lid=2928) Done: replication complete. Waiting for postrun Standard Fehler (STDERR) info 2013-10-31 20:23:20 create user vunvelve error 2013-10-31 20:24:20 TIMEOUT: Access to share still failed after 30 seconds: error 2013-10-31 20:24:20 **************** Test failed above this line (1) **************** info 2013-10-31 20:24:20 remove user vunvelve debug 2013-10-31 20:24:20 user vunvelve removed info 2013-10-31 20:24:20 checking whether the user vunvelve is really removed debug 2013-10-31 20:24:21 user vunvelve does not exist -----------------------------------------------------------------------------
This is a little bit strange. During the failedldif test (tests/10_ldap/60failedldif) the samba / winbind processes lost the connection to the LDAP server. In this time the smb.conf is not written complete: ----------------------------------------------------------------------- # Warning: This file is auto-generated and might be overwritten by # univention-config-registry. # Please edit the following file(s) instead: # Warnung: Diese Datei wurde automatisch generiert und kann durch # univention-config-registry überschrieben werden. # Bitte bearbeiten Sie an Stelle dessen die folgende(n) Datei(en): # # /etc/univention/templates/files/etc/samba/smb.conf.d/01univention-samba_main # /etc/univention/templates/files/etc/samba/smb.conf.d/02univention-samba_netbios # /etc/univention/templates/files/etc/samba/smb.conf.d/11univention-samba_ldap # /etc/univention/templates/files/etc/samba/smb.conf.d/21univention-samba_winbind # /etc/univention/templates/files/etc/samba/smb.conf.d/31univention-samba_password # /etc/univention/templates/files/etc/samba/smb.conf.d/41univention-samba_printing # /etc/univention/templates/files/etc/samba/smb.conf.d/51univention-samba_domain # /etc/univention/templates/files/etc/samba/smb.conf.d/52univention-samba_domainname # /etc/univention/templates/files/etc/samba/smb.conf.d/61univention-samba_misc # /etc/univention/templates/files/etc/samba/smb.conf.d/71univention-samba_users # /etc/univention/templates/files/etc/samba/smb.conf.d/81univention-quota_scripts # /etc/univention/templates/files/etc/samba/smb.conf.d/81univention-samba_scripts # /etc/univention/templates/files/etc/samba/smb.conf.d/90univention-samba_user_shares # /etc/univention/templates/files/etc/samba/smb.conf.d/91univention-samba_shares # /etc/univention/templates/files/etc/samba/smb.conf.d/92univention-samba_shares # /etc/univention/templates/files/etc/samba/smb.conf.d/95univention-samba_local_config # /etc/univention/templates/files/etc/samba/smb.conf.d/99univention-samba_local_shares # [global] debug level = 0 syslog = 0 max log size = 0 max open files = 32808 server string = %h univention corporate server machine password timeout = 0 netbios name = slave094 ; ldap passdb backend = ldapsam:"ldap://slave094.autotest094.local:7389" auth methods = guest sam winbind ldap suffix = dc=autotest094,dc=local ldap admin dn = "cn=slave094,cn=dc,cn=computers,dc=autotest094,dc=local" ldap ssl = start tls passdb expand explicit = no ; idmap/winbind ldap idmap suffix = cn=idmap,cn=univention idmap config * : backend = ldap idmap config * : range = 55000-64000 idmap config * : ldap_url = ldap://slave094.autotest094.local:7389 idmap config * : ldap_user_dn = cn=slave094,cn=dc,cn=computers,dc=autotest094,dc=local idmap config AUTOTEST094 : backend = nss idmap config AUTOTEST094 : range = 1000-54999 winbind max clients = 500 winbind nested groups = no winbind enum users = yes winbind enum groups = yes winbind separator = + ; winbind use default domain = yes ; winbind enable local accounts = yes template shell = /bin/bash template homedir = /home/%D-%U ; password sync pam password change = no unix password sync = no ; ldap passwd sync = yes passwd chat = *New*password* %n\n *Re-enter*new*password* %n\n *password*changed* passwd chat timeout = 60 client use spnego = yes obey pam restrictions = yes encrypt passwords = yes ; printing load printers = yes printing = cups printcap name = cups ; domain security = user domain logons = yes domain master = no preferred master = yes local master = yes os level = 65 wins support = no wins server = master094.autotest094.local workgroup = AUTOTEST094 oplocks = yes kernel oplocks = yes large readwrite = yes deadtime = 15 read raw = yes write raw = yes max xmit = 65535 getwd cache = yes wide links = no store dos attributes = yes max protocol = SMB2 logon home = \\slave094\%U logon drive = I: logon path = \\slave094\%U\windows-profiles\%a preserve case = yes short preserve case = yes time server = yes host msdfs = no msdfs root = no guest account = nobody map to guest = Bad User admin users = administrator join-backup set quota command = /usr/sbin/univention-setquota check password script = /usr/share/univention-samba/password ----------------------------------------------------------------------- Here it ends. At the same time I see: Jan 3 09:52:41 slave094 kernel: [566624.196769] univention-dire[8417]: segfault at 0 ip (null) sp 00007fff1b8473d8 error 14 in univention-directory-listener[400000+14000] From the listener.log: 03.01.14 09:52:39.812 LISTENER ( ERROR ) : Can't contact LDAP server: going into LDIF mode kadmin: ext host/slave094.autotest094.local@AUTOTEST094.LOCAL: Wrong database version 03.01.14 09:52:39.901 LISTENER ( ERROR ) : Could not write to transaction file /var/lib/univention-ldap/listener/listener. Check for /var/lib/univention-directory-replication/fa iled.ldif 03.01.14 09:52:39.901 LISTENER ( ERROR ) : failed to write to transaction file Abort: Can't contact LDAP server. Try to sync changes stored in /var/lib/univention-directory-replication/failed.ldif into local LDAP 03.01.14 09:52:41.495 LISTENER ( WARN ) : received signal 15 waiting for listener-shutdown close failed in file object destructor: Error in sys.excepthook: Original exception was: . . . . . shutdown done replay stored changes ... Restored modifies sucessfuly, the ldif-file will be moved to /tmp/replayed.ldif_2014-01-03-09:52:46 Starting univention-directory-listener daemon. done. 03.01.14 09:52:47.222 DEBUG_INIT
Created attachment 5709 [details] listener.log.debug4.bug_33050.bz2
Created attachment 5710 [details] strace-listener_bug_33050.bz2
I've added a simple workaround to ucs-test/tests/10_ldap/60failedldif. At the end of this test case /etc/samba/smb.conf is re-written and the samba processes are restarted. The real problem is that UCR does not write the files as atomic operation. I've opened a new bug for it: Bug #33842.
Now it happened on a S4 DC Backup. It seems that the DRS replication does not work: http://jenkins.knut.univention.de:8080/view/Autotest/job/UCS%203.2%20Autotest%20MultiEnv/SambaVersion=s4,Systemrolle=backup/227/testReport/
I've added some more debug. During the 61getent_crash test case one samba process failed with a segmentation fault: [2014/01/11 06:50:25.743787, 0, pid=1305] ../lib/util/fault.c:73(fault_report) INTERNAL ERROR: Signal 11 in pid 1305 (4.1.0-Debian) Please read the Trouble-Shooting section of the Samba HOWTO All other samba processes still run but the DRS replication fails: [2014/01/11 06:50:30.584953, 0, pid=1296] ../source4/rpc_server/common/forward.c:55(dcesrv_irpc_forward_callback) IRPC callback failed for DsReplicaSync - NT_STATUS_CONNECTION_REFUSED [2014/01/11 06:50:35.589097, 0, pid=1296] ../source4/rpc_server/common/forward.c:55(dcesrv_irpc_forward_callback) IRPC callback failed for DsReplicaSync - NT_STATUS_CONNECTION_REFUSED root@backup093:~# samba-tool processes Service: PID ----------------------------- dnsupdate 1310 rpc_server 1296 rpc_server 1296 rpc_server 1296 rpc_server 1296 rpc_server 1296 cldap_server 1303 winbind_server 1307 kdc_server 1304 samba 0 dreplsrv 1305 kccsrv 1309 ldap_server 1301 root@backup093:~# I've created Bug #33904 for this issue and disabled the test case 10_ldap/61getent_crash.
Indeed, I just tried to reproduce the problem and discovered the test was disabled :-)
Released as an errata update for unmaintained.