Univention Bugzilla – Full Text Bug Listing |
Summary: | UCS@school 4.1 Slave: FAILED 03univention-directory-listener.inst | ||
---|---|---|---|
Product: | UCS | Reporter: | Arvid Requate <requate> |
Component: | Listener (univention-directory-listener) | Assignee: | Philipp Hahn <hahn> |
Status: | CLOSED FIXED | QA Contact: | Arvid Requate <requate> |
Severity: | critical | ||
Priority: | P3 | CC: | best, gohmann, hahn, schwardt, walkenhorst |
Version: | UCS 4.1 | Flags: | hahn:
Patch_Available+
|
Target Milestone: | UCS 4.1-0-errata | ||
Hardware: | Other | ||
OS: | Linux | ||
What kind of report is it?: | --- | What type of bug is this?: | --- |
Who will be affected by this bug?: | --- | How will those affected feel about the bug?: | --- |
User Pain: | Enterprise Customer affected?: | ||
School Customer affected?: | ISV affected?: | ||
Waiting Support: | Flags outvoted (downgraded) after PO Review: | ||
Ticket number: | Bug group (optional): | ||
Max CVSS v3 score: | |||
Attachments: |
join.log
management-console-module-schoolinstaller.log Untested patch - compiles at least 1.c |
Description
Arvid Requate
2016-01-19 18:40:59 CET
Created attachment 7419 [details]
management-console-module-schoolinstaller.log
Looks like downgrading univention-directory-listner to 10.0.0-2.282.201509231400 (i.e. reverting to version/errata=0) before starting the UCS@school installer on the DC Slave fixes this issue. So it seems like the adjustment for Bug 39957 breaks something. I'm currently unable to reproduce it. Also the test Jenkins tests run without problems at least without this problem. Are you able to reproduce it? Happens in Jenkins: http://jenkins.knut.univention.de:8080/job/UCSschool%204.1/job/UCSschool%204.1%20Multiserver/ws/SambaVersion/s4-school-only/log-slave2042/join.log Happens again in Jenkins: http://jenkins.knut.univention.de:8080/job/UCSschool%204.1/job/UCSschool%204.1%20Multiserver/SambaVersion=s4/ws/join.log Configure 03univention-directory-listener.inst Thu Feb 4 19:15:53 EST 2016 2016-02-04 19:15:53.428791817-05:00 (in joinscript_init) Setting ldap/database/ldbm/dbsync Multifile: /etc/ldap/slapd.conf 04.02.16 19:15:54.740 DEBUG_INIT 04.02.16 19:15:54.740 LISTENER ( INFO ) : purging cache 04.02.16 19:15:54.740 LDAP ( INFO ) : connecting to ldap://master203.autotest203.local:7389/ 04.02.16 19:15:54.749 LDAP ( INFO ) : simple_bind as cn=slave2032,cn=dc,cn=server,cn=computers,ou=School2,dc=autotest203,dc=local 04.02.16 19:15:54.754 LISTENER ( INFO ) : connecting to notifier master203.autotest203.local:6669 04.02.16 19:15:54.755 LISTENER ( INFO ) : established connection to 10.210.151.160 port 6669 04.02.16 19:15:54.756 LISTENER ( INFO ) : connection okay to host master203.autotest203.local:7389 04.02.16 19:15:54.873 LISTENER ( INFO ) : importing handler /usr/lib/univention-directory-listener/system/replication.py 04.02.16 19:15:55.228 LISTENER ( INFO ) : replication flatmode enabled by UCR: no 04.02.16 19:15:55.228 LISTENER ( INFO ) : replication flatmode activated: False 04.02.16 19:15:55.228 LISTENER ( INFO ) : importing handler /usr/lib/univention-directory-listener/system/cups-printers.py 04.02.16 19:15:55.292 LISTENER ( INFO ) : importing handler /usr/lib/univention-directory-listener/system/samba-shares.py 04.02.16 19:15:55.339 LISTENER ( INFO ) : importing handler /usr/lib/univention-directory-listener/system/pkgdb-watch.py 04.02.16 19:15:55.502 LISTENER ( INFO ) : importing handler /usr/lib/univention-directory-listener/system/nfs-shares.py 04.02.16 19:15:55.505 LISTENER ( INFO ) : importing handler /usr/lib/univention-directory-listener/system/remove-old-homedirs.py 04.02.16 19:15:55.506 LISTENER ( INFO ) : importing handler /usr/lib/univention-directory-listener/system/keytab.py 04.02.16 19:15:55.506 LISTENER ( INFO ) : importing handler /usr/lib/univention-directory-listener/system/faillog.py 04.02.16 19:15:55.507 LISTENER ( INFO ) : importing handler /usr/lib/univention-directory-listener/system/bind.py 04.02.16 19:15:55.509 LISTENER ( INFO ) : importing handler /usr/lib/univention-directory-listener/system/umc-service-providers.py 04.02.16 19:15:55.509 LISTENER ( INFO ) : importing handler /usr/lib/univention-directory-listener/system/ucsschool-s4-branch-site.py UNIVENTION_DEBUG_BEGIN : uldap.__open host=slave2032.autotest203.local port=7389 base=dc=autotest203,dc=local 04.02.16 19:15:57.032 LDAP ( INFO ) : establishing new connection with retry_max=11 04.02.16 19:15:57.047 LDAP ( INFO ) : bind binddn=cn=slave2032,cn=dc,cn=server,cn=computers,ou=School2,dc=autotest203,dc=local UNIVENTION_DEBUG_END : uldap.__open host=slave2032.autotest203.local port=7389 base=dc=autotest203,dc=local 04.02.16 19:15:57.048 LISTENER ( ERROR ) : import of filename=/usr/lib/univention-directory-listener/system/ucsschool-s4-branch-site.py failed Traceback (most recent call last): File "/usr/lib/univention-directory-listener/system/ucsschool-s4-branch-site.py", line 134, in <module> raise ImportError("Error accessing LDAP via machine account: %s" % (ex,)) ImportError: Error accessing LDAP via machine account: {'desc': 'Invalid credentials'} 04.02.16 19:15:57.048 LISTENER ( ERROR ) : import of filename=/usr/lib/univention-directory-listener/system/ucsschool-s4-branch-site.py failed in module_import() 04.02.16 19:15:57.048 LISTENER ( INFO ) : importing handler /usr/lib/univention-directory-listener/system/quota.py 04.02.16 19:15:57.049 LISTENER ( INFO ) : importing handler /usr/lib/univention-directory-listener/system/nagios-client.py 04.02.16 19:15:57.051 LISTENER ( INFO ) : importing handler /usr/lib/univention-directory-listener/system/nscd.py 04.02.16 19:15:57.051 LISTENER ( INFO ) : importing handler /usr/lib/univention-directory-listener/system/nss.py 04.02.16 19:15:57.051 LISTENER ( INFO ) : importing handler /usr/lib/univention-directory-listener/system/ldap_server.py 04.02.16 19:15:57.065 LISTENER ( INFO ) : importing handler /usr/lib/univention-directory-listener/system/samba4-idmap.py 04.02.16 19:15:57.080 LISTENER ( INFO ) : importing handler /usr/lib/univention-directory-listener/system/cups-pdf.py 04.02.16 19:15:57.080 LISTENER ( INFO ) : importing handler /usr/lib/univention-directory-listener/system/univention-saml-servers.py 04.02.16 19:15:57.080 LISTENER ( INFO ) : importing handler /usr/lib/univention-directory-listener/system/license_uuid.py 04.02.16 19:15:57.081 LISTENER ( INFO ) : importing handler /usr/lib/univention-directory-listener/system/ucs-school-user-logonscript.py 04.02.16 19:15:57.101 LISTENER ( INFO ) : importing handler /usr/lib/univention-directory-listener/system/pupilgroups.py 04.02.16 19:15:57.109 LISTENER ( INFO ) : importing handler /usr/lib/univention-directory-listener/system/keytab-member.py 04.02.16 19:15:57.110 LISTENER ( INFO ) : importing handler /usr/lib/univention-directory-listener/system/s4-connector.py 04.02.16 19:15:57.111 LISTENER ( INFO ) : importing handler /usr/lib/univention-directory-listener/system/gencertificate.py 04.02.16 19:15:57.112 LISTENER ( INFO ) : importing handler /usr/lib/univention-directory-listener/system/ldap_extension.py 04.02.16 19:15:57.113 LISTENER ( INFO ) : importing handler /usr/lib/univention-directory-listener/system/nfs-homes.py 04.02.16 19:15:57.114 LISTENER ( INFO ) : importing handler /usr/lib/univention-directory-listener/system/remove-old-sharedirs.py 04.02.16 19:15:57.115 LISTENER ( INFO ) : importing handler /usr/lib/univention-directory-listener/system/udm_extension.py 04.02.16 19:15:57.127 LISTENER ( INFO ) : importing handler /usr/lib/univention-directory-listener/system/well-known-sid-name-mapping.py 04.02.16 19:15:57.132 LISTENER ( INFO ) : importing handler /usr/lib/univention-directory-listener/system/hosteddomains.py 04.02.16 19:15:57.133 LISTENER ( INFO ) : setting data for all handlers: key=basedn value=dc=autotest203,dc=local 04.02.16 19:15:57.133 LISTENER ( INFO ) : replication: listener passed key="basedn" value="dc=autotest203,dc=local" 04.02.16 19:15:57.133 LISTENER ( INFO ) : pkgdb-watch: listener passed key="basedn" value="dc=autotest203,dc=local" 04.02.16 19:15:57.133 LISTENER ( INFO ) : setting data for all handlers: key=binddn value=cn=slave2032,cn=dc,cn=server,cn=computers,ou=School2,dc=autotest203,dc=local 04.02.16 19:15:57.133 LISTENER ( INFO ) : replication: listener passed key="binddn" value="cn=slave2032,cn=dc,cn=server,cn=computers,ou=School2,dc=autotest203,dc=local" 04.02.16 19:15:57.133 LISTENER ( INFO ) : pkgdb-watch: listener passed key="binddn" value="cn=slave2032,cn=dc,cn=server,cn=computers,ou=School2,dc=autotest203,dc=local" 04.02.16 19:15:57.133 LISTENER ( INFO ) : setting data for all handlers: key=bindpw value=<HIDDEN> 04.02.16 19:15:57.133 LISTENER ( INFO ) : replication: listener passed key="bindpw" value="<HIDDEN>" 04.02.16 19:15:57.133 LISTENER ( INFO ) : pkgdb-watch: listener passed key="bindpw" value="<HIDDEN>" 04.02.16 19:15:57.133 LISTENER ( INFO ) : setting data for all handlers: key=ldapserver value=master203.autotest203.local 04.02.16 19:15:57.133 LISTENER ( INFO ) : replication: listener passed key="ldapserver" value="master203.autotest203.local" 04.02.16 19:15:57.133 LISTENER ( WARN ) : replication: ldap server changed to master203.autotest203.local 04.02.16 19:15:57.133 LISTENER ( INFO ) : pkgdb-watch: listener passed key="ldapserver" value="master203.autotest203.local" 04.02.16 19:15:57.133 LISTENER ( INFO ) : pkgdb-watch: ldap server changed to master203.autotest203.local 04.02.16 19:15:57.133 LISTENER ( INFO ) : pkgdb-watch: ldap reconnect triggered UNIVENTION_DEBUG_BEGIN : uldap.__open host=master203.autotest203.local port=7389 base=dc=autotest203,dc=local 04.02.16 19:15:57.165 LDAP ( INFO ) : establishing new connection with retry_max=11 04.02.16 19:15:57.179 LDAP ( INFO ) : bind binddn=cn=slave2032,cn=dc,cn=server,cn=computers,ou=School2,dc=autotest203,dc=local UNIVENTION_DEBUG_END : uldap.__open host=master203.autotest203.local port=7389 base=dc=autotest203,dc=local 04.02.16 19:15:57.192 LISTENER ( INFO ) : running handlers for cn=Subschema 04.02.16 19:15:57.192 LISTENER ( WARN ) : handler: replication (not ready) (ignore) 04.02.16 19:15:57.192 LISTENER ( INFO ) : replication: Running handler for: cn=Subschema Restarting ldap server(s). Stopping ldap server(s): slapd ...done. Starting ldap server(s): slapd ...done. 04.02.16 19:15:58.895 LISTENER ( INFO ) : handler: replication (successful) 04.02.16 19:15:58.896 LISTENER ( WARN ) : handler: faillog (not ready) (ignore) 04.02.16 19:15:58.896 LISTENER ( INFO ) : handler: faillog (successful) 04.02.16 19:15:58.896 LISTENER ( WARN ) : handler: s4-connector (not ready) (ignore) 04.02.16 19:15:58.905 LISTENER ( INFO ) : handler: s4-connector (successful) 04.02.16 19:15:58.906 LISTENER ( WARN ) : Set Schema ID to 7 04.02.16 19:15:58.906 LISTENER ( WARN ) : initializing module replication 04.02.16 19:15:58.906 LISTENER ( INFO ) : call handler_clean for module replication 04.02.16 19:15:58.906 LISTENER ( INFO ) : removing replica's cache File: /var/lib/univention-ldap/ldap/DB_CONFIG 04.02.16 19:16:00.042 LISTENER ( INFO ) : call handler_initialize for module replication 04.02.16 19:16:00.042 LISTENER ( INFO ) : REPLICATION: initialize 04.02.16 19:16:00.042 LISTENER ( INFO ) : removing replica's cache slapd: Kein Prozess gefunden File: /var/lib/univention-ldap/ldap/DB_CONFIG 04.02.16 19:16:01.205 LISTENER ( INFO ) : initializing replica's cache Starting ldap server(s): slapd ...done. 04.02.16 19:16:02.389 LISTENER ( INFO ) : remove old entries for module replication 04.02.16 19:16:02.391 LISTENER ( INFO ) : call cache_free_cursor for module replication 04.02.16 19:16:02.391 LISTENER ( INFO ) : initialize schema for module replication 04.02.16 19:16:02.392 LISTENER ( INFO ) : running handlers [replication] for cn=Subschema 04.02.16 19:16:02.393 LISTENER ( INFO ) : replication: Running handler for: cn=Subschema Restarting ldap server(s). Stopping ldap server(s): slapd ...retry #1....done. Starting ldap server(s): slapd ...done. 04.02.16 19:16:14.117 LISTENER ( INFO ) : handler: replication (successful) 04.02.16 19:16:14.118 LISTENER ( INFO ) : module replication for relating objects 04.02.16 19:16:14.118 LISTENER ( ERROR ) : could not get DNs when initializing replication: Can't contact LDAP server 04.02.16 19:16:14.118 LISTENER ( WARN ) : initializing module cups-printers 04.02.16 19:16:14.118 LISTENER ( INFO ) : call handler_clean for module cups-printers 04.02.16 19:16:14.118 LISTENER ( INFO ) : call handler_initialize for module cups-printers 04.02.16 19:16:14.119 LISTENER ( INFO ) : remove old entries for module cups-printers 04.02.16 19:16:14.120 LISTENER ( INFO ) : call cache_free_cursor for module cups-printers 04.02.16 19:16:14.120 LISTENER ( INFO ) : initialize schema for module cups-printers 04.02.16 19:16:14.121 LISTENER ( INFO ) : running handlers [cups-printers] for cn=Subschema 04.02.16 19:16:14.121 LISTENER ( INFO ) : module cups-printers for relating objects 04.02.16 19:16:14.122 LISTENER ( ERROR ) : could not get DNs when initializing cups-printers: Can't contact LDAP server While joining the school slave, the master restarts the slapd: Master (syslog): -------------------------------------------------------------------------------- Feb 4 19:15:35 slave2032 logger: /etc/init.d/slapd restart (pid: 12789, ppid:12464 univention-join) Feb 4 19:15:35 slave2032 logger: /etc/init.d/slapd stop (pid: 12797, ppid:12789 slapd) Feb 4 19:15:35 slave2032 logger: /etc/init.d/slapd start (pid: 12810, ppid:12789 slapd) Feb 4 19:15:35 slave2032 slapd[12821]: @(#) $OpenLDAP: slapd (Nov 6 2015 08:03:15) $#012#011root@ladda:/var/build/temp/tmp.bfW1EHuZcn/pbuilder/openldap-2.4.42+dfsg/debian/build/servers/slapd Feb 4 19:15:53 slave2032 logger: /etc/init.d/slapd start (pid: 13155, ppid:13138 invoke-rc.d) Feb 4 19:15:53 slave2032 slapd[13166]: @(#) $OpenLDAP: slapd (Nov 6 2015 08:03:15) $#012#011root@ladda:/var/build/temp/tmp.bfW1EHuZcn/pbuilder/openldap-2.4.42+dfsg/debian/build/servers/slapd Feb 4 19:15:57 slave2032 logger: /etc/init.d/slapd restart (pid: 13228, ppid:13223 univention-dire) Feb 4 19:15:57 slave2032 logger: /etc/init.d/slapd stop (pid: 13236, ppid:13228 slapd) Feb 4 19:15:57 slave2032 logger: /etc/init.d/slapd start (pid: 13249, ppid:13228 slapd) Feb 4 19:15:57 slave2032 slapd[13260]: @(#) $OpenLDAP: slapd (Nov 6 2015 08:03:15) $#012#011root@ladda:/var/build/temp/tmp.bfW1EHuZcn/pbuilder/openldap-2.4.42+dfsg/debian/build/servers/slapd Feb 4 19:16:01 slave2032 logger: /etc/init.d/slapd start (pid: 13270, ppid:13223 univention-dire) Feb 4 19:16:01 slave2032 slapd[13281]: @(#) $OpenLDAP: slapd (Nov 6 2015 08:03:15) $#012#011root@ladda:/var/build/temp/tmp.bfW1EHuZcn/pbuilder/openldap-2.4.42+dfsg/debian/build/servers/slapd Feb 4 19:16:01 slave2032 /usr/sbin/cron[4209]: (*system*univention-directory-policy) RELOAD (/etc/cron.d/univention-directory-policy) -------------------------------------------------------------------------------- Master (system-stats.log) -------------------------------------------------------------------------------- listener 13716 0.1 1.0 517992 38824 ? S 19:15 0:01 | \_ /usr/sbin/univention-directory-listener -F -b dc=autotest203,dc=local -m /usr/lib/univention-directory-listener/system -c /var/lib/univention-directory-listener -d 3 -x -ZZ -D cn=admin,dc=autotest203,dc=local -y /etc/ldap.secret -------------------------------------------------------------------------------- Master (listener.log) -------------------------------------------------------------------------------- 04.02.16 19:16:00.957 LISTENER ( INFO ) : postrun handler: umc-service-providers (prepared=-1) 13258 04.02.16 19:16:00.963 LISTENER ( INFO ) : umc-service-providers: Reloading LDAP server. Initiating graceful reload of ldap server(s). Sending HUP to ldap server(s): slapd ...done. Starting ldap server(s): slapd ...done. Checking Schema ID: ...done. -------------------------------------------------------------------------------- Slave (listener.log): -------------------------------------------------------------------------------- 04.02.16 19:16:01.205 LISTENER ( INFO ) : initializing replica's cache Starting ldap server(s): slapd ...done. 04.02.16 19:16:02.389 LISTENER ( INFO ) : remove old entries for module replication 04.02.16 19:16:02.391 LISTENER ( INFO ) : call cache_free_cursor for module replication 04.02.16 19:16:02.391 LISTENER ( INFO ) : initialize schema for module replication 04.02.16 19:16:02.392 LISTENER ( INFO ) : running handlers [replication] for cn=Subschema 04.02.16 19:16:02.393 LISTENER ( INFO ) : replication: Running handler for: cn=Subschema Restarting ldap server(s). Stopping ldap server(s): slapd ...retry #1....done. Starting ldap server(s): slapd ...done. 04.02.16 19:16:14.117 LISTENER ( INFO ) : handler: replication (successful) 04.02.16 19:16:14.118 LISTENER ( INFO ) : module replication for relating objects 04.02.16 19:16:14.118 LISTENER ( ERROR ) : could not get DNs when initializing replication: Can't contact LDAP server -------------------------------------------------------------------------------- We should simply reconnect if the LDAP connection has been closed. I'll move it to UCS. Created attachment 7452 [details]
Untested patch - compiles at least
Initialization might take some time. If the connection to the remote (or
local) LDAP servers is lost during that time, the initialization of the
module is aborted.
Add macro LDAP_RETRY to re-try failed commands if LDAP_SERVER_DOWN is
returned.
Wrap all LDAP search commands with said macro.
Add UCR 'listener/ldap/retries' to make number of retires configurable.
> While joining the school slave, the master restarts the slapd
As far as I read this, the umc-service-providers listener on the master restarts the slapd. This must be triggered by some joinscript action from the slave. So the slave should probably wait at that point until the slapd is ok again? Otherwise there might be other timing dependent effects in other setups?
(In reply to Philipp Hahn from comment #7) > Created attachment 7452 [details] > Untested patch - compiles at least but had some serious flaws. r67292 | Bug #40460 UDL: Retry LDAP queries on LDAP_SERVER_DOWN fixed patch applied r67291 | Bug #40460 UDL: Document listener signals document SIGUSR1/USR2/HUP/TERM, which is useful while debugging Package: univention-directory-listener Version: 10.0.0-6.298.201602100812 Branch: ucs_4.1-0 Scope: errata4.1-0 r67293 | Bug #40460 UDL: Retry LDAP queries on LDAP_SERVER_DOWN YAML univention-directory-listener.yaml QA: Testes 32k environemtn with slapd restart during initial join okay r67317 | Bug #40460 UDL: Retry LDAP queries on LDAP_SERVER_DOWN Fix spelling in UCRV. Limit max sleep to 31s Package: univention-directory-listener Version: 10.0.0-7.299.201602101726 Branch: ucs_4.1-0 Scope: errata4.1-0 r67318 | Bug #40460 UDL: Retry LDAP queries on LDAP_SERVER_DOWN YAML univention-directory-listener.yaml Created attachment 7469 [details]
1.c
Maybe the sleep((1 << _retry++) & 0x1f) from commit r67317 is not what you intended? See the attached example for an alternative:
Current:
1 2 4 8 16 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 2 4 8 16 0 0 0
Proposal:
1 2 4 8 16 32 1 2 4 8 16 32 1 2 4 8 16 32 1 2 4 8 16 32 1 2 4 8 16 32 1 2 4 8 16 32 1 2 4 8
Ok, apart from the question in the last comment the patch works. Unfortunately the amd64 version is still not built (openjdk-7 built is on its way..). Good: I stopped the Master slapd during initialization of replication.py and I see the binary exponential backoff in the join.log: ============================================================================== 23.11.15 20:44:39.432 LDAP ( ERROR ) : start_tls: Can't contact LDAP server 23.11.15 20:44:40.433 LDAP ( ERROR ) : start_tls: Can't contact LDAP server 23.11.15 20:44:42.434 LDAP ( ERROR ) : start_tls: Can't contact LDAP server 23.11.15 20:44:46.435 LDAP ( ERROR ) : start_tls: Can't contact LDAP server ============================================================================== Then I continue the slapd and the module finishes initialization as intended. This will be enough for the situation here. Bad: In case the slapd doesn't come back during the waiting time (current default: about 31 seconds), then the join will continue as before without proper listener module initialization and the "joined" system will still be broken. Maybe we should rather abort the join in that case? I know, addressing this is too much to ask for this bug, but maybe we should fork another bug to fix that point too? (In reply to Arvid Requate from comment #11) > Created attachment 7469 [details] > 1.c > > Maybe the sleep((1 << _retry++) & 0x1f) from commit r67317 is not what you > intended? See the attached example for an alternative: Thanks, fixed: r67329 | Bug #40460 UDL: Retry LDAP queries on LDAP_SERVER_DOWN r67330 | Bug #40460 UDL: Retry LDAP queries on LDAP_SERVER_DOWN Package: univention-directory-listener Version: 10.0.0-9.301.201602102049 Branch: ucs_4.1-0 Scope: errata4.1-0 r67332 | Bug #40460 UDL: Retry LDAP queries on LDAP_SERVER_DOWN YAML univention-directory-listener.yaml (In reply to Arvid Requate from comment #12) > Unfortunately the amd64 version is still not built (openjdk-7 built is on > its way..). Forced build for amd64 > Bad: In case the slapd doesn't come back during the waiting time (current > default: about 31 seconds), then the join will continue as before without > proper listener module initialization and the "joined" system will still be > broken. Maybe we should rather abort the join in that case? I know, > addressing this is too much to ask for this bug, but maybe we should fork > another bug to fix that point too? 03univention-directory-listener.inst has this code, which breaks the join to complete: >if test -e "/var/lib/univention-directory-listener/handlers/replication"; then > res=`cat /var/lib/univention-directory-listener/handlers/replication` > if [ $res != "1" -a "$res" != "3" ]; then > exit 1 Ok, looks good, with listener/ldap/retries=8 I got for each module: ============================================================================= 23.11.15 22:00:23.337 LISTENER ( WARN ) : initializing module nfs-homes 23.11.15 22:00:23.348 LDAP ( ERROR ) : start_tls: Can't contact LDAP server 23.11.15 22:00:24.349 LDAP ( ERROR ) : start_tls: Can't contact LDAP server 23.11.15 22:00:26.351 LDAP ( ERROR ) : start_tls: Can't contact LDAP server 23.11.15 22:00:30.353 LDAP ( ERROR ) : start_tls: Can't contact LDAP server 23.11.15 22:00:38.354 LDAP ( ERROR ) : start_tls: Can't contact LDAP server 23.11.15 22:00:54.355 LDAP ( ERROR ) : start_tls: Can't contact LDAP server 23.11.15 22:01:26.356 LDAP ( ERROR ) : start_tls: Can't contact LDAP server 23.11.15 22:01:27.357 LDAP ( ERROR ) : start_tls: Can't contact LDAP server 23.11.15 22:01:29.357 LISTENER ( ERROR ) : could not get DNs when initializing nfs-homes: Can't contact LDAP server ============================================================================= To test the extreme case of total failure I set retries to 0 and tried again and you are right, 03univention-directory-listener.inst fails in that case. Advisory: Ok (In reply to Arvid Requate from comment #12) > Bad: In case the slapd doesn't come back during the waiting time (current > default: about 31 seconds), then the join will continue as before without > proper listener module initialization and the "joined" system will still be > broken. Maybe we should rather abort the join in that case? I know, > addressing this is too much to ask for this bug, but maybe we should fork > another bug to fix that point too? No, please not. In my last support situation, the first four modules succeeded including the replication module. After that, I could simply restart the listener and the missing modules finished like expected. That saved me a few hours. > No, please not.
I was concerned about the inverse case: replication.py fails but then - due to this brilliant new retry feature - the Master LDAP server has enough time to recover to make the initialization of the following modules succeed. But I tested it and Philipp is right, in that case the join script aborts later, because other things fail.
|