Univention Bugzilla – Bug 54715
Joining a Backup Node into UCS@school "singleserver" setup runs in several timeouts and finally fails
Last modified: 2023-03-31 09:53:08 CEST
Running univention-join on an unjoined Backup Node to join into a UCS@school environment (in my case a single school) takes **very** long and leaves the administrator puzzeled as to what's going on during the pre-joinscripts phase: ``` root@ucsBackup:~# univention-join univention-join: joins a computer to an ucs domain copyright (c) 2001-2022 Univention GmbH, Germany Enter Primary Directory Node Account : Administrator Enter Primary Directory Node Password: Search Primary Directory Node: done Check Primary Directory Node: done Stop LDAP Server: done Search ldap/base done Start LDAP Server: done Search LDAP binddn done Sync time: done Running pre-join hook(s): done Join Computer Account: done Stopping univention-directory-notifier daemon: done Stopping univention-directory-listener daemon: done Sync ldap.secret: done Sync ldap-backup.secret: done Sync SSL directory: done Check TLS connection: done Download host certificate: done Sync SSL settings: done Purging translog database: done Restart LDAP Server: done Sync Kerberos settings: done Create kerberos/adminserver File: /etc/krb5.conf Running pre-joinscripts hook(s): ``` Looking into the join.log shows that it runs ``` 2022-05-05 14:34:11,071 ucsschool-join-hook: [INFO] Calling ('univention-install', '--force-yes', '--yes', 'ucs-school-singleserver') ``` Lookin at the actualise.log shows, that 1. Joinscripts are run (e.g. univention-samba4) and run into local ldapsearch timeouts because the LDAP replication has not even be configured at this stage of the join. 2. Dynamic registrations of LDAP Schema and ACL extensions repeatedly take time to fail for the same reason. We could adjust the `call_joinscript` library functions (which e.g. is used in univention-samba4.postinst) to not run if the machine is not joined yet. But what does that even mean at this point? For sure univention-check-join-status fails, because there is a joinscript that has not been run. But machine.secret is already there (and works) to fetch the join hooks and the file /var/univention-join/joined is also already there, albeit still empty at that stage. As a hack we could maybe remove /var/univention-join/joined in the prejoin-hook and adjust call_joinscript to abort if that file is not yet present. Make up your mind and file a bug against UCS to implement the expected behavior for call_joinscript.
Just look at the time gap here in join.log, that's 40 minutes: ``` 2022-05-05 14:34:11,071 ucsschool-join-hook: [INFO] Calling ('univention-install', '--force-yes', '--yes', 'ucs-school-singleserver') ... 05.05.22 14:43:52.901 DEBUG_INIT 05.05.22 14:52:29.735 DEBUG_INIT 2022-05-05 15:15:13,893 ucsschool-join-hook: [INFO] Not installing 'UCS@school Veyon Proxy' app on this system role. 2022-05-05 15:15:13,897 ucsschool-join-hook: [INFO] ucsschool-join-hook.py is done Configure 00ucs-school-app-version-check.inst Thu May 5 15:15:13 CEST 2022 2022-05-05 15:15:13.991516791+02:00 (in joinscript_init) Version of app "ucsschool" on this host: "5.0 v1" Version of app "ucsschool" on Primary Directory Node: "5.0 v1" OK: local version of app "ucsschool" lower than or equal to version on Primary Directory Node. Version check passed. 2022-05-05 15:15:15.039867519+02:00 (in joinscript_save_current_version) Configure 01univention-ldap-server-init.inst Thu May 5 15:15:15 CEST 2022 2022-05-05 15:15:15.071292350+02:00 (in joinscript_init) File: /var/lib/univention-ldap/translog/DB_CONFIG 6273cde3 /etc/ldap/slapd.conf: line 199: unknown attr "@univentionApp" in to clause 6273cde3 <access clause> ::= access to <what> [ by <who> [ <access> ] [ <control> ] ]+ <what> ::= * | dn[.<dnstyle>=<DN>] [filter=<filter>] [attrs=<attrspec>] <attrspec> ::= <attrname> [val[/<matchingRule>][.<attrstyle>]=<value>] | <attrlist> <attrlist> ::= <attr> [ , <attrlist> ] <attr> ::= <attrname> | @<objectClass> | !<objectClass> | entry | children <who> ::= [ * | anonymous | users | self | dn[.<dnstyle>]=<DN> ] [ realanonymous | realusers | realself | realdn[.<dnstyle>]=<DN> ] [dnattr=<attrname>] [realdnattr=<attrname>] [group[/<objectclass>[/<attrname>]][.<style>]=<group>] [peername[.<peernamestyle>]=<peer>] [sockname[.<style>]=<name>] [domain[.<domainstyle>]=<domain>] [sockurl[.<style>]=<url>] [dynacl/<name>[/<options>][.<dynstyle>][=<pattern>]] [ssf=<n>] [transport_ssf=<n>] [tls_ssf=<n>] [sasl_ssf=<n>] ``` And then the error messages continue: * 01univention-ldap-server-init.inst apparently leave slapd in a defunct state (yet it reports success for the joinscript) * Next 03univention-directory-listener.inst fails likewise spectacular ways, because, no local slapd So the current state is: ``` Running pre-joinscripts hook(s): done Configure 00ucs-school-app-version-check.inst done Configure 01univention-ldap-server-init.inst done Configure 02univention-directory-notifier.inst done Configure 03univention-directory-listener.inst ## hangs ```
Created attachment 10944 [details] join.log
Created attachment 10945 [details] actualise.log
The end of the story: ``` Running pre-joinscripts hook(s): done Configure 00ucs-school-app-version-check.inst done Configure 01univention-ldap-server-init.inst done Configure 02univention-directory-notifier.inst done Configure 03univention-directory-listener.inst done ************************************************************************** * Join failed! * * Contact your system administrator * ************************************************************************** * Message: Please visit https://help.univention.com/t/8842 for common problems during the join and how to fix them -- FAILED: failed.ldif exists. ************************************************************************** ```
I tried running univention-join again and after some initial haggling with systemd to finally get slapd started normally, the join works "much better", ad the pre-joinscript stuff has already been done. Yet, it finally fails again with ``` Configure 62ucs-school-singleserver.inst failed ************************************************************************** * Join failed! * * Contact your system administrator * ************************************************************************** * Message: Please visit https://help.univention.com/t/8842 for common problems during the join and how to fix them -- FAILED: 62ucs-school-singleserver.inst ************************************************************************** ``` and join.log shows ``` Object modified: cn=ucsBackup,cn=dc,cn=computers,dc=jtorres,dc=org The object type of this object differs from the specified object type: The object cn=ucsBackup,cn=dc,cn=computers,dc=jtorres,dc=org is not a computers/domaincontroller_master. 62ucs-school-singleserver.inst: ************************************************************************** * Join failed! * * Contact your system administrator * ************************************************************************** * Message: Please visit https://help.univention.com/t/8842 for common problems during the join and how to fix them -- FAILED: 62ucs-school-singleserver.inst ************************************************************************** ``` This is true, the object type is computers/domaincontroller_backup, but that's the whole point of this exercise.
The customer has a singlemaster and a backupserver. Since UCS 5 the backupserver is not able to join.
A similar bug was fixed in, 62ucs-school-multiserver.inst: if [[ "$server_role" = domaincontroller_master ]]; then ucsschoolRole=dc_master else ucsschoolRole=dc_backup fi univention-directory-manager "computers/$server_role" modify "$@" \ ... BUT in a singleserver environment, we should make sure what should be executed on a DC backup by the 62ucs-school-singleserver.inst script
Yes, we basically adjusted the 62ucs-school-singleserver.inst script to exit early on the joining DC Backup, before it starts to do things specific to the domaincontroller_master. But basically it just shows that there is an unresolved clash of concepts between the "ucs-school-singleserver" and the concept of joining a Backup Directory Node.
I changed the summary to make more explicit that this happens only in singleserver environments. Seems like a workaround was possible in the support ticket. Can we document the steps here in case they are helpfull to fix the problem?
(In reply to Ingo Steuwer from comment #9) > I changed the summary to make more explicit that this happens only in > singleserver environments. > > Seems like a workaround was possible in the support ticket. Can we document > the steps here in case they are helpfull to fix the problem? The fix was directly editing the joinscript. We edit this lines, to prevent "thinks" happening on the backup. ---------- if [ "$server_role" != domaincontroller_master ]; then joinscript_save_current_version exit 0 fi ----------------