Bug 40262 - Join of previously deleted UCS@school Slave fails
Join of previously deleted UCS@school Slave fails
Status: CLOSED FIXED
Product: UCS@school
Classification: Unclassified
Component: General
UCS@school 4.1
Other Linux
: P5 normal (vote)
: UCS@school 4.1 R2 vXXX
Assigned To: Daniel Tröder
Florian Best
: interim-2
Depends on: 41753
Blocks:
  Show dependency treegraph
 
Reported: 2015-12-15 20:01 CET by Michael Grandjean
Modified: 2016-10-21 14:50 CEST (History)
3 users (show)

See Also:
What kind of report is it?: ---
What type of bug is this?: 4: Minor Usability: Impairs usability in secondary scenarios
Who will be affected by this bug?: 2: Will only affect a few installed domains
How will those affected feel about the bug?: 3: A User would likely not purchase the product
User Pain:
Enterprise Customer affected?:
School Customer affected?: Yes
ISV affected?:
Waiting Support:
Flags outvoted (downgraded) after PO Review:
Ticket number:
Bug group (optional):
Max CVSS v3 score:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Grandjean univentionstaff 2015-12-15 20:01:25 CET
This was recognized during a UCS@school Workshop:

It is not possible to re-join an UCS@school Slave whose machine account was deleted.

Error message from 00ucs-school-slave-check-ou.inst:

> ERROR: The computer object for this host isn't inside a school OU and therefore at the wrong place.

How to reproduce:
UCS@school Multi-Server environment
-> Delete an already joined UCS@school Slave computer object via UMC
-> run 'univention-join' on the UCS@school Slave

Expected behaviour:
The join process should recognize the UCS@school setup and move the slave to the correct OU/container automatically.


BTW: It's not sufficient to simply move the computer object to cn=dc,cn=server,cn=computers,ou=School,... 
I tried that, but then I get:

> ERROR: This host has no permission for the OU he is stored in.
Comment 1 Michael Grandjean univentionstaff 2016-01-17 20:11:31 CET
Seems like the group membership for OU${OU}-DC-Edukativnetz or OU${OU}-DC-Verwaltungsnetz is missing in this case.
Comment 2 Sönke Schwardt-Krummrich univentionstaff 2016-07-07 22:22:10 CEST
(In reply to Michael Grandjean from comment #0)
> Expected behaviour:
> The join process should recognize the UCS@school setup and move the slave to
> the correct OU/container automatically.

It's not that easy. The correct position and access rights are determined via the LDAP groups "OU${OU}-DC-Edukativnetz" or "OU${OU}-DC-Verwaltungsnetz". So if the group membership is missing, it's hard to determine the correct OU.
Additionally at the OU object, the correct fileserver has to be configured.

Suggestion:
Print detailed info on how to fix this issue. If I recall correctly, the following commands on DC master would fix it (has to be tested):
# cd /usr/share/ucs-school-import/scripts/
# ./create_ou $school $slave_name
# ./move_domaincontroller_to_ou --dcname=$slave_name --ou=$school
#
Comment 3 Daniel Tröder univentionstaff 2016-08-29 16:41:16 CEST
It is not clear to me, what result is expected of this bug.
I have successfully rejoined a school slave repeatedly in the following way:

Master: sch-m-71
Slave:  sch-s-74

--- master ---
root@sch-m-71:~# OU=schulezwei           <-- [1]
root@sch-m-71:~# SLAVENAME=sch-s-74      <-- [2]
root@sch-m-71:~# eval $(ucr shell)
root@sch-m-71:~# udm computers/domaincontroller_slave remove --dn=cn=$SLAVENAME,cn=dc,cn=server,cn=computers,ou=$OU,$ldap_base

--- slave ---
root@sch-s-74:~# reboot

--- master ---
root@sch-m-71:~# udm computers/domaincontroller_slave create --set name=$SLAVENAME --position cn=dc,cn=server,cn=computers,ou=$OU,$ldap_base --append groups=cn=OU$OU-DC-Edukativnetz,cn=ucsschool,cn=groups,$ldap_base
udm container/ou modify --dn=ou=$OU,$ldap_base --set ucsschoolHomeShareFileServer=cn=$SLAVENAME,cn=dc,cn=server,cn=computers,ou=$OU,$ldap_base --set ucsschoolClassShareFileServer=cn=$SLAVENAME,cn=dc,cn=server,cn=computers,ou=$OU,$ldap_base

--- slave ---
# replace IP with that of the master
root@sch-s-74:~# ucr set nameserver1=10.200.3.71 nameserver2= nameserver3=
root@sch-s-74:~# univention-join -dcname $(ucr get ldap/master) -dcaccount Administrator

------------------------------------------------------------

One time univention-join failed on the slave _before_ 00ucs-school-slave-check-ou.inst at stage "Join Computer Account:" with
---
E: failed to create DC Slave (1) [LDAP Error: Type or value exists: modify/add: memberUid: value #0 already exists]
---
Removing the DNS/DHCP users on the master helped:
root@sch-m-71:~# udm users/user remove --dn=uid=dns-$SLAVENAME,cn=users,$ldap_base
root@sch-m-71:~# udm users/user remove --dn=uid=http-proxy-$SLAVENAME,cn=users,$ldap_base

------------------------------------------------------------

[1] The user can obtain this from UMC or with
# udm container/ou list | egrep '^DN|name|displayName'
[2] The user can obtain this by logging into the slave and running
# hostname -s
Comment 4 Sönke Schwardt-Krummrich univentionstaff 2016-09-06 14:23:02 CEST
As discussed, please test the suggestion:
# cd /usr/share/ucs-school-import/scripts/
# ./create_ou $school $slave_name
# ./move_domaincontroller_to_ou --dcname=$slave_name --ou=$school
#

If this works, please add an appropriate error message to the joinscript with hints on how to react to fix the error condition.
Comment 5 Daniel Tröder univentionstaff 2016-09-07 11:41:06 CEST
72349: added a help message, if rejoin of school slave fails
72350: advisory
Comment 6 Florian Best univentionstaff 2016-09-21 19:47:45 CEST
REOPEN: Please quote the argument, otherwise the message says:

1) Remove the slaves computer account:
   # udm computers/domaincontroller_slave remove --dn=cn=mein host,cn=dc,cn=computers,dc=mei ne,dc=bas is

REOPEN: the instructions don't work. The join fails after 10 minutes.
The log is full of error messages with:

Could not chdir to home directory /dev/null: Not a directory
scp: /etc/univention/ssl/xen8.school.local: Permission denied

(xen7 is the DC Master, xen8 is the DC Slave):
The problems are probably the SSL permissions:
root@xen7:~# ls -l /etc/univention/ssl/xen8.school.local
insgesamt 20
-rw-r----- 1 xen8$ DC Backup Hosts 5396 Sep 21 19:37 cert.pem
-rw-r----- 1 xen8$ DC Backup Hosts 2749 Sep 21 19:37 openssl.cnf
-rw-r----- 1 xen8$ DC Backup Hosts 1675 Sep 21 19:37 private.key
-rw-r----- 1 xen8$ DC Backup Hosts 1273 Sep 21 19:37 req.pem
root@xen7:~# id xen8$
uid=2008(xen8$) gid=5006(DC Slave Hosts) Gruppen=5006(DC Slave Hosts),5007(Computers),5012(DC-Edukativnetz),5014(OUoldschool-DC-Edukativnetz),5025(Authenticated Users)

~OK: The message is not translated but I guess univention-join is english everywhere.
YAML: OK
Comment 7 Florian Best univentionstaff 2016-09-21 19:51:25 CEST
(In reply to Florian Best from comment #6)
> REOPEN: the instructions don't work. The join fails after 10 minutes.
> The log is full of error messages with:
The 10 minutes were a lie. univention-join never stops!
→ Bug #30005
Comment 8 Daniel Tröder univentionstaff 2016-09-22 09:21:39 CEST
(In reply to Florian Best from comment #6)
> REOPEN: Please quote the argument, otherwise the message says:
>
> 1) Remove the slaves computer account:
>    # udm computers/domaincontroller_slave remove --dn=cn=mein
> host,cn=dc,cn=computers,dc=mei ne,dc=bas is
Done: r72736
 
> REOPEN: the instructions don't work. The join fails after 10 minutes.
> The log is full of error messages with:
> 
> Could not chdir to home directory /dev/null: Not a directory
> scp: /etc/univention/ssl/xen8.school.local: Permission denied
> 
> (xen7 is the DC Master, xen8 is the DC Slave):
> The problems are probably the SSL permissions:
> root@xen7:~# ls -l /etc/univention/ssl/xen8.school.local
> insgesamt 20
> -rw-r----- 1 xen8$ DC Backup Hosts 5396 Sep 21 19:37 cert.pem
> -rw-r----- 1 xen8$ DC Backup Hosts 2749 Sep 21 19:37 openssl.cnf
> -rw-r----- 1 xen8$ DC Backup Hosts 1675 Sep 21 19:37 private.key
> -rw-r----- 1 xen8$ DC Backup Hosts 1273 Sep 21 19:37 req.pem
> root@xen7:~# id xen8$
> uid=2008(xen8$) gid=5006(DC Slave Hosts) Gruppen=5006(DC Slave
> Hosts),5007(Computers),5012(DC-Edukativnetz),5014(OUoldschool-DC-
> Edukativnetz),5025(Authenticated Users)

I don't think the problem is with /etc/univention/ssl/<slave>, but with "home directory /dev/null".

Please retry: reboot the slave before rejoining, login as root and use the masters "Administrator" account to rejoin.
Comment 9 Sönke Schwardt-Krummrich univentionstaff 2016-09-22 11:00:26 CEST
(In reply to Daniel Tröder from comment #8)
> (In reply to Florian Best from comment #6)
> > REOPEN: Please quote the argument, otherwise the message says:
> >
> > 1) Remove the slaves computer account:
> >    # udm computers/domaincontroller_slave remove --dn=cn=mein
> > host,cn=dc,cn=computers,dc=mei ne,dc=bas is
> Done: r72736
>  
> > REOPEN: the instructions don't work. The join fails after 10 minutes.
> > The log is full of error messages with:
> > 
> > Could not chdir to home directory /dev/null: Not a directory
> > scp: /etc/univention/ssl/xen8.school.local: Permission denied
> > 
> > (xen7 is the DC Master, xen8 is the DC Slave):
> > The problems are probably the SSL permissions:
> > root@xen7:~# ls -l /etc/univention/ssl/xen8.school.local
> > insgesamt 20
> > -rw-r----- 1 xen8$ DC Backup Hosts 5396 Sep 21 19:37 cert.pem
> > -rw-r----- 1 xen8$ DC Backup Hosts 2749 Sep 21 19:37 openssl.cnf
> > -rw-r----- 1 xen8$ DC Backup Hosts 1675 Sep 21 19:37 private.key
> > -rw-r----- 1 xen8$ DC Backup Hosts 1273 Sep 21 19:37 req.pem
> > root@xen7:~# id xen8$
> > uid=2008(xen8$) gid=5006(DC Slave Hosts) Gruppen=5006(DC Slave
> > Hosts),5007(Computers),5012(DC-Edukativnetz),5014(OUoldschool-DC-
> > Edukativnetz),5025(Authenticated Users)
> 
> I don't think the problem is with /etc/univention/ssl/<slave>, but with
> "home directory /dev/null".

The remark about /dev/null is only a warning. The permission denied ist the problem. Since scp copies the certificate from the master, the reason has to be searched on the DC master. Maybe a timing issue?
 
> Please retry: reboot the slave before rejoining, login as root and use the
> masters "Administrator" account to rejoin.

I don't think that a reboot of the slave will fix anything (but it takes longer ==> timing issue?).
Comment 10 Florian Best univentionstaff 2016-09-22 12:00:24 CEST
It worked now.
For that specific univention-join problem we have Bug #30005.
Comment 11 Sönke Schwardt-Krummrich univentionstaff 2016-10-04 13:24:47 CEST
UCS@school 4.1 R2 v5 has been released.

http://docs.software-univention.de/changelog-ucsschool-4.1R2v5-de.html

If this error occurs again, please clone this bug.
Comment 12 Florian Best univentionstaff 2016-10-21 14:50:16 CEST
(In reply to Florian Best from comment #7)
> (In reply to Florian Best from comment #6)
> > REOPEN: the instructions don't work. The join fails after 10 minutes.
> > The log is full of error messages with:
> The 10 minutes were a lie. univention-join never stops!
> → Bug #30005

The reason for this is that the NSCD uid cache is not removed! Bug #31926