Bug 34422 - Fix missleading error message
Fix missleading error message
Status: CLOSED FIXED
Product: UCS
Classification: Unclassified
Component: Samba4
UCS 3.2
Other Linux
: P5 normal (vote)
: UCS 3.2-2-errata
Assigned To: Felix Botner
Arvid Requate
:
Depends on: 33942
Blocks:
  Show dependency treegraph
 
Reported: 2014-03-31 08:12 CEST by Janis Meybohm
Modified: 2014-07-10 13:35 CEST (History)
3 users (show)

See Also:
What kind of report is it?: ---
What type of bug is this?: ---
Who will be affected by this bug?: ---
How will those affected feel about the bug?: ---
User Pain:
Enterprise Customer affected?:
School Customer affected?:
ISV affected?:
Waiting Support:
Ticket number:
Bug group (optional):
Max CVSS v3 score:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Janis Meybohm univentionstaff 2014-03-31 08:12:11 CEST
+++ This bug was initially created as a clone of Bug #33942 +++

There are different situations where samba-tool domain join fails with an "Your filesystem or build does not support posix ACLs, which s3fs requires." Traceback. In most situations missing ACL support is not the cause for this error and the message leads customers into the wrong direction.

We should improve the try/except block or change the message.
---
Searching for dsServiceName in rootDSE failed: operations error at ../source4/dsdb/samdb/ldb_modules/rootdse.c:501
Failed to find our own NTDS Settings DN in the ldb!
ldb: module schema_load initialization failed : No such object
ldb: module rootdse initialization failed : No such object
ldb: module samba_dsdb initialization failed : No such object
ldb: Unable to load modules for /var/lib/samba/private/sam.ldb: (null)
samdb_connect failed
VFS connect failed!
ERROR(<class 'samba.provision.ProvisioningError'>): uncaught exception - ProvisioningError: Your filesystem or build does not support posix ACLs, which s3fs requires.  Try the mounting the filesystem with the 'acl' option.
  File "/usr/lib/python2.6/dist-packages/samba/netcmd/__init__.py", line 175, in _run
    return self.run(*args, **kwargs)
  File "/usr/lib/python2.6/dist-packages/samba/netcmd/domain.py", line 560, in run
    machinepass=machinepass, use_ntvfs=use_ntvfs, dns_backend=dns_backend)
  File "/usr/lib/python2.6/dist-packages/samba/join.py", line 1220, in join_DC
    ctx.do_join()
  File "/usr/lib/python2.6/dist-packages/samba/join.py", line 1101, in do_join
    ctx.join_provision()
  File "/usr/lib/python2.6/dist-packages/samba/join.py", line 752, in join_provision
    use_ntvfs=ctx.use_ntvfs, dns_backend=ctx.dns_backend)
  File "/usr/lib/python2.6/dist-packages/samba/provision/__init__.py", line 2052, in provision
    raise ProvisioningError("Your filesystem or build does not support posix ACLs, which s3fs requires.  Try the mounting the filesystem with the 'acl' option.")
Join failed - cleaning up
---
Comment 1 Janis Meybohm univentionstaff 2014-03-31 08:24:21 CEST
Ticket#: 2014011421000077
Ticket#: 2014031921008486
Ticket#: 2014032721004296
Comment 2 Felix Botner univentionstaff 2014-06-17 15:58:17 CEST
The join script tries to join to the domain, if that fails it tries to join to the samba dc's, until the join succeeds.

But if the replication during the first join (to the domain) fails with  "Failed to apply records" (unique index violation on objectGUID, i create two objects with the same objectSid -> ldbedit -H /var/lib/samba/private/sam.ldb.d/DC\=W2K12\,DC\=TEST.ldb) the second join (to the dc) always fails with "Your filesystem or build does not support posix ACLs, which".

After the first join failed samba is in a horribly state and trying to join to another dc's does not help.

(1)

We should check if the domain/dc is reachable before join, than join and NOT continue if the join fails.

Unfortunately, the return value of "samba-tool domain join" is always 255 in case of an error:

-> samba-tool domain join ...; echo $?
 raise ProvisioningError("Your filesystem
 255

-> samba-tool domain join ...; echo $?
 raise Exception("Failed to find a writeable DC for
 255

But maybe we can check samba-tool domain info before the samba domain join. If that fails, we check the (next) dc. If samba domain join fails, we abort the join script. 

-> samba-tool domain info w2k12.test; echo $?
...
Client site      : Default-First-Site-Name
0

samba-tool domain info master.w2k12.test; echo $?
...
Client site      : Default-First-Site-Name
0

-> samba-tool domain info master.w2k12.test; echo $?
ERROR: Invalid IP address 'master.w2k12.test'!
255

-> root@slave:~# samba-tool domain info w2k12.test; echo $?
ERROR: Invalid IP address 'w2k12.test'!
255

(2)

If we are in this state (raise ProvisioningError("Your filesystem or build does...) a complete rejoin also fails. There is no way to get samba going. I assume we have to cleanup /var/lib/samba some how. The join script does this, but only if 
 ldbsearch -H /var/lib/samba/private/sam.ldb \
  'samAccountName=slave$' msDS-KeyVersionNumber
returns something. 

In my case i get 

-> ldbsearch -H /var/lib/samba/private/sam.ldb 'samAccountName=slave$' msDS-KeyVersionNumber
Searching for dsServiceName in rootDSE failed: operations error at ../source4/dsdb/samdb/ldb_modules/rootdse.c:518
Failed to find our own NTDS Settings DN in the ldb!
schema_load_init: no schema head present: (skip schema loading)
module schema_load initialization failed : No such object
module rootdse initialization failed : No such object
module samba_dsdb initialization failed : No such object
Unable to load modules for /var/lib/samba/private/sam.ldb: (null)
Failed to connect to /var/lib/samba/private/sam.ldb - (null)

So, no cleanup in this case, and the join fails. If i remove /var/lib/samba/private/* before running the join script, the join succeeds. 

-> /etc/init.d/samba stop
-> rm -rf /var/lib/samba/private/*
-> /etc/init.d/samba start
Comment 3 Felix Botner univentionstaff 2014-06-25 16:54:00 CEST
* added test "samba-tool domain info" before join
* abort if join fails
* always cleanup /var/lib/samba (cleanup_var_lib_samba)

YAML: 2014-06-17-univention-samba4.yaml
Comment 4 Arvid Requate univentionstaff 2014-07-02 18:54:05 CEST
Verified:
 * Code review Ok
 * Rejoin works, machine SID stays the same, "RID Set" stays the same, rIDNextRID is preserved and the re-created dns-account works.
 * Advisory Ok
Comment 5 Janek Walkenhorst univentionstaff 2014-07-10 13:35:22 CEST
http://errata.univention.de/ucs/3.2/142.html