Bug 31853 - listener plugin nfs-shares.py failes if server is configured with dhcp
listener plugin nfs-shares.py failes if server is configured with dhcp
Status: CLOSED FIXED
Product: UCS
Classification: Unclassified
Component: UCR
UCS 3.1
Other Linux
: P2 normal (vote)
: UCS 3.1-1-errata
Assigned To: Stefan Gohmann
Philipp Hahn
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-06-28 15:15 CEST by Ingo Steuwer
Modified: 2013-07-10 14:16 CEST (History)
2 users (show)

See Also:
What kind of report is it?: ---
What type of bug is this?: ---
Who will be affected by this bug?: ---
How will those affected feel about the bug?: ---
User Pain:
Enterprise Customer affected?:
School Customer affected?:
ISV affected?:
Waiting Support:
Flags outvoted (downgraded) after PO Review:
Ticket number:
Bug group (optional):
Max CVSS v3 score:


Attachments
DC Slave join.log (4.80 KB, text/x-log)
2013-07-05 15:46 CEST, Ingo Steuwer
Details
DC Slave ucr dump (17.91 KB, text/plain)
2013-07-05 15:47 CEST, Ingo Steuwer
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Ingo Steuwer univentionstaff 2013-06-28 15:15:23 CEST
UCS 3.1-1 Errata130, seen on DC Backup and DC Slave
On a server configured with DHCP on eth0, the domain join fails during initializing the listener as the module "nfs-shares" throws a traceback. As the module is mandatory, there is no chance to join the server. 

Seen at different customers:
2013042321001617
2013012921000971

Workaround: move /usr/lib/univention-directory-listener/system/nfs-shares.py out of /usr/lib/univention-directory-listener/system/

i.e.:
mkdir /usr/lib/univention-directory-listener/system_deactivated
mv /usr/lib/univention-directory-listener/system/nfs-shares.py \
 /usr/lib/univention-directory-listener/system_deactivated
/etc/init.d/univention-directory-listener restart
Comment 1 Ingo Steuwer univentionstaff 2013-06-28 15:15:36 CEST
Seen on Memberserver also.
Comment 2 Kevin Dominik Korte univentionstaff 2013-06-28 15:21:12 CEST
Traceback:
27.06.13 13:16:32.398  DEBUG_INIT
27.06.13 13:16:32.408  CONFIG      ( ERROR   ) : Error on opening
"/etc/univention/base.conf"
27.06.13 13:16:32.411  CONFIG      ( ERROR   ) : Error on opening
"/etc/univention/base.conf"
27.06.13 13:16:33.072  LISTENER    ( ERROR   ) : import of
filename=/usr/lib/univention-directory-listener/system/nfs-shares.py failed
Traceback (most recent call last):
  File "/usr/lib/univention-directory-listener/system/nfs-shares.py", line 45, in <module>
    ip = interfaces.get_default_ip_address().ip
AttributeError: 'NoneType' object has no attribute 'ip'
27.06.13 13:16:33.073  LISTENER    ( ERROR   ) : import of
filename=/usr/lib/univention-directory-listener/system/nfs-shares.py failed in
module_import()
Comment 3 Stefan Gohmann univentionstaff 2013-06-28 19:55:24 CEST
Could you provide an ucr dump? Was the system installed with UCS 3.1-1?
Comment 4 Ingo Steuwer univentionstaff 2013-07-01 08:10:50 CEST
(In reply to Stefan Gohmann from comment #3)
> Could you provide an ucr dump? 
mhm, would take a few days; we've got no remote access.

> Was the system installed with UCS 3.1-1?
yes, 3.1-1 with auto-update to Errata 130 during installation.
Comment 5 Stefan Gohmann univentionstaff 2013-07-01 08:51:21 CEST
(In reply to Ingo Steuwer from comment #4)
> (In reply to Stefan Gohmann from comment #3)
> > Could you provide an ucr dump? 
> mhm, would take a few days; we've got no remote access.

The "problem" is, I'm not able to reproduce this issue.

nfs-handlers is configured correctly:

root@backup612:~# cat /var/lib/univention-directory-listener/handlers/nfs-shares ; echo
3
root@backup612:~# 

root@backup612:~# ucr search --brief ^interfaces/[a-z] | grep -v fallback
interfaces/eth0/address: 10.201.61.2
interfaces/eth0/broadcast: 10.201.255.255
interfaces/eth0/ipv6/acceptRA: false
interfaces/eth0/netmask: 255.255.0.0
interfaces/eth0/network: 10.201.0.0
interfaces/eth0/type: dhcp
interfaces/handler: ifplugd
interfaces/primary: <empty>
interfaces/restart/auto: true
root@backup612:~# 


Do you have more than one network interface?
Comment 6 Ingo Steuwer univentionstaff 2013-07-05 15:46:50 CEST
Created attachment 5305 [details]
DC Slave join.log
Comment 7 Ingo Steuwer univentionstaff 2013-07-05 15:47:07 CEST
Created attachment 5306 [details]
DC Slave ucr dump
Comment 8 Ingo Steuwer univentionstaff 2013-07-05 15:49:28 CEST
I reproduced the scenario of 2013042321001617, steps:

Release: UCS 3.1-1 Errata 139

On UCS DC Master with DHCP:
- register a new DC Slave (including IP/MAC/DHCP)
- define a proper gateway (for getting UCS Updates)

On a new machine:
- Installation from UCS 3.1-1 amd64 DVD
- Language: German
- Role: DC Slave
- Network: "F5" for a Lease, aktivated DHCP
-> Proxy??
- don't join at the end of installation
- deselect all packages
- actualise system at end of installation
-> reboot
- try to join the domain using the UMC module (login as "root" at the UMC of the DC Slave)

Attached you'll find both "ucr dump" and join.log of the DC Slave.

Attention: there are a few more errors/traceback in the join log:

------------------
Configure 01univention-ldap-server-init.inst Sat Jun 15 06:28:38 CEST 2013
File: /var/lib/univention-ldap/ldap/DB_CONFIG
51bbed77 /etc/ldap/slapd.conf: line 45: <suffix> invalid DN 21 (Invalid syntax)
slapadd: bad configuration file!
E: your request could not be fulfilled
try `univention-config-registry --help` for more information
N
------------------

------------------
15.06.13 06:28:42.109  LISTENER    ( ERROR   ) : import of filename=/usr/lib/univention-directory-listener/system/keytab.py failed
Traceback (most recent call last):
  File "/usr/lib/univention-directory-listener/system/keytab.py", line 45, in <module>
    filter='(&(objectClass=krb5Principal)(objectClass=krb5KDCEntry)(krb5KeyVersionNumber=*)(|(krb5PrincipalName=host/%s@%s)(krb5PrincipalName=ldap/%s@%s)(krb5PrincipalName=host/%s.%s@%s)(krb5PrincipalName=ldap/%s.%s@%s)(krb5PrincipalName=host/%s.%s@%s)(krb5PrincipalName=ldap/%s.%s@%s)))' % (hostname, realm, hostname, realm, hostname, domainname, realm, hostname, domainname, realm, hostname, listener.baseConfig['ldap/base'].replace('dc=','').replace(',','.'), realm, hostname, listener.baseConfig['ldap/base'].replace('dc=','').replace(',','.'), realm)
AttributeError: 'NoneType' object has no attribute 'replace'
1
------------------

-> the listener plugin failures might be caused by an earlier problem
Comment 9 Stefan Gohmann univentionstaff 2013-07-05 16:38:02 CEST
I'm able to reproduce this if I try to join via the UMC module:

05.07.13 10:23:53.996  DEBUG_INIT
05.07.13 10:23:54.020  CONFIG      ( ERROR   ) : Error on opening "/etc/univention/base.conf"
05.07.13 10:23:54.024  CONFIG      ( ERROR   ) : Error on opening "/etc/univention/base.conf"
05.07.13 10:23:54.743  LISTENER    ( ERROR   ) : import of filename=/usr/lib/univention-directory-listener/system/keytab.py failed
Traceback (most recent call last):
  File "/usr/lib/univention-directory-listener/system/keytab.py", line 45, in <module>
    filter='(&(objectClass=krb5Principal)(objectClass=krb5KDCEntry)(krb5KeyVersionNumber=*)(|(krb5PrincipalName=host/%s@%s)(krb5PrincipalName=ldap/%s@%s)(krb5PrincipalName=host/%s.%s@%s)(krb5PrincipalName=ldap/%s.%s@%s)(krb5PrincipalName=host/%s.%s@%s)(krb5PrincipalName=ldap/%s.%s@%s)))' % (hostname, realm, hostname, realm, hostname, domainname, realm, hostname, domainname, realm, hostname, listener.baseConfig['ldap/base'].replace('dc=','').replace(',','.'), realm, hostname, listener.baseConfig['ldap/base'].replace('dc=','').replace(',','.'), realm)
AttributeError: 'NoneType' object has no attribute 'replace'

The permission of base.conf is set to 600 instead of 744. UMC is started with a different umask:
root@slave113:~# ls -la /etc/univention/base.conf
-rw------- 1 root root 18407 Jul  5 10:23 /etc/univention/base.conf
root@slave113:~# 

Due to Bug #31725 the base.conf file is recreated during an ucr set. UCR should copy the permissions.
Comment 10 Stefan Gohmann univentionstaff 2013-07-05 17:19:05 CEST
Fixed, the file permissions are now copied from the original file.

3.1-1-errata fix: r42093
3.2 fix: r42094
YAML: 42095
Changelog: 42096
Comment 11 Philipp Hahn univentionstaff 2013-07-08 15:04:51 CEST
FAIL: svn4209[34]:

> Modified: branches/ucs-3.2/ucs-3.2-0/base/univention-config-registry/python/univention/config_registry/backend.py (42093 => 42094)
...
> +                       except:

Please only catch OSError (or the super-class "EnvironmentError" if you can't remember if it is "OSError" or "IOError"), as using that catch-all-style hid hard-to-debug errors in the past.

> +                               mode = 00744

Why is it executable for the owner? → 0644

>                        # open temporary file for writing
>                        reg_file = open(temp_filename, 'w')

FYI: There are still more issues:
1.1 Depending in the umask the file might be word-writable for short time. This could be solved by using the low-level function os.open() directly:
  reg_file = os.fdopen(os.open(temp_filename, os.O_WRONLY | os.O_CREAT, mode), 'w')
Note that the later chmod() is still required, because the umask still gets applied and might create a file with less permissions.

1.2 The fstat() call also adds another race condition: with umask=0000 the file would get perm=0777, which gets copied by a concurrent UCR process, which would than finally leave the file with that permission world-writable.

2. Two concurrent processes would write to the same file; if the process finishing last (p2) writes less data than the process finishing first (p1), garbage would remain at the end of the file.
To prevent this the file must be truncated after writing:
  reg_file.truncate()
Since p1 can't detect that situation, it would be best if p2 would write it's consistent state to the master file and rename the possible broken file for investigation.
Generally I think it would be best to just create a exclusive file for each UCR writer and rename that file to the master file. Then the file is guaranteed to be consistent with the drawback, that any concurrent write would be lost, which could be recovered from /var/log/univention/config_registry.replog if required.


Otherwise looks sane.


(OK): 2013-07-05-univention-config-registry.yaml
  Needs to be updated
  s/errata 130/erratum 130/

OK: changelog-3.2.tex


UNTESTED and just to show what I would prefer:
diff --git a/branches/ucs-3.2/ucs-3.2-0/base/univention-config-registry/python/univention/config_registry/backend.py b/branches/ucs-3.2/ucs-3.2-0/base/univention-config-registry/python/univention/config_registry/backend.py
index df8253d..f3f91e2 100644
--- a/branches/ucs-3.2/ucs-3.2-0/base/univention-config-registry/python/univention/config_registry/backend.py
+++ b/branches/ucs-3.2/ucs-3.2-0/base/univention-config-registry/python/univention/config_registry/backend.py
@@ -317,19 +317,23 @@ class _ConfigRegistry(dict):
 
 	def __save_file(self, filename):
 		"""Save sub registry to file."""
-		temp_filename = '%s.temp' % filename
+		temp_filename = '%s_%s_%s' % (filename, time.time(), os.getpid())
 		try:
 			try:
 				file_stat = os.stat(filename)
 				mode = file_stat.st_mode
 				user = file_stat.st_uid
 				group = file_stat.st_gid
-			except:
-				mode = 00744
+			except OSError:
+				mode = 00644
 				user = 0
 				group = 0
 			# open temporary file for writing
-			reg_file = open(temp_filename, 'w')
+			old_umask = os.umask(0)
+			try:
+				reg_file = os.fdopen(os.open(temp_filename, os.O_WRONLY | os.O_CREAT | os.O_EXCL, mode), 'w')
+			finally:
+				os.umask(old_umask)
 			# write data to file
 			reg_file.write('# univention_ base.conf\n\n')
 			reg_file.write(self.__str__())
@@ -338,21 +342,11 @@ class _ConfigRegistry(dict):
 			os.fsync(reg_file.fileno())
 			# close fd
 			reg_file.close()
-			try:
-				os.chmod(temp_filename, mode)
-				os.chown(temp_filename, user, group)
-				os.rename(temp_filename, filename)
-			except OSError:
-				# In this case the temp file created above in this
-				# function was already moved by a concurrent UCR
-				# operation. Dump the current state to a backup file
-				temp_filename = '%s.concurrent_%s' % (filename, time.time())
-				reg_file = open(temp_filename, 'w')
-				reg_file.write('# univention_ base.conf\n\n')
-				reg_file.write(self.__str__())
-				reg_file.close()
-				pass
-				
+			# fix permission
+			os.chmod(temp_filename, mode)
+			os.chown(temp_filename, user, group)
+			# atomically update file
+			os.rename(temp_filename, filename)
 		except EnvironmentError, ex:
 			# suppress certain errors
 			if ex.errno != errno.EACCES:
Comment 12 Stefan Gohmann univentionstaff 2013-07-08 16:37:15 CEST
(In reply to Philipp Hahn from comment #11)
> FAIL: svn4209[34]:
> 
> > Modified: branches/ucs-3.2/ucs-3.2-0/base/univention-config-registry/python/univention/config_registry/backend.py (42093 => 42094)
> ...
> > +                       except:
> 
> Please only catch OSError (or the super-class "EnvironmentError" if you
> can't remember if it is "OSError" or "IOError"), as using that
> catch-all-style hid hard-to-debug errors in the past.

Yes I know, but it this case I want to be sure that an unknown exception never occurs. I want to use the defaults in any exception case.

> > +                               mode = 00744
> 
> Why is it executable for the owner? → 0644

Oh, that was a typo.
 
> >                        # open temporary file for writing
> >                        reg_file = open(temp_filename, 'w')
> 
> FYI: There are still more issues:
> 1.1 Depending in the umask the file might be word-writable for short time.
> This could be solved by using the low-level function os.open() directly:
>   reg_file = os.fdopen(os.open(temp_filename, os.O_WRONLY | os.O_CREAT,
> mode), 'w')
> Note that the later chmod() is still required, because the umask still gets
> applied and might create a file with less permissions.
>
> 1.2 The fstat() call also adds another race condition: with umask=0000 the
> file would get perm=0777, which gets copied by a concurrent UCR process,
> which would than finally leave the file with that permission world-writable.

I think if you use such a umask as root you have other problems ...

> 2. Two concurrent processes would write to the same file; if the process
> finishing last (p2) writes less data than the process finishing first (p1),
> garbage would remain at the end of the file.
> To prevent this the file must be truncated after writing:
>   reg_file.truncate()
> Since p1 can't detect that situation, it would be best if p2 would write
> it's consistent state to the master file and rename the possible broken file
> for investigation.
> Generally I think it would be best to just create a exclusive file for each
> UCR writer and rename that file to the master file. Then the file is
> guaranteed to be consistent with the drawback, that any concurrent write
> would be lost, which could be recovered from
> /var/log/univention/config_registry.replog if required.

OK, but that is not the scope of this bug.

> 
> Otherwise looks sane.
> 
> 
> (OK): 2013-07-05-univention-config-registry.yaml
>   Needs to be updated
>   s/errata 130/erratum 130/

Fixed.

> OK: changelog-3.2.tex
Comment 13 Philipp Hahn univentionstaff 2013-07-09 11:10:51 CEST
OK: fails with 8.0.7-5.430.201306171642
OK: fixed by 8.0.7-7.433.201307081613
OK: fixed by 9.0.2-1.434.201307081613
OK: update 3.1-1 → errata → 3.2
OK: update 3.1-1 → 3.2
  aptitude -u install '?installed?source-package(univention-config-registry)'
OK: ChangeLog-3.2.tex
OK: 3.1-1-errata 2013-07-05-univention-config-registry.yaml

FYI: users need to fix the permissions themselves:
  chmod 644 /etc/univention/base*
Comment 14 Moritz Muehlenhoff univentionstaff 2013-07-10 14:16:32 CEST
http://errata.univention.de/ucs/3.1/140.html