Bug 22479 – Translation-Test funktioniert nicht mit Multi-Line-Strings

Bug 22479 - Translation-Test funktioniert nicht mit Multi-Line-Strings


Summary:	Translation-Test funktioniert nicht mit Multi-Line-Strings

Status:	CLOSED FIXED

Product:	UCS
Classification:	Unclassified
Component:	ucslint
Version:	UCS 3.0
Hardware:	All Linux

Importance:	P5 normal (vote)
Target Milestone:	UCS 3.1
Assigned To:	Philipp Hahn
QA Contact:	Lukas Walter

URL:
Keywords:	interim-2

Depends on:
Blocks:
	Show dependency tree / graph

Reported:	2011-05-11 08:57 CEST by Philipp Hahn
Modified:	2012-12-12 21:09 CET (History)
CC List:	2 users (show)

See Also:
What kind of report is it?:	---
What type of bug is this?:	---
Who will be affected by this bug?:	---
How will those affected feel about the bug?:	---
User Pain:
Enterprise Customer affected?:
School Customer affected?:
ISV affected?:
Waiting Support:
Flags outvoted (downgraded) after PO Review:
Ticket number:
Bug group (optional):
Max CVSS v3 score:

Flags:	hahn: Patch_Available+

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Philipp Hahn

2011-05-11 08:57:45 CEST

--- ucslint/0008-Translations.py	(Revision 24046)
+++ ucslint/0008-Translations.py	(Revision 24044)
@@ -4,7 +4,8 @@
 	import univention.ucslint.base as uub
 except:
 	import ucslint.base as uub
-import re, os
+import re
+import os
 
 # 1) check if translation strings are correct; detect something like  _('foo %s bar' % var)  ==> _('foo %s bar') % var
 # 2) check if all translation strings are translated in de.po file
@@ -36,14 +37,25 @@
 			print "ERROR: directory %s does not exist!" % path
 			return
 
-		regEx1 = re.compile("[\\(\\[\\{\s,:]_\\(\s*'[^']+'\s*%", re.DOTALL)
-		regEx2 = re.compile('[\\(\\[\\{\s,:]_\\(\s*"[^"]+"\s*%', re.DOTALL)
+		regEx1 = re.compile("""^(?:[^'"#\n]| # non string, non comment prefix
+				(?: # matched strings
+					('''|""\").*?(?<!\\\\)\\1
+					|'(?:[^'\n]|\\\\')*?'
+					|"(?:[^"\n]|\\\\")*?"
+				))*
+				[([{\s,:](_\\(\s* # translation
+				(?: # matched strings
+					('''|""\").*?(?<!\\\\)\\3
+					|'(?:[^'\n]|\\\\')*?'
+					|"(?:[^"\n]|\\\\")*?"
+				)\s*%\s*	# substitution
+				(?:[^)]+\\))?)""", re.DOTALL | re.MULTILINE | re.VERBOSE)
 
 		py_files = []
 		po_files = []
 		for dirpath, dirnames, filenames in os.walk( path ):
-			if '/.svn/' in dirpath and dirpath.endswith('/.svn'):   # ignore svn files
-				continue
+			try: dirnames.remove('.svn') # prune svn directory
+			except ValueError: pass
 			for fn in filenames:
 				if fn.endswith('.py'):
 					py_files.append( os.path.join( dirpath, fn ) )
@@ -52,12 +64,13 @@
 
 		for fn in py_files:
 			try:
-				content = open(fn, 'r').read()
-			except:
-				self.addmsg( '0008-2', 'failed to open and read file %s' % fn )
+				with open(fn, 'r') as f:
+					content = f.read()
+			except IOError:
+				self.addmsg( '0008-2', 'failed to open and read file', fn )
 				continue
 			self.debug('testing %s' % fn)
-			for regex in (regEx1, regEx2):
+			for regex in (regEx1,):
 				flen = len(content)
 				pos = 0
 				while pos < flen:
@@ -67,7 +80,7 @@
 					else:
 						line = content.count('\n', 0, match.start()) + 1
 						pos = match.end()
-						self.addmsg( '0008-1', '%s contains construct like _("foo %%s bar" %% var) in line %d' % (fn, line) )
+						self.addmsg( '0008-1', 'substitutes before translation: %s' % match.group(2), fn, line)
 
 		regEx1 = re.compile('\n#.*?fuzzy')
 		regEx2 = re.compile('msgstr ""\n\n', re.DOTALL)
@@ -77,18 +90,18 @@
 			try:
 				content = open(fn, 'r').read()
 			except:
-				self.addmsg( '0008-2', 'failed to open and read file %s' % fn )
+				self.addmsg( '0008-2', 'failed to open and read file', fn )
 				continue
 
 			match = regExCharset.search( content )
 			if not match:
-				self.addmsg( '0008-5', 'cannot find charset definition in %s' % fn )
+				self.addmsg( '0008-5', 'cannot find charset definition', fn )
 			elif not match.group(1).lower() in ('utf-8'):
-				self.addmsg( '0008-6', 'invalid charset (%s) defined in %s' % (match.group(1), fn) )
+				self.addmsg( '0008-6', 'invalid charset (%s) defined' % match.group(1), fn )
 
 			self.debug('testing %s' % fn)
-			for regex, errid, errtxt in [ (regEx1, '0008-3', '%s contains "fuzzy" in line %d'),
-										  (regEx2, '0008-4', '%s contains empty msgstr in line %d') ]:
+			for regex, errid, errtxt in [ (regEx1, '0008-3', 'contains "fuzzy"'),
+										  (regEx2, '0008-4', 'contains empty msgstr') ]:
 				flen = len(content)
 				pos = 0
 				while pos < flen:
@@ -99,5 +112,5 @@
 						# match.start() + 1 ==> avoid wrong line numbers because regEx1 starts with \n
 						line = content.count('\n', 0, match.start() + 1 ) + 1
 						pos = match.end()
-						self.addmsg( errid, errtxt % (fn, line) )
+						self.addmsg( errid, errtxt, fn, line )
 
--- testframework/0008-1-3/src/irgendwas.py	(Revision 24046)
+++ testframework/0008-1-3/src/irgendwas.py	(Revision 24044)
@@ -1,11 +1,19 @@
-import foo
-import bar
-
+#!/usr/bin/python
+_ = lambda s: s
 def main():
 	print 'Boing'
 	print _('Dieser Test ist ok')
 	print _('Hier lieg auch %d Problem vor') % 0
 	x = 'hier'
 	print _('Aber %s knallts' % x)
+	# _('Im Kommentar %s aber nicht' % x)
+	print _('foo %s bar' % x)
+	print _('foo %s \'bar' % x)
+	print _("foo %s bar" % x)
+	print _("foo %s \"bar" % x)
+	print _('''foo %s bar''' % x)
+	print _("""foo %s bar""" % x)
+	print _('''foo %s \'''bar''' % x)
+	print _("""foo %s \"""bar""" % x)
 
 main()
--- testframework/0008-1-3.correct	(Revision 24046)
+++ testframework/0008-1-3.correct	(Revision 24044)
@@ -1,3 +1,11 @@
-E:0008-1: testframework/0008-1-3/src/irgendwas.py contains construct like _("foo %s bar" % var) in line 9
-E:0008-3: testframework/0008-1-3/src/de.po contains "fuzzy" in line 59
-E:0008-3: testframework/0008-1-3/src/de.po contains "fuzzy" in line 80
+E:0008-1: testframework/0008-1-3/src/irgendwas.py:10: substitutes before translation: _('foo %s bar' % x)
+E:0008-1: testframework/0008-1-3/src/irgendwas.py:11: substitutes before translation: _('foo %s \'bar' % x)
+E:0008-1: testframework/0008-1-3/src/irgendwas.py:12: substitutes before translation: _("foo %s bar" % x)
+E:0008-1: testframework/0008-1-3/src/irgendwas.py:13: substitutes before translation: _("foo %s \"bar" % x)
+E:0008-1: testframework/0008-1-3/src/irgendwas.py:14: substitutes before translation: _('''foo %s bar''' % x)
+E:0008-1: testframework/0008-1-3/src/irgendwas.py:15: substitutes before translation: _("""foo %s bar""" % x)
+E:0008-1: testframework/0008-1-3/src/irgendwas.py:16: substitutes before translation: _('''foo %s \'''bar''' % x)
+E:0008-1: testframework/0008-1-3/src/irgendwas.py:17: substitutes before translation: _("""foo %s \"""bar""" % x)
+E:0008-1: testframework/0008-1-3/src/irgendwas.py:8: substitutes before translation: _('Aber %s knallts' % x)
+E:0008-3: testframework/0008-1-3/src/de.po:59: contains "fuzzy"
+E:0008-3: testframework/0008-1-3/src/de.po:80: contains "fuzzy"

Comment 1 Philipp Hahn

2012-07-20 20:16:55 CEST

Angepasst übernommen.
svn34327, ucslint_3.0.0-1

Comment 2 Lukas Walter

2012-10-16 13:51:49 CEST

In meinen Tests ist es mir gelungen eine falsch-positiv Meldung zu erzeugen:
=======================
E:0008-1: univention-bind/test.py:12: substitutes before translation: _('''foo bar''')


test.py:
============
def main():
        print 'Boing'
        print _('Dieser Test ist ok')
        print _('Hier lieg auch %d Problem vor') % 0
        x = 'hier'
        print _('Aber %s knallts' % x)
        # _('Im Kommentar %s aber nicht' % x)
        print _('foo %s bar' % x)
        print _('foo %s \'bar' % x)
        print _("foo %s bar" % x)
        print _("foo %s \"bar" % x)
        print _('''foo bar''')
        print _("""foo %s bar""" % x)
        print _('''foo %s \'''bar''' % x)
        print _("""foo %s \"""bar""" % x)

Comment 3 Philipp Hahn

2012-10-17 10:35:03 CEST

(In reply to comment #2)
> In meinen Tests ist es mir gelungen eine falsch-positiv Meldung zu erzeugen:
> =======================
> E:0008-1: univention-bind/test.py:12: substitutes before translation: _('''foo
> bar''')

Die Regular Expression wurde nochmal überarbeitet und erkennt nun auch diesen Fall. Zumdem werden jetzt auch Unicode- und Raw-String korrekt erkannt.

svn36379, ucslint_3.0.4-3.55.201210171028
ChangeLog: ±0

Comment 4 Lukas Walter

2012-10-17 12:05:29 CEST

(In reply to comment #3)
> (In reply to comment #2)
> > In meinen Tests ist es mir gelungen eine falsch-positiv Meldung zu erzeugen:
> > =======================
> > E:0008-1: univention-bind/test.py:12: substitutes before translation: _('''foo
> > bar''')
> 
> Die Regular Expression wurde nochmal überarbeitet und erkennt nun auch diesen
> Fall. Zumdem werden jetzt auch Unicode- und Raw-String korrekt erkannt.
> 
> svn36379, ucslint_3.0.4-3.55.201210171028
> ChangeLog: ±0

Fehler tritt nicht mehr auf.

Changelog passt,
Verified.

Comment 5 Stefan Gohmann

2012-12-12 21:09:58 CET

UCS 3.1-0 has been released: 
 http://forum.univention.de/viewtopic.php?f=54&t=2125

If this error occurs again, please use "Clone This Bug".