From: SourceForge.net <no...@so...> - 2007-02-28 23:49:02
|
Bugs item #1671134, was opened at 2007-02-28 20:57 Message generated for change (Comment added) made by laukpe You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=112867&aid=1671134&group_id=12867 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Core Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Charles Groves (cgroves) Assigned to: Nobody/Anonymous (nobody) Summary: '%s' % u'x' returns str object Initial Comment: This is broken out of a comment Pekka made on http://jython.org/bugs/1659819 Same problem appears also if you create a string using pattern like 'Something %s' and the substituted string is unicode. Examples below demonstrate. Jython 2.2b1 on java1.5.0_10 (JIT: null) Type "copyright", "credits" or "license" for more information. >>> x = 'Good %s' % u'Hyv\u00E4' >>> type(x) <type 'str'> >>> x 'Good Hyv\xE4' >>> unicode(x) Traceback (innermost last): File "<console>", line 1, in ? UnicodeError: ascii decoding error: ordinal not in range(128) >>> Python 2.4.3 (#1, May 18 2006, 07:40:45) [GCC 3.3.3 (cygwin special)] on cygwin Type "help", "copyright", "credits" or "license" for more information. >>> x = 'Good %s' % u'Hyv\u00E4' >>> type(x) <type 'unicode'> >>> x u'Good Hyv\xe4' >>> unicode(x) u'Good Hyv\xe4' >>> ---------------------------------------------------------------------- Comment By: Pekka Laukkanen (laukpe) Date: 2007-03-01 01:48 Message: Logged In: YES user_id=1379331 Originator: NO Darn, I noticed one more scenario that needs to be taken care. The patch mentioned in the previous comment doesn't fix this. >>> '%(x)s' % { 'x' : u'xxx' } 'xxx' This got me thinking that this issue probably ought to be fixed in StringFormatter.format instead of PyString.str___mod__ (as in the current patch). SF.format could keep count on given format items and return PyString or PyUnicode as needed. That would of course require SF.format's return type to be changed from String to PyString. Probably at the same time it would make sense to change SF.formatXXX methods (formaLong, formatInteger, ...) from public to private to make it more clear that they are only helper methods of SF.format (at least they seem to be) and can thus still return String. These changes seem pretty safe because I could find StringFormatter used elsewhere than in PyString.str___mod__ and the only method it uses is SF.format. Comments about the implementation ideas above are welcome. If the approach seems reasonable I can make a new patch. ---------------------------------------------------------------------- Comment By: Pekka Laukkanen (laukpe) Date: 2007-03-01 00:58 Message: Logged In: YES user_id=1379331 Originator: NO After figuring out that the affected method in this case is __mod__ fixing the problem was pretty easy. The patch is available at http://jython.org/patches/1671304 and it includes following test added to Lib/test/test_str2unicode.py. def test_string_formatting(self): self.assertEquals(unicode, type('%s' % u'x')) self.assertEquals(unicode, type('%s %s' % (u'x', 'y'))) (I also replaced few tabs with spaces in Lib/test/test_str2unicode.py. Hope that's ok. If not, I won't touch them in the future.) ---------------------------------------------------------------------- Comment By: Charles Groves (cgroves) Date: 2007-02-28 21:11 Message: Logged In: YES user_id=1174327 Originator: YES In fixing the original bug I added Lib/test/test_str2unicode.py which would be a good place to add tests for this formatting stuff.... ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=112867&aid=1671134&group_id=12867 |