From: Henri S. <hsi...@ik...> - 2006-08-11 12:54:03
|
On Aug 11, 2006, at 14:02, Diez B. Roggisch wrote: > Depending on what you mean by "meaning of your programs change", > this is plain > wrong. Nope. Here is a demo: --------- import string # a (0061, 61), combining diaeresis (0308, CC 88), a with diaeresis # (00E4, C3 A4), euro (20AC, E2 82 AC) # 23763 (D84D + DF63, F0 A3 9D A3) testStr = "\141\314\210\303\244\342\202\254\360\243\235\243" testUString = unicode(testStr, 'utf-8') for c in testUString: print "%X" % ord(c) --------- That's a self-contained program that produces different output depending on how the interpreter was compiled. > The internal representation of unicode objects in CPython only > matters for > compiled extensions. It matters for any program that looks inside the strings. > But if meaning means to you the behavior of encoding-related > matters at > runtime - that is not true. It is. See above. -- Henri Sivonen hsi...@ik... http://hsivonen.iki.fi/ |