I agree, it makes more sense to map java.lang.String to PyUnicode. PyUnicode is of course just a wrapper around String, sharing the same UTF-16 encoding.
Is there any code that depends on this? The only artificial usage of PyString to represent 8-bit strings should be in Jython itself, such as in PyFile or cStringIO. And there's the rub, our own internals return String from a PyFile#readline, etc., depending on Jython to make the conversion to PyString. I suspect the damage is limited to this, however, simply because the distinction is generally preserved for other code via how TYPE interacts with constructors or factories.
We may also see user-level Jython code depending on this behavior, but I suspect it would actually make it work better. Just looked at this old doc, http://jython.sourceforge.net/docs/differences.html
: "The u"" string modifier is optional and completely ignored if specified." Not anymore, and that's true of much of the rest of the doc. BTW, 2.5 final must include a purging of all such obsolete docs, or at least qualifying them to a prior release.
So if we were to refactor PyFile and similar modules to return PyString instead, it's quite likely
On Thu, Jul 31, 2008 at 6:46 PM, Frank Wierzbicki <email@example.com>
Ah now I understand what you where getting at in IRC. I think you are
On Thu, Jul 31, 2008 at 6:46 PM, Leo Soto M. <firstname.lastname@example.org
> Hi folks,
> Short History:
> I don't think that java Strings should be mapped to PyString. They
> should be mapped to PyUnicode. Any obvious reason why I may be wrong?
right. The only reason for the current behavior is that PyString is
much older than PyUnicode, and so the mapping is just outdated. Not
sure if there are backwards compatibility issues that will cause big
problems -- it may be a pretty big change... but I am all for giving
it a try. It certainly makes sense.