Jim Baker wrote:
We periodically backport CPython's SRE implementation into ours, so I'm hypothesizing this is when it was broken. I wasn't aware of it working before, since I was just looking at this from a unit test perspective for the 2.5/trunk work. It's quite possible that better support had been added before by Jython developers, then inadvertently got wiped out by this backporting.

For my UTF-16 branch, which still needs to be merged in (!), supporting wide character classes was the last major outstanding issue along with merging in other fixes documented by the relevant unit tests. Along with working on its performance. So hopefully I will have some time soon for completion of this work.


Thanks for the explanation and for your work ! I will keep an eye on the evolution of the svn repository ;)


- Jim

On Fri, Apr 4, 2008 at 3:50 AM, Sébastien Boisgérault <Sebastien.Boisgerault@ensmp.fr> wrote:

Hi all,

The code:

    pattern = re.compile(u'[&<>"\u0080-\uffff]+')

used to raise a ValueError (see http://bugs.jython.org/issue1544953)
but seems to work fine in Jython 2.2.1 and in the current svn trunk.
However, AFAICT, using this pattern may generate an error with the
current jython trunk.

Jython 2.3a0 on java1.6.0
Type "copyright", "credits" or "license" for more information.
>>> import re
>>> pattern = re.compile(u'[&<>"\u0080-\uffff]+')
>>> pattern.sub(u"e", u"\xc3\xa9")
Traceback (innermost last):
  File "<console>", line 1, in ?
        at org.python.modules.sre.SRE_STATE.SRE_CHARSET(SRE_STATE.java:402)
        at org.python.modules.sre.SRE_STATE.SRE_COUNT(SRE_STATE.java:493)
        at org.python.modules.sre.SRE_STATE.SRE_MATCH(SRE_STATE.java:778)
        at org.python.modules.sre.SRE_STATE.SRE_SEARCH(SRE_STATE.java:1171)
        at org.python.modules.sre.PatternObject.subx(PatternObject.java:129)
        at org.python.modules.sre.PatternObject.sub(PatternObject.java:80)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)

java.lang.ArrayIndexOutOfBoundsException: java.lang.ArrayIndexOutOfBoundsException: 163

On the other hand, the 2.2.1 release works:

Jython 2.2.1 on java1.6.0
Type "copyright", "credits" or "license" for more information.
>>> import re
>>> pattern = re.compile(u'[&<>"\u0080-\uffff]')
>>> pattern.sub(u"e", u"\xc3\xa9",)

Can anyone confirm this ? Is this issue already known ?



Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
Jython-dev mailing list

Jim Baker