Some of the Multi-Character-Escapes of the XML-Schema
regular expression are not supported correctly.
Namely: \i, \I, \c, \C, and \W
As an example I enclosing the following test driver:
import jregex.*;
public class X implements REFlags {
private static void test(String aPattern, String
aPositiveExample, String aNegativeExample) {
Pattern p = new Pattern(aPattern, XML_SCHEMA);
if (!p.matches(aPositiveExample))
System.out.println("error - regex: " + aPattern + ";
positive example: " + aPositiveExample);
if (p.matches(aNegativeExample))
System.out.println("error - regex: " + aPattern + ";
negative example: " + aNegativeExample);
}
public static void main(String[] args) {
// digits
test("\\d*", "0123456789", "abc");
// non-digits
test("\\D*", "abc", "123");
// white space
test("\\s*", " \t\r\n", "0");
// non-white space
test("\\S*", "abc", "\t");
// any initial character of an XML name
test("\\i*", "abc_:", "1");
// any non-initial character of an XML name
test("\\I*", "123", "a");
// any character that may be part of an XML name (not at
the beginning)
test("\\c*", "abc345_:-.", "<");
// any character that can not be part of an XML name (not
at the beginning)
test("\\C*", "<>!@", "a");
// all character except punctuation
test("\\w*", "abc", "\&quot;");
// all punctiation characters
test("\\W*", "abc", "\&quot;");
}
}
Logged In: YES
user_id=908396
I'm not sure what the XML_SCHEMA flag is supposed to do.
Could you enlighten me?
Apart from that, firstly \i \I \s and \S are not listed as
supported character classes in the docs, secondly \c and \C
have special meaning as control characters, thirdly \d \D
and \w work as they should, fourthly \W works when you
switch the arguments #2 and #3 to its test method (which I
guess you got mixed up).
So, unless I'm missing something obvious that has to do with
the XML_SCHEMA (and I'm suspecting that's the case), it
works fine, no?