DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=7752>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=7752

RegularExpression bug using '\s' (whitespace) operator

           Summary: RegularExpression bug using '\s' (whitespace) operator
           Product: Xerces2-J
           Version: 2.0.1
          Platform: PC
        OS/Version: Windows NT/2K
            Status: NEW
          Severity: Normal
          Priority: Other
         Component: Other
        AssignedTo: [EMAIL PROTECTED]
        ReportedBy: [EMAIL PROTECTED]


Hi,

I've discovered a bug when using the RegularExpression package in Xerces 2.0.1 
when using the '\s' escape sequence for whitespace in RegularExpressions.

Examples:

1) If I create a RegularExpression like this:

RegularExpression regExpChecker = new RegularExpression("d\\s*", "uw"); or
RegularExpression regExpChecker = new RegularExpression("d\\s+", "uw");

I get the following exception:

Exception occurred during event dispatching:
java.lang.NullPointerException
        at org.apache.xerces.impl.xpath.regex.RegularExpression.compile(RegularE
xpression.java:610)
        at org.apache.xerces.impl.xpath.regex.RegularExpression.compile(RegularE
xpression.java:565)
        at org.apache.xerces.impl.xpath.regex.RegularExpression.compile(RegularE
xpression.java:531)
        at org.apache.xerces.impl.xpath.regex.RegularExpression.prepare(RegularE
xpression.java:2832)
        at org.apache.xerces.impl.xpath.regex.RegularExpression.matches(RegularE
xpression.java:1444)

However if I use the expanded character class instead:

RegularExpression regExpChecker = new RegularExpression("d[ \\f\\n\\r\\t]*", 
"uw"); or
RegularExpression regExpChecker = new RegularExpression("d[ \\f\\n\\r\\t]", 
"uw");

it works fine so it's probably an easy error to fix if you know where to look.
This exception occurs when you have '\s' followed by either '*' or '+'.

2) Even when using the '\s' escape operator without the * and + it doesn't work 
properly. For example:

RegularExpression regExpChecker = new RegularExpression("d\\s", "uw");
Match match = new Match();
String test = "hgdjh";
if (regExpChecker.matches(test, 0, test.length(), match)) {
    System.out.println("Error: This shoulnd't be a match!!!");
}

This will cause the error text to be printed and this shouldn't happen. The 
regular expression is set to match the character 'd' followed by any whitespace 
character and yet it matches the 'd' character in the string "hgdjh".

Does anyone have a fix for this yet?

Cheers,
/Eddie

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to