Regular expressions with unions do not work properly with replacing and 
tokenizing.
-----------------------------------------------------------------------------------

         Key: XERCESC-1390
         URL: http://issues.apache.org/jira/browse/XERCESC-1390
     Project: Xerces-C++
        Type: Bug
  Components: Utilities  
    Versions: 2.6.0    
    Reporter: David Bertoni
    Priority: Critical
 Attachments: patch.txt

Consider the following regular expression:

"(ab) | (a)"

with the following input string:

"abracadabra"

If you use an instance the RegularExpression class to replace any matching 
substrings with the empty string, the result should be the following string:

"rcdr"

Instead, just the last "a" in the string is replaced:

"abracadabr"

If you use the same RegularExpression instance to tokenize the expression, the 
result should be the following set of strings:

""
"r"
"c"
"d"
"r"
""

Instead, the result is

"abracadabr"
""

I will attach a proposed patch, but I don't know this code well, so it would be 
great if someone could review it.


-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
If you want more information on JIRA, or have a bug to report see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to