On Fri, Aug 03, 2007 at 08:22:37PM +0530, Ashwin wrote:
>
> Hi,
>
> While testing the xmlFARegExec function if I give the following input
>
> (0|1|2|3|4|5|6|7|8|9) (0,10) (The rule to be used for matching the
> input expression)
>
> Expression to be matched 1234567891 (length 10)
>
> This is returning a failure, if I reduce the expression by 1 it works
> fine. Is this because of incorrect usage?
> Hum, no, looks like a bug, strange ...
> paphio:~/XML -> ./testRegexp '(0|1|2|3|4|5|6|7|8|9){0,10}' '1234567891'
> Testing (0|1|2|3|4|5|6|7|8|9){0,10}:
> 1234567891: Fail
> paphio:~/XML -> ./testRegexp '(0|1|2|3|4|5|6|7|8|9){0,10}' '123456789'
> Testing (0|1|2|3|4|5|6|7|8|9){0,10}:
> 123456789: Ok
Could be worth bugzilla'ing that's something I should be able to fix !
Hi,
With regard to the above problem I made some modifications to the
generate epsilon function, which seems to have solved the problem when a
range with 0 is specified, however now it so happens that if the input is
more than the specified range regxpexec function returns success instead of
failure. The code change is as follows:-
In file xmlregxp.c
1522 if (atom->min == 0) {
1523 xmlFAGenerateEpsilonTransition(ctxt,
atom->start,atom->stop);
1524 newstate = xmlRegNewState(ctxt);
1525 xmlRegStatePush(ctxt, newstate);
ctxt->state = newstate;
xmlFAGenerateEpsilonTransition(ctxt, atom->start, newstate);
counter = xmlRegGetCounter(ctxt);
ctxt->counters[counter].min = atom->min - 1;
ctxt->counters[counter].max = atom->max - 1;// These three
lines were not part of the earlier if condition
counter = -1;
}
else
{
counter = xmlRegGetCounter(ctxt);
ctxt->counters[counter].min = atom->min - 1;
ctxt->counters[counter].max = atom->max - 1;
}
I have to admit that this is nothing but a shabby workaround, if at all one
can call it even that. Passing the value of counter as -1 in the functions
xmlFAGenerateCountedTransition & xmlFAGenerateCountedEpsilonTransition
immediately following this bit of code seems to solve the problem.The basis
for the above change is the fact I have a hunch the problem lies in counted
transitions getting generated where they are not required. Consider the
following cases(I used xmlRegxpPrint to print the below on the console):-
On the left hand side is the case where range was 0, and maximum range
string was returning failure (after the code change this is now returning
success)
The input string is 012 The input String is 122
Testing (0|1|2){0,3}: Testing (0|1|2){1,3}:
regexp: '(0|1|2){0,3}' regexp: '(0|1|2){1,3}'
4 atoms: 4 atoms:
00 atom: charval once char 0 00 atom: charval once
char 0
01 atom: charval once char 1 01 atom: charval once
char 1
02 atom: charval once char 2 02 atom: charval once
char 2
03 atom: subexpr once start -572662307 end 2 03 atom: subexpr once
start 4 end 2
5 states: 4 states:
state: START 0, 8 transitions: state: START 0, 4
transitions:
trans: removed trans: removed
trans: removed trans: char 0 atom 0, to
2
trans: removed trans: char 1 atom 1, to
2
trans: removed trans: char 2 atom 2, to
2
trans: count based 0, epsilon to 4 state: NULL
trans: counted 0, char 0 atom 0, to 2 state: 2, 5 transitions:
trans: counted 0, char 1 atom 1, to 2 trans: count based 0,
epsilon to 3
trans: counted 0, char 2 atom 2, to 2 trans: removed
state: NULL trans: counted 0, char 0
atom 0, to 2
state: 2, 5 transitions: trans: counted 0, char 1
atom 1, to 2
trans: count based 0, epsilon to 4 trans: counted 0, char 2
atom 2, to 2
trans: removed state: FINAL 3, 0
transitions:
trans: counted 0, char 0 atom 0, to 2 1 counters:
trans: counted 0, char 1 atom 1, to 2 0: min 0 max 2
trans: counted 0, char 2 atom 2, to 2 122: Ok
state: NULL
state: FINAL 4, 0 transitions:
1 counters:
0: min -1 max 2
122:Fail
I made the change on the assumption that the part highlighted on LHS should
be similar to RHS (I might of course be completely wrong.). However now for
cases like (0|1|2|3){0,3} if I give the input as 1231 it returns success.
Regards
Ashwin
_______________________________________________
xml mailing list, project page http://xmlsoft.org/
[email protected]
http://mail.gnome.org/mailman/listinfo/xml