On Fri, Aug 03, 2007 at 08:22:37PM +0530, Ashwin wrote:

> 

>    Hi,

> 

>    While testing the xmlFARegExec function if I give the following input

> 

>    (0|1|2|3|4|5|6|7|8|9)   (0,10)   (The rule to be used for matching the

>    input expression)

> 

>    Expression to be matched 1234567891 (length 10)

> 

>    This  is returning a failure, if I reduce the expression by 1 it works

>    fine. Is this because of incorrect usage?

 

>  Hum, no, looks like a bug, strange ...

> paphio:~/XML -> ./testRegexp '(0|1|2|3|4|5|6|7|8|9){0,10}' '1234567891'

> Testing (0|1|2|3|4|5|6|7|8|9){0,10}:

> 1234567891: Fail

> paphio:~/XML -> ./testRegexp '(0|1|2|3|4|5|6|7|8|9){0,10}' '123456789'

> Testing (0|1|2|3|4|5|6|7|8|9){0,10}:

> 123456789: Ok

 

  Could be worth bugzilla'ing that's something I should be able to fix !

 

Hi,

   With regard to the above problem I made some modifications to the
generate epsilon function, which seems to have solved the problem when a
range with 0 is specified, however now it so happens that if the input is
more than the specified range regxpexec function returns success instead of
failure. The code change is as follows:-

In file xmlregxp.c

 

1522              if (atom->min == 0) {

1523               xmlFAGenerateEpsilonTransition(ctxt,
atom->start,atom->stop);

1524               newstate = xmlRegNewState(ctxt);

1525               xmlRegStatePush(ctxt, newstate);

               ctxt->state = newstate;

               xmlFAGenerateEpsilonTransition(ctxt, atom->start, newstate);

 

              counter = xmlRegGetCounter(ctxt);

              ctxt->counters[counter].min = atom->min - 1;

              ctxt->counters[counter].max = atom->max - 1;// These three
lines were not part of the earlier if condition

              counter = -1;

           }

        else

        {

            counter = xmlRegGetCounter(ctxt);

            ctxt->counters[counter].min = atom->min - 1;

            ctxt->counters[counter].max = atom->max - 1;

        }

 

I have to admit that this is nothing but a shabby workaround, if at all one
can call it even that. Passing the value of counter as -1 in the functions
xmlFAGenerateCountedTransition & xmlFAGenerateCountedEpsilonTransition
immediately following this bit of code seems to solve the problem.The basis
for the above change is the fact I have a hunch the problem lies in counted
transitions getting generated where they are not required. Consider the
following cases(I used xmlRegxpPrint to print the below on the console):-

On the left hand side is the case where range was 0, and maximum range
string was returning failure (after the code change this is now returning
success)

 

The input string is 012                           The input String is 122


Testing (0|1|2){0,3}:                             Testing (0|1|2){1,3}:


 regexp: '(0|1|2){0,3}'                            regexp: '(0|1|2){1,3}'


4 atoms:                                          4 atoms:


 00  atom: charval once char 0                     00  atom: charval once
char 0           

 01  atom: charval once char 1                     01  atom: charval once
char 1           

 02  atom: charval once char 2                     02  atom: charval once
char 2           

 03  atom: subexpr once start -572662307 end 2     03  atom: subexpr once
start 4 end 2    

5 states:                                         4 states:


 state: START 0, 8 transitions:                    state: START 0, 4
transitions:          

  trans: removed                                    trans: removed


  trans: removed                                    trans: char 0 atom 0, to
2             

  trans: removed                                    trans: char 1 atom 1, to
2             

  trans: removed                                    trans: char 2 atom 2, to
2             

  trans: count based 0, epsilon to 4               state: NULL


  trans: counted 0, char 0 atom 0, to 2            state: 2, 5 transitions:


  trans: counted 0, char 1 atom 1, to 2             trans: count based 0,
epsilon to 3     

  trans: counted 0, char 2 atom 2, to 2             trans: removed


 state: NULL                                        trans: counted 0, char 0
atom 0, to 2  

 state: 2, 5 transitions:                           trans: counted 0, char 1
atom 1, to 2  

  trans: count based 0, epsilon to 4                trans: counted 0, char 2
atom 2, to 2  

  trans: removed                                   state: FINAL 3, 0
transitions:          

  trans: counted 0, char 0 atom 0, to 2           1 counters:


  trans: counted 0, char 1 atom 1, to 2            0: min 0 max 2


  trans: counted 0, char 2 atom 2, to 2           122: Ok


 state: NULL


 state: FINAL 4, 0 transitions:


1 counters:


 0: min -1 max 2


 122:Fail 

 

I made the change on the assumption that the part highlighted on LHS should
be similar to RHS (I might of course be completely wrong.). However now for
cases like (0|1|2|3){0,3} if I give the input as 1231 it returns success.   

 

Regards

Ashwin


_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
[email protected]
http://mail.gnome.org/mailman/listinfo/xml

Reply via email to