On Wed, Aug 22, 2007 at 09:59:28PM +0530, Ashwin wrote:
> 
>    On Fri, Aug 03, 2007 at 08:22:37PM +0530, Ashwin wrote:
> 
>    >
> 
>    >    Hi,
> 
>    >
> 
>    >     While  testing the xmlFARegExec function if I give the following
>    input
> 
>    >
> 
>    >     (0|1|2|3|4|5|6|7|8|9)    (0,10)    (The  rule  to  be  used  for
>    matching the
> 
>    >    input expression)
> 
>    >
> 
>    >    Expression to be matched 1234567891 (length 10)
> 
>    >
> 
>    >     This  is returning a failure, if I reduce the expression by 1 it
>    works
> 
>    >    fine. Is this because of incorrect usage?
> 
> 
>    >  Hum, no, looks like a bug, strange ...
> 
>    >    paphio:~/XML    ->   ./testRegexp   '(0|1|2|3|4|5|6|7|8|9){0,10}'
>    '1234567891'
> 
>    > Testing (0|1|2|3|4|5|6|7|8|9){0,10}:
> 
>    > 1234567891: Fail
> 
>    >    paphio:~/XML    ->   ./testRegexp   '(0|1|2|3|4|5|6|7|8|9){0,10}'
>    '123456789'
> 
>    > Testing (0|1|2|3|4|5|6|7|8|9){0,10}:
> 
>    > 123456789: Ok
> 
> 
>      Could be worth bugzilla'ing that's something I should be able to fix
>    !
> 
> 
>    Hi,
> 
>        With  regard to the above problem I made some modifications to the
>    generate epsilon function, which seems to have solved the problem when
>    a  range  with  0  is specified, however now it so happens that if the
>    input  is  more  than  the  specified range regxpexec function returns
>    success instead of failure. The code change is as follows:-
> 
>    In file xmlregxp.c
> 
> 
>    1522              if (atom->min == 0) {
> 
>    1523                              xmlFAGenerateEpsilonTransition(ctxt,
>    atom->start,atom->stop);
> 
>    1524               newstate = xmlRegNewState(ctxt);
> 
>    1525               xmlRegStatePush(ctxt, newstate);
> 
>                   ctxt->state = newstate;
> 
>                       xmlFAGenerateEpsilonTransition(ctxt,   atom->start,
>    newstate);
> 
> 
>                  counter = xmlRegGetCounter(ctxt);
> 
>                  ctxt->counters[counter].min = atom->min - 1;
> 
>                    ctxt->counters[counter].max  =  atom->max - 1;// These
>    three lines were not part of the earlier if condition
> 
>                  counter = -1;
> 
>               }
> 
>            else
> 
>            {
> 
>                counter = xmlRegGetCounter(ctxt);
> 
>                ctxt->counters[counter].min = atom->min - 1;
> 
>                ctxt->counters[counter].max = atom->max - 1;
> 
>            }
> 
> 
>    I  have  to  admit that this is nothing but a shabby workaround, if at
>    all  one  can call it even that. Passing the value of counter as -1 in
>    the         functions         xmlFAGenerateCountedTransition         &
>    xmlFAGenerateCountedEpsilonTransition  immediately  following this bit
>    of  code  seems to solve the problem.The basis for the above change is
>    the  fact  I  have  a  hunch  the  problem lies in counted transitions
>    getting  generated where they are not required. Consider the following
>    cases(I used xmlRegxpPrint to print the below on the console):-
> 
>    On the left hand side is the case where range was 0, and maximum range
>    string  was  returning  failure  (after  the  code  change this is now
>    returning success)
> 
> 
>    The  input string is 012                           The input String is
>    122
> 
>    Testing        (0|1|2){0,3}:                                   Testing
>    (0|1|2){1,3}:
> 
>     regexp:       '(0|1|2){0,3}'                                  regexp:
>    '(0|1|2){1,3}'
> 
>    4           atoms:                                                   4
>    atoms:
> 
>     00   atom:  charval once char 0                     00  atom: charval
>    once char 0
> 
>     01   atom:  charval once char 1                     01  atom: charval
>    once char 1
> 
>     02   atom:  charval once char 2                     02  atom: charval
>    once char 2
> 
>     03   atom:  subexpr once start -572662307 end 2     03  atom: subexpr
>    once start 4 end 2
> 
>    5           states:                                                  4
>    states:
> 
>     state:  START  0, 8 transitions:                    state: START 0, 4
>    transitions:
> 
>          trans:     removed                                        trans:
>    removed
> 
>      trans: removed                                    trans: char 0 atom
>    0, to 2
> 
>      trans: removed                                    trans: char 1 atom
>    1, to 2
> 
>      trans: removed                                    trans: char 2 atom
>    2, to 2
> 
>        trans:   count   based   0,   epsilon  to  4                state:
>    NULL
> 
>       trans:  counted  0,  char  0  atom  0,  to 2            state: 2, 5
>    transitions:
> 
>      trans: counted 0, char 1 atom 1, to 2             trans: count based
>    0, epsilon to 3
> 
>        trans:   counted  0,  char  2  atom  2,  to  2              trans:
>    removed
> 
>     state:  NULL                                        trans: counted 0,
>    char 0 atom 0, to 2
> 
>     state:  2, 5 transitions:                           trans: counted 0,
>    char 1 atom 1, to 2
> 
>       trans: count based 0, epsilon to 4                trans: counted 0,
>    char 2 atom 2, to 2
> 
>       trans:  removed                                   state: FINAL 3, 0
>    transitions:
> 
>        trans:   counted   0,   char   0   atom   0,   to   2            1
>    counters:
> 
>       trans:  counted  0,  char  1  atom  1, to 2            0: min 0 max
>    2
> 
>        trans:   counted   0,   char   2   atom  2,  to  2            122:
>    Ok
> 
>     state:
>    NULL
> 
>     state: FINAL 4, 0
>    transitions:
> 
>    1
>    counters:
> 
>     0: min -1 max
>    2
> 
>     122:Fail
> 
> 
>    I  made  the change on the assumption that the part highlighted on LHS
>    should  be  similar  to  RHS (I might of course be completely wrong.).
>    However  now for cases like (0|1|2|3){0,3} if I give the input as 1231
>    it returns success.
> 
> 
>    Regards

  I'm sorry but once it went though the various mail agents and processing your
mail is near unreadable :-\
  If you have a change to submit, please add a patch (preferably using diff
-p as an attachment), this will make sure at least the part related to 
functional changes is not scrambled, and this has the good point of being
precise to the point that a machine can the apply it. So please, 
    1/ drop HTML mail formatting
    2/ attach the patch based on diff

  BTW using the version in SVN after my fix I see:

paphio:~/XML -> ./testRegexp   '(0|1|2|3|4|5|6|7|8|9){0,10}' '123456789'
Testing (0|1|2|3|4|5|6|7|8|9){0,10}:
123456789: Ok
paphio:~/XML -> ./testRegexp   '(0|1|2|3|4|5|6|7|8|9){0,10}' '1234567890'
Testing (0|1|2|3|4|5|6|7|8|9){0,10}:
1234567890: Ok
paphio:~/XML -> ./testRegexp   '(0|1|2|3|4|5|6|7|8|9){0,10}' '12345678901'
Testing (0|1|2|3|4|5|6|7|8|9){0,10}:
12345678901: Fail
paphio:~/XML -> 

  So this might be a duplicate of the previous issue,

Daniel

-- 
Red Hat Virtualization group http://redhat.com/virtualization/
Daniel Veillard      | virtualization library  http://libvirt.org/
[EMAIL PROTECTED]  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine  http://rpmfind.net/
_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
[email protected]
http://mail.gnome.org/mailman/listinfo/xml

Reply via email to