On Thu, Aug 23, 2012 at 3:03 PM, Daniel Veillard <veill...@redhat.com> wrote: > On Wed, Aug 22, 2012 at 10:40:20PM +0200, Johan Corveleyn wrote: >> On Fri, Aug 17, 2012 at 10:32 PM, Johan Corveleyn <jcor...@gmail.com> wrote: >> > On Fri, Feb 24, 2012 at 2:14 PM, Johan Corveleyn <jcor...@gmail.com> wrote: >> > ... >> >> I'm running into a "non-determinist" error with a schema even though I >> >> don't think it's non-determinist (and both XSV and Xerces agree with >> >> me; they too have no problem with the schema). I have reduced my >> >> test-case to a simple example, see below. >> > >> > On Tue, Feb 28, 2012 at 6:00 AM, Daniel Veillard <veill...@redhat.com> >> > wrote: >> > ... >> >> Well in that case it seems the underlying code is generating a >> >> wrong automata so it should be a matter of running that minimal >> >> test case under gdb with the breakpoint appropriately set to find out >> >> what wrong transition got added ... >> > >> > I'd like to take a stab at this issue. I've been able to get the >> > latest 2.9.0 snapshot (from >> > ftp://xmlsoft.org/libxml2/libxml2-git-snapshot.tar.gz) to build in >> > Visual Studio 2010. I can still reproduce the issue. >> > >> > But I need some guidance. What part of the code should I be looking >> > at? Where do I have to set the "breakpoint appropriately to find out >> > the wrong transition that got added"? >> > >> > Alternatively, if you or anyone else on this list can fix the issue >> > already, that would be even better :-). >> > >> > To summarize, the issue is this: >> >> Given the following schema: >> >> >> >> <?xml version="1.0" encoding="UTF-8"?> >> >> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> >> >> <xs:element name="rules"> >> >> <xs:complexType> >> >> <xs:sequence> >> >> <xs:element name="rule" >> >> minOccurs="0" maxOccurs="unbounded"/> >> >> <xs:sequence minOccurs="0" >> >> maxOccurs="1"> >> >> <xs:element >> >> name="specialRule" minOccurs="1" maxOccurs="1"/> >> >> <xs:element name="rule" >> >> minOccurs="0" maxOccurs="unbounded"/> >> >> </xs:sequence> >> >> </xs:sequence> >> >> </xs:complexType> >> >> </xs:element> >> >> </xs:schema> >> >> >> >> xmllint complains with "Schemas parser error : local >> >> complex type: The content model is not determinist." >> >> But it isn't. >> > >> > See also https://bugzilla.gnome.org/show_bug.cgi?id=670865. >> >> Hi, >> >> I could really use some guidance here. I've been able to set up >> everything in Visual Studio 2010, and reproduced the issue from within >> the debugger. So I can set a breakpoint etc. But I mostly lack some >> basic knowledge about how this all works. >> >> Is there some (preferable concise) reading about basic concepts like >> how xml schemas automata work, how these are built, etc? What should >> the automata in this case look like (in terms of the data structures >> inside the code)? How can I see that the automata is non-determinist, >> and what would a determinist automata look like, ... ? > > Johan, > > sorry for the slow answer, > > I will look at this, promise, but all that code is rooted in automata > theory, there is quite some litterature about it, and well unless you > go though the process of learning that code and possibly the theory > behind it, it's not something that can be hacked trivially. Basically > I run this in debug mode, look at the automata generated, there is pro > bably an error there and then this need fixing and a lot of checking. > To be honnest that one of the hardest part of the library to change > with the core of the parser. > Theory is usually learnt though the 'Dragon Book', by Aho and Al. > and the code is actually more complex since using extended automata > operations not found in basic automata ... > > If you have time looking at this may be interesting, but > it's really not trivial at least more complex than other parts of the > code :-)
Thanks for your answer, and for pointing out that it's not so simple :-). It seems that this is a bit over my head then (I have little clue about automata theory, so it would probably take me weeks or months to understand what's going on ...). I hope you can find some time to dig in further in the not-too-distant future :-). I'm always willing to help out, test, ... if that would be useful (though my usefulness might be limited because of my non-understanding of things ...). Thanks for all your efforts, -- Johan _______________________________________________ xml mailing list, project page http://xmlsoft.org/ xml@gnome.org https://mail.gnome.org/mailman/listinfo/xml