Hi, No worries. I'm pretty sure this won't be forgotten in the long run. Let me know about your ideas on how to improve the code.
Cheers, Christian On Mon, Sep 14, 2015 at 3:38 AM, Daniel Veillard <veill...@redhat.com> wrote: > On Mon, Sep 07, 2015 at 09:47:40AM +0200, Christian Ceelen wrote: > > Hi, > > > > Did anyone find the time to look into this or test this? > > Sorry Christian, > > not forgotten but not done yet, will try this week ! > > thanks, > > Daniel > > > Cheers, > > Christian > > > > On Mon, Aug 31, 2015 at 11:47 AM, Christian Ceelen < > > christian.cee...@gmail.com> wrote: > > > > > Hi, > > > > > > i think i no know what i did wrong while filling the hash table. And > yes. > > > My bad ... : > > > > > > xslt.c: 5448 > > > + xmlHashAddEntry2(style->namedTemplatesHash, ret->name, ret->nameURI, > > > template); > > > > > > This adds a pointer to the xmlNode in the parsed stylesheet to the hash > > > table. For just detecting collisions that seems to be ok. > > > But it is utterly useless for using it in xsltFindTemplate in > 'import.c' > > > as that expects a pointer to the compiled template struct > 'xsltTemplate'. > > > So my initial patch for reusing it doing a bad pointer cast here: > > > --- a/libxslt/imports.c > > > +++ b/libxslt/imports.c > > > @@ -400,19 +400,12 @@ xsltFindTemplate(xsltTransformContextPtr ctxt, > const > > > xmlChar *name, > > > return(NULL); > > > style = ctxt->style; > > > while (style != NULL) { > > > - cur = style->templates; > > > - while (cur != NULL) { > > > - if (xmlStrEqual(name, cur->name)) { > > > - if (((nameURI == NULL) && (cur->nameURI == NULL)) || > > > - ((nameURI != NULL) && (cur->nameURI != NULL) && > > > - (xmlStrEqual(nameURI, cur->nameURI)))) { > > > - return(cur); > > > - } > > > - } > > > - cur = cur->next; > > > - } > > > - > > > - style = xsltNextImport(style); > > > + if(style->namedTemplatesHash != NULL){ > > > + cur = (xsltTemplatePtr) > > > xmlHashLookup2(style->namedTemplatesHash, name, nameURI); > > > + if(cur != NULL) > > > + return (cur); > > > + } > > > + style = xsltNextImport(style); > > > } > > > return(NULL); > > > } > > > > > > This explains why it was not working as i had hoped. The hash table > > > returns the xmlNodePtr instead of the xsltTemplatePtr expected by > > > xsltFindTemplate. The patch Daniel does the same, it expects > > > xsltTemplatePtr. > > > > > > So changing in xslt.c: 5448 > > > + xmlHashAddEntry2(style->namedTemplatesHash, ret->name, ret->nameURI, > > > template); > > > to > > > + xmlHashAddEntry2(style->namedTemplatesHash, ret->name, ret->nameURI, > > > ret); > > > Makes it possible to reuse the hash table for the named templates also > in > > > xsltFindTemplate. > > > > > > I pushed the updated changes to github and attached a clean patch to > the > > > email. It is created against 1.1.28. > > > > > > The patch adds the hash table initialization and population to > > > xsltAddTemplate (pattern.c), sets the initial size to a more > reasonable 8 > > > entries, reuses it in xsltFindTemplate (imports.c), and adds a test > case to > > > check that an error is emitted during compilation for template names > that > > > are not unique. > > > > > > So far i haven't replaced the helper functions as Nico suggested, as i > was > > > thinking of moving the look up code to imports.c and share part of a > lookup > > > local to a single stylesheet between xsltFindTemplate and > > > xsltParseStylesheetTemplate. I am not sure if you rather go for a small > > > code duplication or for having small helper classes. In terms of > > > performance i doubt that there is a noticeable difference. > > > > > > Best regards, > > > Christian > > > > > > > > > On Mon, Aug 31, 2015 at 8:27 AM, Daniel Veillard <veill...@redhat.com> > > > wrote: > > > > > >> On Thu, Aug 20, 2015 at 07:21:50PM +0200, Christian Ceelen wrote: > > >> > Hi, > > >> > > > >> > Ok. I think i found the answer by trial and error myself. I was not > > >> able to > > >> > look up the named templates in the hash table for the (pre-) > compiled > > >> > templates. > > >> > > > >> > So i followed Nick's suggest and came up with this first draft: > > >> > https://github.com/GNOME/libxslt/pull/1 > > >> > > >> So I applied the patch, this looks reasonable, we need to add to the > > >> end of the structure, to minimize risk of ABI breakage. Should change > > >> the 1024 value that Nick pointed out. > > >> > > >> > I tested locally and didn't get any surprises. I also created a > simple > > >> test > > >> > with 2 named templates that have the same name to test for collision > > >> > detection. The new code behaves in that case as the previous one. > But so > > >> > far I have no idea yet where and how to add it to the tests in the > test > > >> > suite as this is a test for failure detection. > > >> > > > >> > Looking at a second glance at the imports.c and xsltFindTemplate i > > >> think it > > >> > would be worthwhile to add the new function i created for lookup in > the > > >> > hash table in the file imports.c and try to encapsulate the behavior > > >> there. > > >> > Do you think i should refactor the code in that direction? > > >> > > >> That's where things started to go wrong, I modified > xsltFindTemplate() > > >> to look in the namedTemplatesHash instead of digging the templates > list, > > >> patch is attached, and that raised a number of errors in the > regression > > >> tests. > > >> So it seems to me that namedTemplatesHash isn't initialized with all > the > > >> named templates from the stylesheet and hence the collision detection > > >> is also likely wrong even if it passed an initial simple test. > > >> > > >> So I think we need to give it more scrutiny, i.e. each time a > template > > >> is > > >> added to style->templates we need to make sure we also add it to > > >> style->namedTemplatesHash. > > >> > > >> > > >> > Best regards, > > >> > > > >> > Christian > > >> > > >> > > >> > On Thu, Aug 20, 2015 at 10:28 AM, Christian Ceelen < > > >> > christian.cee...@gmail.com> wrote: > > >> > > > >> > > Hi Nick, > > >> > > > > >> > > Thank you very much for the answer. That sounds like a very good > > >> idea. Yet > > >> > > would the validation still be correct, if the hashtable is local > to > > >> the > > >> > > stylesheet in which the template was defined? I was wondering what > > >> would > > >> > > happen if the same template name is defined in two files, where > one > > >> of the > > >> > > files then imports or includes the other. Concerning the imports i > > >> think > > >> > > the standard is clear, the defined template has lower preference, > so > > >> having > > >> > > two with the same name sounds legit to me. But what happens when > the > > >> file > > >> > > is included? > > >> > > My current understanding is that real validation requires to walk > the > > >> > > include tree for each template name. Would that be correctly > > >> replicating > > >> > > the existing behavior? > > >> > > > > >> > > Btw. In xsltInernals.h:L1514 i found in struct _xsltStylesheet: > > >> > > void *templatesHash; > > >> > > Is this table already containing what we want? > > >> > > > > >> > > Best regards, > > >> > > > > >> > > Christian > > >> > > > > >> > > On Wed, Aug 19, 2015 at 1:00 PM, Nick Wellnhofer < > wellnho...@aevum.de > > >> > > > >> > > wrote: > > >> > > > > >> > >> On 19/08/2015 10:47, Christian Ceelen wrote: > > >> > >> > > >> > >>> At the moment i do not have a deep understanding of the libXSLT > > >> code. My > > >> > >>> current guess is, that this point at which the template name is > > >> > >>> validated to > > >> > >>> be unique in the stylesheet. The current algorithm seems to be > > >> walking > > >> > >>> linearly through the list of all already created templates and > is > > >> > >>> comparing > > >> > >>> the name with each. Given that the compilation process is > walking > > >> > >>> through all > > >> > >>> templates this loop means that we have an O(n^2) algorithm > (with n > > >> being > > >> > >>> the > > >> > >>> number of template instances in the stylesheet to compile). > > >> > >>> The huge number of templates in my XSLTs are just so far over > the > > >> edge, > > >> > >>> that > > >> > >>> the compilation takes 35s. I ran a test in which i skipped the > > >> loop. This > > >> > >>> reduced the compilation time to below 2.5 s. > > >> > >>> > > >> > >>> Would anyone let me know if i have understood the code? What > can i > > >> do to > > >> > >>> improve the code that would get easily accepted and released? I > am > > >> open > > >> > >>> to any > > >> > >>> kind of suggestions on what to do to speed this validation step > up > > >> with > > >> > >>> a data > > >> > >>> structure or mechanism already existing ? > > >> > >>> > > >> > >> > > >> > >> Yes, template names are verified by searching a linked list > which is > > >> > >> O(n^2). Template lookup by name uses the same list: > > >> > >> > > >> > >> > > >> https://github.com/GNOME/libxslt/blob/master/libxslt/imports.c#L375 > > >> > >> > > >> > >> It shouldn't be too hard to change the code to use an > xmlHashTable: > > >> > >> > > >> > >> > > >> https://github.com/GNOME/libxml2/blob/master/include/libxml/hash.h > > >> > >> > > >> > >> Simply add a the hash table to struct _xsltStylesheet > > >> > >> > > >> > >> > > >> > >> > > >> > https://github.com/GNOME/libxslt/blob/master/libxslt/xsltInternals.h#L1509 > > >> > >> > > >> > >> add some initialization and cleanup, then use xmlHashAddEntry2 > and > > >> > >> xmlHashLookup2 with the template's name and namespace URI. This > > >> should make > > >> > >> name verification O(n) overall and lookup of a single template by > > >> name O(1). > > >> > >> > > >> > >> What is now the developer or working repository for libxslt? > > >> > >>> I found the git repository on git.gnome.org < > http://git.gnome.org>, > > >> the > > >> > >>> GNOME/libxslt on github, plus several forks which seem to be > working > > >> > >>> repos. > > >> > >>> Can you point me please to the repo against which i should use > for > > >> > >>> further > > >> > >>> investigation or patches? > > >> > >>> > > >> > >> > > >> > >> The official repo is the one at git.gnome.org: > > >> > >> > > >> > >> https://git.gnome.org/browse/libxslt/ > > >> > >> > > >> > >> Nick > > >> > >> > > >> > >> _______________________________________________ > > >> > >> xslt mailing list, project page http://xmlsoft.org/XSLT/ > > >> > >> xslt@gnome.org > > >> > >> https://mail.gnome.org/mailman/listinfo/xslt > > >> > >> > > >> > > > > >> > > > > >> > > >> > _______________________________________________ > > >> > xslt mailing list, project page http://xmlsoft.org/XSLT/ > > >> > xslt@gnome.org > > >> > https://mail.gnome.org/mailman/listinfo/xslt > > >> > > >> > > >> -- > > >> Daniel Veillard | Open Source and Standards, Red Hat > > >> veill...@redhat.com | libxml Gnome XML XSLT toolkit > http://xmlsoft.org/ > > >> http://veillard.com/ | virtualization library http://libvirt.org/ > > >> > > > > > > > > > _______________________________________________ > > xslt mailing list, project page http://xmlsoft.org/XSLT/ > > xslt@gnome.org > > https://mail.gnome.org/mailman/listinfo/xslt > > > -- > Daniel Veillard | Open Source and Standards, Red Hat > veill...@redhat.com | libxml Gnome XML XSLT toolkit http://xmlsoft.org/ > http://veillard.com/ | virtualization library http://libvirt.org/ > _______________________________________________ > xslt mailing list, project page http://xmlsoft.org/XSLT/ > xslt@gnome.org > https://mail.gnome.org/mailman/listinfo/xslt >
_______________________________________________ xslt mailing list, project page http://xmlsoft.org/XSLT/ xslt@gnome.org https://mail.gnome.org/mailman/listinfo/xslt