On Mon, Sep 07, 2015 at 09:47:40AM +0200, Christian Ceelen wrote: > Hi, > > Did anyone find the time to look into this or test this?
Sorry Christian, not forgotten but not done yet, will try this week ! thanks, Daniel > Cheers, > Christian > > On Mon, Aug 31, 2015 at 11:47 AM, Christian Ceelen < > christian.cee...@gmail.com> wrote: > > > Hi, > > > > i think i no know what i did wrong while filling the hash table. And yes. > > My bad ... : > > > > xslt.c: 5448 > > + xmlHashAddEntry2(style->namedTemplatesHash, ret->name, ret->nameURI, > > template); > > > > This adds a pointer to the xmlNode in the parsed stylesheet to the hash > > table. For just detecting collisions that seems to be ok. > > But it is utterly useless for using it in xsltFindTemplate in 'import.c' > > as that expects a pointer to the compiled template struct 'xsltTemplate'. > > So my initial patch for reusing it doing a bad pointer cast here: > > --- a/libxslt/imports.c > > +++ b/libxslt/imports.c > > @@ -400,19 +400,12 @@ xsltFindTemplate(xsltTransformContextPtr ctxt, const > > xmlChar *name, > > return(NULL); > > style = ctxt->style; > > while (style != NULL) { > > - cur = style->templates; > > - while (cur != NULL) { > > - if (xmlStrEqual(name, cur->name)) { > > - if (((nameURI == NULL) && (cur->nameURI == NULL)) || > > - ((nameURI != NULL) && (cur->nameURI != NULL) && > > - (xmlStrEqual(nameURI, cur->nameURI)))) { > > - return(cur); > > - } > > - } > > - cur = cur->next; > > - } > > - > > - style = xsltNextImport(style); > > + if(style->namedTemplatesHash != NULL){ > > + cur = (xsltTemplatePtr) > > xmlHashLookup2(style->namedTemplatesHash, name, nameURI); > > + if(cur != NULL) > > + return (cur); > > + } > > + style = xsltNextImport(style); > > } > > return(NULL); > > } > > > > This explains why it was not working as i had hoped. The hash table > > returns the xmlNodePtr instead of the xsltTemplatePtr expected by > > xsltFindTemplate. The patch Daniel does the same, it expects > > xsltTemplatePtr. > > > > So changing in xslt.c: 5448 > > + xmlHashAddEntry2(style->namedTemplatesHash, ret->name, ret->nameURI, > > template); > > to > > + xmlHashAddEntry2(style->namedTemplatesHash, ret->name, ret->nameURI, > > ret); > > Makes it possible to reuse the hash table for the named templates also in > > xsltFindTemplate. > > > > I pushed the updated changes to github and attached a clean patch to the > > email. It is created against 1.1.28. > > > > The patch adds the hash table initialization and population to > > xsltAddTemplate (pattern.c), sets the initial size to a more reasonable 8 > > entries, reuses it in xsltFindTemplate (imports.c), and adds a test case to > > check that an error is emitted during compilation for template names that > > are not unique. > > > > So far i haven't replaced the helper functions as Nico suggested, as i was > > thinking of moving the look up code to imports.c and share part of a lookup > > local to a single stylesheet between xsltFindTemplate and > > xsltParseStylesheetTemplate. I am not sure if you rather go for a small > > code duplication or for having small helper classes. In terms of > > performance i doubt that there is a noticeable difference. > > > > Best regards, > > Christian > > > > > > On Mon, Aug 31, 2015 at 8:27 AM, Daniel Veillard <veill...@redhat.com> > > wrote: > > > >> On Thu, Aug 20, 2015 at 07:21:50PM +0200, Christian Ceelen wrote: > >> > Hi, > >> > > >> > Ok. I think i found the answer by trial and error myself. I was not > >> able to > >> > look up the named templates in the hash table for the (pre-) compiled > >> > templates. > >> > > >> > So i followed Nick's suggest and came up with this first draft: > >> > https://github.com/GNOME/libxslt/pull/1 > >> > >> So I applied the patch, this looks reasonable, we need to add to the > >> end of the structure, to minimize risk of ABI breakage. Should change > >> the 1024 value that Nick pointed out. > >> > >> > I tested locally and didn't get any surprises. I also created a simple > >> test > >> > with 2 named templates that have the same name to test for collision > >> > detection. The new code behaves in that case as the previous one. But so > >> > far I have no idea yet where and how to add it to the tests in the test > >> > suite as this is a test for failure detection. > >> > > >> > Looking at a second glance at the imports.c and xsltFindTemplate i > >> think it > >> > would be worthwhile to add the new function i created for lookup in the > >> > hash table in the file imports.c and try to encapsulate the behavior > >> there. > >> > Do you think i should refactor the code in that direction? > >> > >> That's where things started to go wrong, I modified xsltFindTemplate() > >> to look in the namedTemplatesHash instead of digging the templates list, > >> patch is attached, and that raised a number of errors in the regression > >> tests. > >> So it seems to me that namedTemplatesHash isn't initialized with all the > >> named templates from the stylesheet and hence the collision detection > >> is also likely wrong even if it passed an initial simple test. > >> > >> So I think we need to give it more scrutiny, i.e. each time a template > >> is > >> added to style->templates we need to make sure we also add it to > >> style->namedTemplatesHash. > >> > >> > >> > Best regards, > >> > > >> > Christian > >> > >> > >> > On Thu, Aug 20, 2015 at 10:28 AM, Christian Ceelen < > >> > christian.cee...@gmail.com> wrote: > >> > > >> > > Hi Nick, > >> > > > >> > > Thank you very much for the answer. That sounds like a very good > >> idea. Yet > >> > > would the validation still be correct, if the hashtable is local to > >> the > >> > > stylesheet in which the template was defined? I was wondering what > >> would > >> > > happen if the same template name is defined in two files, where one > >> of the > >> > > files then imports or includes the other. Concerning the imports i > >> think > >> > > the standard is clear, the defined template has lower preference, so > >> having > >> > > two with the same name sounds legit to me. But what happens when the > >> file > >> > > is included? > >> > > My current understanding is that real validation requires to walk the > >> > > include tree for each template name. Would that be correctly > >> replicating > >> > > the existing behavior? > >> > > > >> > > Btw. In xsltInernals.h:L1514 i found in struct _xsltStylesheet: > >> > > void *templatesHash; > >> > > Is this table already containing what we want? > >> > > > >> > > Best regards, > >> > > > >> > > Christian > >> > > > >> > > On Wed, Aug 19, 2015 at 1:00 PM, Nick Wellnhofer <wellnho...@aevum.de > >> > > >> > > wrote: > >> > > > >> > >> On 19/08/2015 10:47, Christian Ceelen wrote: > >> > >> > >> > >>> At the moment i do not have a deep understanding of the libXSLT > >> code. My > >> > >>> current guess is, that this point at which the template name is > >> > >>> validated to > >> > >>> be unique in the stylesheet. The current algorithm seems to be > >> walking > >> > >>> linearly through the list of all already created templates and is > >> > >>> comparing > >> > >>> the name with each. Given that the compilation process is walking > >> > >>> through all > >> > >>> templates this loop means that we have an O(n^2) algorithm (with n > >> being > >> > >>> the > >> > >>> number of template instances in the stylesheet to compile). > >> > >>> The huge number of templates in my XSLTs are just so far over the > >> edge, > >> > >>> that > >> > >>> the compilation takes 35s. I ran a test in which i skipped the > >> loop. This > >> > >>> reduced the compilation time to below 2.5 s. > >> > >>> > >> > >>> Would anyone let me know if i have understood the code? What can i > >> do to > >> > >>> improve the code that would get easily accepted and released? I am > >> open > >> > >>> to any > >> > >>> kind of suggestions on what to do to speed this validation step up > >> with > >> > >>> a data > >> > >>> structure or mechanism already existing ? > >> > >>> > >> > >> > >> > >> Yes, template names are verified by searching a linked list which is > >> > >> O(n^2). Template lookup by name uses the same list: > >> > >> > >> > >> > >> https://github.com/GNOME/libxslt/blob/master/libxslt/imports.c#L375 > >> > >> > >> > >> It shouldn't be too hard to change the code to use an xmlHashTable: > >> > >> > >> > >> > >> https://github.com/GNOME/libxml2/blob/master/include/libxml/hash.h > >> > >> > >> > >> Simply add a the hash table to struct _xsltStylesheet > >> > >> > >> > >> > >> > >> > >> https://github.com/GNOME/libxslt/blob/master/libxslt/xsltInternals.h#L1509 > >> > >> > >> > >> add some initialization and cleanup, then use xmlHashAddEntry2 and > >> > >> xmlHashLookup2 with the template's name and namespace URI. This > >> should make > >> > >> name verification O(n) overall and lookup of a single template by > >> name O(1). > >> > >> > >> > >> What is now the developer or working repository for libxslt? > >> > >>> I found the git repository on git.gnome.org <http://git.gnome.org>, > >> the > >> > >>> GNOME/libxslt on github, plus several forks which seem to be working > >> > >>> repos. > >> > >>> Can you point me please to the repo against which i should use for > >> > >>> further > >> > >>> investigation or patches? > >> > >>> > >> > >> > >> > >> The official repo is the one at git.gnome.org: > >> > >> > >> > >> https://git.gnome.org/browse/libxslt/ > >> > >> > >> > >> Nick > >> > >> > >> > >> _______________________________________________ > >> > >> xslt mailing list, project page http://xmlsoft.org/XSLT/ > >> > >> xslt@gnome.org > >> > >> https://mail.gnome.org/mailman/listinfo/xslt > >> > >> > >> > > > >> > > > >> > >> > _______________________________________________ > >> > xslt mailing list, project page http://xmlsoft.org/XSLT/ > >> > xslt@gnome.org > >> > https://mail.gnome.org/mailman/listinfo/xslt > >> > >> > >> -- > >> Daniel Veillard | Open Source and Standards, Red Hat > >> veill...@redhat.com | libxml Gnome XML XSLT toolkit http://xmlsoft.org/ > >> http://veillard.com/ | virtualization library http://libvirt.org/ > >> > > > > > _______________________________________________ > xslt mailing list, project page http://xmlsoft.org/XSLT/ > xslt@gnome.org > https://mail.gnome.org/mailman/listinfo/xslt -- Daniel Veillard | Open Source and Standards, Red Hat veill...@redhat.com | libxml Gnome XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | virtualization library http://libvirt.org/ _______________________________________________ xslt mailing list, project page http://xmlsoft.org/XSLT/ xslt@gnome.org https://mail.gnome.org/mailman/listinfo/xslt