Hi, Did anyone find the time to look into this or test this?
Cheers, Christian On Mon, Aug 31, 2015 at 11:47 AM, Christian Ceelen < christian.cee...@gmail.com> wrote: > Hi, > > i think i no know what i did wrong while filling the hash table. And yes. > My bad ... : > > xslt.c: 5448 > + xmlHashAddEntry2(style->namedTemplatesHash, ret->name, ret->nameURI, > template); > > This adds a pointer to the xmlNode in the parsed stylesheet to the hash > table. For just detecting collisions that seems to be ok. > But it is utterly useless for using it in xsltFindTemplate in 'import.c' > as that expects a pointer to the compiled template struct 'xsltTemplate'. > So my initial patch for reusing it doing a bad pointer cast here: > --- a/libxslt/imports.c > +++ b/libxslt/imports.c > @@ -400,19 +400,12 @@ xsltFindTemplate(xsltTransformContextPtr ctxt, const > xmlChar *name, > return(NULL); > style = ctxt->style; > while (style != NULL) { > - cur = style->templates; > - while (cur != NULL) { > - if (xmlStrEqual(name, cur->name)) { > - if (((nameURI == NULL) && (cur->nameURI == NULL)) || > - ((nameURI != NULL) && (cur->nameURI != NULL) && > - (xmlStrEqual(nameURI, cur->nameURI)))) { > - return(cur); > - } > - } > - cur = cur->next; > - } > - > - style = xsltNextImport(style); > + if(style->namedTemplatesHash != NULL){ > + cur = (xsltTemplatePtr) > xmlHashLookup2(style->namedTemplatesHash, name, nameURI); > + if(cur != NULL) > + return (cur); > + } > + style = xsltNextImport(style); > } > return(NULL); > } > > This explains why it was not working as i had hoped. The hash table > returns the xmlNodePtr instead of the xsltTemplatePtr expected by > xsltFindTemplate. The patch Daniel does the same, it expects > xsltTemplatePtr. > > So changing in xslt.c: 5448 > + xmlHashAddEntry2(style->namedTemplatesHash, ret->name, ret->nameURI, > template); > to > + xmlHashAddEntry2(style->namedTemplatesHash, ret->name, ret->nameURI, > ret); > Makes it possible to reuse the hash table for the named templates also in > xsltFindTemplate. > > I pushed the updated changes to github and attached a clean patch to the > email. It is created against 1.1.28. > > The patch adds the hash table initialization and population to > xsltAddTemplate (pattern.c), sets the initial size to a more reasonable 8 > entries, reuses it in xsltFindTemplate (imports.c), and adds a test case to > check that an error is emitted during compilation for template names that > are not unique. > > So far i haven't replaced the helper functions as Nico suggested, as i was > thinking of moving the look up code to imports.c and share part of a lookup > local to a single stylesheet between xsltFindTemplate and > xsltParseStylesheetTemplate. I am not sure if you rather go for a small > code duplication or for having small helper classes. In terms of > performance i doubt that there is a noticeable difference. > > Best regards, > Christian > > > On Mon, Aug 31, 2015 at 8:27 AM, Daniel Veillard <veill...@redhat.com> > wrote: > >> On Thu, Aug 20, 2015 at 07:21:50PM +0200, Christian Ceelen wrote: >> > Hi, >> > >> > Ok. I think i found the answer by trial and error myself. I was not >> able to >> > look up the named templates in the hash table for the (pre-) compiled >> > templates. >> > >> > So i followed Nick's suggest and came up with this first draft: >> > https://github.com/GNOME/libxslt/pull/1 >> >> So I applied the patch, this looks reasonable, we need to add to the >> end of the structure, to minimize risk of ABI breakage. Should change >> the 1024 value that Nick pointed out. >> >> > I tested locally and didn't get any surprises. I also created a simple >> test >> > with 2 named templates that have the same name to test for collision >> > detection. The new code behaves in that case as the previous one. But so >> > far I have no idea yet where and how to add it to the tests in the test >> > suite as this is a test for failure detection. >> > >> > Looking at a second glance at the imports.c and xsltFindTemplate i >> think it >> > would be worthwhile to add the new function i created for lookup in the >> > hash table in the file imports.c and try to encapsulate the behavior >> there. >> > Do you think i should refactor the code in that direction? >> >> That's where things started to go wrong, I modified xsltFindTemplate() >> to look in the namedTemplatesHash instead of digging the templates list, >> patch is attached, and that raised a number of errors in the regression >> tests. >> So it seems to me that namedTemplatesHash isn't initialized with all the >> named templates from the stylesheet and hence the collision detection >> is also likely wrong even if it passed an initial simple test. >> >> So I think we need to give it more scrutiny, i.e. each time a template >> is >> added to style->templates we need to make sure we also add it to >> style->namedTemplatesHash. >> >> >> > Best regards, >> > >> > Christian >> >> >> > On Thu, Aug 20, 2015 at 10:28 AM, Christian Ceelen < >> > christian.cee...@gmail.com> wrote: >> > >> > > Hi Nick, >> > > >> > > Thank you very much for the answer. That sounds like a very good >> idea. Yet >> > > would the validation still be correct, if the hashtable is local to >> the >> > > stylesheet in which the template was defined? I was wondering what >> would >> > > happen if the same template name is defined in two files, where one >> of the >> > > files then imports or includes the other. Concerning the imports i >> think >> > > the standard is clear, the defined template has lower preference, so >> having >> > > two with the same name sounds legit to me. But what happens when the >> file >> > > is included? >> > > My current understanding is that real validation requires to walk the >> > > include tree for each template name. Would that be correctly >> replicating >> > > the existing behavior? >> > > >> > > Btw. In xsltInernals.h:L1514 i found in struct _xsltStylesheet: >> > > void *templatesHash; >> > > Is this table already containing what we want? >> > > >> > > Best regards, >> > > >> > > Christian >> > > >> > > On Wed, Aug 19, 2015 at 1:00 PM, Nick Wellnhofer <wellnho...@aevum.de >> > >> > > wrote: >> > > >> > >> On 19/08/2015 10:47, Christian Ceelen wrote: >> > >> >> > >>> At the moment i do not have a deep understanding of the libXSLT >> code. My >> > >>> current guess is, that this point at which the template name is >> > >>> validated to >> > >>> be unique in the stylesheet. The current algorithm seems to be >> walking >> > >>> linearly through the list of all already created templates and is >> > >>> comparing >> > >>> the name with each. Given that the compilation process is walking >> > >>> through all >> > >>> templates this loop means that we have an O(n^2) algorithm (with n >> being >> > >>> the >> > >>> number of template instances in the stylesheet to compile). >> > >>> The huge number of templates in my XSLTs are just so far over the >> edge, >> > >>> that >> > >>> the compilation takes 35s. I ran a test in which i skipped the >> loop. This >> > >>> reduced the compilation time to below 2.5 s. >> > >>> >> > >>> Would anyone let me know if i have understood the code? What can i >> do to >> > >>> improve the code that would get easily accepted and released? I am >> open >> > >>> to any >> > >>> kind of suggestions on what to do to speed this validation step up >> with >> > >>> a data >> > >>> structure or mechanism already existing ? >> > >>> >> > >> >> > >> Yes, template names are verified by searching a linked list which is >> > >> O(n^2). Template lookup by name uses the same list: >> > >> >> > >> >> https://github.com/GNOME/libxslt/blob/master/libxslt/imports.c#L375 >> > >> >> > >> It shouldn't be too hard to change the code to use an xmlHashTable: >> > >> >> > >> >> https://github.com/GNOME/libxml2/blob/master/include/libxml/hash.h >> > >> >> > >> Simply add a the hash table to struct _xsltStylesheet >> > >> >> > >> >> > >> >> https://github.com/GNOME/libxslt/blob/master/libxslt/xsltInternals.h#L1509 >> > >> >> > >> add some initialization and cleanup, then use xmlHashAddEntry2 and >> > >> xmlHashLookup2 with the template's name and namespace URI. This >> should make >> > >> name verification O(n) overall and lookup of a single template by >> name O(1). >> > >> >> > >> What is now the developer or working repository for libxslt? >> > >>> I found the git repository on git.gnome.org <http://git.gnome.org>, >> the >> > >>> GNOME/libxslt on github, plus several forks which seem to be working >> > >>> repos. >> > >>> Can you point me please to the repo against which i should use for >> > >>> further >> > >>> investigation or patches? >> > >>> >> > >> >> > >> The official repo is the one at git.gnome.org: >> > >> >> > >> https://git.gnome.org/browse/libxslt/ >> > >> >> > >> Nick >> > >> >> > >> _______________________________________________ >> > >> xslt mailing list, project page http://xmlsoft.org/XSLT/ >> > >> xslt@gnome.org >> > >> https://mail.gnome.org/mailman/listinfo/xslt >> > >> >> > > >> > > >> >> > _______________________________________________ >> > xslt mailing list, project page http://xmlsoft.org/XSLT/ >> > xslt@gnome.org >> > https://mail.gnome.org/mailman/listinfo/xslt >> >> >> -- >> Daniel Veillard | Open Source and Standards, Red Hat >> veill...@redhat.com | libxml Gnome XML XSLT toolkit http://xmlsoft.org/ >> http://veillard.com/ | virtualization library http://libvirt.org/ >> > >
_______________________________________________ xslt mailing list, project page http://xmlsoft.org/XSLT/ xslt@gnome.org https://mail.gnome.org/mailman/listinfo/xslt