Hi,

Did anyone find the time to look into this or test this?

Cheers,
Christian

On Mon, Aug 31, 2015 at 11:47 AM, Christian Ceelen <
christian.cee...@gmail.com> wrote:

> Hi,
>
> i think i no know what i did wrong while filling the hash table. And yes.
> My bad ... :
>
> xslt.c: 5448
> + xmlHashAddEntry2(style->namedTemplatesHash, ret->name, ret->nameURI,
> template);
>
> This adds a pointer to the xmlNode in the parsed stylesheet to the hash
> table. For just detecting collisions that seems to be ok.
> But it is utterly useless for using it in xsltFindTemplate in 'import.c'
> as that expects a pointer to the compiled template struct 'xsltTemplate'.
> So my initial patch for reusing it doing a bad pointer cast here:
> --- a/libxslt/imports.c
> +++ b/libxslt/imports.c
> @@ -400,19 +400,12 @@ xsltFindTemplate(xsltTransformContextPtr ctxt, const
> xmlChar *name,
>         return(NULL);
>      style = ctxt->style;
>      while (style != NULL) {
> -       cur = style->templates;
> -       while (cur != NULL) {
> -           if (xmlStrEqual(name, cur->name)) {
> -               if (((nameURI == NULL) && (cur->nameURI == NULL)) ||
> -                   ((nameURI != NULL) && (cur->nameURI != NULL) &&
> -                    (xmlStrEqual(nameURI, cur->nameURI)))) {
> -                   return(cur);
> -               }
> -           }
> -           cur = cur->next;
> -       }
> -
> -       style = xsltNextImport(style);
> +        if(style->namedTemplatesHash != NULL){
> +            cur = (xsltTemplatePtr)
> xmlHashLookup2(style->namedTemplatesHash, name, nameURI);
> +            if(cur != NULL)
> +                return (cur);
> +        }
> +        style = xsltNextImport(style);
>      }
>      return(NULL);
>  }
>
> This explains why it was not working as i had hoped. The hash table
> returns the xmlNodePtr instead of the xsltTemplatePtr expected by
> xsltFindTemplate. The patch Daniel does the same, it expects
> xsltTemplatePtr.
>
> So changing in xslt.c: 5448
> + xmlHashAddEntry2(style->namedTemplatesHash, ret->name, ret->nameURI,
> template);
> to
> + xmlHashAddEntry2(style->namedTemplatesHash, ret->name, ret->nameURI,
> ret);
> Makes it possible to reuse the hash table for the named templates also in
> xsltFindTemplate.
>
> I pushed the updated changes to github and attached a clean patch to the
> email. It is created against 1.1.28.
>
> The patch adds the hash table initialization and population to
> xsltAddTemplate (pattern.c), sets the initial size to a more reasonable 8
> entries, reuses it in xsltFindTemplate (imports.c), and adds a test case to
> check that an error is emitted during compilation for template names that
> are not unique.
>
> So far i haven't replaced the helper functions as Nico suggested, as i was
> thinking of moving the look up code to imports.c and share part of a lookup
> local to a single stylesheet between xsltFindTemplate and
> xsltParseStylesheetTemplate. I am not sure if you rather go for a small
> code duplication or for having small helper classes. In terms of
> performance i doubt that there is a noticeable difference.
>
> Best regards,
> Christian
>
>
> On Mon, Aug 31, 2015 at 8:27 AM, Daniel Veillard <veill...@redhat.com>
> wrote:
>
>> On Thu, Aug 20, 2015 at 07:21:50PM +0200, Christian Ceelen wrote:
>> > Hi,
>> >
>> > Ok. I think i found the answer by trial and error myself. I was not
>> able to
>> > look up the named templates in the hash table for the (pre-) compiled
>> > templates.
>> >
>> > So i followed Nick's suggest and came up with this first draft:
>> >    https://github.com/GNOME/libxslt/pull/1
>>
>>   So I applied the patch, this looks reasonable, we need to add to the
>> end of the structure, to minimize risk of ABI breakage. Should change
>> the 1024 value that Nick pointed out.
>>
>> > I tested locally and didn't get any surprises. I also created a simple
>> test
>> > with 2 named templates that have the same name to test for collision
>> > detection. The new code behaves in that case as the previous one. But so
>> > far I have no idea yet where and how to add it to the tests in the test
>> > suite as this is a test for failure detection.
>> >
>> > Looking at a second glance at the imports.c and xsltFindTemplate i
>> think it
>> > would be worthwhile to add the new function i created for lookup in the
>> > hash table in the file imports.c and try to encapsulate the behavior
>> there.
>> > Do you think i should refactor the code in that direction?
>>
>>   That's where things started to go wrong, I modified xsltFindTemplate()
>> to look in the namedTemplatesHash instead of digging the templates list,
>> patch is attached, and that raised a number of errors in the regression
>> tests.
>> So it seems to me that namedTemplatesHash isn't initialized with all the
>> named templates from the stylesheet and hence the collision detection
>> is also likely wrong even if it passed an initial simple test.
>>
>>   So I think we need to give it more scrutiny, i.e. each time a template
>> is
>> added to style->templates we need to make sure we also add it to
>> style->namedTemplatesHash.
>>
>>
>> > Best regards,
>> >
>> > Christian
>>
>>
>> > On Thu, Aug 20, 2015 at 10:28 AM, Christian Ceelen <
>> > christian.cee...@gmail.com> wrote:
>> >
>> > > Hi Nick,
>> > >
>> > > Thank you very much for the answer. That sounds like a very good
>> idea. Yet
>> > > would the validation still be correct, if the hashtable is local to
>> the
>> > > stylesheet in which the template was defined? I was wondering what
>> would
>> > > happen if the same template name is defined in two files, where one
>> of the
>> > > files then imports or includes the other. Concerning the imports i
>> think
>> > > the standard is clear, the defined template has lower preference, so
>> having
>> > > two with the same name sounds legit to me. But what happens when the
>> file
>> > > is included?
>> > > My current understanding is that real validation requires to walk the
>> > > include tree for each template name. Would that be correctly
>> replicating
>> > > the existing behavior?
>> > >
>> > > Btw. In xsltInernals.h:L1514 i found in struct _xsltStylesheet:
>> > > void *templatesHash;
>> > > Is this table already containing what we want?
>> > >
>> > > Best regards,
>> > >
>> > > Christian
>> > >
>> > > On Wed, Aug 19, 2015 at 1:00 PM, Nick Wellnhofer <wellnho...@aevum.de
>> >
>> > > wrote:
>> > >
>> > >> On 19/08/2015 10:47, Christian Ceelen wrote:
>> > >>
>> > >>> At the moment i do not have a deep understanding of the libXSLT
>> code. My
>> > >>> current guess is, that this point at which the template name is
>> > >>> validated to
>> > >>> be unique in the stylesheet. The current algorithm seems to be
>> walking
>> > >>> linearly through the list of all already created templates and is
>> > >>> comparing
>> > >>> the name with each. Given that the compilation process is walking
>> > >>> through all
>> > >>> templates this loop means that we have an O(n^2) algorithm (with n
>> being
>> > >>> the
>> > >>> number of template instances in the stylesheet to compile).
>> > >>> The huge number of templates in my XSLTs are just so far over the
>> edge,
>> > >>> that
>> > >>> the compilation takes 35s. I ran a test in which i skipped the
>> loop. This
>> > >>> reduced the compilation time to below 2.5 s.
>> > >>>
>> > >>> Would anyone let me know if i have understood the code? What can i
>> do to
>> > >>> improve the code that would get easily accepted and released? I am
>> open
>> > >>> to any
>> > >>> kind of suggestions on what to do to speed this validation step up
>> with
>> > >>> a data
>> > >>> structure or mechanism already existing ?
>> > >>>
>> > >>
>> > >> Yes, template names are verified by searching a linked list which is
>> > >> O(n^2). Template lookup by name uses the same list:
>> > >>
>> > >>
>> https://github.com/GNOME/libxslt/blob/master/libxslt/imports.c#L375
>> > >>
>> > >> It shouldn't be too hard to change the code to use an xmlHashTable:
>> > >>
>> > >>
>> https://github.com/GNOME/libxml2/blob/master/include/libxml/hash.h
>> > >>
>> > >> Simply add a the hash table to struct _xsltStylesheet
>> > >>
>> > >>
>> > >>
>> https://github.com/GNOME/libxslt/blob/master/libxslt/xsltInternals.h#L1509
>> > >>
>> > >> add some initialization and cleanup, then use xmlHashAddEntry2 and
>> > >> xmlHashLookup2 with the template's name and namespace URI. This
>> should make
>> > >> name verification O(n) overall and lookup of a single template by
>> name O(1).
>> > >>
>> > >> What is now the developer or working repository for libxslt?
>> > >>> I found the git repository on git.gnome.org <http://git.gnome.org>,
>> the
>> > >>> GNOME/libxslt on github, plus several forks which seem to be working
>> > >>> repos.
>> > >>> Can you point me please to the repo against which i should use for
>> > >>> further
>> > >>> investigation or patches?
>> > >>>
>> > >>
>> > >> The official repo is the one at git.gnome.org:
>> > >>
>> > >>     https://git.gnome.org/browse/libxslt/
>> > >>
>> > >> Nick
>> > >>
>> > >> _______________________________________________
>> > >> xslt mailing list, project page http://xmlsoft.org/XSLT/
>> > >> xslt@gnome.org
>> > >> https://mail.gnome.org/mailman/listinfo/xslt
>> > >>
>> > >
>> > >
>>
>> > _______________________________________________
>> > xslt mailing list, project page http://xmlsoft.org/XSLT/
>> > xslt@gnome.org
>> > https://mail.gnome.org/mailman/listinfo/xslt
>>
>>
>> --
>> Daniel Veillard      | Open Source and Standards, Red Hat
>> veill...@redhat.com  | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
>> http://veillard.com/ | virtualization library  http://libvirt.org/
>>
>
>
_______________________________________________
xslt mailing list, project page http://xmlsoft.org/XSLT/
xslt@gnome.org
https://mail.gnome.org/mailman/listinfo/xslt

Reply via email to