Hi,

No worries. I'm pretty sure this won't be forgotten in the long run.
Let me know about your ideas on how to improve the code.

Cheers,
Christian

On Mon, Sep 14, 2015 at 3:38 AM, Daniel Veillard <veill...@redhat.com>
wrote:

> On Mon, Sep 07, 2015 at 09:47:40AM +0200, Christian Ceelen wrote:
> > Hi,
> >
> > Did anyone find the time to look into this or test this?
>
>   Sorry Christian,
>
> not forgotten but not done yet, will try this week !
>
>   thanks,
>
> Daniel
>
> > Cheers,
> > Christian
> >
> > On Mon, Aug 31, 2015 at 11:47 AM, Christian Ceelen <
> > christian.cee...@gmail.com> wrote:
> >
> > > Hi,
> > >
> > > i think i no know what i did wrong while filling the hash table. And
> yes.
> > > My bad ... :
> > >
> > > xslt.c: 5448
> > > + xmlHashAddEntry2(style->namedTemplatesHash, ret->name, ret->nameURI,
> > > template);
> > >
> > > This adds a pointer to the xmlNode in the parsed stylesheet to the hash
> > > table. For just detecting collisions that seems to be ok.
> > > But it is utterly useless for using it in xsltFindTemplate in
> 'import.c'
> > > as that expects a pointer to the compiled template struct
> 'xsltTemplate'.
> > > So my initial patch for reusing it doing a bad pointer cast here:
> > > --- a/libxslt/imports.c
> > > +++ b/libxslt/imports.c
> > > @@ -400,19 +400,12 @@ xsltFindTemplate(xsltTransformContextPtr ctxt,
> const
> > > xmlChar *name,
> > >         return(NULL);
> > >      style = ctxt->style;
> > >      while (style != NULL) {
> > > -       cur = style->templates;
> > > -       while (cur != NULL) {
> > > -           if (xmlStrEqual(name, cur->name)) {
> > > -               if (((nameURI == NULL) && (cur->nameURI == NULL)) ||
> > > -                   ((nameURI != NULL) && (cur->nameURI != NULL) &&
> > > -                    (xmlStrEqual(nameURI, cur->nameURI)))) {
> > > -                   return(cur);
> > > -               }
> > > -           }
> > > -           cur = cur->next;
> > > -       }
> > > -
> > > -       style = xsltNextImport(style);
> > > +        if(style->namedTemplatesHash != NULL){
> > > +            cur = (xsltTemplatePtr)
> > > xmlHashLookup2(style->namedTemplatesHash, name, nameURI);
> > > +            if(cur != NULL)
> > > +                return (cur);
> > > +        }
> > > +        style = xsltNextImport(style);
> > >      }
> > >      return(NULL);
> > >  }
> > >
> > > This explains why it was not working as i had hoped. The hash table
> > > returns the xmlNodePtr instead of the xsltTemplatePtr expected by
> > > xsltFindTemplate. The patch Daniel does the same, it expects
> > > xsltTemplatePtr.
> > >
> > > So changing in xslt.c: 5448
> > > + xmlHashAddEntry2(style->namedTemplatesHash, ret->name, ret->nameURI,
> > > template);
> > > to
> > > + xmlHashAddEntry2(style->namedTemplatesHash, ret->name, ret->nameURI,
> > > ret);
> > > Makes it possible to reuse the hash table for the named templates also
> in
> > > xsltFindTemplate.
> > >
> > > I pushed the updated changes to github and attached a clean patch to
> the
> > > email. It is created against 1.1.28.
> > >
> > > The patch adds the hash table initialization and population to
> > > xsltAddTemplate (pattern.c), sets the initial size to a more
> reasonable 8
> > > entries, reuses it in xsltFindTemplate (imports.c), and adds a test
> case to
> > > check that an error is emitted during compilation for template names
> that
> > > are not unique.
> > >
> > > So far i haven't replaced the helper functions as Nico suggested, as i
> was
> > > thinking of moving the look up code to imports.c and share part of a
> lookup
> > > local to a single stylesheet between xsltFindTemplate and
> > > xsltParseStylesheetTemplate. I am not sure if you rather go for a small
> > > code duplication or for having small helper classes. In terms of
> > > performance i doubt that there is a noticeable difference.
> > >
> > > Best regards,
> > > Christian
> > >
> > >
> > > On Mon, Aug 31, 2015 at 8:27 AM, Daniel Veillard <veill...@redhat.com>
> > > wrote:
> > >
> > >> On Thu, Aug 20, 2015 at 07:21:50PM +0200, Christian Ceelen wrote:
> > >> > Hi,
> > >> >
> > >> > Ok. I think i found the answer by trial and error myself. I was not
> > >> able to
> > >> > look up the named templates in the hash table for the (pre-)
> compiled
> > >> > templates.
> > >> >
> > >> > So i followed Nick's suggest and came up with this first draft:
> > >> >    https://github.com/GNOME/libxslt/pull/1
> > >>
> > >>   So I applied the patch, this looks reasonable, we need to add to the
> > >> end of the structure, to minimize risk of ABI breakage. Should change
> > >> the 1024 value that Nick pointed out.
> > >>
> > >> > I tested locally and didn't get any surprises. I also created a
> simple
> > >> test
> > >> > with 2 named templates that have the same name to test for collision
> > >> > detection. The new code behaves in that case as the previous one.
> But so
> > >> > far I have no idea yet where and how to add it to the tests in the
> test
> > >> > suite as this is a test for failure detection.
> > >> >
> > >> > Looking at a second glance at the imports.c and xsltFindTemplate i
> > >> think it
> > >> > would be worthwhile to add the new function i created for lookup in
> the
> > >> > hash table in the file imports.c and try to encapsulate the behavior
> > >> there.
> > >> > Do you think i should refactor the code in that direction?
> > >>
> > >>   That's where things started to go wrong, I modified
> xsltFindTemplate()
> > >> to look in the namedTemplatesHash instead of digging the templates
> list,
> > >> patch is attached, and that raised a number of errors in the
> regression
> > >> tests.
> > >> So it seems to me that namedTemplatesHash isn't initialized with all
> the
> > >> named templates from the stylesheet and hence the collision detection
> > >> is also likely wrong even if it passed an initial simple test.
> > >>
> > >>   So I think we need to give it more scrutiny, i.e. each time a
> template
> > >> is
> > >> added to style->templates we need to make sure we also add it to
> > >> style->namedTemplatesHash.
> > >>
> > >>
> > >> > Best regards,
> > >> >
> > >> > Christian
> > >>
> > >>
> > >> > On Thu, Aug 20, 2015 at 10:28 AM, Christian Ceelen <
> > >> > christian.cee...@gmail.com> wrote:
> > >> >
> > >> > > Hi Nick,
> > >> > >
> > >> > > Thank you very much for the answer. That sounds like a very good
> > >> idea. Yet
> > >> > > would the validation still be correct, if the hashtable is local
> to
> > >> the
> > >> > > stylesheet in which the template was defined? I was wondering what
> > >> would
> > >> > > happen if the same template name is defined in two files, where
> one
> > >> of the
> > >> > > files then imports or includes the other. Concerning the imports i
> > >> think
> > >> > > the standard is clear, the defined template has lower preference,
> so
> > >> having
> > >> > > two with the same name sounds legit to me. But what happens when
> the
> > >> file
> > >> > > is included?
> > >> > > My current understanding is that real validation requires to walk
> the
> > >> > > include tree for each template name. Would that be correctly
> > >> replicating
> > >> > > the existing behavior?
> > >> > >
> > >> > > Btw. In xsltInernals.h:L1514 i found in struct _xsltStylesheet:
> > >> > > void *templatesHash;
> > >> > > Is this table already containing what we want?
> > >> > >
> > >> > > Best regards,
> > >> > >
> > >> > > Christian
> > >> > >
> > >> > > On Wed, Aug 19, 2015 at 1:00 PM, Nick Wellnhofer <
> wellnho...@aevum.de
> > >> >
> > >> > > wrote:
> > >> > >
> > >> > >> On 19/08/2015 10:47, Christian Ceelen wrote:
> > >> > >>
> > >> > >>> At the moment i do not have a deep understanding of the libXSLT
> > >> code. My
> > >> > >>> current guess is, that this point at which the template name is
> > >> > >>> validated to
> > >> > >>> be unique in the stylesheet. The current algorithm seems to be
> > >> walking
> > >> > >>> linearly through the list of all already created templates and
> is
> > >> > >>> comparing
> > >> > >>> the name with each. Given that the compilation process is
> walking
> > >> > >>> through all
> > >> > >>> templates this loop means that we have an O(n^2) algorithm
> (with n
> > >> being
> > >> > >>> the
> > >> > >>> number of template instances in the stylesheet to compile).
> > >> > >>> The huge number of templates in my XSLTs are just so far over
> the
> > >> edge,
> > >> > >>> that
> > >> > >>> the compilation takes 35s. I ran a test in which i skipped the
> > >> loop. This
> > >> > >>> reduced the compilation time to below 2.5 s.
> > >> > >>>
> > >> > >>> Would anyone let me know if i have understood the code? What
> can i
> > >> do to
> > >> > >>> improve the code that would get easily accepted and released? I
> am
> > >> open
> > >> > >>> to any
> > >> > >>> kind of suggestions on what to do to speed this validation step
> up
> > >> with
> > >> > >>> a data
> > >> > >>> structure or mechanism already existing ?
> > >> > >>>
> > >> > >>
> > >> > >> Yes, template names are verified by searching a linked list
> which is
> > >> > >> O(n^2). Template lookup by name uses the same list:
> > >> > >>
> > >> > >>
> > >> https://github.com/GNOME/libxslt/blob/master/libxslt/imports.c#L375
> > >> > >>
> > >> > >> It shouldn't be too hard to change the code to use an
> xmlHashTable:
> > >> > >>
> > >> > >>
> > >> https://github.com/GNOME/libxml2/blob/master/include/libxml/hash.h
> > >> > >>
> > >> > >> Simply add a the hash table to struct _xsltStylesheet
> > >> > >>
> > >> > >>
> > >> > >>
> > >>
> https://github.com/GNOME/libxslt/blob/master/libxslt/xsltInternals.h#L1509
> > >> > >>
> > >> > >> add some initialization and cleanup, then use xmlHashAddEntry2
> and
> > >> > >> xmlHashLookup2 with the template's name and namespace URI. This
> > >> should make
> > >> > >> name verification O(n) overall and lookup of a single template by
> > >> name O(1).
> > >> > >>
> > >> > >> What is now the developer or working repository for libxslt?
> > >> > >>> I found the git repository on git.gnome.org <
> http://git.gnome.org>,
> > >> the
> > >> > >>> GNOME/libxslt on github, plus several forks which seem to be
> working
> > >> > >>> repos.
> > >> > >>> Can you point me please to the repo against which i should use
> for
> > >> > >>> further
> > >> > >>> investigation or patches?
> > >> > >>>
> > >> > >>
> > >> > >> The official repo is the one at git.gnome.org:
> > >> > >>
> > >> > >>     https://git.gnome.org/browse/libxslt/
> > >> > >>
> > >> > >> Nick
> > >> > >>
> > >> > >> _______________________________________________
> > >> > >> xslt mailing list, project page http://xmlsoft.org/XSLT/
> > >> > >> xslt@gnome.org
> > >> > >> https://mail.gnome.org/mailman/listinfo/xslt
> > >> > >>
> > >> > >
> > >> > >
> > >>
> > >> > _______________________________________________
> > >> > xslt mailing list, project page http://xmlsoft.org/XSLT/
> > >> > xslt@gnome.org
> > >> > https://mail.gnome.org/mailman/listinfo/xslt
> > >>
> > >>
> > >> --
> > >> Daniel Veillard      | Open Source and Standards, Red Hat
> > >> veill...@redhat.com  | libxml Gnome XML XSLT toolkit
> http://xmlsoft.org/
> > >> http://veillard.com/ | virtualization library  http://libvirt.org/
> > >>
> > >
> > >
>
> > _______________________________________________
> > xslt mailing list, project page http://xmlsoft.org/XSLT/
> > xslt@gnome.org
> > https://mail.gnome.org/mailman/listinfo/xslt
>
>
> --
> Daniel Veillard      | Open Source and Standards, Red Hat
> veill...@redhat.com  | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
> http://veillard.com/ | virtualization library  http://libvirt.org/
> _______________________________________________
> xslt mailing list, project page http://xmlsoft.org/XSLT/
> xslt@gnome.org
> https://mail.gnome.org/mailman/listinfo/xslt
>
_______________________________________________
xslt mailing list, project page http://xmlsoft.org/XSLT/
xslt@gnome.org
https://mail.gnome.org/mailman/listinfo/xslt

Reply via email to