Re: Xalan transform hand-built DOM

Peter . Marron Mon, 05 Jun 2006 07:56:24 -0700

Hi Dave,

Thank you once again for your helpful reply.

>>
>> I have, as suggested, created a trivial program that illustrates the
>> problem.
>> It is based on the XalanTransfrom example with minimal changes.
>> It is attached.
>>
>> Hopefully it shows where I am going wrong.
>> Maybe the problem is with my approach rather than any bug in Xalan.
>
>OK, the bad news is this is really something we haven't supported in a
>very long time. In fact, I'm not sure if XalanTransformer ever
>supported using a pre-built tree in this way. We should probably remove
>the XalanNode constructor from XSLTInputSource.

Well, from my point of view that would be very helpful
as it would have saved me from some frustration.

>The good news is the way you can do this is fairly straightforward. You
>need to create your own derivative of XalanParsedSource, and wrap your
>document instance in that. You really need to do a bit more work to get
>your own XalanDocument implementation to work anyway.

OK. I had started down the path of using XalanParsedSource
and given up. However given you suggestion I tried it again
and have finally succeeded in getting a trivial translation to
work. There were a number of problems that I had which
meant it took a lot longer than I would have liked but
nothing too obscure.

I am not sure exactly what you meant about
getting my implementation of XalanDocument to work.
I know (now) about the bug in getFirstChild/getLastChild
and I plan to implement the getElementByIdfunction.
But was there something else that I have missed?

>To get your own custom implementation of XalanDOM to work you need to
>provide a derivative of the DOMSupport class that understands your
>implementation. You can probably just use DOMSupportDefault, but you
>should create your own if you can optimize node-ordering for your
>implementation. For example, XalanSourceTree (the default
>implementation) uses indexing to support fast node ordering. I would
>highly recommend you do something like that, if you want to see
>reasonable performance.

I am interested in this point.
I can't realistically provide (global) node indexing.
(My DOM is a virtualization of a database and I can't
order everything without reading the whole
database - which is never going to happen.) However I plan to index
my nodes locally (that is, I can find the index of a node wrt its parent
quickly) and therefore I can order nodes much faster than the implementation
I find in DOMServices.cpp (Which has to do a linear search of the children.)

I assume that you mean that I should provide my own implementation of
"DOMSupport" and provide my own version of "isNodeAfter" which I can
make run a lot faster. This I understand - no problem.

What's not immediately clear to me is why this would make Xalan run
a lot faster. Does Xalan spend a lot of time checking the ordering
of nodes? Is this obvious? (Is there some reference I could read to
understand why this should be the case?)

Is there anything else that I should try and do
that would particularly help performance?

>Feel free to post again if this isn't clear, or you have more questions.

OK. One more question.
As I mentioned my DOM is a virtualization of
a database. (Not a relational database BTW). So I (plan to) have an
element for a table containing many elements
(potentially 10s of millions of elements) for the rows.
I plan to support user queries expressed as XPaths.
Now if an XPath like ".../tableName/rowName[13]" appears then
what Xalan appears to do is to walk all of the children
using getFirstChild and getNextSibling and then apply the predicate.
OK. I can understand this as it has to apply a node test
and it doesn't know that all the children are "rowName" without
access to something like the PSVI. But even if I do something
like ".../tableName/node()[13]" it looks like it will always
enumerate all of the children in XPath::findChildren
I guess I would prefer some way to get it to use the
DOM's XalanNodeList.item function to find child 13.
(I know that things are slightly more complicated because
the axis might affect the ordering but this doesn't
seem to make it undoable.)
I assume that the reason that Xalan works this way
is because the Xalan DOM classes themselves don't support "item"
and therefore the Xpath implementation can't expect
to use this function. However the question remains:
Is there any way to get Xalan to evaluate
Xpaths whilst avoiding enumerating all of the children.

(Oh, I know about the "id" function and I plan
to try and use that in the short term.
But that relies on the client generating efficient queries.)

And finally... It so happens that my underlying database
holds string data in UTF-8. At the moment this means that
whenever I implement a node it has to copy the data
from UTF-8 into XalanDOMStrings. Is there any chance that
in future XalanDOMString will become an interface?
That would allow me to avoid the copies.

Sorry this e-mail is so long.

And again, thanks for your help.

Regards

Peter Marron

Re: Xalan transform hand-built DOM

Reply via email to