On Fri, May 13, 2005 at 11:15:08PM +0200, Martijn Faassen wrote:
[snip]
Up to there I agree, but I don't understand "style's signature". If you meant the "style's dictionnary", then yes.
Oops, sorry; I was tired when I wrote that and indeed meant 'dictionary', not 'signature'.
[snip]
The reason is that libxslt makes the garantee that a compiled stylesheet is read-only when used to make a transformation. It also avoid problems of shared resources in multi-threaded XSLT engines.
Would the developers be open to me suggesting changes to the XSLT codebase to make this possible again? I suppose I should ask on the
XSLT
Yes that should be doable. I'm not sure what would be the best API for this.
I'm not either, but I'll think about this. Would sharing a dictionary break the read-only guarantee though, and thus break multi-threading?
list, so let's move on to the real purpose of this mail.
Exploring these issues made me conclude that it's time to at least look at the alternative to sharing a single global dictionary, redicting parts of trees. A redicting operation would take place whenever a node is moved into a new tree. All the strings in the subtree below this node will be traced to the originating document's dictionary, and the entries will be copied into the target document's dictionary. Additionally, all string references in the subtree will be made to point to the new document's dictionary.
Yes, it seems that at the DOM level this operation is called an import based on some PHP/javascript examples I saw recently.
Yes, the W3C DOM indeed defines an importNode operation, and I guess I'm asking for the equivalent here. :)
I think that if we add this then we should try to match the existing semantic of those operation in PHP for example.
Does PHP implement this operation on top of libxml2? We might also want to consider the W3C importNode semantic, though I doubt it actually says much of use for us here...
The thing which need to be checked when preparing for such an import are: - doc remapping
By this you mean telling all nodes about the new document node, right?
- dictionnary remapping - namespace references to the original document
As a document contains a list of all the namespace references, right? So if the original document were to be destroyed, namespace references to it from nodes now in new documents would be pointed to free space.
- namespace remapping to the local document
What does this mean as compared to the previous, namespace references to the original document?
- entities reference to the original document I think those are the only pointers which are added to the pure tree oriented parent/child/sibling ones.
Thanks for the list!
Looking at the import implementation of PHP5 might give us an idea of how to implement this. Note that there are incomplete APIs dealing either just with document pointers (xmlSetTreeDoc) or just namespaces (xmlNewReconciliedNs and xmlReconciliateNs).
Okay, I shall study the implementations of those. It would probably be more efficient to provide a function that did all the remapping in one operation as it traversed the tree, though.
[snip]
In order to write a good redicting operation, I'd need a bit more information about which information in a tree exactly can end up inAll markup names, all namespaces strings (prefix and namespace names)
a dictionary. If someone would be able to give me a list of what
ends up in the dictionary, that would be extremely helpful.
and some text node content (so that all "formatting nodes" used to
indent share as much as possible, or very short text nodes for
example "0" or "1").
Short text nodes includes attribute values?
Hope it helps, and thanks !
Thank you for the very helpful answer. I will also look at the PHP5 implementation.
Daniel
P.S.: I think I should be able to design a method to make importing strings from a given dictionnary into python strings quite faster for repeatedly querying the same set of strings. The principle would be to add an API to the dictionnary returning an index for the string (cost O(1)) and at the python binding level have an array keeping pointers to the strings already converted (Py_INCREF'ed of course).
That would be very nice to have! I played with this idea before myself, but didn't get anything working yet. I will think about this some more. Is there any userdata facility in dictionaries already?
Regards,
Martijn
_______________________________________________ xml mailing list, project page http://xmlsoft.org/ [email protected] http://mail.gnome.org/mailman/listinfo/xml
