Howard,
I have a very similar set of requirements, so my experiences _may_ be of use. I have an application where I parse an XML file into a DOM document. I read parts of that document, add nodes, delete nodes, overwrite nodes and search for nodes using the various DOM functions. I also use XPath for finding node-sets and XSLT for transforming the same document. The results from XPath and XSLT transforms are then used to further modify the document.
In this particular case I have found that the best thing to do is work primarily with the Xerces DOM and then wrap the Xerces DOM in XalanDOM when necessary to do XPath or XSLT.
In the case of XPath, I map the resulting node-set back into a Xerces node set and then continue working in Xerces (because I need to use the nodes to determine where to further edit the tree).
In the case of XSLT, I transform directly into Xerces so that I can continue editting.
As has been indicated in the lists, this is not the best for performance, but to be honest, with Xerces 2.1 and the new version of Xalan (coming up - you can get from CVS now), I have found that it works well. I can do some _very_ computationally expensive XPath searches with a wrapped Xerces source and still get good performance.
I like Xalan a lot. I use the same library for Linux/Solaris/Netbsd/Freebsd/Windows all with good performance. But I have found that for the kind of thing I describe above (and I think it's what you are doing as well), the best approach is to work primarily with the Xerces DOM and call on Xalan as necessary.
On the subject of dox - the documentation challenge for Xalan is _huge_. There is an enourmous amount of functionality hidden in Xalan, and making use of it can be fairly complex. At the same time, writing detailed doco for it is somewhat daunting. I'd agree that there is work that could be done here, but at the same time there is only a limited number of resources available.
A bit rambling - but I hope of use.
Cheers, Berin
Howard Kapustein wrote:
OK, I think I'm finally starting to understand (partly) how Xalan works. And I'm getting very very scared.
Here's what I'm trying to do.
I have a large block of C++ code that performs various styles of XML DOM manipulation. In particular, these idioms:
Create a DOMDocument a1) Create empty DOM a2) Load DOM from file a3) Load DOM from string
Navigate the DOM b1) getChildNodes() b2) selectNodes(xpath) b3) selectSingleNode(xpath)
Modify the DOM c1) create...() -- Element, Attribute, DocumentFragment, etc. c2) append/insert/replace/...Child()
Transformation d1) String output = Transform(String input, String XSLfilename) d2) String output = Tranform(Node input, String XSLfilename)
Xalan-C capable. Mostly. It's the 'Modify the DOM' part I'm worried about.
My large project works flawlessly on Windows using MSXML. Ported to Java, works flawlessly using Xalan-J. Ahhh...now we need C++ on Unix.
OK, Xerces is out because I need XPath + XSL support. No problem -- Xalan supports that plus basic DOM processing.
I'm now at the point where I've got everything compiled, linked and partially running. And I fault regularly. And we're still in the small, controlled path, and I'm having major problems.
There are times when I'm handed an XML string and effectively make a read-only DOM -- parse and access it, but never modify it.
*But* there are times -- the majority of times -- where I'm handed an XML string **to start** and I need to intermingle XPath + appendChild() / createElement() / ... logic.
From what I've seen in the Xalan-C archives, ***this is a problem***.Please tell me I'm missing something!
From what I've seen so far:
--"Use XalanDocumentBuilder" only works if I can build the DOM in document order. I can't.
--"Use Xerces bridge" has huge performance problems, and doesn't integrate very well with Xalan in my model (parse with Xerces, bridge to Xalan, change via createElement()/... + appendChild() at arbitrary node in the tree and re-bridge -- or serialize/deserialize, or the like)
--"Use new Xerces DOM" doesn't work with Xalan (and I'm not sure if that would solve my problems anyway)
***I care about DOM + XPath***
I'd *like* XSL to also perform, but that's not very common for me so I'm willing to pay a perf hit doing / prep'ing / moving results if I have to.
I'm honestly stumped.
And I don't mean to offend, but the Xalan-C documentation is just dreadful. I've been writing C++ for 12 years, and C before that; I know Java, Python, Fortan and many other languages, platforms and tools; I've written multithreaded systems and other non-trivial computing tasks. And everytime I see someone on this list say "see the docs" or "see the source, it's all there" I want to reach for a large metallic object. The docs are woefully incomplete -- raw footprints, and occasionally a few blurbs about individual methods, and lots of info about XSL, but if you're trying to use Xalan's DOM you have to piece together for yourself things like what to initialize, why LocalFileInputSource/ParserLiasion vs. XalanDocumentBuilder is necessary, and several other critical 'basics' -- the "API Reference" may exist (though that's debateable), but the "API Overview" is either non-existent or extremely well hidden.
Yes, I'm rather frustrated.
Don't give me that "Use the source, Luke". I have. Xalan is *large*. And many parts are thinly documented. Reverse engineering Xalan and understanding all the pieces just to do relatively basic XML work is not a good plan of attack. The samples often show parts, but 'documenting' the *what* at best, not the *why*.
This sounds worse than I meant. It's free. You get what you pay for. Fine. Take 2 things from this:
1) I'm stumped. I've read the docs. I've read the list archives. I've pored thru the source. I'm still stumped. And I'm getting a very bad feeling...
2) Xalan-C's documentation is a serious detriment to Xalan-C's adoption.
I doubt many people would persevere to this point -- most people would have thrown up their hands and gone back to MSXML, or switched to Java, or gotten stone drunk and looked for another job...
At this point, I'm stumped.
I'm deathly afraid Xalan-C can't do what I need.
If so, I'm stunned -- I can do this with MSXML and Xalan-J, so why not Xalan-C?
I do not need W3C Document.createElement syntax.
[It'd be nice, but my code's structured such that I can live w/o need be]
I *do* need the equivalent.
I need the ability to parse a file/string to a DOM, then traverse the tree using getChildNodes() + XPath, then add new nodes to the tree and remove nodes from the tree, then do more random traversal and repeat.
I really am impressed with Xalan-C overall. I'm just having the devil's time trying to do what should be very simple things.
If Xalan-C can't intermingle load-document-from-stream/file + XPath + create/append-nodes, do you know of an alternative (for C++)? Must run on HP-UX + Solaris + AIX, must support most of the DOM, must support XPath, and must support a way to go DOM<->string. Xerces-C + Pathan or roll-something-myself are the only alternatives I know of -- the former seems questionable and I'd really rather not do the latter...
Please post replies here, or email me directly.
Thank you,
- Howard
--
Email: mailto:[EMAIL PROTECTED] WWW: http://www.kapustein.com/howard/
--
The only "intuitive" interface is the nipple. After that, it's all learned. --Bruce Ediger, in comp.os.linux.misc, on X interfaces
