Re: DOM_ IDOM_ Integration

Lenny Hoffman Tue, 22 Jan 2002 08:22:10 -0800

Hi Tinny,

I created a zip containing the changed source. I sent it to the mailing list, but it bounced it back because it was over 100K (it is 281K). I looked at bugzilla, but it said that it forwards everything to the mailing list, so I was not sure that was the way to go. Please advise -- should I break it up into 3 separate zips and send them separately?

>> I am very interested in your changes, and I think it's a good idea to wrap the DOM with IDOM,

Just to be clear, I wrapped IDOM, with DOM. Without preventing direct IDOM use, though.

Regards,

Lenny

----- Original Message -----

From: Tinny Ng

To: [EMAIL PROTECTED]

Sent: Tuesday, January 22, 2002 6:29 AM

Subject: Re: DOM_ IDOM_ Integration

Lenny,
I am very interested in your changes, and I think it's a good idea to wrap the DOM with IDOM, and eventually fade out the old DOM implementation. Actually I am also thinking to send the IDOM to W3C as one of the non-W3C referenced DOM C++ Binding. I am working on a proposal ......
Would you please post your zip source to the mailing list or bugzilla? We can review the changes in more detail, and see if we can adopt it to the Xerces-C++ code.
Thanks!
Tinny
Lenny Hoffman wrote:
Hi All, I went ahead and performed the refactoring. The results are: IDOMCount and DOMCount now take the same amount of time. Not a surprise, as the IDOM is now behind DOM, which only adds a lightweight smart pointer overhead. If there is a better performance test for comparing, please let me know. The size of the release build of the VC6 DLL has been reduced to 1,364KB from 1,560KB. This is indicative of my removing a large amount of code, mostly due to removing the DOM implementations. The DOM interface can still be used as always, except there were a few methods on DOM_Document that don't exist on IDOM_Document, so given the choice of implementing them in the IDOM or removing them, I simply removed them. It looks like IDOM is truer to the DOM specification, and this seemed like the right way to go, even though I wanted to create an entirely backward compatible version of the DOM interfaces. The methods could be added back and implementations provided on the IDOM side if that is the wrong decision. The methods removed from DOM_Document were:     static DOM_Document   createDocument();     DOM_XMLDecl createXMLDecl(const DOMString& version,
                            const DOMString& encoding,
                            const DOMString& standalone); From inspecting the mail list archive, I learned that the developers of IDOM chose not to implement the equivalent to DOM_XMLDecl because it was non-standard and those parameters where eventually going to be attributes of the document. Because of this I took the liberty to remove "XML_DECL_NODE = 13" from IDOM_Node's and DOM_Node's Nodetype enumerations. I made all of my changes to an off-line copy of the source. I can make it available to anyone interested in seeing it, and it would benefit me, and I believe many others, if the xerces group would adopt the changes. As a review, here are the features. * Users of the DOM interfaces now enjoy the same performance as those of the IDOM interfaces.
* Users can choose to work with either DOM or IDOM interfaces, and the choice is now more one of preference. Though, those implementations that utilize reference counting may require the use of the DOM interface.
* Different DOM implementations can now be plugged in easily. New implementations can provide classes derived from IDOM_* abstract base classes (overriding all methods) or existing ID*impl classes (overriding desired methods only). For those new classes to be used all that is needed is a new DOM_Implementation/DOM_Document pair.
* Common components like the DOMParser can be used with any DOM implementation.
* Coupling has been removed from any specific DOM implementation, improving maintainability as well as providing flexibility. Here is a breakdown of the changes I made: I removed all "*impl*" files from the DOM subsystem. I added the following methods to IDOM_Node to enable IDOM implementations to keep track of in use nodes:     virtual void              addRef() {}
    virtual void              removeRef() {} DOM_Node calls these methods on its IDOM_Node* fImpl member as appropriate in its constructors, assignment operators, and destructor. The default implementation of IDOM simply does nothing in these calls, but other implementations may keep reference counts and removed from memory nodes that are no longer in use. I also added the same addRef()/removeRef() methods and support to NamedNodeMap, NodeList, Range, NodeIterator, and TreeWalker. Because implementations throw exceptions, I kept the IDOM versions of DOMException and RangeException and got rid of the DOM versions. For backwards compatibility I kept the DOM_Exception and DOM_RangeException headers, though, and simply include the corresponding IDOM versions and add typedefs to gain DOM names for them. The DOM nodes all cloned strings being returned using DOMString, while IDOM returns const pointers to XMLCh. The latter works for IDOM because it maintains returned strings in memory until the entire document is deleted, after which no one should be using strings obtained from it. Not all back ends will have this policy, though, especially those that expand and collapse document sections as needed. Thus I added  virtual bool doCloneStrings() const {return false;} to IDOM_Node that implementations can override should they need returned strings cloned. I then rewrote DOMString to contain an XMLCh pointer that could be an alias or a copy, depending on how it is constructed. It is constructed, of course, based on the result of doCloneStrings. I changed the implementation of DOMString to no longer use a string pool, but instead be a light wrapper for IDOM nodes that don't need their XMLCh's cloned, and to simply copy those that do. I did this because the IDOM interface returning XMLCh instead of DOMString gave me the impression that its current implementation represented either unnecessary overhead, or something specific to the old DOM implemenation. All DOMString mutators automatically create a copy (which the DOMString then owns) when called. I added IDOM_NodeFilter as a public base of DOM_NodeFilter so that it can be registered with IDOM_NodeIterator and IDOM_TreeWalker. It implements IDOM_NodeFilter's acceptNode method by wrapping the IDOM_Node in a DOM_Node and passing it to DOM_NodeFilter's pure virtual acceptNode method for DOM_NodeFilter derived classes to override. This keeps the DOM_ users unaware of the underlying IDOM_ implementations. I did nothing with NameNodeFilter.hpp; from what I could tell, it is incomplete code, there is no corresponding .cpp and its base class "NodeFilterImpl" does not exist. I made a couple of changes so that more than one DOMImplementation can exist at one time. Each back end will need its own IDOM_DOMImplemetation derived class to create instances of its own IDOM_Document derived class. These IDOM_DOMImplemetation derived classes may be implemented as a singleton, like IDDOMImplementation, or not. I removed "IDOM_DOMImplementation *IDOM_DOMImplementation::getImplementation() const" and added a IDOM_DOMImplementation pointer as a member of IDDocumentImpl. This way IDDocumentImpl instances can return the implementation that created them without assuming there is only one implementation in the system. The IDOMParser had a few areas where it was tightly coupled to various *Impl classes, which together being only one possible implementation of IDOM_* classes causes problems for using with other implementations. I made the following changes to eliminate the coupling: 1. Changed the parser's fDocument member's type from IDDocumentImpl* to IDOM_Document*.
2. Added setErrorChecking and getErrorChecking virtual methods to IDOM_Document, which IDDocumentImpl overrides with its current methods.
3. Added an addToNodeIDMap pure virtual method to IDOM_Document that takes an IDOM_Attr pointer. Implementations can then store the map in whatever fashion they see fit. I moved the code that creates and adds to IDDocumentImpl's node map to IDDocumentImpl.
4. Added an overload of createDocumentType to IDOM_Document that takes a qualified name, public ID, and a system ID. This matches up with IDDocumentImpl's overload.
5. Added a setDocumentType pure virtual method to IDOM_Document which takes an IDOM_DocumentType pointer. This matches up with that of IDDocumentImpl's.
6. Changed the parser's fDocumentType member's type from IDDocumentTypeImpl* to IDOM_DocumentType*.
7. Added protected pure virtual members to IDOM_DocumentType needed by IDOMParser (made IDOMParser a friend):
    virtual void              setInternalSubset(const XMLCh *value) = 0;
    virtual bool              isIntSubsetReading() const = 0;
    virtual void              setIsIntSubsetReading(bool value) = 0;
8. Changed all *Impl includes to include the IDOM_* base class, and changed all casts to *Impl to casts to IDOM_*.
9. Added setReadOnly and isReadOnly pure virtual methods to IDOM_EntityReference.
10. Added setIngorableWhiteSpace protected pure virtual method to IDOM_Text and added IDOM_Parser as a friend.
11. Added isIdAttr, setIsIdAttr and setSpecified as pure virtual methods to IDOM_Attr.
12. Added protected pure virtual members to IDOM_Entity needed by IDOMParser (made IDOMParser a friend):
    virtual void      setEntityRef(IDOM_EntityReference *) = 0;
    virtual void            setPublicId(const XMLCh *arg) = 0;
    virtual void            setSystemId(const XMLCh *arg) = 0;
    virtual void            setNotationName(const XMLCh *arg) = 0;
13. Copied setPublicId and setSystemId from IDNotationImpl to IDOM_Notation. I added calls to addRef and removedRef for the current node and current parent just in case the document being built uses reference counting. I added an additional constructor to IDOMParser and DOMParser that takes an IDOM_DOMImplementation* so that users can define the type of document the parser builds. I gutted DOMParser and have it delegate to IDOMParser for most everything. The only difference between DOMParser and IDOMParser is that it returns a DOM_Document instead of an IDOM_Document. I switched \schema\XUtil.cpp from using AttrImpl and ElementImpl to using IDOM_Attr and IDOM_Element. Regards, Lenny

Re: DOM_ IDOM_ Integration

Reply via email to