|
Lenny,
I am very interested in your changes, and I think it's a good idea to
wrap the DOM with IDOM, and eventually fade out the old DOM implementation.
Actually I am also thinking to send the IDOM to W3C as one of the non-W3C
referenced DOM C++ Binding. I am working on a proposal ......
Would you please post your zip source to the mailing list or bugzilla?
We can review the changes in more detail, and see if we can adopt it to
the Xerces-C++ code.
Thanks!
Tinny
Lenny Hoffman wrote:
Hi
All, I went ahead and
performed the refactoring. The results are: IDOMCount
and DOMCount now take the same amount of time. Not a surprise, as
the IDOM is now behind DOM, which only adds a lightweight smart pointer
overhead. If there is a better performance test for comparing, please
let me know. The size
of the release build of the VC6 DLL has been reduced to 1,364KB from 1,560KB.
This is indicative of my removing a large amount of code, mostly due to
removing the DOM implementations. The
DOM interface can still be used as always, except there were a few methods
on DOM_Document that don't exist on IDOM_Document, so given the choice
of implementing them in the IDOM or removing them, I simply removed them.
It looks like IDOM is truer to the DOM specification, and this seemed like
the right way to go, even though I wanted to create an entirely backward
compatible version of the DOM interfaces. The methods could be added
back and implementations provided on the IDOM side if that is the wrong
decision. The methods removed from DOM_Document were:
static DOM_Document createDocument();
DOM_XMLDecl createXMLDecl(const DOMString& version,
const DOMString& encoding,
const DOMString& standalone); From
inspecting the mail list archive, I learned that the developers of IDOM
chose not to implement the equivalent to DOM_XMLDecl because it was non-standard
and those parameters where eventually going to be attributes of the document.
Because of this I took the liberty to remove "XML_DECL_NODE = 13"
from IDOM_Node's and DOM_Node's Nodetype enumerations. I
made all of my changes to an off-line copy of the source. I can make
it available to anyone interested in seeing it, and it would benefit me,
and I believe many others, if the xerces group would adopt the changes.
As a review, here are the features. *
Users of the DOM interfaces now enjoy the same performance as those of
the IDOM interfaces.
* Users can choose to work with
either DOM or IDOM interfaces, and the choice is now more one of preference.
Though, those implementations that utilize reference counting may require
the use of the DOM interface.
* Different DOM implementations
can now be plugged in easily. New implementations can provide classes
derived from IDOM_* abstract base classes (overriding all methods) or existing
ID*impl classes (overriding desired methods only). For those new
classes to be used all that is needed is a new DOM_Implementation/DOM_Document
pair.
* Common components like the
DOMParser can be used with any DOM implementation.
* Coupling has been removed
from any specific DOM implementation, improving maintainability as well
as providing flexibility. Here
is a breakdown of the changes I made: I
removed all "*impl*" files from the DOM subsystem. I
added the following methods to IDOM_Node to enable IDOM implementations
to keep track of in use nodes:
virtual void
addRef() {}
virtual void
removeRef() {} DOM_Node
calls these methods on its IDOM_Node* fImpl member as appropriate in its
constructors, assignment operators, and destructor. The default implementation
of IDOM simply does nothing in these calls, but other implementations may
keep reference counts and removed from memory nodes that are no longer
in use. I also added
the same addRef()/removeRef() methods and support to NamedNodeMap, NodeList,
Range, NodeIterator, and TreeWalker. Because
implementations throw exceptions, I kept the IDOM versions of DOMException
and RangeException and got rid of the DOM versions. For backwards
compatibility I kept the DOM_Exception and DOM_RangeException headers,
though, and simply include the corresponding IDOM versions and add typedefs
to gain DOM names for them. The
DOM nodes all cloned strings being returned using DOMString, while IDOM
returns const pointers to XMLCh. The latter works for IDOM because
it maintains returned strings in memory until the entire document is deleted,
after which no one should be using strings obtained from it. Not
all back ends will have this policy, though, especially those that expand
and collapse document sections as needed. Thus I added virtual
bool doCloneStrings() const {return false;} to
IDOM_Node that implementations can override should they need returned strings
cloned. I then rewrote DOMString to contain an XMLCh pointer that
could be an alias or a copy, depending on how it is constructed.
It is constructed, of course, based on the result of doCloneStrings. I
changed the implementation of DOMString to no longer use a string pool,
but instead be a light wrapper for IDOM nodes that don't need their XMLCh's
cloned, and to simply copy those that do. I did this because the
IDOM interface returning XMLCh instead of DOMString gave me the impression
that its current implementation represented either unnecessary overhead,
or something specific to the old DOM implemenation. All DOMString
mutators automatically create a copy (which the DOMString then owns) when
called. I added IDOM_NodeFilter
as a public base of DOM_NodeFilter so that it can be registered with IDOM_NodeIterator
and IDOM_TreeWalker. It implements IDOM_NodeFilter's acceptNode method
by wrapping the IDOM_Node in a DOM_Node and passing it to DOM_NodeFilter's
pure virtual acceptNode method for DOM_NodeFilter derived classes to override.
This keeps the DOM_ users unaware of the underlying IDOM_ implementations. I
did nothing with NameNodeFilter.hpp; from what I could tell, it is incomplete
code, there is no corresponding .cpp and its base class "NodeFilterImpl"
does not exist. I made
a couple of changes so that more than one DOMImplementation can exist at
one time. Each back end will need its own IDOM_DOMImplemetation derived
class to create instances of its own IDOM_Document derived class.
These IDOM_DOMImplemetation derived classes may be implemented as a singleton,
like IDDOMImplementation, or not. I removed "IDOM_DOMImplementation
*IDOM_DOMImplementation::getImplementation() const" and added a IDOM_DOMImplementation
pointer as a member of IDDocumentImpl. This way IDDocumentImpl instances
can return the implementation that created them without assuming there
is only one implementation in the system. The
IDOMParser had a few areas where it was tightly coupled to various *Impl
classes, which together being only one possible implementation of IDOM_*
classes causes problems for using with other implementations. I made
the following changes to eliminate the coupling: 1.
Changed the parser's fDocument member's type from IDDocumentImpl* to IDOM_Document*.
2. Added setErrorChecking and
getErrorChecking virtual methods to IDOM_Document, which IDDocumentImpl
overrides with its current methods.
3. Added an addToNodeIDMap pure
virtual method to IDOM_Document that takes an IDOM_Attr pointer.
Implementations can then store the map in whatever fashion they see fit.
I moved the code that creates and adds to IDDocumentImpl's node map to
IDDocumentImpl.
4. Added an overload of createDocumentType
to IDOM_Document that takes a qualified name, public ID, and a system ID.
This matches up with IDDocumentImpl's overload.
5. Added a setDocumentType pure
virtual method to IDOM_Document which takes an IDOM_DocumentType pointer.
This matches up with that of IDDocumentImpl's.
6. Changed the parser's fDocumentType
member's type from IDDocumentTypeImpl* to IDOM_DocumentType*.
7. Added protected pure virtual
members to IDOM_DocumentType needed by IDOMParser (made IDOMParser a friend):
virtual void
setInternalSubset(const XMLCh *value) = 0;
virtual bool
isIntSubsetReading() const = 0;
virtual void
setIsIntSubsetReading(bool value) = 0;
8. Changed all *Impl includes
to include the IDOM_* base class, and changed all casts to *Impl to casts
to IDOM_*.
9. Added setReadOnly and isReadOnly
pure virtual methods to IDOM_EntityReference.
10. Added setIngorableWhiteSpace protected
pure virtual method to IDOM_Text and added IDOM_Parser as a friend.
11. Added isIdAttr, setIsIdAttr
and setSpecified as pure virtual methods to IDOM_Attr.
12. Added protected pure virtual
members to IDOM_Entity needed by IDOMParser (made IDOMParser a friend):
virtual void
setEntityRef(IDOM_EntityReference *) = 0;
virtual void
setPublicId(const XMLCh *arg) = 0;
virtual void
setSystemId(const XMLCh *arg) = 0;
virtual void
setNotationName(const XMLCh *arg) = 0;
13. Copied setPublicId and setSystemId
from IDNotationImpl to IDOM_Notation. I
added calls to addRef and removedRef for the current node and current parent
just in case the document being built uses reference counting. I
added an additional constructor to IDOMParser and DOMParser that takes
an IDOM_DOMImplementation* so that users can define the type of document
the parser builds. I
gutted DOMParser and have it delegate to IDOMParser for most everything.
The only difference between DOMParser and IDOMParser is that it returns
a DOM_Document instead of an IDOM_Document. I
switched \schema\XUtil.cpp from using AttrImpl and ElementImpl to using
IDOM_Attr and IDOM_Element. Regards, Lenny
|