Hi Tinny,
I created a zip containing the changed
source. I sent it to the mailing list, but it bounced it back because it
was over 100K (it is 281K). I looked at bugzilla, but it said that it
forwards everything to the mailing list, so I was not sure that was the way to
go. Please advise -- should I break it up into 3 separate zips and
send them separately?
>> I am very interested in your changes, and I think
it's a good idea to wrap the DOM with IDOM,
Just to be clear, I wrapped IDOM, with DOM.
Without preventing direct IDOM use, though.
Regards,
Lenny
----- Original Message -----
Sent: Tuesday, January 22, 2002 6:29
AM
Subject: Re: DOM_ IDOM_ Integration
Lenny,
I am very interested in your changes, and I think it's a good idea to wrap
the DOM with IDOM, and eventually fade out the old DOM implementation.
Actually I am also thinking to send the IDOM to W3C as one of the non-W3C
referenced DOM C++ Binding. I am working on a proposal ......
Would you please post your zip source to the mailing list or
bugzilla? We can review the changes in more detail, and see if we can
adopt it to the Xerces-C++ code.
Thanks!
Tinny
Lenny Hoffman wrote:
Hi All, I went ahead and performed the refactoring.
The results are: IDOMCount
and DOMCount now take the same amount of time. Not a surprise, as the
IDOM is now behind DOM, which only adds a lightweight smart pointer
overhead. If there is a better performance test for comparing, please
let me know. The size of
the release build of the VC6 DLL has been reduced to 1,364KB from
1,560KB. This is indicative of my removing a large amount of code,
mostly due to removing the DOM implementations. The DOM interface can still be used as always,
except there were a few methods on DOM_Document that don't exist on
IDOM_Document, so given the choice of implementing them in the IDOM or
removing them, I simply removed them. It looks like IDOM is truer to
the DOM specification, and this seemed like the right way to go, even though
I wanted to create an entirely backward compatible version of the DOM
interfaces. The methods could be added back and implementations
provided on the IDOM side if that is the wrong decision. The methods
removed from DOM_Document were: static DOM_Document
createDocument(); DOM_XMLDecl createXMLDecl(const DOMString&
version,
const DOMString& encoding,
const DOMString& standalone); From inspecting the mail list archive, I learned that the developers
of IDOM chose not to implement the equivalent to DOM_XMLDecl because it was
non-standard and those parameters where eventually going to be attributes of
the document. Because of this I took the liberty to remove
"XML_DECL_NODE = 13" from IDOM_Node's and DOM_Node's Nodetype
enumerations. I made all
of my changes to an off-line copy of the source. I can make it
available to anyone interested in seeing it, and it would benefit me, and I
believe many others, if the xerces group would adopt the changes. As a
review, here are the features. * Users of the DOM interfaces now enjoy the same performance
as those of the IDOM interfaces. * Users can choose to work with either DOM or IDOM interfaces,
and the choice is now more one of preference. Though, those
implementations that utilize reference counting may require the use of the
DOM interface. *
Different DOM implementations can now be plugged in easily. New
implementations can provide classes derived from IDOM_* abstract base
classes (overriding all methods) or existing ID*impl classes (overriding
desired methods only). For those new classes to be used all that is
needed is a new DOM_Implementation/DOM_Document pair.
* Common components like the
DOMParser can be used with any DOM implementation. * Coupling has been removed from any specific
DOM implementation, improving maintainability as well as providing
flexibility. Here is a
breakdown of the changes I made: I removed all "*impl*" files from the DOM
subsystem. I added the
following methods to IDOM_Node to enable IDOM implementations to keep track
of in use nodes: virtual
void
addRef() {} virtual
void
removeRef() {} DOM_Node
calls these methods on its IDOM_Node* fImpl member as appropriate in its
constructors, assignment operators, and destructor. The default
implementation of IDOM simply does nothing in these calls, but other
implementations may keep reference counts and removed from memory nodes that
are no longer in use. I
also added the same addRef()/removeRef() methods and support to
NamedNodeMap, NodeList, Range, NodeIterator, and
TreeWalker. Because
implementations throw exceptions, I kept the IDOM versions of DOMException
and RangeException and got rid of the DOM versions. For backwards
compatibility I kept the DOM_Exception and DOM_RangeException headers,
though, and simply include the corresponding IDOM versions and add typedefs
to gain DOM names for them. The DOM nodes all cloned strings being returned using DOMString,
while IDOM returns const pointers to XMLCh. The latter works for IDOM
because it maintains returned strings in memory until the entire document is
deleted, after which no one should be using strings obtained from it.
Not all back ends will have this policy, though, especially those that
expand and collapse document sections as needed. Thus I
added virtual bool
doCloneStrings() const {return false;} to IDOM_Node that implementations can override
should they need returned strings cloned. I then rewrote DOMString to
contain an XMLCh pointer that could be an alias or a copy, depending on how
it is constructed. It is constructed, of course, based on the result
of doCloneStrings. I
changed the implementation of DOMString to no longer use a string pool, but
instead be a light wrapper for IDOM nodes that don't need their XMLCh's
cloned, and to simply copy those that do. I did this because the IDOM
interface returning XMLCh instead of DOMString gave me the impression that
its current implementation represented either unnecessary overhead, or
something specific to the old DOM implemenation. All DOMString
mutators automatically create a copy (which the DOMString then owns) when
called. I added
IDOM_NodeFilter as a public base of DOM_NodeFilter so that it can be
registered with IDOM_NodeIterator and IDOM_TreeWalker. It implements
IDOM_NodeFilter's acceptNode method by wrapping the IDOM_Node in a DOM_Node
and passing it to DOM_NodeFilter's pure virtual acceptNode method for
DOM_NodeFilter derived classes to override. This keeps the DOM_ users
unaware of the underlying IDOM_ implementations. I did nothing with NameNodeFilter.hpp; from what I
could tell, it is incomplete code, there is no corresponding .cpp and its
base class "NodeFilterImpl" does not exist. I made a couple of changes so that more than one
DOMImplementation can exist at one time. Each back end will need its
own IDOM_DOMImplemetation derived class to create instances of its own
IDOM_Document derived class. These IDOM_DOMImplemetation derived
classes may be implemented as a singleton, like IDDOMImplementation, or
not. I removed "IDOM_DOMImplementation
*IDOM_DOMImplementation::getImplementation() const" and added a
IDOM_DOMImplementation pointer as a member of IDDocumentImpl. This way
IDDocumentImpl instances can return the implementation that created them
without assuming there is only one implementation in the
system. The IDOMParser had
a few areas where it was tightly coupled to various *Impl classes, which
together being only one possible implementation of IDOM_* classes causes
problems for using with other implementations. I made the following
changes to eliminate the coupling: 1. Changed the parser's fDocument member's type from
IDDocumentImpl* to IDOM_Document*. 2. Added setErrorChecking and getErrorChecking virtual methods
to IDOM_Document, which IDDocumentImpl overrides with its current
methods. 3. Added an
addToNodeIDMap pure virtual method to IDOM_Document that takes an IDOM_Attr
pointer. Implementations can then store the map in whatever fashion
they see fit. I moved the code that creates and adds to
IDDocumentImpl's node map to IDDocumentImpl. 4. Added an overload of createDocumentType to
IDOM_Document that takes a qualified name, public ID, and a system ID.
This matches up with IDDocumentImpl's overload. 5. Added a setDocumentType pure virtual
method to IDOM_Document which takes an IDOM_DocumentType pointer. This
matches up with that of IDDocumentImpl's. 6. Changed the parser's fDocumentType
member's type from IDDocumentTypeImpl* to IDOM_DocumentType*.
7. Added protected pure virtual
members to IDOM_DocumentType needed by IDOMParser (made IDOMParser a
friend):
virtual
void
setInternalSubset(const XMLCh *value) = 0; virtual
bool
isIntSubsetReading() const = 0; virtual
void
setIsIntSubsetReading(bool value) = 0; 8. Changed all *Impl includes to include the
IDOM_* base class, and changed all casts to *Impl to casts to
IDOM_*. 9. Added
setReadOnly and isReadOnly pure virtual methods to
IDOM_EntityReference. 10.
Added setIngorableWhiteSpace protected pure virtual method to IDOM_Text and
added IDOM_Parser as a friend. 11. Added isIdAttr, setIsIdAttr and setSpecified as pure
virtual methods to IDOM_Attr. 12. Added protected pure virtual members to IDOM_Entity needed
by IDOMParser (made IDOMParser a friend): virtual
void setEntityRef(IDOM_EntityReference *) =
0;
virtual
void
setPublicId(const XMLCh *arg) = 0; virtual
void
setSystemId(const XMLCh *arg) = 0; virtual
void
setNotationName(const XMLCh *arg) = 0; 13. Copied setPublicId and setSystemId from
IDNotationImpl to IDOM_Notation. I added calls to addRef and removedRef for the current node and
current parent just in case the document being built uses reference
counting. I added an
additional constructor to IDOMParser and DOMParser that takes an
IDOM_DOMImplementation* so that users can define the type of document the
parser builds. I gutted
DOMParser and have it delegate to IDOMParser for most everything. The
only difference between DOMParser and IDOMParser is that it returns a
DOM_Document instead of an IDOM_Document. I switched \schema\XUtil.cpp from using AttrImpl
and ElementImpl to using IDOM_Attr and
IDOM_Element. Regards, Lenny
|