Hi Dave,

> I'm curious as to how the smart-pointer implementation will work
> without reference counts, and how those references counts will be
> maintained in a thread-safe manner without explicit synchronization.
> Does your implementation use reference-counting?
> If so, are you doing simple integer increments?

My implementation does use reference counts to know when nodes are available
for recycling.  I did not use explicit synchronization, but as you and Samar
have pointed out, will need to if read-only thread safety is desired.

Lenny

-----Original Message-----
From: David N Bertoni/Cambridge/IBM [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, April 30, 2002 12:11 AM
To: [EMAIL PROTECTED]
Subject: RE: Call for Vote: which one to be the Xerces-C++ public
supported W3C DOM interface



Hi Lenny,

If possible, I would prefer to avoid implicit conversion operators as they
can lead to unexpected conversions and hard-to-find errors.  If people
really feel the implicit conversion of a smart-pointer to bool is
necessary, than operator bool() is a better choice than operator int.
operator! is also a possibility and is even less likely to cause problems.
If DOMString stays around, I'd also prefer an explicit "operator" to const
XMLCh*, rather than an implicit conversion.  I don't find calling rawBuffer
() or something similar to be such a burden.

I'm curious as to how the smart-pointer implementation will work without
reference counts, and how those references counts will be maintained in a
thread-safe manner without explicit synchronization.  Does your
implementation use reference-counting?  If so, are you doing simple integer
increments?  If the implementation is not reference-counted, how does it
work?

Thanks!

Dave




                      "Lenny Hoffman"
                      <lennyhoffman@ear         To:
<[EMAIL PROTECTED]>
                      thlink.net>               cc:      (bcc: David N
Bertoni/Cambridge/IBM)
                                                Subject: RE: Call for Vote:
which one to be the Xerces-C++ public supported W3C DOM
                      04/29/2002 05:37          interface
                      PM
                      Please respond to
                      xerces-c-dev





Hi Markus,

Thank you very much for the insight.

Note that simply accessing the IDOM implementation via handles does not
affect its thread safety-ness, thus your application is safe.

if (pm_Element)
    pm_Element->getAttribute(...);

How can I do this with references?

You do it with the current handles like this:

if (!pm_Element.isNull())
    pm_Element.getAttribute(...);

Adding an int operator to DOM_Node would allow even more friendly syntax;
e.g.

if (pm_Element)
    pm_Element.getAttribute(...);

This could be easily added.

In fact, an -> operators could be added to the DOM_Node classes and get
this:

if (pm_Element)
    pm_Element->getAttribute(...);

This is now exactly what you started out with, thus is completely backward
compatible with your current use of the IDOM.


XMLCh* are easier to handle as DOMString-Objects in ATL :  CComBSTR cBstr
= pm_Element->getAttribute(...);

Good point, the current DOMString class does not have an XMLCh* operator,
which if it did would solve your problem.  I pretty much gutted the
original DOMString class to make it a simple wrapper around an XMLCh*
returned from IDOM implementations, in lieu of suffering the costs of a the
cross document string management of the original DOM.  As far as I can tell
the only reason the original DOMString did not have an XMLCh* operator was
because there was no guarantee that its internal XMLCh* was null
terminated; well, that guarantee does now exist and the operator can be
added -- I will do that.  So your example remains:

CComBSTR cBstr = pm_Element->getAttribute(...);

Note that string classes are convenient way to perform various operations
on a string without using the static (read functional) methods provided by
XMLString.  I even implemented COW (copy on write) behavior in the new
DOMString class, so that you can feel free to modify a string returned from
a node without having to manually make a copy.

If folks don't find the DOMString wrapper to be that important, that frees
me up to simplify the handle classes and address one of Tinny's concerns.
Tinny pointed out that while the new design hides dual interfaces (DOM and
IDOM) from users, it does not hide them from DOM developers;  as DOM 3
support is added, each interface change would have to be made to both DOM
and IDOM classes.  The only reason I went with complete interface
replication instead of simple smart pointers for the handle classes was to
be able to translate XMLCh pointers returned from IDOM nodes into
DOMStrings.  If I am allowed to get rid of DOMString altogether I can make
the handle classes simple smart pointers that do not replicate IDOM
interfaces, and thus the duplication of effort is eliminated.

Lenny

 -----Original Message-----
From: Markus Fellner [mailto:[EMAIL PROTECTED]]
Sent: Monday, April 29, 2002 6:17 PM
To: [EMAIL PROTECTED]; [EMAIL PROTECTED]
Subject: AW: Call for Vote: which one to be the Xerces-C++ public supported
W3C DOM interface

      O.k the main reaseon for my IDOM flirtation is...
      I've chosen IDOM cause of its thread-safeness. And now I have several
      thousands lines of code using IDOM interface.

      Some other reasons are...
      I have many IDOM_Element*  members (pm_Elem) in my classes. After
      parsing they will be assigned one time and than many times checked if
      they are really assigned and used for reading and writing attributes.

      if (pm_Element)
          pm_Element->getAttribute(...);

      How can I do this with references?

      XMLCh* are easier to handle as DOMString-Objects in ATL :  CComBSTR
      cBstr = pm_Element->getAttribute(...);
      ...

      Sorry for my short answer. I go on holiday tomorrow  and i have to
      pack up!

      I'm back in 2 weeks and looking forward to see the results of this
      voting.
      It's a pitty to go during a hot discussion on this list.

      Markus
            -----Urspr�ngliche Nachricht-----
            Von: Lenny Hoffman [mailto:[EMAIL PROTECTED]]
            Gesendet: Montag, 29. April 2002 23:54
            An: [EMAIL PROTECTED]; [EMAIL PROTECTED];
            [EMAIL PROTECTED]
            Betreff: RE: Call for Vote: which one to be the Xerces-C++
            public supported W3C DOM interface

            Hi Markus,

            To be clear, the fix I created for the IDOM was to recycle
            memory once a node or string is no longer needed.   To know
            when a node is no longer needed I used the original DOM
            interface, but have them wrapping up the IDOM as the
            implementation.  IDOM performance is maintained, but ease of
            use is greatly improved.  Without using the DOM handles to know
            when an IDOM node is in use or not, application code will be
            drawn into explicitly stating when a node is no longer needed
            and can be recycled, which is yet another thing to be
            documented and to for application developers to get wrong and
            suffer consequences for.

            If you love and use the IDOM for its performance, you want the
            memory problem fixed so that it is really fixed, not a
            workaround that only works if your application does everything
            right, then you will love what I have done with combining DOM
            classes as handles, and IDOM classes as bodies.

            If what you love is working with pointers instead of with
            objects, please let me know why.

            One thing I have found harder with objects vs.. pointers is
            down casting from node to derived objects like element.  The
            syntax is a bit cleaner with pointers; e.g.:

                DOM_Node node = ...
                DOM_Element elem =  (const DOM_Element&)node;

            vs:

                IDOM_Node* node = ..
                IDOM_Element* elem = (IDOM_Element*)node;

            It is easy to forget to add the const in the first case, and is
            somewhat non-intuitive because slicing can happen, though it is
            not problem in this case.

            To solve this problem I have thought of adding overloaded
            constructors and assignment operators that take a DOM_Node to
            DOM_Node derived classes like DOM_Element.  Thus the first
            example becomes:

                DOM_Node node = ...
                DOM_Element elem =  node;

            Not only is this code more succinct, but it is safer, as the
            overloaded constructor and assignment operator can check for
            node compatibility via the getNodeType call.

            Again, please let me know what other aspects of points make
            things easier for you.

            > Hope your fix has no effects on thread-safe-ness!

            No affect whatsoever.

            Lenny
                  -----Original Message-----
                  From: Markus Fellner [mailto:[EMAIL PROTECTED]]
                  Sent: Monday, April 29, 2002 4:15 PM
                  To: [EMAIL PROTECTED];
                  [EMAIL PROTECTED]
                  Subject: AW: Call for Vote: which one to be the
                  Xerces-C++ public supported W3C DOM interface

                  Hi Lenny,

                  I hope your fix of the IDOM memory problem goes into the
                  next official release. But I use and love the IDOM
                  interface.
                  It's really easier for an old C++ programmer like me! And
                  I use IDOM cause of its threadsafe properties. Hope your
                  fix has no effects on thread-safe-ness!

                  Markus

                        -----Urspr�ngliche Nachricht-----
                        Von: Lenny Hoffman
                        [mailto:[EMAIL PROTECTED]]
                        Gesendet: Montag, 29. April 2002 17:57
                        An: [EMAIL PROTECTED]; [EMAIL PROTECTED]
                        Betreff: RE: Call for Vote: which one to be the
                        Xerces-C++ public supported W3C DOM interface

                        Hi Markus,

                        The memory management problem solved by recycling
                        no longer used nodes and strings.  The only clean
                        way I know to know when nodes and strings are being
                        used is to use the handle/body pattern, which is
                        what is used by the original DOM.  What I have done
                        is use the original DOM handles and the IDOM
                        implementation, but fixed the IDOM memory problem.

                        Lenny
                              -----Original Message-----
                              From: Markus Fellner
                              [mailto:[EMAIL PROTECTED]]
                              Sent: Monday, April 29, 2002 10:54 AM
                              To: [EMAIL PROTECTED]
                              Subject: AW: Call for Vote: which one to be
                              the Xerces-C++ public supported W3C DOM
                              interface

                              If the memory management problem is solved, I
                              prefer IDOM!!!
                                    -----Urspr�ngliche Nachricht-----
                                    Von: Tinny Ng
                                    [mailto:[EMAIL PROTECTED]]
                                    Gesendet: Montag, 29. April 2002 17:08
                                    An: [EMAIL PROTECTED]
                                    Betreff: Call for Vote: which one to be
                                    the Xerces-C++ public supported W3C DOM
                                    interface

                                    Hi everyone,

                                    I've reviewed Andy's design objective
                                    of IDOM, Lenny's view of old DOM and
                                    his proposal of redesign, and some
                                    users feedback.   Here is a "quick"
                                    summary and I would like to call for a
                                    VOTE about the fate of these two
                                    interfaces.

                                    1.0 Objective
                                    ==========
                                    1.  Define the strategy of Xerces-C++
                                    public DOM interface.  Decide which one
                                    to keep, old DOM interface or new IDOM
                                    interface


                                    2.0 Motivation
                                    ===========
                                    1. As a long term strategy, Xerces-C++
                                    shouldn't define two W3C DOM interfaces
                                    which simply confuses users.
                                        => We've already got many users'
                                    questions about what the difference,
                                    which one to use ... etc.
                                    2. With limited resource, we should
                                    focus our development on ONE stream, no
                                    more duplicate effort
                                        => New DOM Level 3 development
                                    should be done on one interface, not
                                    both.
                                        => No more dual maintenance: two
                                    set of samples (e.g. DOMPrint vs
                                    IDOMPrint), two parsers (DOMParser vs
                                    IDOMParser)
                                    3. To better place Apache Xerces-C++ in
                                    the market, we should have our Apache
                                    Recommended DOM C++ Binding in
                                    http://www.w3.org/DOM/Bindings
                                        => To encourage more users to
                                    develop DOM application AND
                                    implementation based on this binding.
                                        => Such binding should just define
                                    a set of abstract base classes (similar
                                    to JAVA interface) where no
                                    implementation model is assumed


                                    3.0 History
                                    =========
                                    'DOM' was the initial "W3C DOM
                                    interface" developed by Xerces-C++.
                                    However the performance of its
                                    implementation is not quite
                                    satisfactory.

                                    Last year, Andy Heninger came up with a
                                    new design with faster performance, and
                                    such implementation came with a new set
                                    of interface => 'IDOM'.

                                    Currently both 'DOM' and 'IDOM' are
                                    shipped with Xerces-C++.  'IDOM' is
                                    claimed as experimental (like a
                                    prototype) and is subject to change.

                                    More information can be found in :

http://xml.apache.org/xerces-c/program.html
                                    http://www.apache.org/~andyh/

http://marc.theaimsgroup.com/?t=101650188300002&r=1&w=2

http://marc.theaimsgroup.com/?w=2&r=1&s=Proposal%3A+C%2B%2B+Language+Binding
+for+DOM+L&q=t



                                    4.0 IDOM
                                    =========
                                    4.1 Interface
                                    ==========

                                    4.1.1 Features of IDOM Interface
                                    ----------------------------------------
----------
                                    e.g. virtual IDOM_Element*
                                    IDOM_Document::createElement(const
                                    XMLCh* tagName) = 0;

                                    1. Define as abstract base classes
                                    2. Use normal C++ pointers.
                                        => So that abstract base class is
                                    possible.
                                        => Make it more C++ like. Less Java
                                    like.


                                    4.1.2 Pros and Cons of IDOM Interface
                                    ----------------------------------------
------------------
                                    Pros:
                                    1. Abstract base classes that
                                    correspond to the W3C DOM interfaces
                                        => Can be recommended as Apache DOM
                                    C++ Binding
                                        => More standard like, no
                                    implementation assumed as they are just
                                    abstract interfaces using pure virtual
                                    functions
                                    2. (Depends on users' preference)
                                        - someone prefers C++ like style

                                    Cons:
                                    1. IDOM_XXX - weird prefix 'I'
                                        Solution:
                                            - Proposed to rename to DOMXXXX
                                    which also matches the DOM Level 3
                                    naming convention
                                    2. (Depends on users' preference)
                                        - someone does not like pointers,
                                    and wants Java-like interface for ease
                                    to use, ease to learn and ease to port
                                    (from Java).
                                    3. As the old DOM interface has been
                                    around for a long time, majority of
                                    current Xerces-C++ still uses the old
                                    DOM interface, significant migration
                                    impact
                                        Solution:
                                            - Announce the deprecation of
                                    old DOM interface for a couple of
                                    releases before removal

                                    4.2 Implementation
                                    ===============
                                    4.2.1 Features of IDOM Implementation
                                    ----------------------------------------
-------------------
                                    1. Use an independent storage allocator
                                    per document. The advantage here is
                                    that allocation would require no
                                    synchronization
                                        => Fast, good scalability, reduced
                                    memory footprint
                                    2. Use plain, null-terminated (XMLCh *)
                                    utf-16 strings.
                                        => No DOMString class overhead
                                    which is another performance
                                    contributor that makes IDOM faster


                                    4.2.2 Downside of IDOM Implementation
                                    ----------------------------------------
---------------------
                                    1. Manual memory management
                                        - If document comes from parser,
                                    then parser owns the document.  If
                                    document comes from DOMImplementation,
                                    then users are responsible to delete
                                    it.
                                        Solution:
                                            - Provide a means of
                                    disassociating a document from the
                                    parser
                                            - Add a function "Node::release
                                    ()", similar to the idea of
                                    "Range::detach", which allows users to
                                    indicate the release of the Node.
                                                - From C++ Binding abstract
                                    interface perspective, it's up to
                                    implementation how to handle this
                                    "release()" function.
                                                - With Xerces-C++ IDOM
                                    implementation, the release() function
                                    will delete the 'this' pointer if it is
                                    a document, else no-op.
                                    2. Memory retained until the document
                                    is deleted.
                                        - If you change the value of an
                                    attribute or call removeNode many
                                    times,  the memory of the old value is
                                    not deallocated for reuse and the
                                    document grows and grows
                                        Solution:
                                            - This in fact is a tradeoff
                                    for the fast performance offered by
                                    independent storage allocator.
                                            - There is no immediate good
                                    solution in place


                                    5.0 old DOM
                                    ==========
                                    5.1 Interface
                                    ==========

                                    5.1.1 Features of old DOM Interface
                                    ----------------------------------------
-------------
                                    e.g. DOM_Element
                                    DOM_Document::createElement(const
                                    DOMString tagName);

                                    1. Use smart pointers - Java-like


                                    5.1.2 Pros and Cons of old DOM
                                    Interface
                                    ----------------------------------------
----------------------
                                    Pros:
                                    1. DOM_XXX - reasonable name
                                    2. (Depends on users' preference)
                                        - someone wants Java-like interface
                                    for ease to use, ease to learn and ease
                                    to port (from Java).
                                    3. Not that many users have migrated to
                                    IDOM yet, so migration impact is
                                    minimal.

                                    Cons:
                                    1. Not abstract base class
                                        - Cannot be recommended as Apache
                                    DOM C++ Binding
                                        - Implementation (smart pointer
                                    indirection) is assumed
                                        Solution:
                                            - This in fact is a tradeoff
                                    for the ease of use of smart pointer
                                    design
                                            - No solution.
                                    2. (Depends on users' preference)
                                        - someone wants C++-like as this is
                                    C++ interface


                                    5.2 Implementation
                                    ===============
                                    5.2.1 Features of old DOM
                                    Implementation
                                    ----------------------------------------
------------------------
                                    1. Automatic memory management
                                        - Memory is released when there is
                                    no more handles pointing to it
                                        - Use reference count to keep track
                                    of handles
                                    2. Use thread-safe DOMString class


                                    5.2.2 Downside of old DOM
                                    Implementation
                                    ----------------------------------------
----------------------------
                                    1. Performance is slow
                                        - Memory management is the biggest
                                    time consumer, and a lot of memory
                                    footprint.
                                        - There are a whole lot of blocks
                                    allocated when creating a document and
                                    then freed when finished with it. Each
                                    and every node requires at least one
                                    and sometimes several separately
                                    allocated blocks. DOMString take three.
                                    It adds up.
                                        Solution:
                                            - Lenny suggests to use IDOM
                                    interface internally in DOM
                                    implementation, patch in Bugzilla 5967
                                            - Then the performance benefits
                                    of IDOM is gained but the memory
                                    retained problem in IDOM implementation
                                    still remains to address.
                                            - And internally, we will have
                                    dual interface maintenance model as
                                    IDOM interface is then used by DOM
                                    internally.


                                    Vote Question:
                                    ============
                                    I would like to call for a vote:

                                        ==>  Which INTERFACE should be the
                                    Xerces-C++ public supported W3C DOM
                                    Interface, DOM or IDOM? <===

                                    Note:
                                    1. The question is asking which
                                    "interface" to be officially supported.
                                    Once the choice of interface is chosen,
                                    we can discuss how to solve the
                                    downside of implementation as the next
                                    topic.
                                    2. The one being voted will become the
                                    ONLY Xerces-C++ supported public W3C
                                    DOM Interface, and is where the DOM
                                    Level 3 being implemented.
                                    3. The API of the other interface will
                                    be deprecated.  And its samples, and
                                    associated Parser will eventually be
                                    removed from the distribution






---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to