[Xmlunit-general] "New" DifferenceEngine

Stefan Bodewig Mon, 17 May 2010 04:24:28 -0700

Hi,

I'm writing this mail in part to share my thoughts and get feedback and
in part to help myself get a clearer picture of how I want the
difference engine of XMLUnit 2.0 to work.  It won't be too different
from the Java version of 1.x.


The difference engine compares (I)Source instances, there are already
plenty of implementations of this interface and even a fluid builder API
to create such instances.

Many of XMLUnit Java's options will be resolved by specialized Source
implementations.  There will be a CommentLessSource decorator which
strips the comments from an existing Source as a way to emulate
XMLUnit.setIgnoreComments for example.  This should also apply to all
whitespace manipulations.

The difference engine is supposed to traverse the document tree from the
root in a depth first search manner.  Even though the initial
implementation will be DOM based I hope it will be possible to provide
pull parser (StAX or XMLReader based) or event driven (SAX)
implementations that have the potential to compare bigger documents.

As the engine traverses the tree it will compare certain aspects of the
nodes - I guess I'll need to document properly which aspects that are
going to be and in which order the comparisons occur.  Basically these
comparisons will all be testing for exact equality.

If a comparison results in something that is not equal the registered
DifferenceEvaluator will be consulted (passing in
ComparisonResult.DIFFERENT) and will determine whether the difference is
recoverable (in the same sense as the term was used in 1.x),
non-recoverable or non-recoverable and comparison should stop at this
point.

The default implementation is more or less the same as of XMLUnit 1.x,
the main difference is that a Text node and a CDATA node of the same
textual content compare as recoverably different.

ComparisonListeners can be attached to the DifferenceEngine and will -
at this point - be notified of the comparison and its outcome.
Listeners take interest in successfull, failed or all comparisons and
will be notified based on the outcome.

If the DifferenceEvaluator instructed the DifferenceEngine to stop, the
comparison gets aborted otherwise the tree traversal continues.

The remaining interface/delegate is ElementSelector which replaces 1.x's
ElementQualifier.  I'm a bit torn whether a selector based on the
element's name should be the default (as it is in 1.x) or a pure order
based one.

The ElementSelector gets two DOM Elements of the control and test docs
and decides whether they can be compared.  This may require
materializing the DOM tree below each element for non-DOM
implementations - I hope to be able to do so in a lazy manner somehow
if/whenever I start implementing such engines.

The ElementSelector will be invoked whenever the child elements of an
Element are to be compared and there is more than one child.

In a forum thread somebody raised the idea to defer default element
selection to the DifferenceEngine itself if there is no explicit
ElementSelector, i.e. compare a combination of Elements and try a
different one if the comparison doesn't have a successfull outcome.  In
order to avoid duplicate comparisons this would require queueing up
comparisons to notify the listeners once the Elements are selected.  At
least I want to toy with the idea.

Stefan

------------------------------------------------------------------------------

_______________________________________________
Xmlunit-general mailing list
Xmlunit-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xmlunit-general

[Xmlunit-general] "New" DifferenceEngine

Reply via email to