dleslie 00/04/21 07:17:15
Added: xdocs/sources/design design1_1_0.xml process.gif
xalan1_1x1.gif xmllogo.gif
Log:
Preliminary design documentation for Xalan-J 1.1.0
Revision Changes Path
1.1 xml-xalan/xdocs/sources/design/design1_1_0.xml
Index: design1_1_0.xml
===================================================================
<?xml version="1.0"?>
<!DOCTYPE s1 SYSTEM "sbk:/style/dtd/document.dtd">
<s1 title="Xalan-J 1.1.0 Design">
<p><link>Xalan-J 1.1.0 Design</link><img src="xmllogo.gif"
alt="xmllogo.gif"/></p>
<ul>
<li>Author: Scott Boag</li>
<li>State: In Progress</li>
</ul>
<s2 title="Introduction">
<p><link>Introduction</link></p>
<p>This document presents the basic design for Xalan-J 1.1.0, which is
a
<jump
href="http://www.awl.com/cseng/titles/0-201-89542-0/techniques/refactoring.htm">refactoring</jump>
and redesign of the Xalan-J 1.0.1 processor. The main goals of
this redesign are
to: </p>
<ol>
<li>Make the design and code more understandable by Open Source
people.</li>
<li>Reduce code size and complexity.</li>
<li>By simplifying the code, make optimization easier.</li>
<li>Make modules generally more localized, and less tangled
with other
modules.</li>
<li>Begin the adoption of the TRaX (Transformations for XML)
interfaces.</li>
</ol>
<p>The techniques used toward these goals are to:</p>
<ol>
<li>In general, flatten the hierarchy of packages, in order to
make the
structure more apparent from the top-level view.</li>
<li>Break the construction and the validation of the XSLT
stylesheet from
the stylesheet objects themselves.</li>
<li>Drive the construction of the stylesheet through a table,
so that it
is less prone to error.</li>
<li>Break the transformation process into a separate package,
away from
the stylesheet objects.</li>
<li>Create this design document, as a start-point for people
wanting to
approach the code.</li>
</ol>
<p>The goals are not:</p>
<ol>
<li>To add more features in the progress of this refactoring.
This is
design and code clean-up, to meet the above-named goals. In
the course of the
refactoring, it is expected that it will be <em>much</em>
easier to add
features once this work is completed.</li>
<li>To optimize code for the sake of optimization. However, it
is
expected that the code will be faster once the work is
complete.</li>
</ol>
<p>How well we've achieved the goals will be measured by feedback from
the
Xalan-dev list, and by software metrics tools.</p>
<p>Please note that the diagrams in this design document are meant to
be
useful abstractions, and may not always be exact.</p>
</s2>
<s2 title="Overview of Architecture">
<p><link>Overview of Architecture</link></p>
<p>Xalan 1.1.0 is divided into four major modules, and various smaller
modules. The main modules are:</p>
<gloss>
<label><code><link
anchor="process">org.apache.xalan.process</link></code></label>
<item>The module that processes the stylesheet, and provides
the main
entry point into Xalan.</item>
</gloss>
<gloss>
<label><code><link
anchor="templates">org.apache.xalan.templates</link></code></label>
<item>The module that defines the stylesheet structures,
including the
Stylesheet object, template element instructions, and
Attribute Value
Templates. </item>
</gloss>
<gloss>
<label><code><link
anchor="transformer">org.apache.xalan.transformer</link></code></label>
<item>The module that applies the source tree to the Templates,
and
produces a result tree.</item>
</gloss>
<gloss>
<label><code><link
anchor="xpath">org.apache.xalan.xpath</link></code></label>
<item>The module that processes both XPath expressions, and
XSLT Match
patterns.</item>
</gloss>
<p>In addition to the above modules, Xalan implements the
<link anchor="xxx">TRaX</link> interfaces, and depends on the
<link anchor="xxx">SAX2</link> and <link anchor="xxx">DOM</link>
packages.
There is also a general utilities package that contains both XML
utility
classes such as QName, but generally useful classes such as
StringToIntTable.</p>
<p>In the diagram below, the dashed lines denote visibility. All
packages
access the SAX2 and DOM packages.</p>
<p><img src="xalan1_1x1.gif" alt="xalan1_1x1.gif"/></p>
<p>In addition to the above packages, there are the following
additional
packages:</p>
<gloss>
<label><code><link
anchor="xxx">org.apache.xalan.client</link></code></label>
<item>This package has a client applet. I suspect this should
be moved
into the samples directory.</item>
</gloss>
<gloss>
<label><code><link
anchor="xxx">org.apache.xalan.extensions</link></code></label>
<item>This holds classes belonging to the Xalan extensions
mechanism,
which allows Java code and script to be called from within a
stylesheet.</item>
</gloss>
<gloss>
<label><code><link
anchor="xxx">org.apache.xalan.lib</link></code></label>
<item>This is the built-in Xalan extensions library, which holds
extensions such as Redirect (which allows a stylesheet to
produce multiple
output files).</item>
</gloss>
<gloss>
<label><code><link
anchor="xxx">org.apache.xalan.res</link></code></label>
<item>This holds resource files needed by Xalan, such as error
message
resources.</item>
</gloss>
<gloss>
<label><code><link
anchor="xxx">org.apache.xalan.trace</link></code></label>
<item>This package contains classes and interfaces that allow a
caller to
add trace listeners to the transformation, allowing an
interface to XSLT
debuggers and similar tools.</item>
</gloss>
<gloss>
<label><code><link
anchor="xxx">org.apache.xalan.xslt</link></code></label>
<item>This package is for backwards compatibility with
applications that
depend on Xalan 1.0.x interfaces.</item>
</gloss>
</s2><anchor name="process"/>
<s2 title="Process Module">
<p><link>Process Module</link></p>
<p>The <code>org.apache.xalan.process</code> module implements the
<code>org.apache.xalan.trax.Processor</code> interface, which
provides a
factory method for creating a concrete Processor instance, and
provides methods
for creating a <code>org.apache.xalan.trax.Templates</code>
instance, which, in
Xalan and XSLT terms, is the Stylesheet. Thus the task of the
process module is
to read the XSLT input in the form of a file, stream, SAX
events, or a DOM
tree, and produce a Templates/Stylesheet object.</p>
<p>The overall strategy is to define a schema that dictates the legal
structure for XSLT elements and attributes, and to associate
with those
elements construction-time processors that can fill in the
appropriate fields
in the top-level Stylesheet object, and also associate classes
in the templates
module that can be created in a generalized fashion. This makes
the validation
object-to-class associations centralized and declarative.</p>
<p>The schema's root class is
<code>org.apache.xalan.processor.XSLTSchema</code>, and it is
here that the
XSLT schema structure is defined. XSLTSchema uses
<code>org.apache.xalan.processor.XSLTElementDef</code> to
define elements, and
<code>org.apache.xalan.processor.XSLTAttributeDef</code> to
define attributes.
Both classes hold the allowed namespace, local name, and type
of element or
attribute. The XSLTElementDef also holds a reference to a
<code>org.apache.xalan.processor.XSLTElementProcessor</code>,
and a sometimes a
<code>Class</code> object, with which it can create objects
that derive from
<code>org.apache.xalan.templates.ElemTemplateElement</code>. In
addition, the
XSLTElementDef instance holds a list of XSLTElementDef
instances that define
legal elements or character events that are allowed as children
of the given
element.</p>
<p>The implementation of the
<code>org.apache.xalan.trax.Processor</code>
interface is in
<code>org.apache.xalan.processor.StylesheetProcessor</code>,
which creates a
<code>org.apache.xalan.processor.StylesheetHandler</code>
instance. This instance acts as the ContentHandler for the
parse events, and is
handed to the <code>org.xml.sax.XMLReader</code>, which the
StylesheetProcessor
uses to parse the XSLT document. The StylesheetHandler then
receives the parse
events, which maintains the state of the construction, and
passes the events on
to the appropriate XSLTElementProcessor for the given event, as
dictated by the
XSLTElementDef that is associated with the given event.</p>
<p><img src="process.gif" alt="process.gif"/></p>
</s2><anchor name="templates"/>
<s2 title="Templates Module">
<p><link>Templates Module</link></p>
<p>The <code>org.apache.xalan.templates</code> module implements the
<code>org.apache.xalan.trax.Templates</code> interface, and
defines a set of
classes that represent a Stylesheet. The primary purpose of
this module is to
hold stylesheet data, not to perform procedural tasks
associated with the
construction of the data, nor tasks associated with the
transformation itself.
</p>
<p>A <code>StylesheetRoot</code>, which implements the
<code>Templates</code> interface, is a type of
<code>StylesheetComposed</code>,
which is a <code>Stylesheet</code> composed of itself and all
included
<code>Stylesheet</code> objects. A <code>StylesheetRoot</code>
has a global
imports list, which is a list of all imported
<code>StylesheetComposed</code>
instances. From each <code>StylesheetComposed</code> object,
one can iterate
through the list of directly or indirectly included
<code>Stylesheet</code>
objects, and one call also iterate through the list of all
<code>StylesheetComposed</code> objects of lesser import
precedence.
<code>StylesheetRoot</code> is a
<code>StylesheetComposed</code>, which is a
<code>Stylesheet</code>.</p>
<p>Each stylesheet has a set of properties, which can be set by various
means, usually either via an attribute on xsl:stylesheet, or
via a top-level
xsl instruction (for instance, xsl:attribute-set). The get
methods for these
properties only access the declaration within the given
<code>Stylesheet</code>
object, and never takes into account included or imported
stylesheets. The
<code>StylesheetComposed</code> derivative object, if it is a
root
<code>Stylesheet</code> or imported <code>Stylesheet</code>,
has "composed"
getter methods that do take into account imported and included
stylesheets, for
some of these properties. The table of Stylesheet properties,
with composed
methods, is as follows. Note that the names of the attributes
are according to
a formula for translating the xsl names to the Java get/set
method names.</p>
<table>
<tr>
<th>Property</th>
<th>Type</th>
<th>XSL Origin</th>
<th>Composed Methods</th>
<th>Note</th>
</tr>
<tr>
<td>XmlnsXsl</td>
<td>String</td>
<td>xmlns:xsl</td>
<td>(none, applies to stylesheet only)</td>
<td></td>
</tr>
<tr>
<td>ExtensionElementPrefixes</td>
<td>StringVector</td>
<td><code><jump
href="http://www.w3.org/TR/xslt#extension-element">extension-element-prefixes</jump></code>
attribute</td>
<td>(none, applies to stylesheet only)</td>
<td></td>
</tr>
<tr>
<td>ExcludeResultPrefixes</td>
<td>StringVector</td>
<td><code><jump
href="http://www.w3.org/TR/xslt#literal-result-element">exclude-result-prefixes
or xsl:exclude-result-prefixes</jump></code>
attributes</td>
<td>(not sure about this... only from root?)</td>
<td>I think this should be a root method, and a single list
should be
made, like with xsl:output.</td>
</tr>
<tr>
<td>Id</td>
<td>String</td>
<td>The <code><jump
href="http://www.w3.org/TR/xslt#section-Embedding-Stylesheets">id</jump></code>
attribute</td>
<td>(none, applies to stylesheet only)</td>
<td></td>
</tr>
<tr>
<td>Version</td>
<td>String</td>
<td>The <code><jump
href="http://www.w3.org/TR/xslt#forwards">version</jump></code> attribute</td>
<td>(none, applies to stylesheet only)</td>
<td></td>
</tr>
<tr>
<td>XmlSpace</td>
<td>boolean</td>
<td><code><jump
href="http://www.w3.org/TR/xslt#strip">xml:space</jump></code> attribute</td>
<td>(none, applies to stylesheet only)</td>
<td></td>
</tr>
<tr>
<td>Import</td>
<td>Vector (list of StylesheetComposed objects)</td>
<td><code><jump
href="http://www.w3.org/TR/xslt#import">xsl:import</jump></code> element</td>
<td>getImportComposed(int i) / getImportCountComposed()</td>
<td>Composed list contains all imported sheets, not the
importing sheet
itself.</td>
</tr>
<tr>
<td>Include</td>
<td>Vector (list of Stylesheet objects)</td>
<td><code><jump
href="http://www.w3.org/TR/xslt#include">xsl:include</jump></code> element</td>
<td>getIncludeComposed(int i) /
getIncludeCountComposed()</td>
<td>Composed list contains all directly or indirectly included
stylesheets.</td>
</tr>
<tr>
<td>DecimalFormat</td>
<td>Stack (list of DecimalFormatProperties objects)</td>
<td><code><jump
href="http://www.w3.org/TR/xslt#format-number">xsl:decimal-format</jump></code>
element</td>
<td>getDecimalFormatComposed(QName name)</td>
<td></td>
</tr>
<tr>
<td>StripSpaces</td>
<td>Stack (list of XPath match pattern objects)</td>
<td><code><jump
href="http://www.w3.org/TR/xslt#strip">xsl:strip-space</jump></code>
element</td>
<td>getWhiteSpaceInfo(TransformerImpl transformContext, Node
sourceTree, Element targetElement)</td>
<td></td>
</tr>
<tr>
<td>PreserveSpaces</td>
<td>Stack (list of XPath match pattern objects)</td>
<td><code><jump
href="http://www.w3.org/TR/xslt#strip">xsl:preserve-space</jump></code>
element</td>
<td>getWhiteSpaceInfo(TransformerImpl transformContext, Node
sourceTree, Element targetElement)</td>
<td></td>
</tr>
<tr>
<td>Output</td>
<td>OutputFormatExtended</td>
<td><code><jump
href="http://www.w3.org/TR/xslt#output">xsl:output</jump></code> element</td>
<td>getOutputComposed() on StylesheetRoot only</td>
<td></td>
</tr>
<tr>
<td>Key</td>
<td>Vector (list of KeyDeclaration objects)</td>
<td><code><jump
href="http://www.w3.org/TR/xslt#key">xsl:key</jump></code> element</td>
<td>getKeysComposed()</td>
<td></td>
</tr>
<tr>
<td>AttributeSet</td>
<td>Vector (list of ElemAttributeSet objects)</td>
<td><code><jump
href="http://www.w3.org/TR/xslt#attribute-sets">xsl:attribute-set</jump></code>
element</td>
<td>On StylesheetRoot only?</td>
<td></td>
</tr>
<tr>
<td>Variable</td>
<td>Vector (list of ElemVariable objects)</td>
<td><code><jump
href="http://www.w3.org/TR/xslt#top-level-variables">xsl:variable</jump></code>
element</td>
<td></td>
<td></td>
</tr>
<tr>
<td>Param</td>
<td>Vector (list of ElemParam objects)</td>
<td><code><jump
href="http://www.w3.org/TR/xslt#top-level-variables">xsl:param</jump></code>
element</td>
<td></td>
<td></td>
</tr>
<tr>
<td>Template</td>
<td>Vector (list of ElemTemplate objects)</td>
<td><code><jump
href="http://www.w3.org/TR/xslt#section-Defining-Template-Rules">xsl:template</jump></code>
element</td>
<td>getTemplateComposed(TransformerImpl transformContext, Node
sourceTree, Node targetNode, QName mode) and
getTemplateComposed(QName
qname)</td>
<td></td>
</tr>
<tr>
<td>NamespaceAlias</td>
<td>Vector (list of ElemTemplate objects)</td>
<td><code><jump
href="http://www.w3.org/TR/xslt#literal-result-element">xsl:namespace-alias</jump></code>
element</td>
<td>On StylesheetRoot only?</td>
<td></td>
</tr>
<tr>
<td>NonXslTopLevel</td>
<td>Hashtable (table of opaque objects keyed by QName)</td>
<td>Any top-level non-xslt element.</td>
<td>none.</td>
<td></td>
</tr>
<tr>
<td>Href</td>
<td>URL</td>
<td>The location of the stylesheet, possibly set by
xsl:include or
xsl:import.</td>
<td>none.</td>
<td></td>
</tr>
<tr>
<td>StylesheetRoot</td>
<td>StylesheetRoot</td>
<td>The root of the stylesheet tree, for quick access.</td>
<td>none.</td>
<td></td>
</tr>
<tr>
<td>StylesheetParent</td>
<td>Stylesheet</td>
<td>The importing or including stylesheet.</td>
<td>none.</td>
<td></td>
</tr>
<tr>
<td>StylesheetComposed</td>
<td>StylesheetComposed</td>
<td>The closest importing stylesheet.</td>
<td>none.</td>
<td></td>
</tr>
<tr>
<td>NamespaceDecls</td>
<td>Linked list of NameSpace elements</td>
<td>xmlns:foo attribute map</td>
<td>(none, applies to stylesheet only)</td>
<td></td>
</tr>
</table>
</s2><anchor name="transformer"/>
<s2 title="Transformer Module">
<p><link>Transformer Module</link></p>
<p>The transformer needs to cover the following concepts:</p>
<ol>
<li>A transformation implementation object. This holds state
about the
transformation as a whole.</li>
<li>A source tree context object. This holds information about
the
runtime context of the source tree, such as the xsl:key
indexes.</li>
<li>A DOM helper. This is able to handle stuff about a given DOM
implementation that the raw DOM API may not be capable
of.</li>
<li>CountersTable -- this holds state information regarding
running
source tree counts.</li>
<li>Variable Stack. This holds stack frames that track the
current
variables and parameters.</li>
</ol>
</s2><anchor name="xpath"/>
<s2 title="XPath Module">
<p><link>XPath Module</link></p>
<p>(This module for the most part remains the same in 1.1.0)</p>
</s2>
<s2 title="Utils Package">
<p><link>Utils Package</link></p>
</s2>
<s2 title="Other Packages">
<p><link>Other Packages</link></p>
<s3 title="lib">
<p><link>lib</link></p>
</s3>
<s3 title="res">
<p><link>res</link></p>
</s3>
<s3 title="trace">
<p><link>trace</link></p>
</s3>
<s3 title="client">
<p><link>client</link></p>
</s3>
</s2>
<s2 title="Coding Conventions">
<p><link>Coding Conventions</link></p>
<p>This section documents the coding conventions used in the Xalan
source.</p>
<ol>
<li>Class files are arranged with constructors and possibly an
init()
function first, public API methods second, package specific,
protected, and
private methods following (arranged based on related
functionality), member
variables with their getter/setter access methods last.</li>
<li>Non-static member variables are prefixed with "m_".</li>
<li>static final member variables should always be upper case,
without
the "m_" prefix. They need not have accessors.</li>
<li>Private member variables that are not accessed outside the
class need
not have getter/setter methods declared.</li>
<li>Private member variables that are accessed outside the
class should
have either package specific or public getter/setter methods
declared. All
accessors should follow the bean design patterns.</li>
<li>Package-scoped member variables, public member variables,
and
protected member variables should not be declared.</li>
</ol>
</s2>
<s2 title="Open Issues">
<p><link>Open Issues</link></p>
<p>This section documents architectural and design issues that I still
consider to be open or unsolved. (This list is ongoing, and
will change over
time... it's simply a place for me to note problems that are
ongoing and need
to be solved.)</p>
<gloss>
<label>Space stripping</label>
<item>In Xalan 1.0.0, it is clear that space stripping was a
major
performance issue. This needs to be solved in Xalan 1.1.0 by
stripping the
space nodes as the document is being parsed. This is a major
problem though for
DOM trees. This can be perhaps be solved by preprocessing the
DOM tree and
creating a table of space-stripping parent elements, when the
nodes can't be
pre-stripped.</item>
</gloss>
</s2>
</s1>
1.1 xml-xalan/xdocs/sources/design/process.gif
<<Binary file>>
1.1 xml-xalan/xdocs/sources/design/xalan1_1x1.gif
<<Binary file>>
1.1 xml-xalan/xdocs/sources/design/xmllogo.gif
<<Binary file>>