sboag       00/07/27 19:08:27

  Added:       xdocs/sources/design conceptual.gif data.gif design2_0_0.xml
                        org_apache.gif trax.gif xpath.gif
  Removed:     xdocs/sources/design design1_1_0.xml
  Log:
  Changed name to design_2_0_0.xml, and did a bunch of work to add XPath, add 
some explanitory diagrams, etc.
  
  Revision  Changes    Path
  1.1                  xml-xalan/xdocs/sources/design/conceptual.gif
  
        <<Binary file>>
  
  
  1.1                  xml-xalan/xdocs/sources/design/data.gif
  
        <<Binary file>>
  
  
  1.1                  xml-xalan/xdocs/sources/design/design2_0_0.xml
  
  Index: design2_0_0.xml
  ===================================================================
  <?xml version="1.0"?>
  <!DOCTYPE s1 SYSTEM 
"file:///C:\x\xml-stylebook\styles\apachexml\dtd\document.dtd">
  <s1 title="Xalan-J 2.0 Design">
    <p><link>Xalan-J 2.0 Design</link><img src="xmllogo.gif" 
alt="xmllogo.gif"/></p>
    <ul> 
         <li>Author: Scott Boag</li> 
         <li>State: In Progress</li>
     <li><jump href="http://xml.apache.org/xalan-j/apidocs/index.html";>Xalan-J 
2.0 Javadoc</jump></li>
    </ul>
    <s2 title="Introduction"> 
         <p><link idref="intro">Introduction</link></p> 
         <p>This document presents the basic design for Xalan-J 2.0, which is a
                <jump 
href="http://www.awl.com/cseng/titles/0-201-89542-0/techniques/refactoring.htm";>refactoring</jump>
                and redesign of the Xalan-J 1.x proces
      sor. The main goals of this redesign are
                to: </p> 
         <ol> 
                <li>Make the design and code more understandable by Open Source
                  people.</li> 
                <li>Reduce code size and complexity.</li>
                <li>By simplifying the code, make optimization easier.</li> 
                <li>Make modules generally more localized, and less tangled 
with other
                  modules.</li> 
                <li>Begin the adoption of the TrAX (Transformations for XML)
                  interfaces.</li> 
         <li>Increase the streamability of transformations.</li></ol> 
         <p>The techniques used toward these goals are to:</p> 
         <ol> 
                <li>In general, flatten the hierarchy of packages, in order to 
make the
                  structure more apparent from the top-level view.</li> 
                <li>Break the construction and the validation of the XSLT 
stylesheet from
                  the stylesheet objects themselves.</li>
                <li>Drive the construction of the stylesheet through a table, 
so that it
                  is less prone to error.</li> 
                <li>Break the transformation process into a separate package, 
away from
                  the stylesheet objects.</li> 
                <li>Create this design document, as a start-point for people 
wanting to
                  approach the code.</li> 
         </ol> 
         <p>The goals are not:</p> 
         <ol> 
                <li>To add more features in the progress of this refactoring. 
This is
                  design and code clean-up, to meet the above-named goals. In 
the course of the
                  refactoring, it is expected that it will be <em>much</em> 
easier to add
                  features once this work is completed.</li> 
                <li>To optimize code for the sake of optimization. However, it 
is
                  expected that the code will be faster once the work is 
complete.</li> 
         </ol> 
         <p>How well we've achieved the goals will be measured by feedback from 
the
                <link 
anchor="http://xml-archive.webweaving.org/xml-archive-xalan";>Xalan-dev</link> 
list, and by software metrics tools.</p> 
         <p>Please note that the diagrams in this design document are meant to 
be
                useful abstractions, and may not always be exact.</p> 
    </s2> 
    <s2 title="Overview of Architecture"> 
         <p><link idref="overview">Overview of Architecture</link></p> 
         <p>Xalan 2.0 is divided into four major modules, and various smaller
                modules. The main modules are:</p> 
         <gloss> 
                <label><code><link 
anchor="http://xml.apache.org/xalan-j/apidocs/org/apache/xalan/processor/package-summary.html";>org.apache.xalan.process</link></code></label>
 
                <item>The module that processes the stylesheet, and provides 
the main
                  entry point into Xalan.</item> 
         </gloss> 
         <gloss> 
                <label><code><link 
anchor="http://xml.apache.org/xalan-j/apidocs/org/apache/xalan/templates/package-summary.html";>org.apache.xalan.templates</link></code></label>
 
                <item>The module that defines the stylesheet structures, 
including the
                  Stylesheet object, template element instructions, and 
Attribute Value
                  Templates. </item> 
         </gloss> 
         <gloss> 
                <label><code><link 
anchor="http://xml.apache.org/xalan-j/apidocs/org/apache/xalan/transformer/package-summary.html";>org.apache.xalan.transformer</link></code></label>
 
                <item>The module that applies the source tree to the Templates, 
and
                  produces a result tree.</item> 
         </gloss> 
         <gloss> 
                <label><code><link 
anchor="http://xml.apache.org/xalan-j/apidocs/org/apache/xpath/package-summary.html";>org.apache.xpath</link></code></label>
 
                <item>The module that processes both XPath expressions, and 
XSLT Match
                  patterns.</item> 
         </gloss> 
         <p>In addition to the above modules, Xalan implements the
                <link anchor="http://trax.openxml.org/";>TrAX</link> interfaces, 
and depends on the
         <link 
anchor="http://www.megginson.com/SAX/Java/index.html";>SAX2</link> and <link 
anchor="http://www.w3.org/TR/DOM-Level-2/";>DOM</link> packages.
  </p><p><img src="trax.gif" alt="trax.gif"/></p><p>There is also a general 
utilities package that contains both XML utility
         classes such as QName, but generally useful classes such as
         StringToIntTable.</p> 
         <p>In the diagram below, the dashed lines denote visibility. All 
packages
                access the SAX2 and DOM packages.</p> 
         <p><img src="xalan1_1x1.gif" alt="xalan1_1x1.gif"/></p> 
         <p>In addition to the above packages, there are the following 
additional
                packages:</p> 
         <gloss> 
                <label><code><link 
anchor="http://xml.apache.org/xalan-j/apidocs/org/apache/xalan/client/package-summary.html";>org.apache.xalan.client</link></code></label>
 
                <item>This package has a client applet. I suspect this should 
be moved
                  into the samples directory.</item> 
         </gloss> 
         <gloss> 
                <label><code><link 
anchor="http://xml.apache.org/xalan-j/apidocs/org/apache/xalan/extensions/package-summary.html";>org.apache.xalan.extensions</link></code></label>
 
                <item>This holds classes belonging to the Xalan extensions 
mechanism,
                  which allows Java code and script to be called from within a 
stylesheet.</item>
                
         </gloss> 
         <gloss> 
                <label><code><link 
anchor="http://xml.apache.org/xalan-j/apidocs/org/apache/xalan/lib/package-summary.html";>org.apache.xalan.lib</link></code></label>
 
                <item>This is the built-in Xalan extensions library, which holds
                  extensions such as Redirect (which allows a stylesheet to 
produce multiple
                  output files).</item> 
         </gloss> 
         <gloss> 
                <label><code><link 
anchor="http://xml.apache.org/xalan-j/apidocs/org/apache/xalan/res/package-summary.html";>org.apache.xalan.res</link></code></label>
 
                <item>This holds resource files needed by Xalan, such as error 
message
                  resources.</item> 
         </gloss> 
         <gloss> 
                <label><code><link 
anchor="http://xml.apache.org/xalan-j/apidocs/org/apache/xalan/trace/package-summary.html";>org.apache.xalan.trace</link></code></label>
 
                <item>This package contains classes and interfaces that allow a 
caller to
                  add trace listeners to the transformation, allowing an 
interface to XSLT
                  debuggers and similar tools.</item> 
         </gloss> 
         <gloss> 
                <label><code><link 
anchor="http://xml.apache.org/xalan-j/apidocs/org/apache/xalan/xslt/package-summary.html";>org.apache.xalan.xslt</link></code></label>
 
                <item>This package is for backwards compatibility with 
applications that
                  depend on Xalan 1.x interfaces.</item> 
         </gloss> 
    <p>A more conceptual view of this architecture is as follows:</p><p><img 
src="conceptual.gif" alt="Picture of conceptual 
architecture."/></p></s2><anchor name="process"/> 
    <s2 title="Process Module"> 
         <p><link idref="process">Process Module</link></p> 
         <p>The <code>org.apache.xalan.process</code> module implements the
                <code>org.apache.xalan.trax.Processor</code> interface, which 
provides a
                factory method for creating a concrete Processor instance, and 
provides methods
                for creating a <code>org.apache.xalan.trax.Templates</code> 
instance, which, in
                Xalan and XSLT terms, is the Stylesheet. Thus the task of the 
process module is
                to read the XSLT input in the form of a file, stream, SAX 
events, or a DOM
                tree, and produce a Templates/Stylesheet object.</p> 
         <p>The overall strategy is to define a schema that dictates the legal
                structure for XSLT elements and attributes, and to associate 
with those
                elements construction-time processors that can fill in the 
appropriate fields
                in the top-level Stylesheet object, and also associate classes 
in the templates
                module that can be created in a generalized fashion. This makes 
the validation
                object-to-class associations centralized and declarative.</p> 
         <p>The schema's root class is
                <code>org.apache.xalan.processor.XSLTSchema</code>, and it is 
here that the
                XSLT schema structure is defined. XSLTSchema uses
                <code>org.apache.xalan.processor.XSLTElementDef</code> to 
define elements, and
                <code>org.apache.xalan.processor.XSLTAttributeDef</code> to 
define attributes.
                Both classes hold the allowed namespace, local name, and type 
of element or
                attribute. The XSLTElementDef also holds a reference to a
                <code>org.apache.xalan.processor.XSLTElementProcessor</code>, 
and a sometimes a
                <code>Class</code> object, with which it can create objects 
that derive from
                <code>org.apache.xalan.templates.ElemTemplateElement</code>. In 
addition, the
                XSLTElementDef instance holds a list of XSLTElementDef 
instances that define
                legal elements or character events that are allowed as children 
of the given
                element.</p> 
         <p>The implementation of the 
<code>org.apache.xalan.trax.Processor</code>
                interface is in 
<code>org.apache.xalan.processor.StylesheetProcessor</code>,
                which creates a 
<code>org.apache.xalan.processor.StylesheetHandler</code>
                instance. This instance acts as the ContentHandler for the 
parse events, and is
                handed to the <code>org.xml.sax.XMLReader</code>, which the 
StylesheetProcessor
                uses to parse the XSLT document. The StylesheetHandler then 
receives the parse
                events, which maintains the state of the construction, and 
passes the events on
                to the appropriate XSLTElementProcessor for the given event, as 
dictated by the
                XSLTElementDef that is associated with the given event.</p> 
         <p><img src="process.gif" alt="process.gif"/></p> 
    </s2><anchor name="templates"/> 
    <s2 title="Templates Module"> 
         <p><link idref="templates">Templates Module</link></p> 
         <p>The <code>org.apache.xalan.templates</code> module implements the
                <code>org.apache.xalan.trax.Templates</code> interface, and 
defines a set of
                classes that represent a Stylesheet. The primary purpose of 
this module is to
                hold stylesheet data, not to perform procedural tasks 
associated with the
                construction of the data, nor tasks associated with the 
transformation itself.
                </p> 
         <p>A <code>StylesheetRoot</code>, which implements the
                <code>Templates</code> interface, is a type of 
<code>StylesheetComposed</code>,
                which is a <code>Stylesheet</code> composed of itself and all 
included
                <code>Stylesheet</code> objects. A <code>StylesheetRoot</code> 
has a global
                imports list, which is a list of all imported 
<code>StylesheetComposed</code>
                instances. From each <code>StylesheetComposed</code> object, 
one can iterate
                through the list of directly or indirectly included 
<code>Stylesheet</code>
                objects, and one call also iterate through the list of all
                <code>StylesheetComposed</code> objects of lesser import 
precedence.
                <code>StylesheetRoot</code> is a 
<code>StylesheetComposed</code>, which is a
                <code>Stylesheet</code>.</p> 
         <p>Each stylesheet has a set of properties, which can be set by various
                means, usually either via an attribute on xsl:stylesheet, or 
via a top-level
                xsl instruction (for instance, xsl:attribute-set). The get 
methods for these
                properties only access the declaration within the given 
<code>Stylesheet</code>
                object, and never takes into account included or imported 
stylesheets. The
                <code>StylesheetComposed</code> derivative object, if it is a 
root
                <code>Stylesheet</code> or imported <code>Stylesheet</code>, 
has "composed"
                getter methods that do take into account imported and included 
stylesheets, for
                some of these properties. The table of Stylesheet properties, 
with composed
                methods, is as follows. Note that the names of the attributes 
are according to
                a formula for translating the xsl names to the Java get/set 
method names.</p> 
         <table> 
                <tr> 
                  <th>Property</th> 
                  <th>Type</th> 
                  <th>XSL Origin</th> 
                  <th>Composed Methods</th> 
                  <th>Note</th> 
                </tr> 
                <tr> 
                  <td>XmlnsXsl</td> 
                  <td>String</td> 
                  <td>xmlns:xsl</td> 
                  <td>(none, applies to stylesheet only)</td> 
                  <td></td> 
                </tr> 
                <tr> 
                  <td>ExtensionElementPrefixes</td> 
                  <td>StringVector</td> 
                  <td><code><jump 
href="http://www.w3.org/TR/xslt#extension-element";>extension-element-prefixes</jump></code>
                         attribute</td> 
                  <td>(none, applies to stylesheet only)</td> 
                  <td></td> 
                </tr> 
                <tr> 
                  <td>ExcludeResultPrefixes</td> 
                  <td>StringVector</td> 
                  <td><code><jump 
href="http://www.w3.org/TR/xslt#literal-result-element";>exclude-result-prefixes
                         or xsl:exclude-result-prefixes</jump></code> 
attributes</td> 
                  <td>(not sure about this... only from root?)</td> 
                  <td>I think this should be a root method, and a single list 
should be
                         made, like with xsl:output.</td> 
                </tr> 
                <tr> 
                  <td>Id</td> 
                  <td>String</td> 
                  <td>The <code><jump 
href="http://www.w3.org/TR/xslt#section-Embedding-Stylesheets";>id</jump></code>
                         attribute</td> 
                  <td>(none, applies to stylesheet only)</td> 
                  <td></td> 
                </tr> 
                <tr> 
                  <td>Version</td> 
                  <td>String</td> 
                  <td>The <code><jump 
href="http://www.w3.org/TR/xslt#forwards";>version</jump></code> attribute</td> 
                  <td>(none, applies to stylesheet only)</td> 
                  <td></td> 
                </tr> 
                <tr> 
                  <td>XmlSpace</td> 
                  <td>boolean</td> 
                  <td><code><jump 
href="http://www.w3.org/TR/xslt#strip";>xml:space</jump></code> attribute</td> 
                  <td>(none, applies to stylesheet only)</td> 
                  <td></td> 
                </tr> 
                <tr> 
                  <td>Import</td> 
                  <td>Vector (list of StylesheetComposed objects)</td> 
                  <td><code><jump 
href="http://www.w3.org/TR/xslt#import";>xsl:import</jump></code> element</td> 
                  <td>getImportComposed(int i) / getImportCountComposed()</td> 
                  <td>Composed list contains all imported sheets, not the 
importing sheet
                         itself.</td> 
                </tr> 
                <tr> 
                  <td>Include</td> 
                  <td>Vector (list of Stylesheet objects)</td> 
                  <td><code><jump 
href="http://www.w3.org/TR/xslt#include";>xsl:include</jump></code> element</td>
                  
                  <td>getIncludeComposed(int i) / 
getIncludeCountComposed()</td> 
                  <td>Composed list contains all directly or indirectly included
                         stylesheets.</td> 
                </tr> 
                <tr> 
                  <td>DecimalFormat</td> 
                  <td>Stack (list of DecimalFormatProperties objects)</td> 
                  <td><code><jump 
href="http://www.w3.org/TR/xslt#format-number";>xsl:decimal-format</jump></code>
                         element</td> 
                  <td>getDecimalFormatComposed(QName name)</td> 
                  <td></td> 
                </tr> 
                <tr> 
                  <td>StripSpaces</td> 
                  <td>Stack (list of XPath match pattern objects)</td> 
                  <td><code><jump 
href="http://www.w3.org/TR/xslt#strip";>xsl:strip-space</jump></code>
                         element</td> 
                  <td>getWhiteSpaceInfo(TransformerImpl transformContext, Node
                         sourceTree, Element targetElement)</td> 
                  <td></td> 
                </tr> 
                <tr> 
                  <td>PreserveSpaces</td> 
                  <td>Stack (list of XPath match pattern objects)</td> 
                  <td><code><jump 
href="http://www.w3.org/TR/xslt#strip";>xsl:preserve-space</jump></code>
                         element</td> 
                  <td>getWhiteSpaceInfo(TransformerImpl transformContext, Node
                         sourceTree, Element targetElement)</td> 
                  <td></td> 
                </tr> 
                <tr> 
                  <td>Output</td> 
                  <td>OutputFormatExtended</td> 
                  <td><code><jump 
href="http://www.w3.org/TR/xslt#output";>xsl:output</jump></code> element</td> 
                  <td>getOutputComposed() on StylesheetRoot only</td> 
                  <td></td> 
                </tr> 
                <tr> 
                  <td>Key</td> 
                  <td>Vector (list of KeyDeclaration objects)</td> 
                  <td><code><jump 
href="http://www.w3.org/TR/xslt#key";>xsl:key</jump></code> element</td> 
                  <td>getKeysComposed()</td> 
                  <td></td> 
                </tr> 
                <tr> 
                  <td>AttributeSet</td> 
                  <td>Vector (list of ElemAttributeSet objects)</td> 
                  <td><code><jump 
href="http://www.w3.org/TR/xslt#attribute-sets";>xsl:attribute-set</jump></code>
                         element</td> 
                  <td>On StylesheetRoot only?</td> 
                  <td></td> 
                </tr> 
                <tr> 
                  <td>Variable</td> 
                  <td>Vector (list of ElemVariable objects)</td> 
                  <td><code><jump 
href="http://www.w3.org/TR/xslt#top-level-variables";>xsl:variable</jump></code>
                         element</td> 
                  <td></td> 
                  <td></td> 
                </tr> 
                <tr> 
                  <td>Param</td> 
                  <td>Vector (list of ElemParam objects)</td> 
                  <td><code><jump 
href="http://www.w3.org/TR/xslt#top-level-variables";>xsl:param</jump></code>
                         element</td> 
                  <td></td> 
                  <td></td> 
                </tr> 
                <tr> 
                  <td>Template</td> 
                  <td>Vector (list of ElemTemplate objects)</td> 
                  <td><code><jump 
href="http://www.w3.org/TR/xslt#section-Defining-Template-Rules";>xsl:template</jump></code>
                         element</td> 
                  <td>getTemplateComposed(TransformerImpl transformContext, Node
                         sourceTree, Node targetNode, QName mode) and 
getTemplateComposed(QName
                         qname)</td> 
                  <td></td> 
                </tr> 
                <tr> 
                  <td>NamespaceAlias</td> 
                  <td>Vector (list of ElemTemplate objects)</td> 
                  <td><code><jump 
href="http://www.w3.org/TR/xslt#literal-result-element";>xsl:namespace-alias</jump></code>
                         element</td> 
                  <td>On StylesheetRoot only?</td> 
                  <td></td> 
                </tr> 
                <tr> 
                  <td>NonXslTopLevel</td> 
                  <td>Hashtable (table of opaque objects keyed by QName)</td> 
                  <td>Any top-level non-xslt element.</td> 
                  <td>none.</td> 
                  <td></td> 
                </tr> 
                <tr> 
                  <td>Href</td> 
                  <td>URL</td> 
                  <td>The location of the stylesheet, possibly set by 
xsl:include or
                         xsl:import.</td> 
                  <td>none.</td> 
                  <td></td> 
                </tr> 
                <tr> 
                  <td>StylesheetRoot</td> 
                  <td>StylesheetRoot</td> 
                  <td>The root of the stylesheet tree, for quick access.</td> 
                  <td>none.</td> 
                  <td></td> 
                </tr> 
                <tr> 
                  <td>StylesheetParent</td> 
                  <td>Stylesheet</td> 
                  <td>The importing or including stylesheet.</td> 
                  <td>none.</td> 
                  <td></td> 
                </tr> 
                <tr> 
                  <td>StylesheetComposed</td> 
                  <td>StylesheetComposed</td> 
                  <td>The closest importing stylesheet.</td> 
                  <td>none.</td> 
                  <td></td> 
                </tr> 
                <tr> 
                  <td>NamespaceDecls</td> 
                  <td>Linked list of NameSpace elements</td> 
                  <td>xmlns:foo attribute map</td> 
                  <td>(none, applies to stylesheet only)</td> 
                  <td></td> 
                </tr> 
         </table> 
    </s2><anchor name="transformer"/> 
    <s2 title="Transformer Module"> 
         <p><link idref="transformer">Transformer Module</link></p> 
         <p>The <link 
anchor="http://xml.apache.org/xalan-j/apidocs/org/apache/xalan/transformer/package-summary.html";>Transformer</link>
 module is in charge of run-time transformations.  The <link 
anchor="http://xml.apache.org/xalan-j/apidocs/org/apache/xalan/transformer/TransformerImpl.html";>TransformerImpl</link>
 object, which implements the TrAX <link 
anchor="http://trax.openxml.org/javadoc/trax/Transformer.html";>Transformer</link>
 interface, and has an association with a <link 
anchor="http://xml.apache.org/xalan-j/apidocs/org/apache/xalan/templates/StylesheetRoot.html";>StylesheetRoot</link>
 object, begins the processing of the source tree (or provides a <link 
anchor="http://www.megginson.com/SAX/Java/javadoc/org/xml/sax/ContentHandler.html";>ContentHandler</link>
 reference), and performs the transformation.  The Transformer package does as 
much of the transformation as it can, but element level operations are 
generally performed in the <link 
anchor="http://xml.apache.org/xalan-j/apidocs/org/apache/xalan/templ
ates/ElemTemplateElement.html#execute(org.apache.xalan.transformer.TransformerImpl,
 org.w3c.dom.Node, 
org.apache.xalan.utils.QName)">ElemTemplateElement.execute(...)</link> 
methods.</p><p>Result Tree events are fed into a <link 
anchor="http://xml.apache.org/xalan-j/apidocs/org/apache/xalan/transformer/ResultTreeHandler.html";>ResultTreeHandler</link>
 object, which acts as a layer between the direct calls to the result 
  tree content handler (often a Serializer), and the Transformer.  For one 
thing, 
   we have to delay the call to
   startElement(name, atts) because of the
   xsl:attribute and xsl:copy calls.  In other words,
   the attributes have to be fully collected before you
   can call startElement.</p><p>Other important classes in this package 
are:</p><gloss><label>CountersTable and Counter</label><item>The Counter class 
does incremental counting for support of xsl:number.
   This class stores a cache of counted nodes (m_countNodes). 
    It tries to cache the counted nodes in document order... 
   the node count is based on its position in the cache list.  The 
CountersTable class is a table of counters, keyed by ElemNumber objects, each 
   of which has a list of Counter 
objects.</item></gloss><gloss><label>KeyIterator, KeyManager, and 
KeyTable</label><item>These classes handle mapping of keys declared with the 
xsl:key element.</item></gloss><gloss><label>TransformState</label><item>This 
interface is meant to be used by a consumer of SAX2 events produced by Xalan, 
and enables the consumer 
   to get information about the state of the transform.  It 
   is primarily intended as a tooling interface.</item></gloss><p>Even though 
the following modules are defined in the org.apache.xalan package, instead of 
the transformer package, they are defined in this section as they are mostly 
related to runtime transformation.</p> 
    <s3 title="Stree Module"><p><link idref="stree">Stree Module [And 
discussions about streaming]</link></p><p>The Stree module implements the 
default <link anchor="http://www.w3.org/TR/xpath#data-model";>Source Tree 
</link> for Xalan, that is to be transformed.  It implements read-only <link 
anchor="http://www.w3.org/TR/DOM-Level-2/";>DOM2</link> interfaces, and provides 
some information needed for fast transforms, such as document order indexes.  
It also attempts to allow a streaming transform by launching the transform on a 
secondary thread as soon as the SAX2 <link 
anchor="http://www.megginson.com/SAX/Java/javadoc/org/xml/sax/ContentHandler.html#startDocument()">StartDocument</link>
 event has occured.  When the transform requests a node, and node is not 
present, the getFirstChild and GetNextSibling methods will wait until the child 
node has arrived, or an <link 
anchor="http://www.megginson.com/SAX/Java/javadoc/org/xml/sax/ContentHandler.html#endElement(java.lang.String,%20java.lang.String,%20java.lang.Str
ing)">endElement</link> event has occured.</p><p>Note that the secondary thread 
is an issue.  It would be better to do the same thing as described above on a 
single thread, but using the parser in 'pull' mode, or simply with a parseNext 
method so the parse would occur in blocks.</p><p>This kind of streaming is not 
perfect because it still requires an entire source tree to be concretely built. 
 There have been a lot of good discussions on the xalan-dev list about how to 
do static analysis of a stylesheet, and be able to allocate only the nodes 
needed by the transform, while they are needed (or not allocate source objects 
at all).</p><p>Vincent-Olivier Arsenault &lt;[EMAIL PROTECTED]&gt; has proposed 
the following design:</p><p>By looking at the stylesheet you know how 
streamable it is (of course this
  needs strict adherence to the xslt recommendation). since there's a root
  template and no &lt;xsl:apply-templates/&gt; you can build your context list
  containing only absolute x-path (which means nodes get out of context
  faster).</p>
  
  <p>The paths of the relevant nodes, for this stylesheet, are (ok this is an
  example, so I may be missing some):</p>
  <ol>
  <li>path: "/address" context: "address" (at &lt;/address&gt;, you get rid of 
the
  whole "person/address" stuff);</li>
  
  <li>path: "/adn" context: "adn";</li>
  
  <li>path: "/medicalrecord" context: "/" (for possibly repetitive nodes, the
  context is always the parent node).</li>
  </ol>
  <p>And all the rest goes to trash!!!!</p>
  
  <p>Let me refine:</p>
  
  <p>you analyze the whole stylesheet like that (would be good if optimization
  and x-path list could be done simultaneously) and you end up with a list of
  expanded paths mapped to all the templates.</p>
  
  <p>An entry in the list (i would call this list the transformation stack) 
would
  consist of 4 things:</p>
  <ol>
  <li>the relevance context xpath (on which the input nodes will be tested for
  pertinence: do we keep it of not);</li>
  
  <li>the transformation rule to apply to the matching nodes (this can just be a
  forwarder to another template transformation stack);</li>
  
  <li>a result buffer (in which the nodes that can't be streamed are temporarily
  stored);</li>
  
  <li>the streaming context xpath (triggers streaming of the buffer to the
  output).</li>
  </ol>
  </s3><s3 title="Extensions Module"><p><link idref="extensions">Extensions 
Module</link></p><p>This package contains an implementation of Xalan Extension 
Mechanism, which uses the <link 
anchor="http://oss.software.ibm.com/developerworks/opensource/bsf/";>Bean 
Scripting Framework</link>.
  
  The Bean Scripting Framework (BSF) is an architecture for incorporating 
scripting into Java applications and applets.  Scripting languages such as 
Netscape Rhino (Javascript), VBScript, Perl, Tcl, Python, NetRexx and Rexx can 
be used to augment XSLT's functionality.  In addition, the Xalan extension 
mechanism allows use of Java classes.  See the <link 
anchor="http://xml.apache.org/xalan/extensions.html";>XalanJ 1 extension 
documentation</link> for a description of using extensions in a stylesheet. 
Please note that the W3C XSL Working Group is working on a specification for 
standard extension bindings, and this module will change to follow that 
specification.  </p><p>[More needed... -sb]</p></s3></s2><anchor name="xpath"/> 
    <s2 title="XPath Module"> 
         <p><link idref="xpath">XPath Module</link></p> 
         <p>This module is pulled out of the Xalan package, and put in the 
org.apache package, to emphasize that the intention is that this package can be 
used independently of the XSLT engine, even though it has dependencies on the 
Xalan utils module.</p><p><img src="org_apache.gif" alt="xalan ---> 
xpath"/></p> 
    <p>The XPath module first compiles the XPath strings into expression trees, 
and then executes these expressions via a call to the XPath execute(...) 
function.  </p>  <p>Major classes 
are:</p><gloss><label>XPath</label><item>Represents a compiled XPath.  Major 
function is <code>XObject execute(XPathContext xctxt, Node contextNode, 
                           PrefixResolver 
namespaceContext).</code></item></gloss><gloss><label>XPathAPI</label><item>The 
methods in this class are convenience methods into the
   low-level XPath 
API.</item></gloss><gloss><label>XPathContext</label><item>Used as the runtime 
execution context for 
XPath.</item></gloss><gloss><label>DOMHelper</label><item>Used as a helper for 
handling DOM issues.  May be subclassed to take advantage 
     of specific DOM 
implementations.</item></gloss><gloss><label>SourceTreeManager</label><item>bottlenecks
 all management of source trees.  The methods
   in this class should allow easy garbage collection of source 
   trees, and should centralize parsing for those source 
trees.</item></gloss><gloss><label>Expression</label><item>The base-class of 
all expression objects, allowing polymorphic behaviors.</item></gloss><p>The 
general architecture of the XPath module is diveded into the compiler, and 
categories of expression objects.</p><p><img src="xpath.gif" alt="xpath 
modules"/></p><p>The most important module is the axes module.  This module 
implements the DOM2 <link 
anchor="http://www.w3.org/TR/DOM-Level-2/traversal.html#Iterator-overview";>NodeIterator</link>
 interface, and is meant to allow XPath clients to either override the default 
behavior or to replace this behavior.</p><p>The LocPathIterator and 
UnionPathIterator classes implement the <link 
anchor="http://www.w3.org/TR/DOM-Level-2/java-binding.html#org.w3c.dom.traversal.NodeIterator";>NodeIterator</link>
 interface, and polymorphically use AxesWalker derived objects to execute each 
step in the path.  The whole trick is to execute the LocationPath in 
depth-first do
cument order so that nodes can be found without necessarily looking ahead or 
performing a bredth-first search.</p><s3 title="XPath Database 
Connection"><p><link idref="xpath-database">XPath Direct Database 
Connections</link></p><p>An important part of the XPath design in both Xalan 1 
and Xalan 2, is to enable database connections to be used as drivers directly 
to the XPath <link 
anchor="http://www.w3.org/TR/xpath#location-paths";>LocationPath</link> 
handling.  This allows databases to be directly connected to the transform, and 
be able to take advantage of internal indexing and the like.  While in Xalan 1 
this was done via the <link 
anchor="http://xml.apache.org/xalan/apidocs/org/apache/xalan/xpath/XLocator.html";>XLocator</link>
 interface, in Xalan 2 this interface is no longer used, and has been replaced 
by the DOM2 <link 
anchor="http://www.w3.org/TR/DOM-Level-2/traversal.html#Iterator-overview";>NodeIterator</link>
 interface.  An application or extension should be able to install their own 
NodeIterator for a
 given document.</p><p><img src="data.gif" alt="data.gif"/></p><p>[More to 
do]</p></s3></s2> 
    <s2 title="Utils Package"> 
         <p><link idref="utils">Utils Package</link></p> 
    <p>This package contains general utilities for use by both the xalan and 
xpath packages.  It is the intention that many of these utility classes (or 
their equivelents) be eventually brought into the org.apache.xml package for 
general use.  The list of major utilities are as 
follows:</p><gloss><label>AttList</label><item>Wraps a DOM attribute list in a 
SAX Attributes.</item></gloss><gloss><label>BoolStack, IntStack, IntVector, 
etc.</label><item>Simple stacks and vectors  for primative 
values.</item></gloss><gloss><label>DefaultErrorHandler</label><item>Implements 
SAX error handler for default 
reporting.</item></gloss><gloss><label>DOMBuilder</label><item>Takes SAX events 
(in addition to some extra events 
   that SAX doesn't handle yet) and adds the result to a document 
   or document fragment.</item></gloss><gloss><label>Heap</label><item>Classic 
heap 
implementation.</item></gloss><gloss><label>MutableAttrListImpl</label><item>Mutable
 version of 
AttributesImpl.</item></gloss><gloss><label>NameSpace</label><item>A 
representation of a 
namespace.</item></gloss><gloss><label>NodeVector</label><item>A very simple 
table that stores a list of 
Nodes.</item></gloss><gloss><label>ObjectPool</label><item>Used for reuse of 
objects.</item></gloss><gloss><label>PrefixResolver</label><item>The class that 
implements this interface can resolve prefixes 
   to 
namespaces.</item></gloss><gloss><label>PrefixResolverDefault</label><item>This 
class implements a generic PrefixResolver for a DOM, that 
   can be used to perform prefix-to-namespace lookup 
   for an XPath.</item></gloss><gloss><label>QName</label><item>Class to 
represent a  qualified XML 
name.</item></gloss><gloss><label>StringToStringTable</label><item>A very 
simple lookup table that stores a list of strings for lookup.  Used when a 
hashtable is too much 
overhead.</item></gloss><gloss><label>SystemIDResolver</label><item>Able to 
take a SystemID string and try and turn it into a good absolute 
URL.</item></gloss><gloss><label>TreeWalker</label><item>Implements a Visitor 
design pattern, doing a pre-order walk of the DOM tree, calling a 
ContentHandler interface as it goes.  Used for DOM-to-SAX 
conversion.</item></gloss><gloss><label>Trie</label><item>A digital search trie 
for 7-bit ASCII text.</item></gloss><gloss><label>UnImplNode</label><item>To be 
subclassed by classes that wish to act as DOM nodes, without having to 
implement all the methods.  Widely used.</item></gloss></s2> 
    <s2 title="Other Packages"> 
         <p><link idref="other">Other Packages</link></p> 
         <gloss><label>client</label><item>Implementation of Xalan Applet 
[should we keep this?].
  
  </item></gloss> 
                <gloss><label>dtm</label><item>Implementation of the Document 
Table Model (DTM) [Should we keep this?].</item></gloss> 
                <gloss><label>extensions</label><item>Implementation of Xalan 
Extension Mechanism, which uses the Bean Scripting Framework.</item></gloss> 
                <gloss><label>lib</label><item>Implementation of Xalan-specific 
extensions [I want to add lots more extensions to this 
package!].</item></gloss><gloss><label>res</label><item>Contains strings that 
require internationalization.</item></gloss></s2> 
    <s2 title="Coding Conventions"> 
         <p><link idref="coding-conventions">Coding Conventions</link></p> 
         <p>This section documents the coding conventions used in the Xalan
                source.</p> 
         <ol> 
                <li>Class files are arranged with constructors and possibly an 
init()
                  function first, public API methods second, package specific, 
protected, and
                  private methods following (arranged based on related 
functionality), member
                  variables with their getter/setter access methods last.</li> 
                <li>Non-static member variables are prefixed with "m_".</li> 
                <li>static final member variables should always be upper case, 
without
                  the "m_" prefix. They need not have accessors.</li> 
                <li>Private member variables that are not accessed outside the 
class need
                  not have getter/setter methods declared.</li> 
                <li>Private member variables that are accessed outside the 
class should
                  have either package specific or public getter/setter methods 
declared. All
                  accessors should follow the bean design patterns.</li> 
                <li>Package-scoped member variables, public member variables, 
and
                  protected member variables should not be declared.</li> 
         </ol> 
    </s2> 
    <s2 title="Open Issues"> 
         <p><link idref="open-issues">Open Issues</link></p> 
         <p>This section documents architectural and design issues that I still
                consider to be open or unsolved. (This list is ongoing, and 
will change over
                time... it's simply a place for me to note problems that are 
ongoing and need
                to be solved.)</p> 
         <gloss> 
                <label>Space stripping</label> 
                <item>In Xalan 1.x, it is clear that space stripping was a major
                  performance issue. This needs to be solved in Xalan 2.0 by 
stripping the
                  space nodes as the document is being parsed. This is a major 
problem though for
                  DOM trees. This can be perhaps be solved by preprocessing the 
DOM tree and
                  creating a table of space-stripping parent elements, when the 
nodes can't be
                  pre-stripped.</item> 
         </gloss> 
    </s2>
  </s1>
  
  
  
  1.1                  xml-xalan/xdocs/sources/design/org_apache.gif
  
        <<Binary file>>
  
  
  1.1                  xml-xalan/xdocs/sources/design/trax.gif
  
        <<Binary file>>
  
  
  1.1                  xml-xalan/xdocs/sources/design/xpath.gif
  
        <<Binary file>>
  
  

Reply via email to