Summary of XalanJ work (long!)

ilene Thu, 06 Feb 2003 17:18:24 -0800

Hi,

As XSLT 1.0 work starts to wrap up and we begin the transition to XSLT 2.0,
I think it's a good time to take stock of the work currently going on
independently and in the various branches, for  XalanJ.  Over the past few
months there have been several notes posted about new work and integration
efforts and I would like to try and summarize them here.  I invite people
to add more information about design or status of any item and also to add
anything I may have missed.  I strongly encourage everyone to keep
discussions public so we can all understand the status of the project.


In both the XSLT 1.0 and XSLT 2.0 work, there are efforts to synchronize
XSLTC and Xalan, where possible, in order get common behavior, improve
maintainability, reduce duplication of effort on future work, and in some
cases to improve performance.

Work that's wrapping up for XSLT 1.0:

Integration work:

 * On the XSLTC_DTM branch there is ongoing work to integrate the DTM into
XSLTC. I believe this work is nearing completion, with remaining effort
focussing on performance. (Henry Zongaro, Morris Kwan)

 * Independently (although there has been some discussion on xalan-dev)
there is ongoing work to integrate the serializers for Xalan-interpretive
and XSLTC.  (Brian Minchau)


Extension work:

 * SQL extension enhancements (Art (Welch?))

 * XSLT/XPath extension mechanism (Dimitry Voytenko).


On the xslt20 branch, work is in progress to implement the XSLT 2.0/XPath
2.0 specs, for both Xalan-interpretive and XSLTC.  Because there is so much
new work here, we have the choice of continuing to synchronize
Xalan-interpretive and XSLTC wherever possible, or working independently on
each.  Although not listed separately, performance is always a
consideration.  Here are the works in progress, that I know of:


 * Synchronize on a single XPath AST/Parser (or, at the very least, on a
single API to the AST).  Work has started on this in the
java/xpath_rwapi/src directory of the xslt20 branch. (Lionel Villard,
Santiago Pericas-Geertsen, Shane Curcuru)

 * From Scott: We are drafting the design to a new Data Model API yet
another... sorry) to replace the DTM API though we will keep many aspects
of the DTM implementation for SAX/XNI events).  We call this XDM. The
problem with the DTM is that it still exposes handles, which require us, in
essence, to implement a mini database in memory... even for existing object
hierarchies such as the DOM.  Also, implementing DTM adapters can be
confusing and  expensive, as some of you can attest to.  The XDM would
never establish proper node identity, i.e. handles or objects, and would
act more as a cursor, similar to what you would have for iterating over
tables with JDBC.  You will request an iterator based on axes and nodetest,
and you will request information, such as nodename, nodetype, relative
iterator, etc., from the current state of that iterator.  We'll send more
design info about this, but we think it will allow for significant
performance improvements (including streaming subtrees) and data adapter
implementation ease.  So far this is very much at the experimental level.
(Scott Boag, Joseph Kesselman)

* From Joe: We have prototypes of some portions of the XSLT 2.0 datatype
support retrofitted onto the existing DTM data model, with some of the
functions and operators needed to manipulate that information. That code is
still considered EXPERIMENTAL; among other things, it requires deep
awareness of Xerces post-schema-validation Infoset (PSVI) and is currently
bound fairly tightly to the XNI interfaces (though later changes to Xerces
may allow partly relaxing those ties).

This modified version  currently exists on the "xslt20" branch of the
Apache CVS server; please be aware that it may not incorporate all the
latest fixes from the Xalan mainstream. We are still determining whether it
is worth reconciling in its current form, or whether we should wait until
the XDM experiment has progressed to the point where we know which
data-model API it should be based on. (Joseph Kesselman)


Here are other ideas, I'm aware of, that I believe are quite open:

* Synchronize on a single XSLT AST.  This means probably throwing all of
the current Xalan "processor" package away, and working more like XSLTC for
both modes.

* Synchronize on a single function library and function API.  The design
hasn't started on this yet.  I think we will end up with something closer
to XSLTC than the current Xalan interpretive library.

*  We have to come up with a good way to interface with a schema processor,
particularly for validation and type discovery.  This is the thorniest
issue.  Essentially no serious design discussions have occurred about this
as of yet.

* Have XSLTC produce java source code instead of byte code.


Ilene.

Summary of XalanJ work (long!)

Reply via email to