Hi, As XSLT 1.0 work starts to wrap up and we begin the transition to XSLT 2.0, I think it's a good time to take stock of the work currently going on independently and in the various branches, for XalanJ. Over the past few months there have been several notes posted about new work and integration efforts and I would like to try and summarize them here. I invite people to add more information about design or status of any item and also to add anything I may have missed. I strongly encourage everyone to keep discussions public so we can all understand the status of the project.
In both the XSLT 1.0 and XSLT 2.0 work, there are efforts to synchronize XSLTC and Xalan, where possible, in order get common behavior, improve maintainability, reduce duplication of effort on future work, and in some cases to improve performance. Work that's wrapping up for XSLT 1.0: Integration work: * On the XSLTC_DTM branch there is ongoing work to integrate the DTM into XSLTC. I believe this work is nearing completion, with remaining effort focussing on performance. (Henry Zongaro, Morris Kwan) * Independently (although there has been some discussion on xalan-dev) there is ongoing work to integrate the serializers for Xalan-interpretive and XSLTC. (Brian Minchau) Extension work: * SQL extension enhancements (Art (Welch?)) * XSLT/XPath extension mechanism (Dimitry Voytenko). On the xslt20 branch, work is in progress to implement the XSLT 2.0/XPath 2.0 specs, for both Xalan-interpretive and XSLTC. Because there is so much new work here, we have the choice of continuing to synchronize Xalan-interpretive and XSLTC wherever possible, or working independently on each. Although not listed separately, performance is always a consideration. Here are the works in progress, that I know of: * Synchronize on a single XPath AST/Parser (or, at the very least, on a single API to the AST). Work has started on this in the java/xpath_rwapi/src directory of the xslt20 branch. (Lionel Villard, Santiago Pericas-Geertsen, Shane Curcuru) * From Scott: We are drafting the design to a new Data Model API yet another... sorry) to replace the DTM API though we will keep many aspects of the DTM implementation for SAX/XNI events). We call this XDM. The problem with the DTM is that it still exposes handles, which require us, in essence, to implement a mini database in memory... even for existing object hierarchies such as the DOM. Also, implementing DTM adapters can be confusing and expensive, as some of you can attest to. The XDM would never establish proper node identity, i.e. handles or objects, and would act more as a cursor, similar to what you would have for iterating over tables with JDBC. You will request an iterator based on axes and nodetest, and you will request information, such as nodename, nodetype, relative iterator, etc., from the current state of that iterator. We'll send more design info about this, but we think it will allow for significant performance improvements (including streaming subtrees) and data adapter implementation ease. So far this is very much at the experimental level. (Scott Boag, Joseph Kesselman) * From Joe: We have prototypes of some portions of the XSLT 2.0 datatype support retrofitted onto the existing DTM data model, with some of the functions and operators needed to manipulate that information. That code is still considered EXPERIMENTAL; among other things, it requires deep awareness of Xerces post-schema-validation Infoset (PSVI) and is currently bound fairly tightly to the XNI interfaces (though later changes to Xerces may allow partly relaxing those ties). This modified version currently exists on the "xslt20" branch of the Apache CVS server; please be aware that it may not incorporate all the latest fixes from the Xalan mainstream. We are still determining whether it is worth reconciling in its current form, or whether we should wait until the XDM experiment has progressed to the point where we know which data-model API it should be based on. (Joseph Kesselman) Here are other ideas, I'm aware of, that I believe are quite open: * Synchronize on a single XSLT AST. This means probably throwing all of the current Xalan "processor" package away, and working more like XSLTC for both modes. * Synchronize on a single function library and function API. The design hasn't started on this yet. I think we will end up with something closer to XSLTC than the current Xalan interpretive library. * We have to come up with a good way to interface with a schema processor, particularly for validation and type discovery. This is the thorniest issue. Essentially no serious design discussions have occurred about this as of yet. * Have XSLTC produce java source code instead of byte code. Ilene.
