Christine Li wrote:
> >which leads me to believe that Xalan is actually using the system id as a
> >URI to load a resource.  It seems kind of broken to require that any
> >stylesheet that references itself be reloadable and that it actually 
> would
> >reload it and reparse it.
> 
> You are right about this. For document function, the processor will reload 
> the document using the system id as a URI

You might call it a bug. See XSLT 1.0 sec. 12.1, starting with "Two documents
are treated as the same document if they are identified by the same URI" and
also see RFC 3986 sec. 4.4: "When a same-document reference is dereferenced
for a retrieval action, the target of that reference is defined to be within
the same entity (representation, document, or message) as the reference;
therefore, a dereference should not result in a new retrieval action." (Or see
its predecessor, RFC 2396 sec. 4.2, which says essentially the same thing.)

My conclusion after dealing with this 4 years ago in another processor is that
in order to satisfy the document() + generate-id() requirements in XSLT 1.0
sec. 12.1, the processor must cache all documents it reads AND must reliably
detect all same-document references including but not limited to document('')
so that it can generate the same IDs for them, if not actually use the same 
node objects each time.

I made a cursory test at http://skew.org/xml/stylesheets/doc-id/ and found
widely varying results in different processors at the time. It may seem
academic, but I can envision a case where generate-id() inconsistency with
same-document references: you might want to use the Muenchian method on data
nodes embedded in the stylesheet.

Anyway, you could write your own JAXP URIResolver to do same-document ref
detection and document caching, but it's still up to Xalan's generate-id()
implementation to also do its own same-document detection because it has to
take the Source that the URIResolver gives it and treat it as the same 
document that it has already been using...perhaps not an easy thing to do.

Also note that it's also possible that the stylesheet docs haven't been read
yet in the current transformation session, such as when the transformer was
prepared with an already-created Templates object, the point of which is to
encapsulate the stylesheet tree -- a single tree, sans whitespace-only text
nodes and comments, and possibly further optimized -- not the original
docs/entities from which it was derived. So in that situation there's no way
around reloading stylesheet docs to satisfy document('').

Mike

Reply via email to