On Sep 4, 2008, at 14:43, Michael(tm) Smith wrote:
Julian Reschke <[EMAIL PROTECTED]>, 2008-09-03 10:10 +0200:
Henri Sivonen wrote:
Oops. I didn't realize there was content after the signature.
Is this commonly used? It's a rather unobvious use of a transform
package.
I know it's commonly used for serializing XML (actually, as far as
I recall,
it's the recommended way to do it when you have to rely on what the
JDK
includes). Once you know it's there and realize that it includes HTML
serialization as well, it's kind of obvious to use it for that as
well.
That being said, I don't recall whether it was recommended
anywhere. And no,
I don't know how common it is.
Is there a better alternative that doesn't require including
additional
packages?
That seems like a really good question. Henri, I'd think that
after as much exploration as you've done around XML processing in
Java, if there were some better way, you might know about it. Does
anything come to mind?
Or wait, I now note that qualification of "doesn't require
including additional packages"... which I guess gets back to what
Julian had mentioned earlier about developers not being at liberty
to install additional packages into Java environments on shared
hosts where they need to do their work.
I don't know of any better way to get a SAX to XML or SAX to HTML
serializer from the APIs provided by the JDK.
Although I hadn't been aware of the JDK including the Xalan serializer
behind TrAX, I was unaware that it can be used without a transform
before Julian mentioned it. That is, I didn't know that you can use a
Transformer without loading transform into it. (And still, before I
form an opinion on whether doing so makes sense, I want to step
through the process in a debugger to find out what exactly happens
between the SAX events going into the empty Transformer and the
OutputStream coming out.)
So far, I have used three ways to serialize SAX to XML in Java.
First, I use the serializer from GNU JAXP. Using it has become
increasingly difficult as GNU JAXP started to depend on GCJ stuff and
stopped being fully functional on a pure JRE.
Then I started using the Xalan serializer as shipped by the Apache
Software Foundation (i.e. not depending on the Sun-private copy inside
the JDK). I got increasingly annoyed by the way it handled Namespaces,
it not sanitizing non-XML characters, the verbosity of instantiating
it and the slowness of reaction to
https://issues.apache.org/jira/browse/XALANJ-2419
Now I am using a SAX to XML serializer that I wrote myself. It has no
configurability, has no factories or providers, sanitizes non-XML
characters in content, obeys my sense of Namespace aesthetics and is
contained in one .java file.
For serializing SAX to HTML, for a long time, I used a serializer that
a friend and I pair programmed as part of a university project. Now
I'm using a serializer that I wrote from scratch by extrapolating from
the DOM to Unicode algorithm that the HTML5 spec gives.
--
Henri Sivonen
[EMAIL PROTECTED]
http://hsivonen.iki.fi/