On Sep 4, 2008, at 14:43, Michael(tm) Smith wrote:

Julian Reschke <[EMAIL PROTECTED]>, 2008-09-03 10:10 +0200:

Henri Sivonen wrote:
Oops. I didn't realize there was content after the signature.
Is this commonly used? It's a rather unobvious use of a transform package.

I know it's commonly used for serializing XML (actually, as far as I recall, it's the recommended way to do it when you have to rely on what the JDK
includes). Once you know it's there and realize that it includes HTML
serialization as well, it's kind of obvious to use it for that as well.

That being said, I don't recall whether it was recommended anywhere. And no,
I don't know how common it is.

Is there a better alternative that doesn't require including additional
packages?

That seems like a really good question. Henri, I'd think that
after as much exploration as you've done around XML processing in
Java, if there were some better way, you might know about it. Does
anything come to mind?

Or wait, I now note that qualification of "doesn't require
including additional packages"... which I guess gets back to what
Julian had mentioned earlier about developers not being at liberty
to install additional packages into Java environments on shared
hosts where they need to do their work.

I don't know of any better way to get a SAX to XML or SAX to HTML serializer from the APIs provided by the JDK.

Although I hadn't been aware of the JDK including the Xalan serializer behind TrAX, I was unaware that it can be used without a transform before Julian mentioned it. That is, I didn't know that you can use a Transformer without loading transform into it. (And still, before I form an opinion on whether doing so makes sense, I want to step through the process in a debugger to find out what exactly happens between the SAX events going into the empty Transformer and the OutputStream coming out.)

So far, I have used three ways to serialize SAX to XML in Java.

First, I use the serializer from GNU JAXP. Using it has become increasingly difficult as GNU JAXP started to depend on GCJ stuff and stopped being fully functional on a pure JRE.

Then I started using the Xalan serializer as shipped by the Apache Software Foundation (i.e. not depending on the Sun-private copy inside the JDK). I got increasingly annoyed by the way it handled Namespaces, it not sanitizing non-XML characters, the verbosity of instantiating it and the slowness of reaction to
https://issues.apache.org/jira/browse/XALANJ-2419

Now I am using a SAX to XML serializer that I wrote myself. It has no configurability, has no factories or providers, sanitizes non-XML characters in content, obeys my sense of Namespace aesthetics and is contained in one .java file.

For serializing SAX to HTML, for a long time, I used a serializer that a friend and I pair programmed as part of a university project. Now I'm using a serializer that I wrote from scratch by extrapolating from the DOM to Unicode algorithm that the HTML5 spec gives.

--
Henri Sivonen
[EMAIL PROTECTED]
http://hsivonen.iki.fi/



Reply via email to