- From: Henri Sivonen <hsivonen@iki.fi>
- Date: Thu, 4 Sep 2008 18:29:58 +0300
- To: Michael (tm) Smith <mike@w3.org>
- Cc: Julian Reschke <julian.reschke@gmx.de>, www-archive@w3.org
On Sep 4, 2008, at 14:43, Michael(tm) Smith wrote: > Julian Reschke <julian.reschke@gmx.de>, 2008-09-03 10:10 +0200: > >> Henri Sivonen wrote: >>> Oops. I didn't realize there was content after the signature. >>> Is this commonly used? It's a rather unobvious use of a transform >>> package. >> >> I know it's commonly used for serializing XML (actually, as far as >> I recall, >> it's the recommended way to do it when you have to rely on what the >> JDK >> includes). Once you know it's there and realize that it includes HTML >> serialization as well, it's kind of obvious to use it for that as >> well. >> >> That being said, I don't recall whether it was recommended >> anywhere. And no, >> I don't know how common it is. >> >> Is there a better alternative that doesn't require including >> additional >> packages? > > That seems like a really good question. Henri, I'd think that > after as much exploration as you've done around XML processing in > Java, if there were some better way, you might know about it. Does > anything come to mind? > > Or wait, I now note that qualification of "doesn't require > including additional packages"... which I guess gets back to what > Julian had mentioned earlier about developers not being at liberty > to install additional packages into Java environments on shared > hosts where they need to do their work. I don't know of any better way to get a SAX to XML or SAX to HTML serializer from the APIs provided by the JDK. Although I hadn't been aware of the JDK including the Xalan serializer behind TrAX, I was unaware that it can be used without a transform before Julian mentioned it. That is, I didn't know that you can use a Transformer without loading transform into it. (And still, before I form an opinion on whether doing so makes sense, I want to step through the process in a debugger to find out what exactly happens between the SAX events going into the empty Transformer and the OutputStream coming out.) So far, I have used three ways to serialize SAX to XML in Java. First, I use the serializer from GNU JAXP. Using it has become increasingly difficult as GNU JAXP started to depend on GCJ stuff and stopped being fully functional on a pure JRE. Then I started using the Xalan serializer as shipped by the Apache Software Foundation (i.e. not depending on the Sun-private copy inside the JDK). I got increasingly annoyed by the way it handled Namespaces, it not sanitizing non-XML characters, the verbosity of instantiating it and the slowness of reaction to https://issues.apache.org/jira/browse/XALANJ-2419 Now I am using a SAX to XML serializer that I wrote myself. It has no configurability, has no factories or providers, sanitizes non-XML characters in content, obeys my sense of Namespace aesthetics and is contained in one .java file. For serializing SAX to HTML, for a long time, I used a serializer that a friend and I pair programmed as part of a university project. Now I'm using a serializer that I wrote from scratch by extrapolating from the DOM to Unicode algorithm that the HTML5 spec gives. -- Henri Sivonen hsivonen@iki.fi http://hsivonen.iki.fi/
Received on Thursday, 4 September 2008 15:30:42 UTC