RE: Clean XSL documents

Bjoern Hoehrmann wrote:

> >I'm not sure it could do that much in the HTMLTidy sense of fixing up
> >*bad* markup.
 
> I can think of some cases like
>   * missing end-tags
>   * un-quoted attribute values

James Clark's tool SX (now part of OpenSP too) will do those -- at least
to some extent:  I can't remember offhand if SX requires a "real" SGML
document -- i.e., one with a DTD.

>   * un-escaped <, ]> and &
>   * white-space in front of the XML declaration

Yes.
 
> but then I come to think who actually writes non-wellformed XML
> documents and isn't writing a fault tolerant XML processor harmful?

I've goofed while hand-editing XML documents before.  Firing it up in a
browser (IE or Moz) is a pretty quick way to find WF-ness errors.  

> >I was really thinking of just a "pretty print" mechanism.
> 
> White-space in XML documents is in general significant, so it 
> is hard to actually pretty print, unless you think a style like 
> documented in http://www.w3.org/2000/08/lb2/ is "pretty". 

No thank you!  :)

> If not, you'd still need a definition of what elements in the to 
> be tidied document are inline and block-level

Yes.  Assume all elements are by default inline (the same assumption CSS
makes, IIRC).  Then with a config file like

	indent: 2
	block-elements: url("block.txt")
	empty-elements: url("emtpy.txt")

or some such, you'd be set. If you didn't provide it a list of empty
elements, it would assume them from any '/>' ending the tag (just like
XML parsers do). (You might want a *few* other things, like being able
to set the input and output encodings).

> and if users are willing to provide this information? 

Indeed they would certainly have to if they want to use (the currently
mythical) XMLTidy.  I can think of at least several occasions an XMLTidy
would have helped me immensely, and I would have gladly (and rather
quickly) been able to compile a list of the block-level elements. 

> Who at all needs pretty XML?

Go to www.altova.com and view source.  That beauty was created using
XSLT.  Because it happens to be XHTML, it's fortunately Tidyable.  

Now imagine the same for a similarly created non-XHTML XML document!  :)


/Jelks



 

Received on Saturday, 2 November 2002 13:29:44 UTC