a handy XML/XHTML line-breaking algorithm

I use tidy and Amaya to manage XHTML documents. A major
annoyance is that they break lines differently, introducing
a lot of noise in my CVS/RCS history.

A while back, we got some feedback on the XML c14n spec,
proposing an alternative line-breaking algorithm:

"In [...] significantly enhancing the utility of line-oriented text
processing tools in dealing with canonicalized documents, I believe
this alternative is worth considering."
	-- Comments on the WD - A proposed alternative
	Arjun Ray (Sun, Feb 20 2000) 
http://lists.w3.org/Archives/Public/www-xml-canonicalization-comments/2000Feb/0005.html

Basically, you break lines between <tag-name and >,
and after each attribute, and line breaks in character
data get escaped (&#10;).

I have written it up in some detail; please see:

	A Handy Line-breaking Algorithm for XML (esp XHTML)
	http://www.w3.org/2000/08/lb2/
	$Revision: 1.10 $ of $Date: 2000/08/08 17:25:41 $

-- 
Dan Connolly, W3C http://www.w3.org/People/Connolly/

Received on Tuesday, 8 August 2000 13:48:33 UTC