W3C home > Mailing lists > Public > html-tidy@w3.org > July to September 2000

a handy XML/XHTML line-breaking algorithm

From: Dan Connolly <connolly@w3.org>
Date: Tue, 8 Aug 2000 13:28:15 -0400 (EDT)
Message-ID: <3990430B.27E0D77B@w3.org>
To: www-amaya@w3.org, html-tidy@w3.org
CC: Arjun Ray <aray@q2.net>, swick@w3.org, ij@w3.org
I use tidy and Amaya to manage XHTML documents. A major
annoyance is that they break lines differently, introducing
a lot of noise in my CVS/RCS history.

A while back, we got some feedback on the XML c14n spec,
proposing an alternative line-breaking algorithm:

"In [...] significantly enhancing the utility of line-oriented text
processing tools in dealing with canonicalized documents, I believe
this alternative is worth considering."
	-- Comments on the WD - A proposed alternative
	Arjun Ray (Sun, Feb 20 2000) 
http://lists.w3.org/Archives/Public/www-xml-canonicalization-comments/2000Feb/0005.html

Basically, you break lines between <tag-name and >,
and after each attribute, and line breaks in character
data get escaped (&#10;).

I have written it up in some detail; please see:

	A Handy Line-breaking Algorithm for XML (esp XHTML)
	http://www.w3.org/2000/08/lb2/
	$Revision: 1.10 $ of $Date: 2000/08/08 17:25:41 $

-- 
Dan Connolly, W3C http://www.w3.org/People/Connolly/
Received on Tuesday, 8 August 2000 13:48:33 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:13:44 GMT