W3C home > Mailing lists > Public > html-tidy@w3.org > October to December 2001

Re: don't collapse two spaces at the end of a sentence

From: Richard A. O'Keefe <ok@atlas.otago.ac.nz>
Date: Tue, 18 Dec 2001 15:12:03 +1300 (NZDT)
Message-Id: <200112180212.PAA21297@atlas.otago.ac.nz>
To: Todd_Lewis@unc.edu, html-tidy@w3.org, lee@novonyx.com
	I understood the original problem to be that when Tidy rewraps raw
	blocks of text, it doesn't do the two-space two step.

The problem is not that it doesn't _add_ double-spacing,
but that it doesn't _preserve_ double-spacing that is already there.

	All the issues you brought up about how to determine the end of
	sentences (in various languages no less) have been worked out for years
	in TeX, and the code is free for the taking.

Since HTML 4 and XHTML are based on Unicode, it may be relevant to note
that the Unicode standard includes a method for determining sentence
boundaries.  It's not claimed to be perfect, but it works pretty well
for a wide range of languages and scripts.

	If it were important enough
	to some coder to preserve his two spaces (or "correct" it in HTML from
	other authors/sources), then he could take the appropriate part of TeX's
	code and incorporate it into Tidy, therefore doubling it's size (or
	there abouts -- I'm guessing).

A very wild guess indeed.  A better guess would be 0.5%.
Received on Monday, 17 December 2001 21:12:17 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:13:47 GMT