W3C home > Mailing lists > Public > html-tidy@w3.org > October to December 2001

Re: don't collapse two spaces at the end of a sentence

From: Todd M. Lewis <utoddl@email.unc.edu>
Date: Mon, 17 Dec 2001 16:25:46 -0500
Message-ID: <3C1E62DA.AE98C37@email.unc.edu>
To: lee@novonyx.com, html-tidy@w3.org
Lee,

I agree with everything you said. Here's the point: Tidy has a bunch of
options each of which control one of two different things: (can you have
an off-topic parenthetical after the second colon in a sentence? Or can
you even have two colons in a sentence? And embedded questions, too? Oh
well...) presentation of the rendered HTML, and layout of the raw
source.

I understood the original problem to be that when Tidy rewraps raw
blocks of text, it doesn't do the two-space two step. I didn't think we
were talking about how browsers render spaces, as that's covered
elsewhere in the appropriate specs (and violated at will by the
browsers).

All the issues you brought up about how to determine the end of
sentences (in various languages no less) have been worked out for years
in TeX, and the code is free for the taking. If it were important enough
to some coder to preserve his two spaces (or "correct" it in HTML from
other authors/sources), then he could take the appropriate part of TeX's
code and incorporate it into Tidy, therefore doubling it's size (or
there abouts -- I'm guessing). I would like such a flag, but I don't
care enough about it to bloat Tidy that much. Like I said, I've gotten
used to one space being sufficient. But the hard work (dealing with the
issues you raised) has been done.

If somebody wants to affect presentation, then ".&nbsp; " is probably
the way to go, unless there's some magic to be had in style sheets that
I'm not aware of. But not only do I not want to work that hard, I don't
want my browser to work that hard either. :-)
-- 
   +------------------------------------------------------------+
  / Todd_Lewis@unc.edu              http://www.unc.edu/~utoddl /
 /(919) 962-5273               Lord, give me patience... Now! /
+------------------------------------------------------------+

Lee Passey wrote:
> 
> My undergraduate degree is in a foreign language, so I find this type of
> discussion extremely fascinating, although it is really off-topic.
> 
> Earlier, Gerhard Scholz suggested that two spaces could be placed at the
> end of sentences by using "dot&nbsp;space".  This will work, and there
> should be nothing in tidy that will prevent or alter this usage.  (If
> there is, please send me an example, and I will work at fixing it.)  But
> should tidy have an option to fix this up where it is not present?
> 
> More to the point, can tidy fix this up where it is not present?
> 
> The problem is, what is a sentence?  From a lexigraphic standpoint, my
> first reaction would be to define the end of a sentence as a
> non-whitepace, followed by a period, an exclamation mark, or a question
> mark, followed by whitespace.  But what about abbreviations like Mr.,
> Ms. or Dr., which should be followed by a single space?  And what about
> sentences "where the punctuation is encapsulated in quotation marks?"
> And what would be the effect of phrase elements and font style elements
> (e.g. <em> or <i>)?  And I seem to recall from my typing class that
> there should be two spaces after colons as well.  Should be include
> rules for that too?
> 
> I conclude that the problem is more appropriately discussed in forums on
> natural language processing or artificial intelligence.  I think that
> tidy should do nothing to prevent "two space or die" bigots from
> creating html which reflects their bias, but I don't think that it can
> or should "fix" text in which it is not already present.
> 
> "Todd M. Lewis" wrote:
> >
> > "Richard A. O'Keefe" wrote:
> > >
> > > In this mailing list, we're NOT talking about how the text ends up being
> > > presented.  We're talking about how the HTML source form is tidied, and
> > > arguments from "modern" typography (really based on mediaeval scribes'
> > > desire to cram as many words as they could onto their extremely expensive
> > > writing medium) are entirely beside the point.
> >
> > Doesn't TeX do something more involved with end-of-sentence spacing?
> > How much bloat would it add to Tidy to make it smarter about punctuation
> > at the end of sentences, like TeX?  For the record, I used to be a "two
> > spaces or die" biggot, but I got over it.  Still, it would be nice if
> > Tidy allowed as much stylistic choice as possible in the internal
> > layout...
> > --
> >    +------------------------------------------------------------+
> >   / Todd_Lewis@unc.edu              http://www.unc.edu/~utoddl /
> >  /(919) 962-5273               Lord, give me patience... Now! /
> > +------------------------------------------------------------+
Received on Monday, 17 December 2001 16:25:49 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:13:47 GMT