W3C home > Mailing lists > Public > www-style@w3.org > January 2013

Re: CSS needs improvements for handling sentence spacing

From: Thomas A. Fine <fine@head.cfa.harvard.edu>
Date: Tue, 08 Jan 2013 15:50:02 -0500
Message-ID: <50EC867A.9090605@head.cfa.harvard.edu>
To: www-style@w3.org
On 12/20/12 3:47 PM, Alan Stearns wrote:
> My recommendation would be to promote the use of the method you've
> devised, then show that people care enough about sentence spacing to mark
> up their sentences and use your workaround.

I'm wondering how exactly it would be determined that "people care 
enough about sentence spacing to mark up their sentences."  I've looked 
at a few things to see what people are doing right now.  In my inbox, of 
the messages that can be analyzed for sentence spacing, about a third of 
them use two spaces between each sentence.  When I apply the same 
analysis to a few months of the public-html mailing list, I get  about 
10% of the messages using two spaces between sentences.  And when I look 
at actual web pages, I checked about 30,000 blogger postings, and found 
that only about 3-4% use two spaces.

Blogger is an interesting case, because the default editor preserves 
spaces (albeit in a somewhat broken way).  So those who are using two 
spaces on blogger are in fact using wide spacing in their web pages 
(even though it works poorly).  Even at only 3%, to me this is a huge 
number of people who care enough to mark up their sentences, and who 
could benefit from a cleaner solution.

There's a reason I've focused simply on people who are using two spaces 
(besides blogger preserving them).  After a lot of consideration, I 
think it would be a reasonable (partial) solution to use this two space 
typing habit directly to detect sentence boundaries.  If CSS had 
sentence-spacing feature, one of the methods it could use to decide what 
sentences are is to look for two spaces following terminal punctuation. 
  Unlike many sentence-detection algorithms this method is highly 
accurate, and fully in the control of the content creator.

     tom
Received on Tuesday, 8 January 2013 20:50:31 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 17:21:04 GMT