- From: Thomas A. Fine <fine@head.cfa.harvard.edu>
- Date: Mon, 12 Mar 2012 10:16:36 -0400 (EDT)
- To: public-html-comments@w3.org
I would like to propose a new tag to fill a glaring hole in HTML. There is no tag for marking sentence structure in HTML. While we have always had the <P> tag, possibly the first and most important tag, used to mark off paragraphs, there is no similar tag for sentences. One might argue that sentences are already marked with punctuation, and are simply a part of the content, not the layout. However this is clearly not true. A period is a piece of punctuation with many uses, and it may or may not mark a sentence. And many people are interested in providing formatting specific to sentences. Throughout the 800+ years of printing, printers have mostly chosen to mark sentences off with additional space. While I'm not hear to take sides on whether or not there should be additional space, it's clear that it's a layout feature missing from HTML. One could argue that there are other options are already available that can accomplish this. However by far the most commonly recommended option is which is clearly incorrect, because it wrongly affects line wrapping, and because mixing a space and a NBSP to allow for line wrapping can lead to blanks at the beginning of the line. There are other more appropriate spaces, like  ,  , and   which would be more appropriate to use. This is appropriate and correct for many situations, however it is not the fully flexible layout control one should expect in our modern CSS world. Ideally for those who desire it, they should be able to mark off sentences and finely control their layout using CSS. This approach is also not necessarily correct in terms of cut-and-paste behavior: someone that feels that their inter-sentence spacing should be large, e.g.  , would probably hope that upon cut-and-paste this extra space would map to two (or more) space characters. However it's clear that   must translate to only one single character. A dedicated sentence tag would allow web designers to offer recommendations on mapping between sentences and number of spaces, and allow users to override this setting to their own taste. The approach of using space entities also provides no mechanism for dynamic control. Only CSS could allow web page viewers to adjust the inter-sentence spacing to their own taste. Proper CSS control can be accomplished to a fair degree with the <SPAN> tag, however this is still suboptimal for a couple of reasons. First, it isn't clear that this is correct behavior. If you adjust the padding-right for all of your sentence spans, it isn't clear if a browser's wrap margin will be shifted because of this when a sentence ends near the wrap margin. Second, without a standard, there is no "hook" for software developers to build sentence handling into their user interface. That is, if there were a sentence tag, then HTML generators could offer help to the user in detecting and tagging sentences, whereas without a standard this is unlikely to happen. Use of the SPAN also does not properly address the cut-and-paste issue I discussed above. I have a web page formatted using the SPAN tag, along with javascript used to allow the end-user to adjust sentence spacing: http://hea-www.harvard.edu/~fine/Tech/html-sentences.html (I discuss many of these issues on that web page.) Perhaps the biggest reason for adding this tag is a political one. There is an ongoing debate about whether or not the spacing between sentences should be different than spacing between words. It's not my intention here to take sides. More importantly, HTML should not take sides, but the lack of a tag for marking sentence structure does just that. Many people naively point at HTML's space collapsing behavior as some kind of proof that it is wrong to add extra space between sentences. But this should be a decision of the web designers, HTML itself should be agnostic on the issue. The only way to do this is to offer a functional mechanism for those who want to use it. The record shows that in the early nineties when HTML designers looked at the issue of space between sentences, they should just use word-spacing, not because it was "correct", but because attempting to detect sentences was just too much trouble. At the time, the goals of HTML were to be small and simple; just a structure for accessing other document types. However the purpose of HTML has changed radically since then. In summary these are the arguments in favor of a sentence tag. * HTML should not take sides on this layout issue. * is clearly the wrong solution. *   and other spaces are usable in some cases, but incomplete. * Manipulating SPAN does not offer a clearly correct solution. * SPAN does not give software developers a standard that could be used in user interface design. * There is no solution that addresses the cut-and-paste issue. Thank you, tom Thomas A. Fine
Received on Tuesday, 13 March 2012 12:12:02 UTC