- From: Thomas A. Fine <fine@head.cfa.harvard.edu>
- Date: Wed, 05 Dec 2012 12:57:28 -0500
- To: public-html@w3.org
HTML needs a tag to indicate sentence structure. So, how do I go about having this tag added? Is there a formal procedure? Should I submit a bug report? Is there a specific group or mailing list where I should start? What exactly is the process? Here's a brief summary of why I think this is needed: HTML5 has already added a number of other semantic tags which describe recognizable pieces of documents which are larger than sentences (e.g. SECTION). And this trend has continued with RDF and Microdata showing that there is a significant interest in indicating smaller semantic pieces down to the sub-sentence level. For this reason alone it should be obvious that it would be ludicrous for HTML to offer semantic tags for a vast array of different chunks of information, and yet ignore the absolutely most common semantic chunk, the sentence. Like other semantic tags, a sentence tag can be useful in attempts to extract meaning from a document, or to convert text to speech with more reliable inflection, or to provide more reliable translations, and probably for many other reasons. In addition to semantic reasons, my primary interest in this issue is in providing a mechanism for sentence spacing. As HTML could arguably be the most consumed document type for the printed word today or in the near future, it's shocking that it can't do the one common formatting option that typesetters often used for hundreds of years after the invention of movable type: wider sentence spacing. It's not my intention to start or facilitate some kind of war about sentence spacing. Indeed, HTML should absolutely be agnostic on the issue. Unfortunately, it's inability to handle what is historically the most basic text formatting operation can not be considered an agnostic position. I've seen arguments of this issue where people hold up HTML as evidence that wider sentence spacing is no longer correct. In other words, there is now a belief that the HTML standard has already taken sides. Here's a few reasons why people might want to adjust sentence formatting: * Representation of the look of historical documents. * As an aid to new readers, or people learning a new langauge. * As an aid to people with learning or visual disabilities. * As an additional means of adding emphasis to text. * Simply because they prefer it for aesthetic reasons. While there are suggested algorithms for detecting sentences, none of them works completely reliably. An accurate solution defies even the most advanced AI approach, and in fact even another human being would likely fail to accurately guess what the content creator had in mind in all cases. If HTML has been given all the modern tools of convenience that we now have, shouldn't it also include one of the most basic tools that typesetters have been using for centuries? tom
Received on Wednesday, 5 December 2012 17:58:01 UTC