- From: Jukka K. Korpela <jkorpela@cs.tut.fi>
- Date: Mon, 5 Mar 2007 08:48:25 +0200 (EET)
- To: www-html@w3.org
- cc: help-whatwg.org@lists.whatwg.org
On Mon, 5 Mar 2007, Lachlan Hunt wrote: > Your example can be handled using: > > <title>H₂O</title> Did you try that? On a popular browser? What happens almost certainly is that when the title element's content is rendered in a widget, a fairly limited character repertoire (perhaps just ISO-8859-1) is used and all other characters appear as boxes or parts of boces. Things might be different when using a sufficiently new browser _and_ when the system settings have been changed to use a font like Arial Unicode MS in different widgets. My point is that browsers render title elements in simplistic manners and they need to be changed anyway just to cover character repertoires the way they do in document content. It would take extra efforts to interpret inline markup as well, but probably not disproportionately. There's still the argument against allowing markup in title elements that the element specifies data that is essentially external to HTML, to be rendered in non-HTML contexts. I think most of the objections would be covered by allowing <title>H2O</title> <title type="extended">H<sub>2</sub>O</title> since popular browsers would use the first element and ignore the second, and future browsers could use whichever they want, possibly different title elements for different purposes. (Actually the type="extended" attribute would not be needed. I added it just because it looked explanatory.) P.S. A use case for allowing markup: <title>The famous formula <i>E</i> = <i>m</i><i>c</i><sup>2</sup></title> (One might ask whether <var> should be used here instead of <i>, but that's not relevant here.) You can replace <sup>2</sup> by ² or the superscript two character itself, but you can't replace italicized E, m, and c in any similar manner. The use of italics for physical quantities is a well-established convention and affects the meanings of symbols: italicized "m" stands for mass, upright "m" stands for meter. If a document teaches physics, shouldn't it use correct notations in its title, too? Another use case would be <title>Government of Canada Site | <span lang="fr">Site du gouvernement du Canada</span></title> There's hardly a way to let speech synthesizers to pronounce it correctly or to let automatic spelling checkers to check it properly in any other way. Automatic language recognition from the text itself, on a heuristic basis, works reasonably well for longish fragments of texts but hardly for short texts in mixed languages. -- Jukka "Yucca" Korpela, http://www.cs.tut.fi/~jkorpela/
Received on Monday, 5 March 2007 06:48:33 UTC