W3C home > Mailing lists > Public > www-html@w3.org > March 2007

Re: [html] Elements within "title"?

From: Jukka K. Korpela <jkorpela@cs.tut.fi>
Date: Mon, 5 Mar 2007 08:48:25 +0200 (EET)
To: www-html@w3.org
cc: help-whatwg.org@lists.whatwg.org
Message-ID: <Pine.GSO.4.64.0703050814500.12315@mustatilhi.cs.tut.fi>

On Mon, 5 Mar 2007, Lachlan Hunt wrote:

> Your example can be handled using:
>  <title>H&#x2082;O</title>

Did you try that? On a popular browser? What happens almost certainly is
that when the title element's content is rendered in a widget, a fairly
limited character repertoire (perhaps just ISO-8859-1) is used and all 
other characters appear as boxes or parts of boces. Things might be 
different when using a sufficiently new browser _and_ when the system 
settings have been changed to use a font like Arial Unicode MS in 
different widgets.

My point is that browsers render title elements in simplistic manners
and they need to be changed anyway just to cover character repertoires the 
way they do in document content. It would take extra efforts to interpret 
inline markup as well, but probably not disproportionately.

There's still the argument against allowing markup in title elements that 
the element specifies data that is essentially external to HTML, to be 
rendered in non-HTML contexts.

I think most of the objections would be covered by allowing

<title type="extended">H<sub>2</sub>O</title>

since popular browsers would use the first element and ignore the second, 
and future browsers could use whichever they want, possibly different 
title elements for different purposes.

(Actually the type="extended" attribute would not be needed. I added it 
just because it looked explanatory.)

P.S. A use case for allowing markup:

<title>The famous formula <i>E</i> = 

(One might ask whether <var> should be used here instead of <i>, but 
that's not relevant here.)

You can replace <sup>2</sup> by &sup2; or the superscript two character
itself, but you can't replace italicized E, m, and c in any similar 
manner. The use of italics for physical quantities is a well-established 
convention and affects the meanings of symbols: italicized "m" stands for 
mass, upright "m" stands for meter. If a document teaches physics, 
shouldn't it use correct notations in its title, too?

Another use case would be

<title>Government of Canada Site |
   <span lang="fr">Site du gouvernement du Canada</span></title>

There's hardly a way to let speech synthesizers to pronounce it correctly 
or to let automatic spelling checkers to check it properly in any other 
way. Automatic language recognition from the text itself, on a heuristic 
basis, works reasonably well for longish fragments of texts but hardly for 
short texts in mixed languages.

Jukka "Yucca" Korpela, http://www.cs.tut.fi/~jkorpela/
Received on Monday, 5 March 2007 06:48:33 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:06:15 UTC