[whatwg] Re: Are the semantic inline elements really useful?

On 30 Aug, 2005, at 4:16 PM, Henri Sivonen wrote:
> ...
> Consider <cite> for example. What's it really good for? Why should an 
> author bother to use <cite> instead of <i>? Once you have learned to 
> press command-i (or ctrl-i), why should you have to learn to do 
> something else when all you really want to get done is to italicize 
> titles of works?

In theory, screen-readers could intone <cite> elements differently from 
<i> elements, to avoid confusion (when reading articles about movies, 
for example). In English, for example, <cite> would be slightly lower 
pitch, at slightly slower speed, and with less tonal variation, while 
<em> would be slightly higher pitch at slightly slower speed with 
normal tonal variation, and <i> would be no change from normal.

In practice, current popular screen-readers piggyback on Internet 
Explorer for Windows, and therefore make no distinction between the 
various italic elements whatsoever.

> ...
> The scenario that perhaps in the future there will be a need to style 
> the titles of works in a different way (for example bold 
> strike-through fuchsia) seemed ludicrous.

I have seen some sites that style <cite> differently from <i>, usually 
as italics + some color. They could do that just as easily with <i 
class="title"> (even overriding the italics later if they wanted, while 
retaining them for CSS-ignoring UAs).

> Also, the point about pieces of software doing something cool with the 
> data did not seem like a truthful explanation, because <cite> has been 
> around for a long time and still there are no reports of a killer app 
> emerging around it. So I did not recommend <cite>.

Right. The other use of <cite> I can think of (as I mentioned in the 
article you cited) would be for a site like Technorati to extract the 
titles of books or movies people were talking about. But I haven't seen 
any site that does that yet.

> Aside: Now that I looked at the source of the literature list, I 
> noticed that some titles of works were marked up as <em>. my 
> hypothesis is that after an upgrade Dreamweaver has started using <em> 
> when pressing command-i. Sigh. See 
> http://mpt.net.nz/archive/2004/05/02/b-and-i

It's ironic that semantic markup advocacy is gradually preventing <em> 
from ever being semantically useful (hence my article). Since there are 
currently no popular clients that usefully distinguish <em> from the 
other italic elements, there's no obvious way of demonstrating authors' 
wrongness when they use <em> but really mean <cite>/<var>/<dfn>/<i>, so 
no future client will be able to trust that an <em> in a page means 
what the HTML spec says it does. <em> will be the new <i>.

<cite> is probably safe for now, but only because <em> is the honeypot 
for those who are replacing all their occurrences of <i> with one other 
element.

And <i> itself was presentational from its inception, which is one of 
the reasons HTML 5's redefinition of <i> as semantic is misguided. No 
client will ever be able to treat <i> as "an instance of the use of a 
term" on the public Web, because pages that use it for some purpose 
other than "an instance of the use of a term" number at least in the 
hundreds of millions. Even if semantic clients used doctype-sniffing to 
special-case HTML 5, Web browsers would not, so authors would still use 
<i> for italics by mistake far too often.

The problem applies to block elements just as much as inline ones. For 
example, the current HTML 5 draft effectively redefines <dl>/<dt>/<dd> 
as presentational elements (though HTML 4 opened the stable door, both 
by suggesting them for "marking up dialogues", and by using them for 
non-definition purposes on the very first page of the spec itself). 
Something like Google Glossary could extract <dl>/<dt>/<dd> if it was 
unambiguously for definitions, but what can software do with HTML 5's 
"any ... groups of name-value data"? Nothing at all. It's semantically 
useless, and therefore presentational.

One useful line of retreat would be to specify that in the following 
code, "the state of being happy" is unambiguously a definition of 
"happiness" and not of any other subset of the <dt>.

     <dl>
       <dt><dfn>happiness</dfn> /'h? p? nes/ <i><abbr>n.</abbr></i></dt>
       <dd>the state of being happy</dd>
     </dl>

This could be encouraged by "dt dfn {font-weight: bold; font-style: 
normal;}" in browsers' default style sheets, which would be quite 
backward-compatible because of the rarity of <dt><dfn> up to now.

-- 
Matthew Thomas
http://mpt.net.nz/

Received on Tuesday, 30 August 2005 15:23:20 UTC