RE: abbr and acronym

On Thu, 29 Mar 2007, Kempen, E.J.F. van wrote:

>> Actually the page http://esw.w3.org/topic/HTML/AbbrAcronym01
>> (now) says:
>>
>> "The main problem with having only one tag concern the aural
>> UA : acronyms are pronounced like word, while most abbr are
>> spelled. Other abbr are a hybrid form, which are pronounced
>> partly like a word and partly spelled."
>
> I've added this sentence to point out that there are more ways to
> pronounce abbreviations and that the correct prononciation isn't always
> very clear, as pointed out by several other people earlier.

Wiki content keeps changing, so references to it are rather vague, aren't 
they?

Anyway, the point is that the pronunciation is not clear _at all_. As a 
human being, you might infer something from the presence of <abbr> markup, 
if you know the abbreviation and the topic area; as a computer program, 
please don't. The choice of <acronym> vs. <abbr> does _not_ resolve the 
issue whether the content is to be read as a word or some other way. So 
what does it do? Nothing useful.

Note that the common assumption (by people who present <acronym> and 
<abbr> as useful) is that the title attribute specifies the pronunciation. 
Well, it does not. It is an "advisory title", whatever that means. It 
would be quite arbitrary to treat it as a pronunciation suggestion. The 
title attribute is in fact a vaguely defined attribute that everyone and 
his brother wants to use for different purposes, with no regard to 
conflict between usage. Think about the (ab)use of title attributes in 
"microformats", where they surely don't specify the expanded form of an 
abbreviation but (for example) ISO 8601 format time notation.
The title attribute should really be renamed to dwim (after "Do What I 
Mean").

As an attempt to find some new paths for this rather diverging and 
repeating discussion, I propose the inclusion of a read="..." attribute, 
allowed for inline elements, with CDATA value, and defined specifically 
to indicate the intended spoken form of the element's content. It would 
typically be used for <span> elements. Examples:
<span read="u s">US</span>
<span read="United States">US</span>
<span read="ohms">&Omega;</span>

This is far from optimal (and leaves it open exactly how the content is to 
be read), but it should remove any need for <abbr> markup that is supposed 
to address the same issue but doesn't.

-- 
Jukka "Yucca" Korpela, http://www.cs.tut.fi/~jkorpela/

Received on Thursday, 29 March 2007 10:56:51 UTC