[whatwg] Problems with the definition of <cite>

On Jan 6, 2007, at 12:18 PM, fantasai wrote:
> Anne van Kesteren wrote:
>> By the way, I didn't really get the arguments about implementing a 
>> construct like:
>>   <p><cite><a href="...">...</a></cite> ... <q>...</q></p>
>> At least not for visual user agents.
> I think the problem is what happens if I am, for example, writing
> a 5-paragraph essay comparing two books. I use lots of quotations
> from both books in the same paragraph in all five paragraphs, but
> the cite information is complete (author+title) only in the first
> instance, and the order if source and quotation is mixed up all
> over the place. You can machine-process the simple case of one
> quote, one cite, but there's no way to machine-process that without
> some help.
> ...

Right. The description of attaching adjacent <cite>s to <blockquote> 
and <q> is not only a heuristic, it's a poor heuristic, because it will 
fail often in those documents where <blockquote> and <cite> are used at 
all. For example, it will fail where one <cite> element is in a 
paragraph immediately between two <blockquote>s, when it may be the 
citation of only one or neither of them.

There are other problems in WA1's current definition of <cite>
<http://www.whatwg.org/specs/web-apps/current-work/#the-cite>. It says:

     This is the correct way to do it:

         <p><q>This is correct!</q>, said <cite>Ian</cite>.</p>

Despite this being consistent with the example given in the HTML 4 
specification, it is not compatible with the Web (except for the tiny 
part of it found on diveintomark.org and its imitators). All noticable 
graphical browsers default to cite {font-style: italic}, and it is 
inappropriate to italicize someone's name just because you're quoting 
them. Therefore, that's not what Web authors -- or even HTML reference 
authors -- understand <cite> to be for.
(A counterexample is the Mozilla Developer Center's description of 
<cite>, which uses the same example as the HTML 4 spec, but helpfully 
shows how nonsensically Gecko would render it if you actually used it 
that way. <http://developer.mozilla.org/en/docs/HTML:Element:cite>)

WA1 continues:

     This is also wrong, because the title and the name are not
     references or citations:

         <p>My favourite book is <cite>The Reality
         Dysfunction</cite> by <cite>Peter F. Hamilton</cite>.</p>

     This is correct, because even though the source is not quoted,
     it is cited:

         <p>According to <cite>the Wikipedia article on
         HTML</cite>, HTML is defined in formal specifications
         that were developed and published throughout the

This is also incompatible with the Web, again because nobody would want 
"the Wikipedia article on HTML" italicized unless they were emphasizing 
it. On the other hand, if browser developers decided /en masse/ to 
deitalicize <cite> by default, it would have no presentation at all, so 
many fewer people would bother using it at all.

Further, it is a distinction most authors won't be able to understand. 
For example, which of these paragraphs would be conformant?

     <p>My favourite book is <cite>The Reality Dysfunction</cite> by
     <cite>Peter F. Hamilton</cite>, because Hamilton describes
     wormholes as a way of travelling over long distances.</p>

     <p>My favourite book is <cite>The Reality Dysfunction</cite> by
     <cite>Peter F. Hamilton</cite>, because of Hamilton's
     description of wormholes.</p>

     <p>My favourite book is <cite>The Reality Dysfunction</cite> by
     <cite>Peter F. Hamilton</cite>, because of Hamilton's
     descriptions of various sci-fi ideas.</p>

     <p>My favourite book is <cite>The Reality Dysfunction</cite> by
     <cite>Peter F. Hamilton</cite>, because of Hamilton's
     descriptiveness and imagination.</p>

     <p>I arrived in Boston having read about half of Peter F.
     Hamilton's latest book, <cite>Pandora's Star</cite>. This is a
     nearly 900 page book, part one of the <cite>Commonwealth
     Saga</cite>. I absolutely loved his first saga, the
     <cite>Night's Dawn Trilogy</cite>. So far this book is
     promising to be just as good.</p>

Even if you can carefully make the distinction between the conformant 
and non-conformant examples, most authors will not. It is not 
plausible, for example, that an author will realize "oh, I'm no longer 
actually mentioning any of Hamilton's ideas from that *particular* 
book, I'd better remove the invisible <cite> element around its title".

I think a more compatible and visually obvious (if less semantically 
obvious) definition of <cite> is marking up the name of a work: a book, 
film, exhibition, game, etc.

To close on a minor point: en-GB-hixie notwithstanding, it's 
"preceded", not "preceeded". :-)

Matthew Paul Thomas

Received on Friday, 5 January 2007 22:56:57 UTC