W3C home > Mailing lists > Public > public-html@w3.org > August 2013

Re: updated cite definition - please review

From: Smylers <Smylers@stripey.com>
Date: Fri, 23 Aug 2013 17:18:43 +0100
To: HTMLWG WG <public-html@w3.org>
Message-ID: <20130823161843.GF2173@stripey.com>
Steve Faulkner writes:

> I have made changes to the definition of the cite element in the HTML
> 5.1 editors draft
> new:
> http://www.w3.org/html/wg/drafts/html/master/text-level-semantics.html#the-cite-element
> old:
> http://www.w3.org/TR/html51/text-level-semantics.html#the-cite-element

The clarification of <cite> in HTML5 (versus HTML4) made sense to me;
this change looks to largely undo that, and also disallow some uses
which were previously allowed.

I do not understand what the purpose is of an element with such a vague
definition. It doesn't seem like there is any semantic which can
usefully be conveyed by knowing that a piece of text which may be one of
any number of thing relating to the source of quoted text.

Please can you give an example of why using <cite> in this way is
useful? For instance, in the example added to the spec, what is the
advantage in have Charles Bukowski's name marked up (as opposed to
leaving it with no additional markup, like the other words in the

And how would you expect Charles Bukowski's name to be rendered, to
indicate to the reader that it's information relating to the source of
the nearby quote? I'm mostly used to names such as this not being
denoted in any specific way.

Whereas titles of works often are denoted somehow — perhaps in italics,
or with single or double quote marks around them, or underlined.
Obviously such titles need some mark-up to indicate that they are
titles, so that this can be conveyed to users.

For instance The Boo Radleys have a track called ‘Charles Bukowski Is
Dead’, which somebody quoting from the lyrics and writing about the
album it's on may well abbreviate to ‘Charles Bukowski’ on subsequent
mentions. They may also refer to the person Charles Bukowski, explaining
who he is. It's possible for the reader to distinguish between the track
and the person because the track will be rendered in a way which conveys
it's the title of a work.

So I completely understand the reason for an element with the HTML5
definition of <cite>.

What I don't understand is why it makes sense for that element to be the
same element which is also used to mark up things relating to sources of
quotations which aren't titles. Please can you explain why it makes
sense have a single element covering all of these uses?

Unless you want everything relating to sources of quoted text to be
denoted like titles are, in practice as an author you're going to need
to distinguish titles of works from the other uses. This could of course
be done with classes and CSS which hooks off those, but that then makes
the semantic site-specific rather than part of the HTML standard.

In particular, this means that without CSS a user doesn't have the
intended semantic of the document conveyed to them. This is obviously
suboptimal compared with a situation where user agents know which bits
of text are titles (so they can be conveyed appropriately) and which

It seems like a layering violation for CSS to be required to glean the
semantics of a document.

And in practice, existing browsers do display the contents of <cite>
differently from the surrounding text (italics in graphical browsers, in
a different colour in Lynx). So anybody using <cite> for something other
than the title of a work risks it being conveyed as a title of a work to
users anyway.

Whereas marking up information related to references to sources of
quoted text that isn't a title differently — possibly with a newly
minted element for the purpose, or no mark-up at all, or <span> with a
site-specific class — avoids this issue.

Bruce Lawson writes:

> This change is to restore the HTML 4.01 way.

I'm not sure that it does. It seems more restrictive than HTML4 was.

In particular, the new definition only covers information about “quoted
text”. It no longer allows using <cite> for marking up the title of a
work which is merely being mentioned, not quoted from.

For instance, my ‘Charles Bukowski Is Dead’ scenario above involved the
author quoting from the song's lyrics while writing about the track. If
lyrics from the track were not quoted, then the song wouldn't qualify as
“quoted text”, and using <cite> wouldn't be valid, per its new

Please can you explain why this makes sense? It seems to be that an
author writing about a work could either quote from it or not, and I
don't understand why the element used to mark up its title depends on

> By removing the ability to cite authors, lots of people have spent a
> good deal of time attempting to find other ways of marking that up,

Why do they need to mark up authors?

> leading to potential code bloat (wrapping blockquotes in <figure>, for
> example, so they can use a <figcaption> instead of <cite> inside /
> next to a <blockquote>).

Even if <cite> can be used for an author of a quotation, it isn't valid
for that to be inside <blockquote>, since <blockquote> is restricted to
the text being quoted:

So if an author wishes to have an element which encloses both a
quotation and its author, an additional element will be required
regardless of whether the author is marked up with <cite>, <span>, <p>,
<figcaption>, or <div>.

Please can you give an example of using <cite> next to a blockquote
where using <span> instead of <cite> wouldn't suffice?


Stop drug companies hiding negative research results.
Sign the AllTrials petition to get all clinical research results published.
Read more: http://www.alltrials.net/blog/the-alltrials-campaign/
Received on Friday, 23 August 2013 16:19:12 UTC

This archive was generated by hypermail 2.3.1 : Thursday, 29 October 2015 10:16:34 UTC