W3C home > Mailing lists > Public > whatwg@whatwg.org > August 2009

[whatwg] the cite element

From: Ian Hickson <ian@hixie.ch>
Date: Mon, 3 Aug 2009 11:29:57 +0000 (UTC)
Message-ID: <Pine.LNX.4.62.0908031108150.6420@hixie.dreamhostps.com>
On Mon, 27 Jul 2009, Erik Vorhes wrote:
> On Sun, Jul 19, 2009 at 4:58 AM, Ian Hickson<ian at hixie.ch> wrote:
> >>
> >> If <cite> is exclusively for titles, it shouldn't be called <cite>.
> >
> > Sure, but we're about 15 years too late for that.
> 
> Well, no: the as far as I have been able to determine, every HTML 
> specification (before HTML5) did not limit this element to titles.

I meant that we're too late to rename the element.


> > In practice, people haven't been confused between these two attributes 
> > as far as we can tell. People who use <cite> seem to use it for 
> > titles, and people who use cite="" seem to use it for URLs. (The 
> > latter is rare.)
> 
> See <http://www.four24.com/>; note near the top of the source: 
> <blockquote id="verse" cite="John 4:24">...

My statement stands, on the aggregate:

On Mon, 27 Jul 2009, Philip Taylor wrote:
> 
> See http://philip.html5.org/data/cite-attribute-values.txt for some 
> data. (Looks like non-URI values are quite rare.)

While we're at it, Philip had other data:

> Also maybe relevant: see http://philip.html5.org/data/cite.txt for some 
> older data about <cite>. (Looks like non-title uses are very common.)

This seems to support my point that <cite> is used for a whole variety of 
purposes, like <em>, <i>, <q>, HTML4's <cite>, and HTML5's <cite>. Very 
few, actually much fewer than I had remembered from my last look at the 
data, are names of people, citations or otherwise.


On Mon, 27 Jul 2009, Erik Vorhes wrote:
>
> > A new element wouldn't work in legacy UAs, so it wouldn't be as 
> > compelling a solution. Also, <cite> is already being used for this 
> > purpose.
> 
> My preference would be for <cite> to retain the flexibility it has in 
> pre-HTML5 specifications, which would include referencing titles.

The flexibility doesn't seem as useful as limiting it to titles. What is 
the problem solved by allowing names to be marked up in the same manner as 
titles? The problem solved by allowing titles specifically to be marked up 
is that titles are usually typographically offset from the surrounding 
text in a distinctive fashion. This doesn't apply to names. Reusing the 
same element for both encourages authors to use <cite> for both which 
makes it harder for them to get the right typographic effect, leading to a 
lower quality of typography overall. I think this is a bad thing.


> If backwards compatibility is that big a concern, why does HTML5 use 
> <legend> outside of <fieldset> elements?

Because inventing a new element in that particular case turns out to be 
non-trivial (pretty much every synonym for "caption" is already used by 
some HTML element), and we can afford to wait to get <figure> done.


> And if the definition of new elements is such a concern, why introduce
> *any* new elements? (Please forgive the snark.)

There were no existing elements that could be reused for many of the new 
semantics. When there were, we used them (e.g. <i>, <b>, <cite>, <menu>, 
<legend>, <h1>).


> > What is the pressing need for an element for citations, which would 
> > require that we overload <cite> with two uses?
> 
> A title can be a citation, but not all citations are titles. What's the 
> pressing need for limiting <cite> only to titles?

As described above, the need to have an element for titles is that there 
are typographic conventions that apply to titles. What is the pressing 
need for an element for citations, which would require that we overload 
<cite> with two uses?


> >> I understand HTML5's attempts to provide semantic value to such 
> >> elements as <i>, <b>, and <small>. To at the same time remove 
> >> semantic value at the same time is completely asinine.
> >
> > If <cite>'s original meaning has value, that is true; what is its 
> > value?
> 
> I would assume that this would be obvious. <cite> both denotes and 
> connotes "citation."

But why does that have value? How would you use this information?


> >> > Note that HTML5 now has a more detailed way of marking up 
> >> > citations, using the Bibtex vocabulary. I think this removes the 
> >> > need for using the <cite> element in the manner you describe.
> >>
> >> Since this is supposed to be the case, why shouldn't HTML5 just ditch 
> >> <cite> altogether? (Aside from "backward compatibility," which is 
> >> beside the point of the question.)
> >
> > Backwards compatibility (with legacy documents, which uses it to mean 
> > "title of work") is the main reason.
> 
> I'd beg to differ, regarding "legacy documents." See, for example the 
> automated citation generation at Wikipedia: 
> http://en.wikipedia.org/wiki/Wikipedia:Citation_templates

What specifically am I looking for here? This doesn't seem to have any 
relevance to HTML.


> In addition, the comments at zeldman.com use <cite> to reference authors 
> of comments. While that specific example is younger than HTML5, this is 
> merely an example of a relatively common use-case for <cite> that does 
> not use it to signify "title of work."

As I said, the most common use of <cite> is to mark up italics. I agree 
entirely that it's misused.


> >> There is no reason at all why it can't be defined as "citing whom".
> >
> > The main reason would be that there doesn't appear to be a useful 
> > purpose to doing that.
> 
> The above references suggest otherwise. There are plenty of instances 
> where one would want to cite people rather than just a "title of work"; 
> blog commenters are only the most obvious example.

Blog commenters don't need to be marked up any differently than the number 
of the comment -- that's a stylistic issue that varies from blog to blog. 
I don't see the need for an element specifically for people commenting on 
blogs. In most blogs that I've seen, the name isn't even highlighted in 
any particular fashion.


> Existing tools that treat <cite> exclusively as "title of work" do so 
> against every HTML specification out there (i.e., HTML 4.01 and 
> earlier).

Existing tools generally have had very few problems in finding ways to do 
things against every HTML specification out there. Over 90% of all content 
on the Web is syntactically invalid in some way, and I'm sure that more 
than 10% of content on the Web is generated by tools.


> >> While the HTML 4.01 specification is hardly perfect, I don't see the 
> >> value in limiting the semantic potential of the <cite> element in 
> >> HTML5.
> >
> > As far as I can tell, increasing it from citations to titles of works 
> > is actually increasing its semantic potential, not limiting it.
> 
> Well, no. It's making it more exclusive. Defining <cite> as "title of 
> work" increases its specificity, but limits its semantic potential. As I 
> noted before, all titles are citations, but not all citations are 
> titles. By defining <cite> as an element that identifies a "citation" 
> you allow for "title of work" while not excluding other justifiable uses 
> of this element, e.g., "cited person."

Not all titles are citations, actually. For example, I've heard of the 
/Pirates of Penzance/, but I'm not citing it, just mentioning it in 
passing.


> > Indeed, there is a lot of misuse of the element -- as alternatives for
> > <q>, <i>, <em>, and HTML5's meaning of <cite>, in particular.
> >
> > Expanding it to cover the meanings of <q>, <i>, and <em> doesn't seem as
> > useful as expanding it just to cover works.
> 
> I believe you mean "limiting it just to cover works" here.

I meant expanding it, since not all titles of works are citations.


> > I think it's clear that people want to use <cite> for things other 
> > than citations, and in fact do use it that way widely. If we're 
> > increasing it past just citations, then there seems to be clear value 
> > to using it to mark up titles; there doesn't seem to be much value in 
> > marking up titles and just any names (they're styled differently in 
> > practice), and marking up any title but only names of people who have 
> > been quoted is just weird.
> 
> Then why not allow <cite> to be defined as "citation" rather than a
> subset of citation ("title of work")?

I disagree that "title of work" is a subset of "citation".

The reasons to limit it to "title of work" instead of allowing it to also 
cover non-titles are discussed at the top of this e-mail.


> Your reference to names and titles, e.g., being "styled differently in 
> practice" is a red herring. Sure, names and titles are rarely styled the 
> same, but not all titles are styled the same way, either.

As a first approximation, titles are italics, and names are not. I think 
that's a far closer approximation of typographical conventions than 
lumping titles and names together into one default style.


I haven't changed the spec. I continue to hold the position that covering 
titles of works is more useful than covering titles of works and names of 
people, and more useful than covering only names of people or works that 
are explicitly cited.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
Received on Monday, 3 August 2009 04:29:57 UTC

This archive was generated by hypermail 2.4.0 : Wednesday, 22 January 2020 16:59:15 UTC