[whatwg] the cite element

Dear Ian,

Here are a few more thoughts regarding the definition of <cite> in HTML5.


On Thu, Aug 27, 2009 at 7:08 PM, Ian Hickson <ian at hixie.ch> wrote:
>
>> Earlier, when justifying why you changed the definition of <cite> from
>> HTML 4.01, you said:
>>
>> > I don't think it makes sense to use the <cite> element to refer to
>> > people, because typographically people aren't generally marked up
>> > anyway. I don't really see how you'd use it to refer to untitled
>> > works.
>>
>> This usage is an example of when people are typographically marked up.
>
> It's a minor case. The semantic here wouldn't be "name of person", it
> would be "name of person when immediately following a quote in a
> pullquote", which is far too specific to deserve a whole element.
>

I don't think anyone is arguing that there should be a new element
exclusively for the above use or that <cite> should be limited only to
that definition ("name of person when immediately following a quote in
a pullquote" or the more forgiving "person to whom the quote is
attributed"). Still, it would be nice to be able to use <cite> to mark
up people being cited (along with other citations that don't
explicitly involve a work's title).



> ... more importantly, the element's style is made
> non-italics, thus completely defeating the entire point of marking up the
> element in the first place.

I'm not sure this is a reasonable argument against the use of <cite>.
Following this line of reasoning, it is not worthwhile to mark up
titles of works if they are *not* to be italicized; moreover, it is
even pointless to mark up headings using <h1>-<h6> if you intend to
remove the bold styling.

The counter to this approach is that <h1>-<h6> provide semantic value
even when styled differently from the default. But the same can be
said for <cite>, whether it is defined as "title of work" or as a more
general "citation."



> When examining pages, you have to first pick a random sample, then study
> those, because otherwise you get sampling bias. With a trillion pages on
> the Web, it's easy to find thousands of examples of any particular use of
> HTML elements; the question is what is the most useful definition, not
> what is used at all.

Because you believe "title of work" to be the most useful definition,
does that mean that you would reject even a majority use of <cite> for
marking up citations that aren't only or exclusively titles?

There are plenty of examples of authors using <cite> to mark up the
following (among other things):

- titles of works
- full citations
- names and other sources of quote attribution (leaving aside
placement relative to the quote)
- names of blog post commenters and authors (in the context of their
comments, posts, etc.)

Even if titles are by for the most common use case, it doesn't make
sense to exclude other semantically justifiable uses of what appear to
be valid uses of the <cite> element, at least according to the English
language usages associated with the word "cite."

Put another way, if you had no prior knowledge of the current HTML5
definition of <cite> (and perhaps any other specification's definition
of the element), what would seem to be logical and appropriate uses of
the element?


>> By changing the definition of <cite> in HTML5, you are saying that numerous
>> users of the HTML4 definition of <cite> are no longer conforming, and not
>> really giving any alternative that does the same job.
>
> <span> does the job fine, in the rare cases where someone really wants to
> mark up someone's name.

Unless there is some semantic value to the name being more than "just"
a name, yes.


>> In the absence of that, having <cite> mean simply a source being cited,
>> and allowing the author to determine whether they want to use it for
>> titles of works, authors, or entire citations, seems to be both
>> reasonable and compatible with existing content.
>
> I think having it mean "title of work" only is more useful. Having it mean
> all three will mislead authors into using it for all three, and then cause
> them undue pain as they work around the default styling.

I'm not sure I buy the "undue pain" argument, especially since there
are plenty of times authors may wish to deviate from the default
italic style of <cite> (using either "title of work" or "citation" as
the definition):

- A normally italicized title that is in a block of text that is also
italicized (in which case the general use would be to remove italics
from <cite>)
- A title of a work that according to a style guide should not be
italicized (in which case a class value would probably be added to the
<cite> element, such as "<cite class="essay">The Freedom to
Offend</cite>").

Moreover, what kinds of difficulties do you suppose? Nested <cite>
elements? I don't think this would be any more a challenge than nested
lists, <strong> in bolded text, or <em> in italicized text, in terms
of dealing with default styles.


> People are actively overriding the styles <cite> because they think it's
> the right element, but it has the wrong effect. I don't know what more
> harm we could be causing here. The element is failing at its only purpose,
> because people think they're being Semantically Right.

I'm not sure I understand your reasoning here. People who are using
<cite> according to the HTML 4.01 specification are wrong for doing
so? Are you retroactively finding fault because you have redefined
<cite> in the HTML5 specification? Or is there some other line of
reasoning?

The definition of <cite> in HTML 4.01 is open to multiple
interpretations. All it says is "Contains a citation or a reference to
other sources."[1] The examples provided make clear, though, that all
the use cases I mention above (titles of works, full citations, names
and other sources of quote attribution, and names of blog post
commenters and authors) would be acceptable according to that
specification.

And as Jeremy Keith and others have pointed out, there's nothing wrong
with overriding default presentational styles. I'm not sure why it
should be such a cause for concern with <cite>.



> A random sample of the Web would need to show more uses of this than uses
> of other things.
>

So it looks like there won't be a specification change, since only the
single highest use-case for an element is worth allowing?


> How few sites using <cite> for people's names would it take to convince
> you that it _wasn't_ a common case?

I'd like to think that I'm willing to be reasonable about this. But
from this humble author's perspective, the answer is "a lot fewer than
there are now," as I (and others) find <cite> to be a useful semantic
tool for more than just "title of work."

And since I'm in a slightly combative mood, let's turn the question around:

How many sites using <cite> for people's names (or other reasonable
uses that deviate from "title of work") would it take to convince you
that it _was_ a common case?


>
> On Mon, 17 Aug 2009, Brian Campbell wrote:
>> >
>> > What is the problem solved by marking up people's names?
>> >
>> > Why is this:
>> >
>> >   <p>I live with <name>Brett</name> and <name>Damian</name>.</p>
>> >
>> > ...better than this?:
>> >
>> >   <p>I live with Brett and Damian.</p>
>>
>> Has anyone claimed that the <cite> element should be used in such a case?
>
> Yes.

Who? If you mean me, we had this back-and-forth already, and I don't
believe those are reasonable (or valid) uses of marking up people's
names with <cite>. I've argued that <cite> should be used for marking
up citations; sometimes those are people's names, but that doesn't
mean that every name should be marked up with <cite>.

In your arguments against using <cite> to mark up non-title citations
(including people who are cited), you've shown an amazing level of
contempt for the intellectual capabilities of document authors. It
should be relatively obvious when a person's name should be wrapped in
a <cite> element and when it shouldn't be; and since <cite> doesn't
particularly affect accessibility or functional concerns, I don't see
why this is a case where authors should be given greater latitude to
use their discretion.

And if misuse or absurd over-complication remain a concern, I would be
more than happy to provide examples that clarify the situation.


>> The only usage I've seen offered is that the <cite> element may be used
>> to mark up a persons name when that person is the source of a quotation;
>> as in, when you are citing that person (hence, the term "cite").
>
> People have argued that merely mentioning something is a citation.

Again, who has argued this? If you mean me, we've had this
conversation before. I argued that <cite> should be used for
citations. That includes the following use cases (and possibly others
that I'm not remembering or aware of):

- titles of works
- full citations
- names and other sources of quote attribution (leaving aside
placement relative to the quote)
- names of blog post commenters and authors (in the context of their
comments, posts, etc.)


>> Should only the majority usage ever be allowed?
>
> That is a concern, yes. Another is what is most useful for authors.
>

Including myself, there have been several authors who have expressed
concern that defining <cite> as "title of work" is not as useful as
allowing it for the broader use-cases that I have detailed above.


>> Or if there is another usage, that is somewhat less common, but is still
>> logically consistent, usefully takes advantage of fallback styling in
>> the absence of CSS, and meets the English language definition of the
>> term, should that also be allowed?
>
> Whether it is useful, what problems it solves, and how it works in
> existing implementations are more important concerns than all the ones you
> listed, IMHO.

I want to be clear, I believe I understand why you have chosen to
define <cite> as it appears in the current draft of the HTML5
specification; I just happen to believe that the current definition is
not as useful as it could be and (more importantly) invalidates
current reasonable uses of the element.

Allowing <cite> to encompass the uses I have detailed above would be
useful, allow for <cite> to provide semantic value (and not just a
styling hook that could just as easily be provided by something like
<i class="title">), and works perfectly well in all extant browser
implementations.


Sincerely,
Erik Vorhes



[1] http://www.w3.org/TR/html401/struct/text.html#h-9.2.1

Received on Tuesday, 15 September 2009 08:50:20 UTC