[whatwg] the cite element

On Tue, Sep 22, 2009 at 8:46 PM, Ian Hickson <ian at hixie.ch> wrote:
> On Wed, 16 Sep 2009, Erik Vorhes wrote:
>> On Wed, Sep 16, 2009 at 4:16 AM, Ian Hickson <ian at hixie.ch> wrote:
>> >> Unless there is some semantic value to the name being more than
>> >> "just" a name, yes.
>> > Is there?
>> Yes
> What is it?

<cite> points to a primary source of the statement, as opposed to an
someone merely named by the statement.

>> and with the removal of the <dialog> element (of which I was unaware
>> when I sent my last message) makes a compelling case for the
>> re-expansion of <cite> for dialog.

> Why?

dialogues and transcripts and credits and theatrical scripts are all
arguably too fine-grained for a "citation", as opposed to a "label" or
"attribution", but they are certainly real use cases where the
attribution is important.

These three are even cases where print sources will typically shift
font in some way between the attribution (<b>Mephistopheles</b>) and
the actual statement, though not always in the same manner.  Of the
three that I found first,

  Indented lines, said
  or sung aloud.

<i>Name.</i>  Statement begins here.

Q.    Attorney's question.
A.    Witness answers.
Q.    Attorney's next question.
A.    Next response.

>> On October 31, 2006, Michael Fortin suggested the following pattern:
>> <p><cite>Me:</cite> <q>Can I say something?</q>

>> Which Jeremy Keith also recommends. [1]
>> Aside from the current definition of <cite>, I think this would be a
>> good use of the element, since it makes more sense than <b> or <span>
>> (what do those signify in this context?) and there's nothing wrong with
>> an italicized name in this context. Moreover, there are examples of
>> Fortin/Keith's usage in the wild.

> I don't understand why we need an element here at all, and I don't
> understand why we would want to reuse <cite>, of all elements, if we did
> in fact need one.

That "Me:" isn't pronounced; it is metadata so important that it gets
written (in an odd style) in printed form.  The punctuation (followed
by a new sentence, complete with initial capitals) is the closest a
typewriter can come to markup, and scripts will typically make the
difference more emphatic.

I'll agree that it seems odd to have that many <cite> elements in such
close proximity, but it is the closest match I can find in the spec,
and it doesn't seem to be actually wrong.  Searching for lines by a
particular character is a fairly common use case.

>> > ...  How do
>> > you define "citation"? What problem does it solve?

>>  <cite> should be allowed for markup in the following instances:

>> - titles of works
>> - full citations
>> - names and other sources of quote attribution (including identifying
>> speakers in dialog)
>> - names of blog post commenters and authors (in the context of their
>> comments, posts, etc.)

> That seems like a really strange and eclectic variety of uses.

All boil down to "says who?".  A title of a work indicates something
about when they said it, and how (formally enough to have a title),
but ... so does a hyperlink to the author.

> For example, it seems odd to say that in the following, the third <cite>
> is non-conforming, but the other two are fine:
> ? <article>
> ? ?<footer>Comment by <cite>John Adams</cite></footer>
> ? ?<p>I think that the following comment by <cite>Fred Fox</cite> is
> ? ?right:</p>
> ? ?<blockquote>
> ? ? <p>Tomatoes are juicy.</p>
> ? ?</blockquote>
> ? ?<p>However, I like to visit <cite>Ian</cite> and he does not like them
> ? ?at all.</p>
> ? </article>

Please do some hallway testing on this.  Ask half a dozen people what
they think of this markup.  If you have to prompt, ask about the use
of cite in particular.

I'm guessing that most won't even really notice the cites to John
Adams or Fred Fox, but almost all will wonder about the cite to Ian.

The difference is that John Adams and Fred Fox were the ones saying
something -- the cite was attributing something to them.  They were
"actors" as opposed to "objects" in the linguistic sense.  Ian was
simply an "object" (a direct object, in this case) that happens to be

> It seems like it would be better to not have any elements for the
> bottom three definitions you list, or to introduce a new element for those
> that have use cases. However, no compelling use cases have been mentioned
> as far as I am aware.

Are you seriously saying that there is no need to attribute to "names
and other sources of quote attribution (including identifying speakers
in dialog)", or to markup the user name of "names of blog post
commenters and authors (in the context of their
comments, posts, etc.)"


I haven't yet seen a forum that didn't style usernames of the
commentators differently (generally either bold or as a link, rather
than italics, but still differently).

Nor have I yet seen a script (or published play) that didn't use some
styling variation to distinguish the character names from their words.
 (Usually -- but not quite always -- I see additional variations to
indicate character actions, and generic stage directions such as scene

> On Mon, 21 Sep 2009, Erik Vorhes wrote:
>> I feel here that you're stretching the definition of "title of work"
>> beyond its usefulness. If we can use aliases within <cite>, great, but
>> that seems to make more apparent the usefulness of having <cite> be
>> for more than just "title of work."

> There's two uses that I know of: making titles of works italics by
> default, and making it easier to change that style.

The original purpose of a citation was so that readers could, if they
wished, go back to the original.  That is much easier when the
original is only a click away, and so even more important.

> On Wed, 16 Sep 2009, Jim Jewett wrote:
>> ... ?If you have to look it up, then only careful people will
>> use it properly. ?(On the other hand, if there is any HTML element whose
>> users are likely to be extra careful, cite is a strong candidate.)

> I think you're right to be pessimistic, but ...
> A simple definition like "titles of works" helps.

The source of the information, such as the name of the person or the
title of the book being quoted.

>> My own interpretation of (a fraction of)
>> http://philip.html5.org/data/cite.txt did not support narrowing the
>> definition only to titles. ?For example

>> (1) ?Examples of citing a person, arguably the creator.

>> (1a) ?http://www.hiddenmickeys.org/Movies/MaryPoppins.html

>> The cite element is used to give credit to the person who
>> found/verified each "Hidden Mickey":
>> ? ? <CITE>REPORTED: <A HREF="mailto:...">Beverly O'Dell</A> 12 MAR 98</CITE>
>> ? ? <CITE>UPDATE: Greg Bevier 29 JUL 98</CITE>

> I don't think that's a usage anyone is actually arguing for though, is it?

Yes, I do think so.  The person in the cite element is the source of
the information.  This is similar to using cite for the author of a
comment at a blog.

>> (1b) ?http://www.webporter.com -- they give the author of the article.
>> ?But it looks like they (at least sometimes) include the title as
>> well, which fits under full citation.

> Right, this is the "full citation" feature. Notice their stylings, though:
> they are overriding the default font styles, and instead treating the
> whole thing as a block-level element. They would be better off using <p>
> with a class, or having us introduce a block-level element like <credit>
> or <dc> (which we might add to <figure>).

I agree that they would be better off with a <credit> element.  I also
believe that <credit> would be better for some of the use cases that
seem to be contentious, like blog-comments-author.  (1a, 1c, and 1d
would also be better off with <credit>, in my opinion.)  An <attrib>
element might be better still, as that would also work sensibly in

But <cite> is clearly the best option unless/until the more
specialized <credit> (or attrib) is added.

> This would be what <figure> would let them do, especially if we added <dc>
> for credits:
> ?<fiure class="photo">
> ? <dd><img src="images/ksquirt2_39-10_02.jpg" alt="Kelly goofing around.">
> ? <dt>Paddler: Kelly McCauley
> ? <dc>Photo: April McCauley, 2001
> ?</table>

dc?  diagram credits?

Can there be more than one dt/dd pair in a figure?  How do these
associate with dc?

Can there be more than one dc for the same dd?  (author and
illustrator?  two co-authors?)

<credit> is a fine element, whether block or inline.  <dc> seems like
another layer or two of workaround.

>> (2) ?Several uses -- and several *non-uses* for titles from
>> http://www.growndodo.com/wordplay/oulipo/
>> The page begins with carefully attributed blockquotes. ?These are
>> *not* done with cite, presumably because it didn't seem flexible
>> enough. ?Instead, it was marked up as
>> ? ? <p class="quote">...
>> ? ? <p class="citation">
>> ? ? ? <span class="citationauthor">Fran&ccedil;ois Le Lionnais</span>,
>> ? ? ? <span class="citationsource">Lipo: First Manifesto</span></p>
>> Within the text, <cite> was used to point to source materials, but
>> there didn't seem to be anything quoted; in most cases the texts were
>> used as example objects of study; if they actually need a title
>> markup, then so does the specific Viking ship in Leif's example.
>> Sample usage: ? <cite>S + 7</cite> (substrata (&quot;novelette&quot; +
>> 7) does appear to be a title.
>> At the end of the page, there is a further readings section.
>> ? ? <dt>author<cite>title</cite>publisher</dt> is used for printed
>> reference books
>> but
>> ? ? <p class="linklist"><a href ...> is used for equivalent references
>> on the web,
>> and cite is also used to name the professor of a course
>> ? ? <cite>4-5 units, <a
>> href="http://www.centerforbookculture.org/dalkey/bio_gsorrentino.html">Sorrentino</a></cite>

> That page seems pretty close to what HTML5 specifies now, though it's not
> fully consistent, as you say.

That almost sounds as though the real specification were:

   "Book Title, even if you aren't quoting or
    paraphrasing anything -- this isn't really about
    citations; we just call it cite for historical reasons."

I'm trying to imagine keeping a straight face as I say that books get
special markup because their names often need to be italicized, but
this doesn't apply to ships, because, well, ships aren't written down.
 And whether to <cite>The Gettysburg Address</cite> sort of depends on
how you want it styled.


Received on Tuesday, 22 September 2009 19:28:38 UTC