[whatwg] <cite>, <q cite="">, and <blockquote cite="">

I studied the feedback (most of which is contained below) regarding the 
<cite> element and the cite="" attribute on <q> and <blockquote>.

Executive summary:

Though I would likely not have thought the <cite> element worthy of 
inclusion in HTML if it were being proposed today, evidence suggests that 
it has found a high number of fans, relative to similar elements [1]. The 
biggest difficulty with it has been questions regarding its use, for 
instance whether to mark entire citations with it or just the titles in 
citations, whether any title of a work can be marked with a <cite> 
element, and so forth. To this end I have made the specification of the 
element very explicit, and have allowed any reference to any work to be 
marked with <cite>, even fleeting references. I have also clarified that 
the <cite> element should just be used to mark up titles, not entire 
citations. Finally, I have clarified that people's names should not be 
marked using <cite>. While these decisions may not be popular with 
everyone, I believe they strike a balanced compromise between the various 
proposals, and get us the most practical benefit from the element.

The situation with the cite="" attributes is less clear cut. It's a rarely 
used attribute [2], but when it's used, anecdotal evidence suggests it 
tends to be used correctly. It's also been used by some people to hook 
scripts and styles to, and in those cases, seems to be done pretty much 
correctly. [3] [4] Not wanting to look a gift horse in the mouth here, 
I've basically left cite="" unchanged, though we'll have to add examples.

[1] twice as many as <dfn> total on a per-element basis across a few 
billion pages studied last year, seen on four times more pages.
[2] e.g. under 0.085% of pages had <q cite> in a 2005 study of a billion 
pages; no uses of cite="" on <q> or <blockquote> at all in this study:
http://canvex.lazyilluminati.com/survey/2007-07-17/analyse.cgi/attr/cite
[3] http://home.tampabay.rr.com/bmerkey/examples/cite-quotes.html
[4] http://www.sitepoint.com/article/structural-markup-javascript


On Tue, 9 Nov 2004, fantasai wrote:
> Ian Hickson wrote:
> > On Sat, 28 Aug 2004, Anne van Kesteren wrote:
> > 
> > > Having an element NAME would be very useful.
> > 
> > Could you expand on this? For things like films, books, etc, I was 
> > thinking of "clarifying" the definition of <cite>, but I haven't 
> > really thought of that much yet.
> 
> I don't know what exactly Anne was referring to, but from going through 
> an old type-composition book, I found the need for
> 
>   - title of long work (usually put in italics)
>   - title of short work (usually put in quotes)
>
> There are a lot of sub-types of "title of work", but you can, at least 
> in English, sort them into the categories "long" and "short" and this is 
> sufficient for most typesetting practice. I haven't yet looked into 
> typesetting practices for other locales.

Both of these are now <cite>, possibly with classes to distinguish the 
exact types of titles.


>   - name of important/holy person (sometimes put in small-caps)

<span> with hCard, or possibly <b> if it's a key word (e.g. in a gossip 
column).


>   - instance of (technical) term

<i>.


> Someone suggested co-opting <abbr> to handle smileys.
> I also think we need an element for expressing things like *runs away*.

What's wrong with no markup at all for both of those?


On Wed, 17 Nov 2004, Matthew Thomas wrote:
> On 15 Nov, 2004, at 12:58 PM, Laurens Holst wrote:
> > Matthew Thomas wrote:
> > ...
> > > You mean posts by citation
> > > <http://diveintomark.org/archives/2002/12/27/pushing_the_envelope>. I hope
> > > "Hixie said I was using [<cite>] correctly"
> > > <http://diveintomark.org/archives/2003/01/19/influences> was an over-broad
> > > interpretation of Ian's words, because (a) Ian has mentioned "'clarifying'
> > > the definition of <cite>"
> > > <http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2004-November/
> > > 002329.html>, and (b) while Mark's uses of <cite> matched the example
> > > given in the HTML 4.01 spec
> > > <http://www.w3.org/TR/REC-html40/struct/text.html#edef-CITE>, they did not
> > > match the default presentation in all visual UAs, nor the resultant use by
> > > most Web authors.
> > > (Specifically, I think the most coherent and backward-compatible
> > > "clarification" would be to restrict <cite> to titles of works, because
> > > inviting authors to use it for names of people as suggested in the HTML
> > > 4.01 example would require authors to override <cite>'s italic-ness
> > > frequently, making them more likely to abandon the element completely.)

That's pretty much what HTML5 now says.


> > Actually, in the cases where I used cite for that purpose, italics what
> > exactly what I intended them to be rendered like.
> > 
> > Example:
> > "<p>On a side note, it seems that <cite>fantasai</cite> is getting
> > really busy with the alternate style sheet switcher (at least I?m
> > seeing a fair lot of activity on the bugs involved), so hopefully by
> > the time Firefox 1.0 gets released it will be back in. And perhaps we
> > will even see persistent style switching, though I wouldn?t count on
> > it.</p>"
> > ...

This is now explicitly wrong.


> If you really want italics there, with all due respect, you're strange. 
> Occasionally gossip columns have the equivalent of .name {font-weight: 
> bold;}, but otherwise the vast majority of publishers don't give 
> people's names in-line any special styling at all. Even 
> <http://diveintomark.org/css/squares-summer.css> has "cite {font-style: 
> normal;}" to reverse UAs' default italic, but people won't be bothered 
> adding that to their site's style sheets if it's easier just to not use 
> <cite> in the first place.
>
> So if <cite> is "clarified" to include names of authors, we'll see the 
> first two phenomena I described a week ago 
> <http://listserver.dreamhost.com/pipermail/whatwg-whatwg.org/2004- 
> November/002344.html>: most people won't bother using it (because it 
> doesn't give them a presentation they want), and those people who do use 
> it will do so overzealously.
> 
> You just provided an excellent example of overzealous use: you wrote 
> "<cite>fantasai</cite>", but that is not conformant, because you used 
> fantasai neither as "a citation" nor as "a reference to another source".

I agree with the above, mostly, and have said names are out of scope.


On Sat, 16 Apr 2005, fantasai wrote:
> > > > Movie titles are similar. I'd like my UA to give me a tooltip 
> > > > containing information from IMDB for every movie title. With user 
> > > > JavaScript I can do this, if there's a way to recognise movie 
> > > > titles.
> > > 
> > > Then would you want different markup for book titles, movie titles, 
> > > play titles, song titles, etc?  Or would you just expect the script 
> > > to search IMDB for anything marked up with <cite>?
> > 
> > Again, I don't really know. I could see a use case for a "type" 
> > attribute (as was suggested earlier in this thread), but that seems 
> > like a slippery slope. Suggestions?
> 
> You will need a type attribute of some kind if you are to handle the 
> different typographic conventions for e.g. books vs. articles. Book 
> titles are italicized: article titles are put in quotes. Parallel 
> distinctions exist for other types of creative works.

The class attribute can be used for these distinctions.


On Sat, 16 Apr 2005, fantasai wrote:
> Lachlan Hunt wrote:
> >   My favourite book is <a href="urn:isbn:0-735-71245-X">Eric Meyer on
> >   CSS</a>.
> 
> There are two major problems (as I see it) with using ISBN for 
> citations.
> 
>   1. You are limiting yourself to referencing a single edition of the work,
>      which may not be appropriate. Shakespeare's plays, for example, are
>      available in many, many different publications. If I want to look up
>      the context for your quote from Romeo and Juliet, in most cases there's
>      no need for me to find the exact same edition that you were using. You
>      can helpfully give me a link to an online version of the text, but
>      that would be extra info.
>
>   2. You cannot cite anything that has not been assigned an ISBN. There are
>      a lot of publications that don't have standardized IDs. Company memos,
>      ancient manuscripts, my 9th grade paper on medieval castles, a cereal
>      box, the bathroom wall, etc.

I haven't mentioned this in the spec.


On Sat, 16 Apr 2005, fantasai wrote:
> > On Sat, 16 Apr 2005, Rob Mientjes wrote:
> > 
> > > > Would it make sense to allow it for books? I don't know. Maybe the 
> > > > <cite> element needs a "type" attribute that takes values like 
> > > > "person", "ship", "publication"? What other names do people want 
> > > > to mark up?
> > 
> > I actually meant the <name> element should, although one option is 
> > indeed to co-opt <cite> for this (I don't really like that idea 
> > though).
> 
> "But there's no ship as can match <cite>The Interceptor</cite> for 
> speed."
> 
> ...

This is now disallowed.


> > The thing is we don't want to start making people do:
> > 
> >    <cite><name type="person">Ian</name></cite> said <q>Hello</q>.
> > 
> > ...when all they need to do is write:
> > 
> >    Ian said "Hello".
> > 
> > Is there any advantage to marking up people's names?
> 
> Depends on what you want to do with them, really. In most cases it's not 
> necessary, since in most cases you don't want to do anything special 
> with them. However, although the average person's name is usually not 
> treated specially, holy figures sometimes are. Ancient egyptians put 
> pharoahs' names in a special cartouche; more modern works, iirc, put 
> some holy persons' names in small-caps.
> 
> > Maybe we should just let ship names be marked up by <i> (it is, after 
> > all, an instance of use of a term, as it were), and say that <cite> 
> > can be used for any reference to a publication, including those that 
> > aren't really citations ("my favourite book is <cite>...</cite>").
> 
> The distinction between a citation and a mention is oftentimes subtle, 
> and I am sure that many authors would confuse the two. So from a 
> practical perspective, this may be necessary. However, the main problem 
> we have right now is that there is no clear alternative to <cite>. So 
> perhaps if there was one -- a blatantly _obvious_ alternative -- it 
> would not be as much of a problem.

I've given alternative elements for the various cases that are unclear. If 
there are any cases that could be made more explicit, please let me know.


> Another thing to think about:
>   How does one mark up a bibliography? The whole entry is a <cite>, really,
>   although only the title part should be in italics.

I've given an example of a bibliography.


On Tue, 30 Aug 2005, Henri Sivonen wrote:
> On Aug 28, 2005, at 11:02, Lachlan Hunt wrote:
> 
> > Although some editors do also provide some semantic options, they're 
> > usually limited in their abilities.  Some have some semantic block 
> > level elements like headings, paragraphs, lists and maybe blockquote. 
> > However, few have semantic elements like abbr, cite, code, dfn, kbd, 
> > samp, var, q and strong/em (some, like contentEditable, mistakenly use 
> > bold and italic options for those).  I often have to jump through 
> > hoops just to get <code> in my markup while using dreamweaver, by 
> > using the buttons for <b> and/or <i> and then running search and 
> > replace to fix up the markup.
> 
> Could the user interface difficulties with this semantic inline elements 
> stem at least partly from problems with the semantic inline elements 
> themselves?
> 
> Consider <cite> for example. What's it really good for? Why should an 
> author bother to use <cite> instead of <i>? Once you have learned to 
> press command-i (or ctrl-i), why should you have to learn to do 
> something else when all you really want to get done is to italicize 
> titles of works?

Indeed, WYSIWIG editors haven't really solved this problem yet.

In non-WYSIWIG environments, though, you would want different elements so 
that you can restyle different things in different ways, if nothing else.


> I think making the case for <cite> fails the explaining to mother test. 
> Chances are that there is something wrong with <cite> if I don't know 
> how to explain to my mother why she should use it instead of the 
> semantically empty italics. I cannot come up with any tangible 
> advantages. And I have been able to make the case for paragraphs and 
> headings.
> 
> When mother was putting literature lists (eg. 
> http://www.helsinki.fi/~rkosken/kirjallisuus/pukuhistoria.html ) on the 
> Web, she asked something about the technicalities so I to look. My 
> immediate thought was that there are titles of works and they should be 
> marked up using <cite>. However, when I thought how I should make the 
> point, I couldn't come up with any good explanation why the effort 
> should be expanded. The scenario that perhaps in the future there will 
> be a need to style the titles of works in a different way (for example 
> bold strike-through fuchsia) seemed ludicrous. Also, the point about 
> pieces of software doing something cool with the data did not seem like 
> a truthful explanation, because <cite> has been around for a long time 
> and still there are no reports of a killer app emerging around it. So I 
> did not recommend <cite>.

I think that's fair enough.


> Aside: Now that I looked at the source of the literature list, I noticed 
> that some titles of works were marked up as <em>. my hypothesis is that 
> after an upgrade Dreamweaver has started using <em> when pressing 
> command-i. Sigh. See http://mpt.net.nz/archive/2004/05/02/b-and-i

I agree that <em>-as-italics is a problem; we have kept <i> for that 
reason (amongst others).


> P.S. Using <cite> and <code> is relatively easy with OOo Writer/Web but 
> not as easy as pressing command-i. I have used <cite> myself when 
> writing using OOo Writer/Web, but I admit I should consider the 
> motivation rather cargo cultish.

The main reason I would have is for restyling, e.g. as small caps.


On Wed, 31 Aug 2005, Jasper Bryant-Greene wrote:
> > 
> > Consider <cite> for example. What's it really good for? Why should an 
> > author bother to use <cite> instead of <i>? Once you have learned to 
> > press command-i (or ctrl-i), why should you have to learn to do 
> > something else when all you really want to get done is to italicize 
> > titles of works?
> 
> <cite> is good for citing another work.

That doesn't really answer Henri's question, though.


> I can come up with several. Although you may think it unlikely now, you 
> may like to change the styling of your cited works at some point in the 
> future.
> 
> You may also like to loop through <cite> elements in JavaScript and link 
> them to an external database of works, or generate a reference list. 
> These are just examples of the huge advantages of using meaningful 
> semantic elements instead of presentational elements.

I'm not convinced that there are "huge advantages" here.


> > Aside: Now that I looked at the source of the literature list, I 
> > noticed that some titles of works were marked up as <em>. my 
> > hypothesis is that after an upgrade Dreamweaver has started using <em> 
> > when pressing command-i. Sigh. See 
> > http://mpt.net.nz/archive/2004/05/02/b-and-i
> 
> http://jasper.bryant-greene.name/2005/08/31/b-and-i-are-bad-mkay/

This seems to have gone away now.


> > P.S. Using <cite> and <code> is relatively easy with OOo Writer/Web 
> > but not as easy as pressing command-i. I have used <cite> myself when 
> > writing using OOo Writer/Web, but I admit I should consider the 
> > motivation rather cargo cultish.
> 
> The fact that your preferred editor does not offer a predefined keyboard 
> shortcut for using proper semantic elements is not an excuse.

That's backwards. We should justify the elements to the WYSIWYG editors' 
implementors, not dismiss them because they don't find a reason to use 
these elements.


On Wed, 31 Aug 2005, Henri Sivonen wrote:
> > 
> > I can come up with several.
> 
> But you only mention two.

Indeed.


> > Although you may think it unlikely now, you may like to change the 
> > styling of your cited works at some point in the future.
> 
> It is a risk. If the risk actualizes, the bad thing that happens is that 
> you have to do some work marking up the titles of the cited works 
> differently. The probability that this particular risk actualizes is 
> pretty low. Certainly lower than 1.

Indeed. I'm not saying that everyone needs to use <cite>; however, I do 
think it should be available.


> If you mitigate the effects of the risk actualizing by marking up the 
> titles of the cited works differently right now, the probability that 
> the bad thing (that you have to do some work marking up the titles of 
> the cited works differently) happens is 1!

Though in his defence, the bad work is not as bad if you do it from the 
start.


> Therefore, mitigating the effects of the risk actualizing does not make 
> sense assuming that the cost of using <cite> upfront is greater than the 
> cost of italicizing (it typically is) and the cost does not rise a great 
> deal if the task is postponed until the risk actualizes.

I'm not convinced it's easy to retroactively fix it. But it's up to the 
author to make the decision. We're just offering the option.


> > > Aside: Now that I looked at the source of the literature list, I 
> > > noticed that some titles of works were marked up as <em>. my 
> > > hypothesis is that after an upgrade Dreamweaver has started using 
> > > <em> when pressing command-i. Sigh. See 
> > > http://mpt.net.nz/archive/2004/05/02/b-and-i
> > 
> > http://jasper.bryant-greene.name/2005/08/31/b-and-i-are-bad-mkay/
> 
> Quoting from there:
> > Surely <span class="taxonomical"> has more semantics than <i>?
> 
> Nope. That is a common fallacy. Semantic markup (in order to be useful 
> and not just a placebo) requires that the sender and recipient share the 
> understanding of the semantics of the markup vocabulary. If you pull out 
> English words out of your sleeve or stetson and use them in class 
> attributes (or as element names in your home grown XML vocabulary for 
> that matter), they are just meaningless opaque strings for a recipient 
> with whom you do not have an a priori agreement about the semantics.
> 
> <span class="taxonomical"> is no more semantic than <span 
> class="taksonominen"> or <span class="tshhdhhtntshshssnhnt">. OTOH, UAs 
> know a priori that <i> is supposed to be italicized.

Indeed.


On Wed, 31 Aug 2005, James Graham wrote:
> 
> Maybe that's the fundamental problem. <cite> (and others) are useless 
> because they don't _do_ anything. If <cite> was like the LaTeX \cite{} + 
> BibTeX [...] and could be used to automatically insert references from 
> an external list and create a reference list with a <bibliography /> tag 
> then it would be widely used, at least in the subset of documents where 
> that functionality is desirable. But instead there isn't a clear design 
> goal other than "citations should be recognisable as such [by UAs]" 
> which isn't a strong enough reason to use it and (apparently) hasn't 
> allowed for enough functionality that UA vendors have been able to hook 
> up unexpected functions that make using <cite> desirable.
> 
> This isn't a suggestion to make <cite> like LaTeX \cite{}, merely an 
> observation that underused or abused elements are those without an 
> obvious, /user visible/ functionality, probably one that was explicitly 
> designed into the element.

I agree with your conclusion.


On Tue, 30 Aug 2005, Matthew Thomas wrote:
> >
> > Consider <cite> for example. What's it really good for? Why should an 
> > author bother to use <cite> instead of <i>? Once you have learned to 
> > press command-i (or ctrl-i), why should you have to learn to do 
> > something else when all you really want to get done is to italicize 
> > titles of works?
> 
> In theory, screen-readers could intone <cite> elements differently from 
> <i> elements, to avoid confusion (when reading articles about movies, 
> for example). In English, for example, <cite> would be slightly lower 
> pitch, at slightly slower speed, and with less tonal variation, while 
> <em> would be slightly higher pitch at slightly slower speed with normal 
> tonal variation, and <i> would be no change from normal.

Theoretically.


> In practice, current popular screen-readers piggyback on Internet 
> Explorer for Windows, and therefore make no distinction between the 
> various italic elements whatsoever.

Sadly.


> > The scenario that perhaps in the future there will be a need to style 
> > the titles of works in a different way (for example bold 
> > strike-through fuchsia) seemed ludicrous.
> 
> I have seen some sites that style <cite> differently from <i>, usually 
> as italics + some color. They could do that just as easily with <i 
> class="title"> (even overriding the italics later if they wanted, while 
> retaining them for CSS-ignoring UAs).

True, but since we have <cite> already published and implemented, we might 
as well use it.


> > Also, the point about pieces of software doing something cool with the 
> > data did not seem like a truthful explanation, because <cite> has been 
> > around for a long time and still there are no reports of a killer app 
> > emerging around it. So I did not recommend <cite>.
> 
> Right. The other use of <cite> I can think of (as I mentioned in the 
> article you cited) would be for a site like Technorati to extract the 
> titles of books or movies people were talking about. But I haven't seen 
> any site that does that yet.

It's probably more reliable to use heuristics.


> > Aside: Now that I looked at the source of the literature list, I 
> > noticed that some titles of works were marked up as <em>. my 
> > hypothesis is that after an upgrade Dreamweaver has started using <em> 
> > when pressing command-i. Sigh. See 
> > http://mpt.net.nz/archive/2004/05/02/b-and-i
> 
> It's ironic that semantic markup advocacy is gradually preventing <em> 
> from ever being semantically useful (hence my article). Since there are 
> currently no popular clients that usefully distinguish <em> from the 
> other italic elements, there's no obvious way of demonstrating authors' 
> wrongness when they use <em> but really mean <cite>/<var>/<dfn>/<i>, so 
> no future client will be able to trust that an <em> in a page means what 
> the HTML spec says it does. <em> will be the new <i>.
> 
> <cite> is probably safe for now, but only because <em> is the honeypot 
> for those who are replacing all their occurrences of <i> with one other 
> element.
> 
> And <i> itself was presentational from its inception, which is one of 
> the reasons HTML 5's redefinition of <i> as semantic is misguided. No 
> client will ever be able to treat <i> as "an instance of the use of a 
> term" on the public Web, because pages that use it for some purpose 
> other than "an instance of the use of a term" number at least in the 
> hundreds of millions. Even if semantic clients used doctype-sniffing to 
> special-case HTML 5, Web browsers would not, so authors would still use 
> <i> for italics by mistake far too often.

By this point, the definition of <i> has gone through several iterations, 
and should now be back to something that addresses the above concerns.


> The problem applies to block elements just as much as inline ones. For 
> example, the current HTML 5 draft effectively redefines <dl>/<dt>/<dd> 
> as presentational elements (though HTML 4 opened the stable door, both 
> by suggesting them for "marking up dialogues", and by using them for 
> non-definition purposes on the very first page of the spec itself). 
> Something like Google Glossary could extract <dl>/<dt>/<dd> if it was 
> unambiguously for definitions, but what can software do with HTML 5's 
> "any ... groups of name-value data"? Nothing at all. It's semantically 
> useless, and therefore presentational.

It's structural, maybe, but not presentational (that is, not media- 
specific). It's no more presenational, for example, than a correctly used 
<table> element. It just describes the relationship between its parts, 
rather than the semantic of what those parts _are_.


> One useful line of retreat would be to specify that in the following 
> code, "the state of being happy" is unambiguously a definition of 
> "happiness" and not of any other subset of the <dt>.
> 
>     <dl>
>       <dt><dfn>happiness</dfn> /'h?? p?? nes/ <i><abbr>n.</abbr></i></dt>
>       <dd>the state of being happy</dd>
>     </dl>
> 
> This could be encouraged by "dt dfn {font-weight: bold; font-style: 
> normal;}" in browsers' default style sheets, which would be quite 
> backward-compatible because of the rarity of <dt><dfn> up to now.

I've noted this for when I work on <dfn>.


On Sun, 31 Dec 2006, Anne van Kesteren wrote:
>
> I saw a recent draft introduced semantics for <blockquote> being 
> followed or preceded by a <p> which contains <cite> and for <q> which is 
> inside a paragraph which also contains a <cite>. Perhaps this <cite> 
> element can contain a single element <a> which contains the source of 
> the quote?

No, sometimes the link will be to a page about the resource. For example, 
a quote from a book could have a <cite> element that links to that book on 
Amazon or Google Book search.

Can you complete the following paragraph?:

  <p>When the <code>cite</code> element is contained in, or itself
  contains, a <span>hyperlink</span> (e.g. using an <code>a</code>
  element), then that link must be to a resource _______________...


> So we can drop the unsupported cite="" attribute from both <blockquote> 
> and <q> or at least provide a way to have visual metadata. (I'm aware 
> cite="" is exposed in some way in some user agents, but that's not 
> really usable in any way...)

I don't think we need to drop it. Some people use it, and it's not 
especially misused or anything.


On Sun, 31 Dec 2006, Elliotte Harold wrote:
> 
> It's not just about user agents and their display. The cite attribute is 
> a useful way of referencing the original source for purposes of 
> attribution. I use it frequently.
> 
> Consider an academic book. You usually don;t bother to read all the end 
> notes, but if there's one you're particularly interested in, you can 
> always look it up. By contrast inline citation such as [Steven Jones, 
> Facts and Theorems, p. 112, University press: New York] is distracting 
> and unnecessary. The web is the same.
> 
> I know the Web has a real problem with source citation, plagiarism, and 
> giving credit where credit is due. However removing one of the real 
> tools we have to support appropriate citation is not going in the right 
> direction.

Seems reasonable to me.


On Sun, 31 Dec 2006, Anne van Kesteren wrote:
> 
> You apparently didn't read the part of my proposal of moving the 
> information cite="" gives to a more visual place. (I think I also 
> mentioned allowing both if there was a real need for cite="".)

On Sun, 31 Dec 2006, Elliotte Harold wrote:
> 
> Indeed I missed that, but looking at the archives this doesn't really 
> change my opinion. The cite attribute is valuable precisely because it 
> is invisible. It does not get in the way of the normal flow of reading, 
> unlike the cite element.
> 
> It would be nice if user agents would make it a little more available 
> without view source (e.g. with a tooltip as they do for acronym titles), 
> but regardless more often than not a reader will not want to see the 
> text of the cite.

On Sun, 31 Dec 2006, Benjamin Hawkes-Lewis wrote:
> 
> Hmm, tooltips are a bit problematic, since there might also be a title 
> on the q or blockquote, a title on its container, or a title in its 
> contents. Also just displaying a cite attribute as a tooltip wouldn't be 
> pretty, as a cite is a URI and so not very human readable. One of the 
> features I'd like to add to my Hypertextuality extension is 
> automatically fetching metadata from cited pages. I'm not sure how best 
> to display it however.

I've added a note to myself to make it clear that cite="" should be made 
available, e.g. on hover.


On Mon, 1 Jan 2007, Karl Dubost wrote:
> 
> It is useful and usable
> http://www.blog-and-blues.org/weblog/2004/08/24/284-attribut-cite-pseudo-lien
> http://simonwillison.net/2002/Dec/20/blockquoteCitations/
> http://www.sitepoint.com/article/structural-markup-javascript
> http://www.456bereastreet.com/archive/200411/
> quotations_and_citations_quoting_text/
> 
> It doesn't need always to be visual as well. plus the fact that the cite 
> can be things like
> 
> 	cite="urn:isbn:0671892584"
> 
> And it is used on the Web, at least on my personal web site. The Web is 
> not only about browsers.

The Web in all the cases above pretty much is about browsers, and I think 
it is about browsers in the cite="" case in general, since that's how you 
come across the quotes in the first place. But it being about browsers 
isn't a bad thing, nor does it compell us to drop the attribute.


On Sun, 31 Dec 2006, Benjamin Hawkes-Lewis wrote:
> 
> How would a given <cite> be unambiguously associated with its quotation 
> element? We could add a for attribute to <cite>, but that would 
> complicate the creation of quotations in user-generated content since 
> content management systems would have to process id's they hadn't 
> created. That isn't an insurmountable obstacle, but it's worth thinking 
> about.

The spec covers this.


> 3) The original specifications for cite didn't require UAs to expose 
> cite to end-users. If a spec doesn't require something it's unlikely to 
> be implemented.
> 
> 4) The original specifications for cite provided no examples of possible 
> implementations. When faced with HTML4's specification for cite, browser 
> developers scratch their heads and wonder how they're supposed to 
> implement it.
> 
> 5) The original specification for cite did not suggest ways of resolving 
> the potential conflict between cite and href involved in markup like:

These three points will be fixed in the rendering section. I've added some 
notes to the markup to make sure this is covered.


On Mon, 1 Jan 2007, Benjamin Hawkes-Lewis wrote:
> 
> At the first sight, the blockquote spec sounds like it could work 
> without the cite attribute:
> 
> > If a blockquote element is preceeded or followed by a p element that 
> > contains a single cite element and is itself not preceeded or followed 
> > by another blockquote element and does not itself have a q element 
> > descendant, then, the citation given by that cite element gives the 
> > source of the quotation contained in the blockquote element.
> 
> The q spec won't work however:
> 
> > If a q element is contained (directly or indirectly) in a paragraph 
> > that contains a single cite element and has no other q element 
> > descendants, then, the citation given by that cite element gives the 
> > source of the quotation contained in the q element.
>
> What if, as is not at all uncommon, there are quotations from different 
> sources in the same paragraph?

Then you use cite="", indeed.


> One interesting problem associated shifting citation URIs into ordinary 
> anchor hyperlinks is that when authors link to individual fragments or 
> pages of resources this could lead to enormous proliferation of links. 
> That might make the current screen reader/talking browser model of 
> listing links problematic. On the other hand, I suppose screen 
> readers/talking browsers could prune lists by comparing URIs though that 
> somewhat changes the nature of the enterprise.

I agree that we should keep cite="".


On Sun, 31 Dec 2006, Benjamin Hawkes-Lewis wrote:
> 
> I assumed Anne meant something like:
> 
> <q>rhubarb rhubarb rhubarb</q> [<cite><a href="www.example.com">Nemo, 
> Works, IV</a></cite>]
>
> Which would be backwards compatible, but wouldn't unambiguously connect 
> q and cite. Which is a major disadvantage in my view.

HTML5 explicitly links that <cite> and that <q>.


On Sun, 31 Dec 2006, Matthew Raymond wrote:
> Benjamin Hawkes-Lewis wrote:
> > <q>rhubarb rhubarb rhubarb</q> [<cite><a href="www.example.com">Nemo,
> > Works, IV</a></cite>]
> 
>    Idle thought:
> 
> | <blockquote>
> |   <p>
> |     <q>rhubarb rhubarb rhubarb</q>
> |     [<cite><a href="www.example.com">Nemo, Works, IV</a></cite>]
> |   </p>
> | </blockquote>
> 
>    The <blockquote> becomes a container that associates <q> elements 
> with the first child <cite> element.

That seems a bit more radical than even I'd like to consider! :-)


On Mon, 1 Jan 2007, Benjamin Hawkes-Lewis wrote:
> 
> Rather than redefining blockquote and q to associate with a cite child, 
> it might actually make more sense to redefine cite to associate with 
> blockquote and q children, allowing something like:
> 
> <cite>
> 	<blockquote>
> 		<p>The learned scholar WhatsHisName wrote that: 
> 			<cite><q>The foobar is doubled.</q> 
> 			(Tractatus on the foobar,section466)</cite>.
> 			</p>
> 		</blockquote> 
> 	<a href="www.example.com">Nemo, Works, IV</a>
> 	</cite>
> 
> It may of course be that the way current UAs parse and render cite would 
> make such markup impossible.

I think this is pretty radical too.


On Tue, 2 Jan 2007, Matthew Raymond wrote:
> 
> Okay, how 'bout this:
> 
> | <excerpt>
> |   <p>
> |     <q>rhubarb rhubarb rhubarb</q>
> |     [<cite><a href="www.example.com">Nemo, Works, IV</a></cite>]
> |   </p>
> | </excerpt>

I think there's enough doubt that we need to cover this use case at all 
that we shouldn't be adding _more_ elements to cover it. :-)


On Wed, 3 Jan 2007, Benjamin Hawkes-Lewis wrote:
> >
> > Making quoting even more difficult is not better at all in my opinion.
> 
> Well, can you suggest an alternative way of associating different 
> instances of q, which may themselves contain citations from the quoted 
> material, with different instances of cite in the same paragraph?

Is this really something that happens often enough for us to care about 
to the point of optimising the markup for it?


> If you want to make it simpler, you could keep the spec's suggested 
> semantics for q and cite so long as there is only one cite in the 
> paragraph. This could complicate formatting however. How would one 
> differentiate in-text references using cite with cite elements that one 
> wished to display as footnotes?

I don't really follow.


> If you do want to keep things really simple on the hand-coding end, the 
> cite attribute, not the cite element, is definitely the way to go, since 
> bibliographic information can be encoded in the URI (have a look at 
> OpenURL) and metadata can be retrieved by requesting the page in the 
> case of web addresses. Web Applications 1.0 could specifically require 
> browsers be able to retrieve, understand, and expose information from 
> OpenURL ContextObjects, Dublin Core, standard HTML META metadata, and 
> hCite.

That seems like trying to solve too many solutions down the Web's throat, 
without really many problems to go with them. What problem(s) are we 
trying to solve here?


> One solution to associating cite elements with quotations might be to 
> keep the cite attribute, but add a scheme (or something) by which the 
> cite attribute could refer to a URI for citation data rather than the 
> work itself. Then it could refer to a cite element via a fragment 
> identifier. (The reason to have q refer to cite rather than the other 
> way round is that you never have two cites to one q, but you often have 
> more than one q to a cite.)

We could do that, but that seems like too much indirection for most people 
(including me!).


On Wed, 3 Jan 2007, Henri Sivonen wrote:
> 
> You seem to assume that there is a need to
>  1) Mark up quotations so that that software can unambiguously see which 
> DOM range was quoted.
>  2) Mark up sources of quotations in an unambiguous 
> machine-dereferencable way.
>  3) Associate the two unambiguously.
> 
> I very much doubt the need to mark up quoted DOM ranges unambiguously 
> and to unambiguously give a machine-dereferencable source pointer. I 
> also think that the issue is being approached from the wrong direction.

I agree.


On Wed, 3 Jan 2007, Benjamin Hawkes-Lewis wrote:
> 
> Blogs, comment threads, forums, academic writing, books, journalism, and 
> emails are not "niche use cases". In all of these cases, there is a 
> clear advantage to making it easy for authors to create accurate 
> quotations where the reader can easily get information about the source 
> and jump to the source. (Well, except for authors attempting to use 
> quotations to mislead people. They'd begin to encounter certain 
> difficulties.)

There's a clear advantage in theory, but I'm not convinced that the 
advantage is seen by the people you are targetting the solution at.


> > Or, authors could simply not mark up the sources of quotations 
> > unambiguously leaving it to readers to cope with the relationship of 
> > quotations and sources the same way readers of papers publications do.
> 
> What possible advantage would that provide?

It's the only option we're realistically going to be able to convince 
authors to do. :-)


On Thu, 4 Jan 2007, Karl Dubost wrote, in reply to Henri:
> > 
> > If HTML had unambiguous sourcing of quotations, what cool software 
> > would you write that would consume the markup?
> 
> Given into account that the notion of "cool" is very subjective and tied 
> to one's interests.
> 
> * http://web.archive.org/web/20030211001151/http://diveintomark.org/
> archives/quotations/
> http://web.archive.org/web/20030207035922/diveintomark.org/archives/citations/
> http://diveintomark.org/archives/2003/01/28/autocontent
> * technorati, bloglines like http://www.bookorati.com/
> * threading for commenting system on Weblogs a database of well known 
> quotations, authors. a databse of poetry frequency analysis of quotes 
> for texts.
> 
> I can also imagine a tool which displays possibility to have more 
> information on the quotes contained in the page by displaying a widget 
> with more exploration: spontaneous buy of the source which has been 
> cited (without to necessary use amazon), or get more information about 
> an author, redirecting to wikipedia ala PageMapper 
> http://labs.metacarta.com/PageMapper/ or OpenLayers 
> http://openlayers.org/

On Sat, 6 Jan 2007, Henri Sivonen wrote in reply:
> 
> That's an oft-cited example, but
> 1) It doesn't demonstrate a need for a Web-wide distributed system for 
> quotation or citation cataloging.
> 2) The flagship example of mining the semantics of quotations and 
> citations was dumping the data as lists! Is extracting lists of stuff 
> the best that can be done? No offense to Mark intended, but just making 
> lists isn't impressive enough to justify the trouble, in my opinion.
> 3) The originator of the example has discontinued the example.

Indeed, I spoke with Mark about this, and he didn't seem especially 
convinced that the example was convincing. :-)


> > * technorati, bloglines like http://www.bookorati.com/
> 
> The crucial difference is that Technorati and Bloglines work without a 
> per-post effort to support them.
> 
> The exception is Technorati Tags, but in that case, the blogger is 
> likely seeking to get attention for his/her own stuff instead of wanting 
> to make an effort to help Technorati's business.
> 
> > * threading for commenting system on Weblogs
> 
> Commenting systems are controlled by blog engines, so blog engine can 
> present threading in an internally consistent way without there being a 
> need for Web-wide comment threading markup.
> 
> Or did you mean distributed threading so that Technorati and Google 
> could construct a Usenet-like view of the blogosphere?

I agree with all the above.


> > a database of well known quotations, authors.
> 
> That's a "nice to have" thing that could be made if the data was there 
> for another reason. It is not a killer app that justifies the effort of 
> providing the data in the first place.

...and indeed these databases already exist, without the markup.


On Wed, 3 Jan 2007, Benjamin Hawkes-Lewis wrote:
> > 
> > How much data is provided depends on the type of writing. If someone 
> > quotes someone else's blog, quotation marks and a plain <a href link 
> > back are enough.
> 
> Well, perhaps.

Definitely, even. And I think that's what most authors will stick to.


> But an interface like I am describing would be less complicated than the 
> current process of copying text, putting it in quotation marks, copying 
> a link, putting it in an <a> element, writing some link text, and then 
> correcting the link so that ampersands are correctly encoded.

An interface like you're describing wouldn't be available o most people, 
in practice (look at the Web today -- many interfaces could be made 
available for existing things, but aren't).

...and note that most people don't correct the ampersands as it is. :-)


> > Punctuation and plain links go a long way for human readers.
> 
> There is no way for machines to conclusively differentiate quotation 
> punctuation from non-quotation punctuation.

Most people don't care.


> > And I am unconvinced that authors would be willing to spoon feed data 
> > mining tools, considering that the beneficiaries of such spoon feeding 
> > are not the authors themselves nor even their direct human audience.
> 
> So you want to quote a book. Do you choose to:
> 
> a) Spend a minute gathering the relevant information and arranging it 
> into a marked up and styled citation?
> 
> b) Spend three seconds typing an ISBN into a box and get the same 
> result?
> 
> I choose b).

In practice I generally do a), and I imagine most other people do too.


On Wed, 3 Jan 2007, James Graham wrote:
> 
> FWIW, I know, offhand, the ISBN of exactly zero books (whereas I could 
> probably quote from several). Therefore it would take considerable 
> effort for me to find the ISBN of a book I was quoting (I would have to 
> spend time looking it up on the book or online somewhere), then more 
> effort to carefully copy the human unfriendly string into whatever tool 
> was demanding this apparently superfluous information. I would imagine 
> that "three seconds" is an underestimate of about an order of magnitude.
> 
> This last bit is the killer; people hate doing mundane things even when 
> they have to (I've never met anyone who enjoys filling in BibTeX 
> citations, for example and that is of comparable difficulty to the 
> process you advocate), and certainly won't do if if they see no benefit 
> for their efforts (even if some minority group will).

Indeed.


On Thu, 4 Jan 2007, Benjamin Hawkes-Lewis wrote:
> 
> Let's try a little experiment. I have here a stopwatch. I go over to my 
> bookcase, close my eyes, stick out my hand and take the first book I 
> touch from the shelf. I place it beside my keyboard. I start my 
> stopwatch...
> 
> 0-520-24073-1
> 
> Time taken: 7 seconds. How did I accomplish this astonishing feat?

You haven't accomplished anything yet. All you have here is an opaque 
number. What you _want_ is human-readable information.


> As another experiment, I'm just going to type the plain text citation:
> 
> Joan Roughgarden, Evolution's Rainbow: Diversity, Gender, and Sexuality 
> in Nature and People (Berkeley and Los Angeles, 2004).
> 
> Time taken: 36 seconds. No markup, no styling.

This is useful. Now I know what you're citing.


> And if typing 10-13 digit numbers still sounds like too much hard work, 
> the state of the art is to dangle a book in front of your webcam and 
> have your software grab its details of the web's bibliographic 
> databases.

This is not something that most people can do.


On Fri, 5 Jan 2007, Benjamin Hawkes-Lewis wrote:
>
> Martin Atkins asked:
> > So do you expect browsers to use an online service to look up 
> > information about a given referenced book? What online service should 
> > they use to do this, and what happens when that online service ceases 
> > to exist at some point in the future?
> 
> Actually, is it not trivial to build UAs that can pull down new URIs 
> with other updates and where users can add URIs for new online services 
> if necessary?

It is not trivial, but it's possible -- so long as the browser distributor 
stays in business, which is itself non-trivial. The bigger problem is that 
service providers would charge for this, and the browser vendors derive no 
benefit from paying for this service.


As a general reply to the whole idea of making browser clever here about 
getting citations and so forth, I recommend making a browser vendor 
implement this before trying to make the spec require it.


(Here I have snipped a large number of e-mails that just rehash the points 
made above. If I missed a point that you think the spec should deal with, 
please do let me know.)


On Sun, 7 Jan 2007, Matthew Paul Thomas wrote:
> 
> First, it's hard for UAs to present cite= in a way that is both usable 
> and backward compatible. (Just changing a cursor isn't discoverable 
> enough. Putting any extra button etc in the page might mess up page 
> layouts, though it might work if it was placed in-line at the end of the 
> quote.)
> 
> Second, it's hard for authors to use it in a way that is 
> backward-compatible. That is, if the source information is important 
> enough that it needs to be accessible in those UAs that don't (yet) 
> support cite=, the author has to provide the information in some other 
> fashion too.
> 
> And third, it requires the existence of an IRI of some sort. Often you 
> won't have this, for example when the source information for your quote 
> is something as vague as "attributed to Mark Twain".
> 
> (None of this is new, just a summary of what I understand from the 
> discussion so far. I'm still thinking about alternative markup.:-)

This is a good summary. I'm not sure what more can be done beyond what the 
spec has now, though.


On Wed, 10 Jan 2007, Sander Tekelenburg wrote:
> 
> The fact that UI problems like this aren't solved yet does not mean they 
> cannot be solved. Just that they haven't been solved yet. I'm sure that 
> to a large extend this has to do with UA vendors having spent resources 
> on browser wars and ESP engines for the past 10 years, at the cost of 
> other development.

That's possible, but the spec is not the right place to be innovating 
these types of UIs, so it's not clear to me where to go from here, beyond 
what the spec says today.


On Sun, 21 Jan 2007, Matthew Paul Thomas wrote:
> 
> For example:
> 
>     <p><a id="q018" href="http://example.com/2007/01/21/c">Fred
>     Mondegreen concurs</a>: <q source="#q018">When you compare it
>     with books, the Web is still a newborn baby</q>.</p>
> 
>     <p>As <span id="q019">Albert Einstein said during an interview
>     in 1949</span>: <q source="#q019">I do not know how the Third
>     World War will be fought, but I can tell you what they will use
>     in the Fourth ? rocks!</q></p>
> 
> (Disclaimer: I don't expect people would actually use this, unless there 
> was some famous semantic application taking advantage of it. The same 
> applies to cite=.)

I think there's too much indirection for anyone to use this. Certainly I 
don't think it'd be used more than cite="".


> Google notwithstanding, the Web does not contain all quotable material 
> that exists. If the source is a pamphlet, magazine, user manual, or 
> interview, there may well be *no* relevant URL to cite.

Indeed.


On Sat, 6 Jan 2007, fantasai wrote:
> 
> I think the problem is what happens if I am, for example, writing a 
> 5-paragraph essay comparing two books. I use lots of quotations from 
> both books in the same paragraph in all five paragraphs, but the cite 
> information is complete (author+title) only in the first instance, and 
> the order if source and quotation is mixed up all over the place. You 
> can machine-process the simple case of one quote, one cite, but there's 
> no way to machine-process that without some help.

But who cares? I mean, who is going to machine process that? In such a 
situation, surely just two links (or two biblio entries somewhere) is 
plenty enough for everyone involved.


> Another problem is providing citations for a sequence of blockquotes 
> when none of them have URI sources to put in the 'cite' attribute. I 
> might have a favorite quotes page, for example. Does it really make 
> sense that an isolated blockquote in someone's blog gets defined 
> semantics for its <cite><blockquote> pair but the blockquotes on my 
> /quotes page/ don't?

Put each one in a <div>, or separate them with <hr>.


> Another problem is, how do I present a list of quotes attributed to
> one person? E.g.
>   My Favorite Quotes from Mark Twain
>     * ...
>     * ...
>     * ...
> There's no way to mechanically associate the quotes with Mark Twain.

Again, I don't see this as a problem.


On Sat, 6 Jan 2007, Matthew Paul Thomas wrote:
> 
> Right. The description of attaching adjacent <cite>s to <blockquote> and 
> <q> is not only a heuristic, it's a poor heuristic, because it will fail 
> often in those documents where <blockquote> and <cite> are used at all. 
> For example, it will fail where one <cite> element is in a paragraph 
> immediately between two <blockquote>s, when it may be the citation of 
> only one or neither of them.

The spec, as written, will not associate either with that element.


> There are other problems in WA1's current definition of <cite>
> <http://www.whatwg.org/specs/web-apps/current-work/#the-cite>. It says:
> 
>     This is the correct way to do it:
> 
>         <p><q>This is correct!</q>, said <cite>Ian</cite>.</p>
> 
> Despite this being consistent with the example given in the HTML 4 
> specification, it is not compatible with the Web (except for the tiny 
> part of it found on diveintomark.org and its imitators). All noticable 
> graphical browsers default to cite {font-style: italic}, and it is 
> inappropriate to italicize someone's name just because you're quoting 
> them. Therefore, that's not what Web authors -- or even HTML reference 
> authors -- understand <cite> to be for.

This is changed now.


> WA1 continues:
> 
>     This is also wrong, because the title and the name are not
>     references or citations:
> 
>         <p>My favourite book is <cite>The Reality
>         Dysfunction</cite> by <cite>Peter F. Hamilton</cite>.</p>
> 
>     This is correct, because even though the source is not quoted,
>     it is cited:
> 
>         <p>According to <cite>the Wikipedia article on
>         HTML</cite>, HTML is defined in formal specifications
>         that were developed and published throughout the
>         1990s.</p>
> 
> This is also incompatible with the Web, again because nobody would want "the
> Wikipedia article on HTML" italicized unless they were emphasizing it.

Both of these are fixed now too.


> Further, it is a distinction most authors won't be able to understand. For
> example, which of these paragraphs would be conformant?
> 
>     <p>My favourite book is <cite>The Reality Dysfunction</cite> by
>     <cite>Peter F. Hamilton</cite>, because Hamilton describes
>     wormholes as a way of travelling over long distances.</p>
> 
>     <p>My favourite book is <cite>The Reality Dysfunction</cite> by
>     <cite>Peter F. Hamilton</cite>, because of Hamilton's
>     description of wormholes.</p>
> 
>     <p>My favourite book is <cite>The Reality Dysfunction</cite> by
>     <cite>Peter F. Hamilton</cite>, because of Hamilton's
>     descriptions of various sci-fi ideas.</p>
> 
>     <p>My favourite book is <cite>The Reality Dysfunction</cite> by
>     <cite>Peter F. Hamilton</cite>, because of Hamilton's
>     descriptiveness and imagination.</p>
> 
>     <p>I arrived in Boston having read about half of Peter F.
>     Hamilton's latest book, <cite>Pandora's Star</cite>. This is a
>     nearly 900 page book, part one of the <cite>Commonwealth
>     Saga</cite>. I absolutely loved his first saga, the
>     <cite>Night's Dawn Trilogy</cite>. So far this book is
>     promising to be just as good.</p>

Per the new rules, only the last one (because the first four put his name 
in the <cite> element.)


> Even if you can carefully make the distinction between the conformant 
> and non-conformant examples, most authors will not. It is not plausible, 
> for example, that an author will realize "oh, I'm no longer actually 
> mentioning any of Hamilton's ideas from that *particular* book, I'd 
> better remove the invisible <cite> element around its title".
> 
> I think a more compatible and visually obvious (if less semantically 
> obvious) definition of <cite> is marking up the name of a work: a book, 
> film, exhibition, game, etc.

Agreed.


On Tue, 16 Jan 2007, Benjamin Hawkes-Lewis wrote:
> 
> Says who? There are even situations where this would be appropriate in 
> modern English, which seems to be your frame of reference here. For 
> example, when cited as the source of a quotation from a transcript in 
> British legal writing: "Counsel's name should appear in upper-and 
> lower-case italics" (Oxford Guide to Style (ISBN 0-19-869175-0), 423).

The spec defines what elements should be used for that now.


> (1) Modern English typographic conventions are crystal clear that the 
> entire reference is the citation, /not/ just or even especially the 
> italicized part.

Indeed, but this isn't useful.


> (2) Modern English typographic conventions do not always use italics for 
> the name of a work. For example, by the Oxford Guide to Style (ISBN 
> 0-19-869175-0), the titles of articles, orations, unpublished works, 
> treaties, parliamentary statutes (and in British legal writing, even US 
> statutes), European secondary legislation, books of the Bible and 
> /suwar/ of the Koran, and rabbinical works that have become nicknames 
> (on this, see p. 541) are not italicized, and those of poems frequently 
> are not.

That's what the class attribute and CSS is for.


> It is also parochial, since conventions in other languages (of course) 
> vary. For example, in Russian bibliographies, book and journal titles 
> are set in upright script not italics, and in Russian body text they are 
> placed in guillemets, along with picture titles (Oxford Guide, 335-6, 
> 339, 341). In German, roman not italic is used for article and book 
> titles, often without quotation marks (Oxford Guide, 290).

CSS.


> The /only/ way we will get browsers to display citations in the manner 
> expected by the user is with language-sensitive styling of markup that 
> differentiates the different components of citations (names, article 
> titles, journal titles, page numbers, etc) such as hCite promises to 
> provide. The <cite> element alone is far too coarse a tool for this job.

Indeed. But it's enough for most authors. hCite can be used for more 
detailed styling and semantics if desired.


On Tue, 16 Jan 2007, James Graham wrote:
> 
> So, to summarise, <cite> is insufficient for extracting useful semantics 
> and has a (essentially unchangable) default style which means that it 
> will /at best/ be used correctly in English, some of the time, with 
> careful authouring.
> 
> You've presented quite a convincing argument to deprecate <cite>.

Yet people do use it. :-)


On Tue, 16 Jan 2007, Benjamin Hawkes-Lewis wrote:
> > 
> >  and has a (essentially unchangable) default style
> 
> It's not unchangeable at all. Browsers and users can set a different 
> default style on it; HTML5 can even suggest a different default style.

We can't realistically change the default style.


On Sun, 21 Jan 2007, Benjamin Hawkes-Lewis wrote:
> 
> In terms of web functionality, I think HTML needs to provide at least 
> the ability to:
> 
> 1) Jump directly to a discussed work/authority (or, at worst, directions 
> to a discussed work/authority) from a brief mention or detailed 
> description of said work/authority.

Provided by <a href="">.


> 2) Jump directly to the sources of a quotation or statement (or, at 
> worst, directions to/discussion of the sources of a quotation or 
> statement) from the quotation or statement, while still allowing the 
> quotation or statement to contain hyperlinks itself.

Provided by <a href=""> also, and by cite="" in theory.


> 3) List works discussed or used as references by a given web document. 
> (Academics need to be able to track who is citing whom.)

Provided by <ul> and <cite>, and <a>.


> Function 2 and therefore Function 3 clearly require something additional 
> to <a>.

Not really. You don't say they have the be distinguishable, and I'm not 
convinced they have to be, either.


> 2) We modify the idea somewhat and suggest that the genius of HTML when 
> used with CSS is that its element set is typical of those components for 
> which a typical page will need to use style hooks. But even this would 
> be problematic to sustain: where are the <banner>, <navigation>, 
> <product>, <note>, <comment>, and <advert> elements?

<header>, <nav>, <section>, <aside>, and <article> respectively; <advert> 
is pointless since nobody would use it.


On Sat, 6 Jan 2007, FROIDURE Nicolas wrote:
>
> Why not creating something like already exists for forms ?
> Here are 3 quotations of <cite id="author">Martin Luther King</cite> :
> <blockquote cite="#author">   (...)     </blockquote>
> <blockquote cite="#author">   (...)     </blockquote>
> <p>And the sentence i prefer is : <q cite="#author">   (...)     </q></p>

We could, but it's not clear that this is really solving an important 
problem.


On Fri, 12 Jan 2007, Matthew Raymond wrote:
>
>    There's been some debate about the |cite| attribute versus the <cite> 
> element. There problem with the attribute is that it doesn't allow for 
> non-text content and isn't visible on legacy browsers. The problem with 
> the element is that there are no means of associating it with quotes or 
> blockquotes.
> 
>    Well, why not overload the |cite| attribute so that it's valid to use 
> the URL for a <cite> element? Example:
> 
> | <p>
> |   <q cite="#Hixie">How times have changed</q>, said
> |   <cite id="Hixie">
> |     <a href="http://ln.hixie.ch/?start=1163122250&count=1">
> |       Ian
> |     </a>
> |   </cite>.
> | </p>
> 
>    You could then have multiple |cite| attributes point to the same 
> <cite> element, and the <cite> element doesn't necessarily have to be in 
> close proximity, markupwise, to the referencing <q> or <blockquote>.

It's not clear to me why this is better than:

   <p>
    <q>How times have changed</q>,
    <a href="http://ln.hixie.ch/?start=1163122250&amp;count=1">said Ian</a>.
   </p>



On Tue, 16 Jan 2007, Benjamin Hawkes-Lewis wrote:
> 
> Associating the cite attribute with the <cite> element is certainly 
> better than having no association between quotations and the <cite> 
> element at all.

Why?


> However, I do think it would make more sense to have a different 
> attribute ("citeref"?) for linking to citations as opposed to linking to 
> sources. One reason for this is that user agents will struggle to 
> differentiate citation sources from referenced citations, which will 
> make actually exposing the information or processing it much harder. For 
> example, with a source you can verify a quotation, but not with a 
> citation alone (unless the citation itself /unambiguously/ links to a 
> source). And while you might wish to show a citation in a popup, you 
> probably don't want to expose a source that way.

I honestly this we're far above the heads of most people here already, 
just with <cite> and cite="", let alone if we add more markup for this.


On Wed, 17 Jan 2007, Matthew Raymond wrote:
> 
>    I agree with your argument, not just because of your arguments above, 
> but because I can see situations where you might have multiple <q> and 
> <blockquote> elements referring to the same <cite> element but having 
> different values for |cite|. For instance, you might provide a URL in 
> |cite| to a specific paragraph in a book, but |citeref| may point to a 
> <cite> element that only contains the title of the book.

I really don't think we're seeing enough evidence that people care at all 
about this to go to this level of complexity. cite="" is used rarely; if 
people wanted to use citations, they would use it, even if it's not 
perfect. As it is, the attribute barely hits the radar.


On Sun, 14 Jan 2007, Brad Fults wrote:
> 
> I definitely agree that something has to be done to make |cite| and 
> <cite> useful. With this method, user agents could provide additional 
> functionality like automated citation lists and controls (e.g. linking 
> to citations from quotes).

If people want to prove that this is plausible, I recommend getting 
browsers to suppot hCite. If that is supported, then it would lend a lot 
of weight towards getting something like this natively into HTML.


On Wed, 12 Dec 2007, Henri Sivonen wrote:
> 
>  * Considering that mere presentation-level implementation in visual UAs 
> is ubiquitous and needed for Support Existing Content, UAs will have to 
> continue to italicize <cite>.
>
>  * Considering that content authored to HTML 4 may be syndicated or 
> otherwise repurposed into an HTML5 site template, it doesn't seem 
> productive to require the removal of <cite> from such content. Hence, 
> <cite> should probably be kept as conforming part of the language.
>
>  * Considering the default presentation of <cite> since the dawn of 
> time, the example in the ancient IIIR draft and DanC's IRC statement[1] 
> about the original intent, I think the element should be defined [at 
> least primarily] as meaning title-of-work. See ?7.133 on page 284 of 
> CMOS 14th ed.

Agreed.


>  * Considering the misguided over-general definition in HTML 4, the 
> definition in HTML5 should probably contain some weasel words to allow 
> those who read the HTML 4 definition to use <cite> for personal names 
> without getting into flame wars.

I don't really see the value in this; I've made it explicitly wrong.


>  * Considering that during the existence of <cite> in some form in HTML, 
> no compelling semantic mining use cases have emerged where the semantics 
> miner and the document author weren't in tight collaboration (or the 
> same person as in the famous diveintomark.org case) and considering that 
> the default presentation of <cite> is biased towards publishing styles 
> close to that documented in CMOS, I think the spec should be worded not 
> to require titles of works to be marked up as <cite>. Specifically, the 
> spec should say something that'd protect authors who don't mark titles 
> of works as <cite> (for whatever reason; tool support, i18n 
> considerations, whatever) from time-wasting flamewars. (I could not come 
> up with any good story explaining why my mother as a page authors should 
> make an effort to use <cite> instead of whatever command-i produces in 
> Dreamweaver.)

Well, the spec doesn't require it, but then it doesn't require that 
anything really be marked up explicitly, just that the elements be used 
correctly.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Received on Tuesday, 19 February 2008 22:36:40 UTC