[whatwg] <blockquote cite> and <q cite> from Benjamin Hawkes-Lewis on 2007-01-03 (public-whatwg-archive@w3.org from January 2007)

From: Benjamin Hawkes-Lewis <bhawkeslewis@googlemail.com>
Date: Wed, 03 Jan 2007 14:28:19 +0000
Message-ID: <1167834499.7682.35.camel@galahad>
Henri Sivonen wrote:

> You seem to assume that there is a need to
>   1) Mark up quotations so that that software can unambiguously see  
> which DOM range was quoted.
>   2) Mark up sources of quotations in an unambiguous machine- 
> dereferencable way.
>   3) Associate the two unambiguously.

I don't see the distinction between 1, 2, and 3.

> First, we should consider how people writing for traditional print  
> media would express quotations and sources. 

I agree we should consider this, but why should we consider
this /first/?

> (They'd use typographic conventions and words.)

It's print media. What else could they use?

> Then we should consider if this is enough for  
> the Web or whether there could *realistically* be cases where  
> consuming software could serve users notably better for non-niche use  
> cases if there was more data available (i.e. big wins--not just  
> chasing diminishing returns).

Blogs, comment threads, forums, academic writing, books, journalism, and
emails are not "niche use cases". In all of these cases, there is a
clear advantage to making it easy for authors to create accurate
quotations where the reader can easily get information about the source
and jump to the source. (Well, except for authors attempting to use
quotations to mislead people. They'd begin to encounter certain
difficulties.)

> If it turns out that having additional data would be a big win, we
> should consider the cost and incentives  
> of providing that additional data and whether authors can  
> realistically provide the additional data (i.e. do they even know  
> it). 

With print-style quotations, they need to know a lot of "additional
data" about the quoted work, and then they need to consult their manual
of style to work out how on earth to cite it. With the sort of
machine-processable cited quotations I am advocating, they need to know
far less about either the quoted work or style conventions. For example,
found some text you want to quote in a web page? Select it, click "Copy
as quotation". Go somewhere else, and click "Insert as quotation." Or,
for example, found some text you want to quote in a book? Go to the
insertion point, click "Insert quotation", fill in an ISBN (or author,
title, date, or select from a list of remembered works), fill in a start
and end page, fill in the quotation text, and you're done.

> If this analysis suggests that authors would be able and  
> incentivized to provide the additional data, only then should we  
> design markup for it.

If they're citing materials at all, then they already are
providing /more/ additional data then my vision of how this should work
would require.

> > Requiring ordinary end-users to do /any/ of the following
> > tasks by hand seems unrealistic:
> 
> Indeed.

Indeed, so why are you suggesting we require them to do task 4 (which is
one of the hardest)?

> Or, authors could simply not mark up the sources of quotations  
> unambiguously leaving it to readers to cope with the relationship of  
> quotations and sources the same way readers of papers publications do.

What possible advantage would that provide? Sourcing quotations
unambiguously adds very little indeed to the challenges of writing
quotations and sources independently.

> If the spec is too "out there", it gets ignored.

Out where? Parsing metadata from hCite and META elements aren't exactly
challenging data-processing tasks. (Bandwidth /might/ be more of an
issue. But I suspect bandwidth problems could be circumvented entirely
using OpenURLs, if necessary, since with OpenURLs you can encode a lot
of information about the cited work into the URI.)

> Most notably, links are used on the Web to achieve a clear behavioral  
> goal in real software.

The behavioral goal is every bit as clear here: to make it easy to quote
stuff from somewhere else in such a way that people can:

a) get information about the quotation's source

b) go to the quotation's source

This couldn't be further from semantics for the sake of semantics. It's
as fundamental as <input type="text">.

--
Benjamin Hawkes-Lewis
Received on Wednesday, 3 January 2007 06:28:19 UTC