Re: ensuring the existence & enhancing the power of Q from Benjamin Hawkes-Lewis on 2007-04-12 (public-html@w3.org from April 2007)

From: Benjamin Hawkes-Lewis <bhawkeslewis@googlemail.com>
Date: Fri, 13 Apr 2007 00:12:37 +0100
To: public-html@w3.org
Message-Id: <1176419557.15007.25.camel@galahad>
Thanks to Gregory J. Rosmaita for raising this issue and stressing the
difference between quotations and mere direct speech and scare quotes;
the fate of q is minor obsession of mine:

http://www.benjaminhawkeslewis.com/www/accessibility/q-element/

I strongly agree that we need to maintain and improve a
machine-discoverable way of demarcating quotations from external sources
and following citations. This feature was pretty fundamental to the
vision of "hypertext" originally articulated by Ted Nelson in 1965, so
the desire expressed by some to remove its vestiges from HyperText
Markup Language is somewhat depressing. Separating content from
presentation facilitates restyling, but more importantly:

1) Identifying quotations is crucial for screen reader and voice browser
accessibility. Such user agents must either present quotations in a
different voice or announce them. Users often wish to minimize
non-essential punctuation. If it is hard for browser developers to
dictate language-sensitive quotation punctuation, how much harder must
it be for screen reader developers to cope with the entire variety of
punctuation in the absence of markup!

2) Excerpting sources is an key component of human discourse. The easier
we make it to review original sources, the sooner errors and
misrepresentations will be discovered and the more effective
communication will be. Solving the minor technical problems associated
with q and blockquote will therefore have tangible benefits to society.

Pace Anne van Kesteren, forcing authors to insert quotation punctuation
in raw text would /not/ free browser developers from the need to
implement complex CSS for styling quotations since raw text /cannot/ be
used to express the full variety of quotation punctuation actually in
use. See Problem C analyzed in my earlier message to to www-style:

http://lists.w3.org/Archives/Public/www-style/2006Sep/0141.htm

Ian Hickson's suggestion that complex regular-expression-based CSS
replace author-specified quotation punctuation sounds feasible, but to
achieve the accessibility benefits discussed above the spec would have
to mandate a particular set of punctuation, or assistive technology
would once again be reduced to playing a guessing game.

Placing complex human-readable bibliographical information into an
attribute such as cite, as suggested by Gregory, or title, as suggested
by Olivier Gendrin, would be a mistake, because within an attribute:

1) You cannot identify changes in language within the text in
machine-discoverable manner, which is an important accessibility
requirement (e.g. so screen readers can switch to the correct
pronunciation for that language).

2) You cannot include links (e.g. to alternate editions).

3) You cannot express other machine-readable semantics (e.g. hCite).

for/id attributes connecting quotation to citation elements would not
suffer from the same issues, although I think a "from" attribute would
be more logical than a "for". How would this system cope with repeated
references to different "fragments" (e.g. deadtree pages, film times,
fragment identifiers) from the same resource? Would there be a different
<cite> element for every page cited from a book? Would the full citation
information need to be restated for each mention? Or would there be a
way to chain <cite> elements together in a machine discoverable way?

An alternative which would seemingly avoid all these problems would be
to adopt a data-rich, machine-readable citation URI format (perhaps
patterned on OpenURL) which could be included in the quotation element's
cite attribute and parsed, displayed, and followed by user agents in
ways that suit users, rather than the transient, time-wasting, and
error-prone citation formatting guidelines to which authors are
enslaved. By "data-rich" I mean that such a URI format would contain
within itself essential metadata (e.g. author, title, date, ISBN, page
number for books) without requiring further lookup to discover these.
Thus in the event of catastrophic failure, the relationship between
texts, and even the original purport of lost texts, could still be
reconstructed, much as with ancient scholia:

http://en.wikipedia.org/wiki/Scholia

I do not believe localization of quotation punctuation by user agents to
be a key feature; so far that requirement has largely proved a
regrettable roadblock to implementation. If it is specified, the spec
itself should enumerate what punctuation should be used for what
languages. But browsers should be obligated to expose quotations and
provide easy access to their sources. One way to do the later is to
provide link through to sources via the context menu.

--
Benjamin Hawkes-Lewis
Received on Thursday, 12 April 2007 23:17:56 UTC