Re: CSS and quotation typography

David Woolley writes:

> I must admit that I can't find it as clearly written as I thought, but
> you need to go back to 1989 to really get the picture.

Ah, thanks for digging that up for me. Two quibbles:

1) As far as I can tell, that document is strictly a design for the
   *web*, rather than HTML per se. That is to say, the development of a
   variety of media types for web nodes, some more complex than others,
   is not inconsistent with "authorship" becoming "universal". It is not
   (I think) until HTML 3.2 that HTML claims to be "the", as opposed to,
   *a* markup language for the web.

2) More importantly, "universal" authorship is clearly much broader
   than just authorship by the "reasonably intelligent". How well would
   those with cognitive difficulties distinguish between the HTML 1.0
   draft's cocktail of concepts like citation, emphasis, and definition
   and presentational elements like <i> and <b> [1]? Arguably, HTML was
   broken for universal authorship long before big business got in on
   the game.

> I'd certainly agree that by the time that was written, HTML had lost
> out in its simplicity objectives, to commercialization.

If HTML is no longer remotely simple, then I don't see how an appeal to
a prelapsarian design goal of radical simplicity (even if such a goal
can be substantiated) works as an objection to <q> in particular. 

> A lot of languages are designed for technical authors

It's true that DocBook "is particularly well suited to books and papers
about computer hardware and software (though it is by no means limited
to these applications)" [2]. But Korpela's language wasn't pitched at
technical authors, and authors in the fields of mathematics, science,
and even the humanities use LaTeX, which also uses a special syntax for
inserting quotation punctuation [3].

As for the observation that DocBook "has a more sophisticated audience
than HTML", I think that's a pretty theoretical objection. The majority
of people who think they are writing HTML are not authoring anything of
the sort; they are swimming in the treacherous waters of hypermedia tag
soup.

> "web designer" is not a job title describing a technical author

True, but a web designer *should* understand HTML, which includes far
more complex components than <q>. Designers also tend to be people
with an interest in correct typography.

> It's an unnecessary abstraction for such users.

Why are you so sure of this? What do you think the point of semantic
abstraction is usually? What advantage does <p> offer over <div> or
<em> over <i> that <q> doesn't offer over, or add to, ‘’ and friends?

> It is almost one of the least used.  This is the evidence I am using,
> although one has to admit that a minor factor is the lack of backward
> compatibility.

Given that even Internet Explorer 7 won't support <q> and that the
implementation in other browsers is dire, this is hardly a matter of
"backward compatibility". I really can't see why you think the lack of
support is only a "minor" factor. Almost every discussion of <q> I've
seen other than the HTML specifications themselves notes how the element
itself is useless because of poor support that can only be fixed with
JavaScript. The absence of current usage just isn't persuasive evidence
that <q> would not be used if properly implemented.

> Unfortunately (using England as an example) formal grammar is no longer
> taught

Time for some damned lies and statistics. Punctuation of quotations with
inverted commas is part of the National Curriculum, Key Stage 2 [5]. 63%
of English 11 year olds passed Key Stage 2 tests to the required
standard for writing in 2005 [6]. 44% of adults have literacy roughly
equivalent to the required standards for Key Stage 2 [7].

> that means that any reasonable punctuation is something of an
> optomistic expectation, rather than an indication that SGML/XML markup
> of the punctuation units would improve the situation.

Do people have a better understanding of the concept of quotation or its
correct punctuation? It seems to me the later is impossible without the
former.

> What's problem C?

Sorry, see "Problem C: punctuation for wrapped lines of <q>" in my original
post [8].

> As I remember it, this limitation is being fixed in XHTML 2.0.

I'm not sure what you're thinking of here. (If it's <l> then no, that's
no improvement over <br> since the line breaks in Problem C result from
width-dependent line wrapping *not* from semantic/structural line
divisions.)

> By normal punctuation, I mean that punctuation that is possible without
> a markup language.

Possible where? In text/plain;charset=utf-8? At a printing house? With a
typewriter?  On a postcard?

> Failure to implement in browsers is always a problem and
> specificications can only get round this by creating commercially
> attractive features

There's little gain in this if the (purportedly) non-commercial features
don't get implemented too.

> Jaws tends not to lead, but rather to do things which help with pages
> coded according to de facto current web design practice, so its lack
> of <q> support probably more reflects the lack of use of <q>.

Yes, it's a self-fulfilling prophecy. Anyhow, Jaws 7 came with support
for Firefox 1.5, but not for the <q> element.

> Again, technical authoring isn't where browsers make money, it is
> advertising and thin client applications.

How did implementing <acronym> or <em> help browsers make money where
<q> would not? 

To (attempt to) sum up your position, we should keep the status quo of a
broken <q> because:

1) (X)HTML + CSS must be simple enough to be a universal authoring
   medium and the under-educated and/or cognitively challenged won't
   understand the concept of <q>.

2) The "reasonably intelligent" who are well educated but not technical
   authors have no need for <q>.

I'm sorry to say your first argument seems an irrelevant consideration.
Including a broken <q> makes the HTML specification even worse for the
under-educated and/or cognitively challenged. And excising <q> would not
make HTML a suitable markup language for them. You don't need to get rid
of <q>, you need to create a *radically simpler markup language* or
*more sophisticated authoring tools with some level of natural language
processing*. Both are extremely worthy causes, and I'd seriously like to
see more discussion of them at the W3C and in the web community
generally, but I don't think this a good rationale for not improving
<q>.

As for the second argument, here's a quick list of what we'd get from a
properly implemented <q>:

a) a standardized way to bind a quotation to a citation

b) a semantic element with which to style quotations (e.g. with a
   different background color)

c) an element to distinguish quotations from things that look like
   quotations but aren't (scare quoted phrases, new terms, article
   titles, and so on)

d) an element that shows unambiguously where a quotation begins and
   ends, important information that is obscured by some punctuation
   styles (such as that typically employed by US English)

e) an element that makes it easier to track quotations (e.g. tracing
   the influence of scholarship, the impact of political soundbites, or
   the popularity of comic catchphrases)

f) a more consistent (X)HTML specification, given the existence of
   <blockquote>

g) an element that can be recognized by assistive technology even if
   the surrounding punctuation is unfamiliar or wrong

And also, if quotation marks are added automatically:

h) no need to remember key sequences or use a picker to insert
   typographically correct punctuation

i) no need to even learn how to correctly punctuate quotations for a
   given language (or pay someone for that expertise)

j) no problem with poorly designed software turning straight quotation
   marks in measurements, transcriptions, or code samples into "smart"
   quotes [e.g. 9]

k) the ability to switch between different punctuation styles with ease

Others may well be able to think of benefits not on this list. Note than
none of the advantages listed are peculiar to technical authoring (that
is, writing about the field of technology).

I defy anyone to think of a serious advantage to including a *broken*
<q> in the (X)HTML + CSS standards.

References
----------

 [1] http://www.w3.org/MarkUp/draft-ietf-iiir-html-01.txt

 [2] http://www.docbook.org/whatis

 [3] http://www.latex-project.org/guides/usrguide.pdf

 [5] http://www.w3.org/TR/WCAG10-HTML-TECHS/#text-quotes

 [6] http://www.nc.uk.net/nc/contents/En-2-3-POS.html#N109BF 

 [7] http://www.literacytrust.org.uk/Database/stats/englandstats.html#graph

 [8] http://www.literacytrust.org.uk/Database/stats/adultstats.html#England

 [9] http://lists.w3.org/Archives/Public/www-style/2006Sep/0141.html

[10] http://blogs.msdn.com/ie/archive/2006/08/28/728654.aspx

---------------------
Benjamin Hawkes-Lewis

Received on Monday, 18 September 2006 12:59:53 UTC