RE: Checkpoint 3.7: Big Hurdle for Double-AA/Triple-AAA Compliance

On Fri, 23 Jul 1999, Bruce Bailey wrote:

> 8859-1 does not support left and right quote marks, but Windows has 
> co-opted the non-displayed characters 

This is surely common knowledge.  These character correspond to
unicode code points which are located (well) above 255 in the HTML
Document Character Set.

> from € (x80) through Ÿ 

These notations are explicitly undefined in HTML.  Please do not
confuse the argument even further by using these undefined notations
in your presentation: the discussion is hard enough, without having to
deal with this kind of problem too.  These character code values would
be technically legal as 8-bit coded characters in a datastream whose
coding had been correctly announced with an appropriate charset value
(one of those charset values registered at the IANA repository).  
However, there is no requirement for client agents to accept any
arbitrary coding that the server may offer.  Best results would be
achieved by using one of the internationally standardised codings,
rather than a vendor-defined one.

> several other characters that Mac/PC users have taken for granted for years 
> (dagger, emdash, (tm), etc.).

Be that as it may...

> Unicode supports smart quotes (and much more), but the 3x versions of MS IE 
> and 4x (!) version of Netscape Navigator do not display them.  

It isn't really true that NN 4.* versions cannot display them, though
it may be that some readers don't have their NS4.* properly set up.  
If I display this page for instance

http://ppewww.ph.gla.ac.uk/~flavell/unicode/unidata20.html 

on Win NS4.* with the Unicode font configured to be e.g Verdana (I'm
downgrading it from my usual Cyberbit unicode font for the purpose of
this discussion) then I see a good repertoire of quotation marks from
U+2018 to U+201E, as well as your daggers at U+2020 and U+2021.  The
TM is also in its proper place (on the next page, of course).

> The same is 
> true for the named entities (&ldquo; ... &rdquo;) and for <Q> ... </Q> 
> (which actually even have LESS support).

I believe that the lack of support for <Q> was already appreciated by
the participants in this discussion, and is a matter of some concern,
despite the strong attractions of this construct in accessibility
terms.

> The most cross-platform-compatible way to get proper quote mark is with 
> &#147; and &148; 

Well, I'm sorry, but as far as I'm concerned, these constructs are
undefined in HTML, and I assure you that it does NOT work with "all"
browsers.  There is a correct way of representing these characters in
HTML, and nothing seems to be gained at this point in trying to
reverse the progress of HTML in order to pander to the ad hoc
behaviour of some older browser versions.

> BTW, 
> the W3C validator DOES approve the use of &#147; and the like.

This is a subtle technical issue, but these constructs are not
syntactically invalid (and therefore could not be rejected by a formal
validator): their meaning, however, is not defined by the HTML
specification.  The HTML document character set is defined to be
iso-10646/Unicode, and these positions of the document character set
are reserved for control functions, not populated with displayable
characters.

> I would love to hear from someone to tell me if &#147; ... &#148; works 
> with Un*x browsers.

With standard fonts, the characters disappear entirely, for reasons
that should be obvious (and are trenchantly documented in the writeup
for the "demoroniser" tool).  You could make it work if you visit
every unix user and force them to install non-standard unix fonts.


This was unfortunate.  There are some very real problems to be
discussed and tackled, but I feel the my response here has done
nothing to move matters forward, but has merely made an effort to
regain some of the ground that was just lost.

best regards

Received on Friday, 23 July 1999 17:57:23 UTC