- From: Richard Ishida <ishida@w3.org>
- Date: Fri, 04 Nov 2011 15:33:11 +0000
- To: www International <www-international@w3.org>
Before I send a note to the Unicode Consortium, I thought I'd check for
feedback here. Looking through the list of quotation marks that Ian
Hickson {1} just added to the HTML5 spec I noticed one or two things
that look like anomalies (in the Unicode data). (That table is generated
automatically from the CLDR XML files.)
[1] A couple of locales have non-paired punctuation marks for secondary
quotes. They are af and tg. tg is not yet confirmed, but af is. Is this
really correct?
[2] The arabic entry has the following:
'\201c' '\201d' '\2018' '\2019'
ie.
“ U+201C LEFT DOUBLE QUOTATION MARK
” U+201D RIGHT DOUBLE QUOTATION MARK
‘ U+2018 LEFT SINGLE QUOTATION MARK
’ U+2019 RIGHT SINGLE QUOTATION MARK
which corresponds to
quotationStart quotationEnd alternateQuotationStart alternateQuotationEnd
I think this is wrong. Since these are not mirrored characters in
Unicode, surely the order should be
” U+201D RIGHT DOUBLE QUOTATION MARK
“ U+201C LEFT DOUBLE QUOTATION MARK
’ U+2019 RIGHT SINGLE QUOTATION MARK
‘ U+2018 LEFT SINGLE QUOTATION MARK
Same applies for Hebrew and i assume other languages when they are
written in rtl scripts.
(Note, btw, that these assignments are only default settings. They can
be changed using CSS if desired, eg. to substitute angle brackets for
quotes in Arabic text.)
Any thoughts on this?
RI
PS: (I guess I need to say ;-) Please keep replies to the questions
above, rather than moving the discussion (at least in this thread) to
whether the q element should or should not automatically apply quotation
marks and if so all the pitfalls that that may entail.
{1} http://dev.w3.org/html5/spec/rendering.html#quotes
--
Richard Ishida
Internationalization Activity Lead
W3C (World Wide Web Consortium)
http://www.w3.org/International/
http://rishida.net/
Received on Friday, 4 November 2011 15:33:40 UTC