- From: Masayasu Ishikawa <mimasa@w3.org>
- Date: Tue, 03 Aug 2004 14:40:49 +0900 (JST)
- To: Christian.Hujer@itcqis.com
- Cc: www-html@w3.org, www-html-editor@w3.org
Christian Wolfgang Hujer <Christian.Hujer@itcqis.com> wrote: > Section 9.8 The quote element > What's the rationale behind requiring the author to add quotes via style or > content instead of inserting them by default ("default stylesheet")? This question comes up frequently, so I'll explain the rationale behind this. Short summary: The q element in earlier version of (X)HTML placed the burden of adding "proper" quotation marks on the wrong side. The quote element in XHTML 2.0 shifts the burden of adding "proper" quotation marks from user agents to authors, who know what are "proper" quotation marks for their documents. Longer story: Back to 2001, the HTML Working Group reviewed all elements/attributes in the XHTML namespace whether they should be succeeded to XHTML 2.0. A question arose whether the q element should be altered to NOT supply the quotation marks by default, and had discussion with the I18N WG and the CSS WG. The basic problem is that the q element requires arcane knowledge of language-sensitive quotation marks, and no user agent would be able to capture all the possible combination of all languages around the world. So, it would be unavoidable that each user agent would end up supporting only certain subset of language-sensitive quotation marks, which may differ by each user agent - the least common denominator would be quite small, or even none. So the result is unpredictable, and authors can't be sure what kind of quotation marks will be rendered, even though they do know what kind of quotation marks they intended. While the HTML 4 spec didn't indicate what a user agent should minimally do [1], RFC 2070 included the following note [2]: NOTE -- minimal support for the Q element is to surround the contents with some kind of quotes, like the plain ASCII double quotes. As this is rather easy to implement, and as the lack of any visible quotes may affect the perceived meaning of the text, user-agent implementors are strongly requested to provide at least this minimal level of support. And this fallback behavior is another reason why the q element was not used widely. In the early days, the main reason was of course the lack of support at all. However, by 2001 many "modern" browsers provided at least "minimal" support for the q element. To list a few (caution: these are the implementation status in 2001, those may have been improved since then): - Lynx has been supporting nested handling of the q element so that it alternates double-quotes and single-quotes with directionality of start and end single-quotes (i.e. something like "... `...' ..."), since 27 May 1996. - Opera supports the q element since version 4 (but only minimally), and also supports relevant CSS properties. - Mozilla/Netscape 6 also support it, but they all just insert " around <q>...</q>, in non-language-sensitive manner. It also supports relevant CSS properties, but didn't handle nesting of quotes properly at that time. - Amaya alternates " and ', but it's not language-sensitive. - Alis Tango is able to configure quotation marks, but strangely its configuration is affected by the language of the *user interface*, so if a user chooses Japanese UI, Tango inserts Japanese quotation marks regardless of the language of the document, even in English or French context. - IE5/Mac tries to be somewhat language-sensitive, but its behavior is sometimes strange, e.g. it uses the combination of U+201C - U+201D and U+2018 - U+2019 if the language is "en", but it merely uses " and ' for "en-US", "en-GB" and so on, and for some languages it uses strange quotation marks. It doesn't support relevant CSS properties or other means to override the default quotation marks. - iCab implements the q element in a language-sensitive manner to some extent, but doesn't provide a way to override the default quotation marks. - IE/Win lacks support for the q element in all versions. This situation effectively shows that the "minimal" level of support for the q element is certainly not difficult, but very few implementors dare to go beyond that level. Ironically IE5/Mac and iCab tried to implement it far better than other user agents, but neither of them provided a way to override the default quotation marks, so for example, neither of them does Japanese quotation marks correctly but authors cannot override the poor "fallback" quotation marks on those user agents. This situation rather discourages the use of the q element, e.g. even if a French author does know what the French quotation marks should be, the specification says that authors should not put quotation marks by themselves around q, and most browsers just end up with ", which is not at all satisfactory. Given that situation, it is quite possible that some authors just insert French quotation marks directly and don't use the q element at all. Even the latest draft of "HTML Techniques for WCAG 2.0" says as follows [3]: The q element marks up inline quotations. NOTE: The q element, though designed for semantic markup, is unsupported, or poorly-supported, in most browsers. So this is a future technique. This is not a document written in the last century, a document written in 2004. Probably the "future" will never come. Not using appropriate markup for quotations is worse than not having appropriate quotation marks. Another difficult aspect of handling language-sensitive quotation marks is that existing practice vary whether quotation marks are considered as part of the content of the parent of the quoted text, or that of the quoted text itself. We researched a bunch of publications, only to find that there's no consistent rule across the world. For example, the quotation marks around English quoted text inside French content text are typically rendered as French quotation marks. On the other hand, when languages like Chinese, German, Indonesian, Korean, Malay are quoted inside Japanese text, quotation marks are typically rendered in the language of the *quoted text*, not as Japanese quotation marks. These are all real-world examples, and those examples effectively show that there are diverse practices around the world, and it is not at all trivial to determine the "proper" quotation marks in an appropriate context. The rule may even differ by local convention, or by author's preference. If we require that user agents should have default style rules, implementors would have to prepare great number of language-sensitive style rules, and even if they do a great job, they won't be able to cover all possible combination of various languages around the world, and even if they can, that may not match the author's preference/ convention. On the other hand, it is rather rare that a document includes multilingual quotations, and authors only have to provide a few style rules that are necessary for their documents. And they do know their preference/convention. So we concluded that it would be reasonable to place the burden of adding "proper" quotation marks on authors rather than implementors. The I18N WG recommended that using styling would be a preferable way and encouraged CSS implementors to support relevant feature more widely and consistently. Then, each author may have their own default style rules, and may include them in their author style sheet. We could provide some sample style rules, but it MUST NOT be in the default XHTML 2.0 style sheet. That's what was agreed between HTML, I18N, and CSS WGs more than three years ago, and why the quote element doesn't add quotes by default. [1] http://www.w3.org/TR/html4/struct/text.html#edef-Q [2] http://www.rfc-editor.org/rfc/rfc2070.txt [3] http://www.w3.org/TR/2004/WD-WCAG20-HTML-TECHS-20040730/#q Regards, -- Masayasu Ishikawa / mimasa@w3.org W3C - World Wide Web Consortium
Received on Tuesday, 3 August 2004 01:40:58 UTC