- From: Ernest Cline <ernestcline@mindspring.com>
- Date: Fri, 12 Dec 2003 19:15:46 -0500
- To: "W3C HTML List" <www-html@w3.org>
In order to handle quotes correctly we must be able to handle the quotation marks correctly. Because in English at least there can be a difference in how quote marks are handled in a block quote and in an inline quote, XHTML 2 needs to be able to distinguish between the two and handle quotation marks correctly. This is fundamental to the nature of a quotation and not just presentational. Hence, XHTML2 should be able to handle this even in the absence of styling. There are three schemes I wish to consider. The first one would be for quotation marks to be included with the content of the quotation element as is currently the case for the (X)HTML <blockquote> element. The second would be to use heuristics to determine what quote marks to use as is currently the case for the (X)HTML <q> element. The third would be to use an child element like the <rp> element from the XRuby Module to provide optional quotes that a user agent could include as warranted. I will defer as long as possible the issue of block vs. inline, as this is clearly secondary to the matter of quotation marks. I will begin by examining how all three approaches work when the content is part of a single document and then proceed to consider how it works when inclusion or transclusion is done. QUOTATION MARKS AS CONTENT With quotation marks as content, the first question we need to consider where do they get put? Example 1A: <!-- XHTML2 fragment --> <p>Text1 "<quote>Text2</quote>" Text3</p> Example 1B: <!-- XHTML2 fragment --> <p>Text1 <quote>"Text2"</quote> Text3</p> If <quote> is displayed inline, there is no display difference between Examples 1A and 1B, but if <quote> is block, then Example 1A is incorrect when displayed, while Example 1B will display correctly. Thus, if quotation marks are to be handled only as content, they must be inside the quotation element unless quote is to be restricted to an inline only element which is clearly not acceptable. This placement causes problems both from the viewpoint of styling and from the viewpoint of inclusion. Whether quote marks are used for a quotation given a block presentation is a stylistic preference. If the quote marks are not desired, there is no way in CSS to remove them or alter them. With inclusion, it is impossible to shift to the correct set of quotation marks for embedded quotes. Example 2A: <!-- file "example.2" --> <quote>"Look at Spot,"</quote> said Dick. <!-- XHTML2 fragment --> When McGuffey wrote, <quote>"<xi:include href="example.2">"</quote> Example 2B: <!-- file "example.2" --> <quote>"Look at Spot,"</quote> said Dick. <!-- XHTML2 fragment --> <p><xi:include href="example.2"> <quote>"See Spot run."</quote> In Example 2A, a correct presentation should change the quotation marks around "Look at Spot," to single quote marks, while in Example 2B they should remain double quotation marks. Hardcoding of one type of quotation marks causes problems with both styling and inclusion. As a result, it can be concluded that XHTML should not handle quotation marks only as content. QUOTATION MARKS AS A SUBELEMENT For the same reasons as given earlier, quotation marks given as a subelement should have the subelement contained inside the quotation element. The advantage of using an element, is that it allows for alternatives to be provided. Example 3A: <!-- file "example.3" --> <q><qc>"<qc>'</qc></qc>Look at Spot,<qc>"<qc>'</qc></qc></q> said Dick. <!-- XHTML2 fragment --> When McGuffey wrote, <q>"<xi:include href="example.3">"</q> Example 3B: <!-- file "example.3" --> <q><qc>"<qc>'</qc></qc>Look at Spot,<qc>"<qc>'</qc></qc></q> said Dick. <!-- XHTML2 fragment --> <p><xi:include href="example.3"> <q>"See Spot run."</q> This solves the problem we had earlier as the user agent can select the appropriate quote marks in both Examples 3A and 3B. The usage of a subelement also solves the quotation mark problem for block quotations. As it enables alternate quotation marks to be specified by styling simply by making only the desired marks visible. All is not rosy tho. There remain two potential problems. One problem is how to handle included quotes in multilingual documents correctly. The problem arises from the fact the style of the quote marks for inline quotes is derived not from the language of the text that is being quoted, but by the language that is used to frame the quote. Hopefully you'll let me get away with only describing the example I have in mind. My knowledge of XPointer and XPath is only sufficient for me to say it can be done, not give the exact code needed to do it. Imagine a source document in French that contains a quote in Latin. Now another document, in English, transcludes the quote. The problem comes in what to transclude. If the quotation element is transcluded, then we should get the French double angle bracket quote characters coming along for the ride, which is not desirable. If only the content inside the quote is transcluded, then we lose access to the attributes of the quotation element such as xml:lang. However, this is not as dire as it may seem at first glance. Technically, the proper way for the French document to handle this is: <q><qc/><span xml:lang="la">quoted text</span><qc/></q> not: <q xml:lang="la"><qc/>quoted text<qc/></q> as the quote marks are French and not Latin. However an XML Include implementation is only required to support the element() scheme not the more complicated xpointer() scheme. To ensure proper handling of quotes without imposing a higher level of required competence for XML Include when used with XHTML it would be useful to require the quoted text if this content model is followed to be enclosed inside another element. This element need not be a special-purpose element tho; simply excluding PCDATA from the content model of the quotation element would be sufficient. Limiting the quoted text to a single subelement would also help transclusions, but is not essential and this model is getting cluttered with elements already. Indeed, the other problem is that we're having to add a lot of elements in an XHTML document, just to support quote marks correctly. Instead of a single element, this scheme needs six elements just to handle a simple English quote correctly in all conceivable circumstances. (The quotation element, a pair of nested quote mark elements and a quotation text element.) This model works, but it is very kludgy. QUOTATION MARKS BY HEURISTIC This would mean going back to the having the user agent provide the quoting characters as specified in HTML 4/XHTML 1 for the <q> element. There are two main problems with this. The first is that a commonly used user agent for HTML 4 has never supported adding the quotes, nor the associated CSS quotes property. Thus it may prove problematic in getting even a basic version of this widelysupported. The second is that the proper quote marks depend heavily upon the language, it is not reasonable to expect a user agent to know the quoting convention of all of the hundreds of languages specifiable by RFC 3066. Automatic quote mark adding should be given a minimal level of support that is required for conformance while encouraging the user agent to improve the performance for languages it knows . An obvious minimal level of support would be equivalent to the following CSS style rules: * { quotes: "\0022" "\0022" "'" "'" } q:before { content: open-quote } q:after { content: close-quote } and then have any more intricate level of support come from styling. (Whether from CSS or some other source.) Then a user agent could provide the following CSS or its equivalent: :lang(en)>*, :root:lang(en) { quotes: "\201C" "\201D" "\2018" "\2019"} :lang(fr)>*, :root:lang(fr) { quotes: "\00AB" "\00BB" "\2039" "\203A"} etc. but it wouldn't be required to do so. And of course if a user agent supports styling, the author would be able to customize the quote characters into what he feels is most pleasant to the eye or if he is using a language which he expects user agents may not support themselves A FINAL COMMENT There is one last question that needs to be considered, the handling of other punctuation adjacent to a quote and how it interacts with the quotation marks of the quote. The general rule seems to be: place the additional marks outside the quote, but for American English, it is common to place an adjacent comma or period inside the final quote mark instead of outside. Unfortunately I can't think of a very good solution of how to do it. The best I can think of using either of the two methods outlined above would be to use either of: Example 4A: <q><qc>"<qc>'</qc></qc>Text<qc>."<qc>.'</qc></qc></q> Example 4B: <style> q.ip {quotes: "\201C" ".\201D" "\2018" ".\2019"} span.ip {display: none} </style> ... <q class="ip">Text</q><span class="ip">.</span> Since the difference between placing such punctuation inside or outside is purely presentation, I don't mind relying ypon styling in this case. IN SUMMARY To handle quotes correctly will require either a very complicated and mostly redundant structure for the quotation element that hand coders will absolutely despise or reverting back to what HTML 4 calls for, i.e., for the user agent to add the quote marks. Ernest Cline ernestcline@mindspring.com
Received on Friday, 12 December 2003 19:16:11 UTC