- From: Jukka K. Korpela <jkorpela@cs.tut.fi>
- Date: Fri, 13 Feb 2004 09:07:13 +0200 (EET)
- To: www-html@w3.org
On Thu, 12 Feb 2004, Ernest Cline wrote: > The only benefit I can see to having <q> provide the quotation marks > instead of making them content is that it makes providing correctly > nested quotation marks in transcluded portion of documents easy. I hadn't thought of that. Yes, it makes sense in principle. > The problem is tho, support for transclusion is extremely limited at > present. Yes and no - there's the SGML way that always was formally part of HTML but was never supported, and history repeats itself in the X* world, but there are in fact several ways to perform inclusion using other methods, like SSI includes. But HTML itself is not suitable for "transclusion" in a far more serious manner than the problem addressed by the idea of <q>. To begin with, HTML (including XHTML) is defined so that you cannot simply include a document verbatim into another. There's really no reason why things _couldn't_ be that way. A document is an element, which is a tree structure. There's no logical reason why the root element could re-appear in the tree. But allowing this would be a major change. On the more practical side, assume that we include (sorry, transclude) just a fragment of a document, an element or sequence of elements that can validly be put inside a <blockquote>. What does this _mean_, apart from the fact that virtually all interested parties will understand the element as meaning 'indent'? For example, does any software that creates a table of content from the headings in a document (this is one of the few kinds of meaningful processing of structural markup that really exists) actually pay attention to <blockquote>? If <blockquote> contains any headings, they should surely be excluded from the ToC. Or does any software that checks the recommended use of headings (no skipping of levels etc.) process <blockquote> adequately, effectively as a separate realm? My point is that nobody takes quotation markup seriously now except a few enthusiasts just for the sake of principle. > <q> was just simply ahead of its time Actually the problem was that it was not taken into HTML _from the beginning_. There would be no fundamental problem then. > It also doesn't help that a major browser implements <q> incorrectly. > which contributes to why authors don't use it. Yes, and it's too late to change that now. Authors are used to using quotation marks. Such things don't change easily. Actually it is easier to change a few browsers than to change people. And there's nothing fundamentally wrong with quotation marks. If you write any software that tries to recognize quotations from Web pages, it would be just a theoretical exercise to play with <q> or <blockquote>, and the latter would give you wrong results far more often than not. Recognizing "..." would be much more relevant. So how does this compute regarding markup? Well, what we would actually need is markup for indicating that "..." is _not_ a quotation, when it's actually used for something else! What I mean is that markup could be used to disambiguate the meaning of quotation marks, rather than replace them. > However, even if <q> and transclusion worked as they should work, > there would still be the problem that the flattened text that results from > stripping away the markup is not the same as one would want if one > produced a plain text file, which should be the goal for a Text Markup > Language, whether or not it is eXtensible or Hyper. I don't quite agree. Markup can be _essential_ in the sense that the fundamental meaning of text is thoroughly changed by it. It is debatable whether <h1> is essential. <blockquote> is, in the general case. If you remove <ol> and <li> markup, does the resulting string of characters really correspond to the intended meaning? The HTML 2.0 specification required that <em> and <strong> be rendered as different from each other and from normal text. To me, this reflects an idea of essential markup. (For some odd reason, the requirement has been dropped, and some browsers actually fail to comply with it.) Some markup can just be omitted without affecting the fundamental meaning of a document. Some markup is essential. The HTML specifications have never tried to draw the line. It's not simply structural vs. presentational. Structural markup can be non-essential (and in poor usage, presentational markup can be essential). > <quote><mark>"</mark>This is a quotation.<mark>"</mark></quote> We might just as well have <question>How are you<mark>?</mark></question> and this might make sense for some applications. But if a markup language contains a large number of elements and attributes that _could_ be used, it can become very confusing. There's already the WAI recommendation that says that quotations should be indicated in markup and not using quotation marks. This is a symptom of building recommendations on wishful thinking (and actually reduces accessibility if followed). Actually, I think it might be best to start from scratch. Deprecate <blockquote> and <q>, and say that quotations should be indicated as quotations using suitable wordings and punctuation characters. Define <quote> as an element that can be used both at block level and as inline markup, which authors _may_ use to indicate quotations for the purpose of automated analysis and processing and which should not be expected to affect rendering by default in any way but which user agents _may_ use as additional information when e.g. choosing how some text is spoken. That is, a UA could decide that <quote>"..."</quote> is an actual quotation whereas "..." might be just an indication of "metaphorical" use of a word, for example. (The whole block vs. inline distinction is a mess, and should not be carried over to any new markup elements.) -- Jukka "Yucca" Korpela, http://www.cs.tut.fi/~jkorpela/
Received on Friday, 13 February 2004 02:07:21 UTC