- From: Scott E. Preece <preece@predator.urbana.mcd.mot.com>
- Date: Wed, 30 Oct 1996 09:07:10 -0600
- To: davidp@earthlink.net
- CC: www-html@w3.org
From: "David Perrell" <davidp@earthlink.net> | | Scott E. Preece wrote: | > Similarly, I believe your example in the second paragraph to be | totally | > broken - non-nesting tags simply aren't allowed, ever, and all the | > browser can do is try to guess what you really meant. | | Broken or no, the sequence is clear and guessing is unwarranted. I was | pleased to find that IE rendered this -- IMO -- logically. --- Well, you can say that all you like, but in fact the sequence is *not* clear, it is inherently ambiguous and there is no "right" way to render it. You're saying "Oh, well, the author must have meant to have those two tags intertwined and not nesting, so let's render it that way." Not only is this a guess (I can't say why you seem to think it isn't), but any parser with any notion of SGML (and I've never before heard anyone accuse Netscape of taking SGML parsing too strictly) is never going to make that guess, because hierarchical nesting of markup is way too essential a part of SGML to ignore. You cannot get to your "logical" interpretation without totally ignoring SGML parsing. In another note you write... | Carl Morris wrote: | > ERROR... :) I would hope that instead of guessing, logic would be | > applied, and the simplest way out would be taken, since the <TT> is | > inside the <I>, and the <I> has now been closed, logic says "there is | > no way to leave the <TT> open... but this would not be the first | thing | > that followed logic... | | Logic says there is no way to leave TT open? Is it written in a DTD | that "TT must look precisely like output from a teletypewriter, which | has no italic, bold or underline capabilities"? Which monospaced font | to use with TT is left to the browser; if that font is Courier and has | an italic form, is it logical to make unneeded dictates about whether | or not an author is *allowed* to specify it? --- Again, when he says "there is no way to leave the <TT> open" he means exactly that. SGML does not allow for non-hierarchical markup. It is *impossible* to have an element start inside another element and end outside it. SGML simply does not allow you to represent that concept as elements (you *could* represent it using a DTD that included elements that signalled the beginning and end of regions, but the HTML DTD doesn't do that - it wraps regions as elements). Yes, this does make it hard to represent certain real-world situations, including the one you used in your example, but that doesn't change the fact that SGML simply doesn't work that way, and HTML is SGML. Most of your paragraph, however, seems off the point. It *is* possible to nest an I element inside a TT element or vice versa. Morris's point is just that the inner element must, logically, end at the end of the containing element. Section 5.7 of RFC 1866 specifically leaves ambiguous the rendering when you nest one "phrase" element inside another: the browser may apply both fonts or only the inner one (that is, if you have <I>italic <TT>mono</TT></I>, the "italic" has to be in italic, and the "mono" has to be monospaced, but the browser *may* use an italic monospaced font for "mono", but is not required to do so. And in another note: | I'm attempting to apply logic to the treatment of cases where | the rules are broken and yet there is logic in the construct that | breaks them. There is no logic in <I>italic text</B>, but there is in | | > <TT>hello <I>good-bye</TT> maybe?</I> --- My point is that while a human may be able to guess what the author meant by that markup (because it is reasonably easy to imagine a markup language in which that would be a legal expression), there is *no* SGML logic in it. It is not a valid logical expression in SGML. A browser presented with that markup must either guess or throw up its hands and display an error message. There are no other choices. --- | if the text can be both monospaced and italic. Just because <TT>hello | <I>good-bye</TT> maybe?</I> is invalid HTML, logic does not dictate | that </TT> must always indicate the end of an italicized section. The | 3.2 ref spec calls for start and end tags for all text and phrase | markup. --- No, </TT> does not always end an italicized phrase, only when the start of the I markup was inside the TT phrase. That is, if you have the markup: <I>This is italicized <TT>and this is monospaced</TT> and this is still italicized</I>, the </TT> does not close the I element. Hwever, the "and this is monospaced" may or may not be italicized as well as monospaced, at the browser's discretion. --- | > ... I want the browser I use | > to help me find errors... | | Terminating one tag with a different end tag is bad guessing that is | unlikely to help find errors. It was IE's logical (IMO) refusal to | terminate <TT> with </I> that started this thread. --- I want my authoring tools to help me find errors. I want my reading tools to faithfully present the markup they are given, but I don't want to have to read error messages when the author has misused HTML. scott -- scott preece motorola/mcg urbana design center 1101 e. university, urbana, il 61801 phone: 217-384-8589 fax: 217-384-8550 internet mail: preece@urbana.mcd.mot.com
Received on Wednesday, 30 October 1996 10:08:29 UTC