Re: <IMAGE>? <TT> == <I>? toHell(NS)

Scott E. Preece (preece@predator.urbana.mcd.mot.com)
Thu, 31 Oct 1996 10:06:49 -0600


Date: Thu, 31 Oct 1996 10:06:49 -0600
Message-Id: <199610311606.KAA07740@predator.urbana.mcd.mot.com>
From: "Scott E. Preece" <preece@predator.urbana.mcd.mot.com>
To: davidp@earthlink.net
CC: www-html@w3.org
In-reply-to: "David Perrell"'s message of Wed, 30 Oct 1996 11:55:29 -0800
Subject: Re: <IMAGE>? <TT> == <I>? toHell(NS)

  From: "David Perrell" <davidp@earthlink.net>

| > it.  You're saying "Oh, well, the author must have meant to have
| those
| > two tags intertwined and not nesting, so let's render it that way." 
| Not
| > only is this a guess (I can't say why you seem to think it isn't)
| 
| I consider it an assumption that is not in conflict with the 3.2 ref
| spec. You say treatment of bad HTML is undefined, then defend NS's
| inconsistent behavior in the face of it as reasonable guessing. I think
| it is not that difficult to formulate a consistent set of assumptions
| to apply in the face of bad markup (and noted that IE fails to). I
| don't think it needs to be law. I think it should be in the form of
| recommendations in a reference spec, and considered good manners to
| abide by. Outside of thinking NSN's treatment of <TT>text</I> is not at
| all desirable and in conflict with the closing tag rule, I don't much
| care what the assumptions are.
---

(a) As I said, NSN didn't handle this case the way *I* would have chosen
(which is also not hte way you would); the only sense in which I am
"defending" NSN applies equally to MSIE - the standard says nothing
about what to do with illegal markup, so the browser is free to do
whatever it thinks best.

(b) the "closing tag rule" is a constraint on *authors* not on the
browser.  All it does for the browser is allow the browser to say, with
less analysis than would otherwise be required, that the markup is
broken.

(c) if there were such recommendations in the standard, then it would be
a defect to not follow them.  There aren't, so the browser vendor is
unconstrained.  If such text were proposed, we could have an interesting
discussion about what the best behavior would be...

---
| 
| > SGML does not allow for non-hierarchical markup.  It is
| > *impossible* to have an element start inside another element and end
| > outside it.
| 
| No more impossible than closing a tag that isn't open. The construct
| that started this sub-thread was <TT>text</I>. Opening and closing tags
| are required. There is no opening tag for italic. Therefore <TT> is not
| contained in italic markup and <TT> should not be terminated by </I>.
---

The impossibility is in the parsing, not in the authoring.  The tags
just guide the parsing, they aren't part of the information model.

SGML parses to a tree of elements.  It is impossible for that tree to
represent a situation in which an element starts inside another element
and finishes outside it.  The fact that the DTD requires end-tags
simply removevs a possible author convenience, it does not change the
parsing model.  At the point in your example where the browser sees the
</I> it can ignore it or try to recover from it or display an error
message or whatever else seems appropriate to the circumstances.  The
choice Netscape made in this instance is going to be right sometimes and
wrong sometimes; I lack statistical evidence to guess whether it's right
more often or wrong more often.

scott

--
scott preece
motorola/mcg urbana design center	1101 e. university, urbana, il   61801
phone:	217-384-8589			  fax:	217-384-8550
internet mail:	preece@urbana.mcd.mot.com