- From: Sam Ruby <rubys@us.ibm.com>
- Date: Fri, 03 Aug 2007 19:32:35 -0400
- To: Dan Connolly <connolly@w3.org>
- CC: public-html@w3.org
Dan Connolly wrote:
> On Fri, 2007-08-03 at 15:42 +0300, Henri Sivonen wrote:
>> On Aug 2, 2007, at 18:16, Sam Ruby wrote:
> [...]
>>> * The notion of “enclosing element” is problematic in the face
>>> of adoption agency algorithms and the like. The prudent thing to
>>> do is to define any case where reparenting would change the meaning
>>> of any element to be a (recoverable) error. This would affect very
>>> few users or documents. It would be a bitch to code in a
>>> conformance checker, but that’s not the spec’s writer’s concern. :-)
>> Reparenting is already an error.
>
> Care to elaborate? If anybody has a moment to walk me/us thru the
> relevant parts of the spec, or just point to it, I'd appreciate it.
I'll take that one. I'm not sure how much of the spec you are familiar
with, so if I cover something that you are already aware of, just skim
over it. It might be helpful to others.
Much has been said about how requiring quotes around attribute values
and slashes in empty tags caused XHTML to "fail", but IMHO, that misses
one of the more interesting stories. Consider the following fragment:
<table><tr><td>x</td></tr><o:p>y</o:p></table>
Now load it in your favorite browser. It may surprise you, but you will
see a 'y' followed by an 'x' on a separate line.
Note that this fragment is well formed in the XML sense -- not namespace
well formed, just simple well-formed, but we'll come back to that in a
minute. So in one sense, it is unambiguous, but in another sense it is
horribly wrong in that not only is <o:p> not defined, it isn't something
that you would expect to find in a table.
What's going on here? Well, the <o:p> element got a foster parent, as
defined here:
http://www.whatwg.org/specs/web-apps/current-work/#foster
(you might want to scroll back a few lines from that anchor to see the
context, namely "anything else" within a <table>)
Why would any browser do such an idiotic thing? Well, the truth is that
they have all been reverse engineering each other for so long, it may be
hard to tell; these days most people reverse engineer IE, but a lot of
behaviors in IE were reverse engineered from NN, but in any case, they
now all do it. And any new entrant into this field had better do it
too, lest they render some portion of the web "wrong".
And furthermore, they had better not just render it this way, they had
better represent it that way in the DOM, as that's what scripts will be
expecting.
And with such, HTML5 was born. Instead of everybody reverse engineering
each other and IE over silly things like what does " " mean, the
bright idea was to write down one definition and have everybody migrate
towards it. This isn't as easy as it looks, which just makes it all the
more valuable to have it done. As long as the behavior isn't totally
insane, it is included in the spec. (Just for fun, search for "Don't
ask" in the spec).
Now, what does this have to do with the question at hand? Imagine the
fragment defined above included in a larger document that declares
xmlns:o at both the root level and on the table element. Which one applies?
My answer: the spec should pick one, and I don't particularly care which
one, but everybody should do it the same way. And a parse error should
be generated.
Henri's point is that a parse error is already generated in this, and
every similar, situation.
That's fine, I would suggest that this case deserves two parse errors
then. Parse errors are cheap. :-)
- Sam Ruby
Received on Friday, 3 August 2007 23:32:42 UTC