[Bug 6777] In HTML documents, no-namespace expression must match http://www.w3.org/1999/xhtml nodes

http://www.w3.org/Bugs/Public/show_bug.cgi?id=6777





--- Comment #11 from Henri Sivonen <hsivonen@iki.fi>  2009-04-06 15:04:04 ---
(In reply to comment #8)
> It is unreasonable to expect the XPath specification to request special
> treatment for one class of documents. 

text/html is a notable class of documents.

> You are essentially proposing to fork the XPath specification so 
> that different rules apply depending on the input document and 
> the processor.

I'm not proposing different rules by processor. I'm proposing a different rule
to apply depending on the HTMLness flag of the owner document. Non-browser
XPath processors don't need such a flag and would, therefore, be unaffected.

> The same result should entail whether an XPath
> expression is evaluated by a DOM inside a browser or by an external processor
> outside the browser not using the DOM at all. You are proposing that these two
> cases would produce different results because one would understand the case of
> an owner document and one would not.

If a processor accepts XPath expressions from JavaScript programs that are out
there on the Web, a no-namespace expression needs to match nodes parsed from
text/html. If a non-DOM processor doesn't accept XPath expressions from
existing JavaScript programs, it is unaffected.

This wouldn't be the first time that APIs are subtly different in the browser
and in server-side Java, BTW. For example, DOM getAttribute method must return
null in browsers when the attribute is missing. Returning the empty string like
the spec says would Break the Web.

> This is an ugly idea that would significantly increase the complexity and
> learning curve for XPath, as well as break much existing software that is
> designed to  process HTML documents using the current well-defined XPath data
> model.

The point of the change is to reduce the differences between DOM/Infoset
representations of equivalent text/html and application/xhtml+xml documents.
This, for example, removes the need to make Selectors behave as if HTML
elements HTML documents were in the http://www.w3.org/1999/xhtml namespace,
because with this change they simply are in that namespace without "as if".
This should be considered an architectural win.

In fact, while implementing this change, this XPath issue was the only case
where elegance was reduced instead of being increased. It's unfortunate that
this then logically must be the case that people working primarily with XML
notice.

> The correct solution is simple: require namespace well-formedness for HTML 5
> documents. 

Into which namespace would you assign HTML element nodes?

> Until the spec takes that simple step, you're going to find yourself
> asking for one special case after another. This is not the first and it will
> not be the last. 

What step, precisely, would you like HTML5 to take?

> Almost everything the W3C has done for the last 12 years has been predicated on
> the notion of namespace well-formedness. You may be right that this was the
> wrong decision, and that we need to throw out 12 years of deployed tools and
> technologies and start over. However, don't expect that we can retrofit your
> new model onto the existing stack.

The way I see it, HTML is what is being retrofitted to the namespace-enabled
stack here.

> If HTML 5 won't accept namespace
> well-formedness, then it's going to have to build its own replacements for
> XPath, XQuery, etc. The existing ones just won't work. 

The whole point of making text/html and application/xhtml+xml DOMs / Infosets
consistent is to reduce the number of special cases and to enable the use of
the same above-DOM/Infoset technologies for both. 

Unfortunately, in the case of DOM Level 3 XPath API, existing content has been
deployed prior to this harmonization. (Obviously, we'd have no issue here if
HTML and XHTML had been namespace-wise consistent ever since DOM Level 2.)

(In reply to comment #9)
> >breaking existing content that uses the API would not be good
> 
> OK, so we're talking about the DOM level 3 API to XPath 1.0. You're changing
> the DOM representation of the content so that the elements will be in the XHTML
> namespace instead of the null namespace, and you want to change the semantics
> of XPath 1.0 so that it behaves as if you had not made this fundamental change.
> Is that right?

Looking backward on existing content, that is right. I'm now pursuing this in
the context of the spec defining document.evaluate() instead of pursuing this
in the context of the spec defining XPath itself.

Looking forward, the same XPath name expressions that use the
http://www.w3.org/1999/xhtml namespace will work on both trees originating from
text/html and application/xhtml+xml.


-- 
Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.

Received on Monday, 6 April 2009 15:04:18 UTC