- From: Ivan Herman via GitHub <sysbot+gh@w3.org>
- Date: Thu, 25 Feb 2016 15:00:11 +0000
- To: public-annotation@w3.org
While I am in favor of having a XPath selector, there are some issues we should be aware of if the WG accepts this proposal. These are all sub-issues that must be reflected, somehow, in the final document. ### XPath and DOM Formally, XPath is defined through a separate [XPath datamodel](https://www.w3.org/TR/xpath-datamodel-31/) document. That document, essentially, says that it relies on the (XML) infoset specification. That is an XML document, whereas HTML5 is not. I have asked our staff colleague (Carine), and this is what she said: > The Web Annotation WG can use the XPath/XQuery data model if they need to, as long as they carefully study compatibility with the constructs to which they want to apply it. We used to have such a document for DOM Level 3, https://www.w3.org/TR/DOM-Level-3-XPath/xpath.html > >That could be a good starting point to evaluate whether DOM4 has departed too much from the original tree model. (I doubt it has) I think the only thing we can/should do is to add a note in the spec, referring to the DOM 3 document so that authors/implementers should be aware of how the XPath is used and defined. (Note that there is no reference to XPath in the [DOM4 spec](http://www.w3.org/TR/2015/REC-dom-20151119/).) ### XPAth and HTML5 In any case, what this means is that XPath works on top of the DOM and *not* on top of the original HTML source. This is important to be emphasized in the spec, because the HTML5 parser may slightly rearrange the original HTML code, which may affect the validity of an XPath expression. A possible reference is: https://www.w3.org/TR/html5/syntax.html which describes the parser (and is therefore hell to read...). However, there are some important internal references to that section. One is: https://www.w3.org/TR/html5/syntax.html#optional-tags which lists the tags that may be missing in the HTML but will be added in the DOM (e.g., `tbody` element if missing). Anywhere that says a start tag can be omitted, it means the parser is going to add the element to the DOM, e.g., `html` `head` `body` `colgroup`, or `tbody`. Another one is: https://www.w3.org/TR/html5/syntax.html#an-introduction-to-error-handling-and-strange-cases-in-the-parser with all kinds of nasty situation that the parser has to take care of (and which lead to DOM modifications). Again, what we can/should do is to add a note in the document drawing attention to this type of problems. ### Normative reference issue Another problem is the status of the XPath documents (I mean the latest, 3.1. versions). At the moment, all documents are in CR, meaning that they would be inappropriate as normative references from a Rec. Some in the reference chain have been in CR for more than a year… However, here is the info I got from Carine: > … it's expected to go to PR along with the other ones in the near future […] Working closely with the developer community, we expect to show evidence of implementations by approximately 1 March 2016. […] It should be in PR before autumn 2016. If that happens, then we may be fine. But we will have to keep an eye on this to see if there are delays... -- GitHub Notification of comment by iherman Please view or discuss this issue at https://github.com/w3c/web-annotation/issues/95#issuecomment-188822574 using your GitHub account
Received on Thursday, 25 February 2016 15:00:13 UTC