- From: Jonas Sicking <jonas@sicking.cc>
- Date: Tue, 22 Sep 2009 11:14:32 -0700
- To: Shane McCarron <shane@aptest.com>
- Cc: Henri Sivonen <hsivonen@iki.fi>, Mark Birbeck <mark.birbeck@webbackplane.com>, HTMLWG WG <public-html@w3.org>, RDFa mailing list <public-rdf-in-xhtml-tf@w3.org>
On Tue, Sep 22, 2009 at 9:48 AM, Shane McCarron <shane@aptest.com> wrote: > > > Henri Sivonen wrote: >> >> How would you characterize the ongoing denial that the syntax >> xmlns:p="http://example.com/" is problematic? >> http://lists.w3.org/Archives/Public/public-html/2009Sep/0843.html >> http://lists.w3.org/Archives/Public/public-html/2009Sep/0790.html >> >> How can the problem be meaningfully resolved when you aren't even >> admitting there's a problem to discuss? > > Because, Henri, we don't grok the problem. I am slowly beginning to > understand that this might be due to our talking past one another. The W3C > has a Recommendation that defines the Syntax of RDFa *input* and the > extraction of RDF triples from that *input*. It defines this as an > extension to XHTML. XHTML Modularization provides the structure for a host > language. The Recommendation is carefully vague about how that input is > parsed because that is properly the job of the host language. > > In the RDFa in HTML document, Manu has deferred the syntax and extraction to > the existing Recommendation, and has deferred the parsing of the input to > the host language specification (HTML5). > Jonas, Maciej, and you have pointed out that (my translation here) since it > is possible for the *input* to be altered on its way to the code that would > perform the extraction, it is important we define the rules for that > extraction more tightly. In particular, it is possible that the syntax of > an 'xmlns:' declaration attribute may not be readily available. It is also > possible that, depending on the form of the *input* document, the > declaration attribute may manifest in different ways on its way through the > toolchain (e.g., showing up as a literal 'xmlns:foo' in HTML mode, and as > 'foo' in the XMLNS namespace in XML mode). However, I don't think *anyone* > has said that the declaration will not be present in some form if it was > present in the original *input*. And that's how the processing rules are > written. > > Section 5.5 defines the way in which prefix mappings are defined and > remembered by an implementation, not how they are pulled from the data > stream by that implementation. To the RDFa Task Force, these are > implementation details. Depending upon your implementation strategy and > environment, you will need to find the things the RDFa extraction process > cares about, and act upon them to generate the triples. We really, really, > really don't care how you do this. What we care about is that each engine > emits the same triples in the end. That's why there is a test suite, and > its why there were lots of independent implementations with completely > different strategies long before the specification was complete. > > Regardless, I agree there is room to tighten the language to ensure that > implementors have the proper guidance, and that edge conditions, even > pathological ones, have clear, consistent rules. I have proposed that we > augment the text in RDFa Syntax section 5.5 step 2 to directly address this > problem, and am updating my proposed errata text now. I hope that, when > that is ready, you will continue to help by letting us know if it satisfies > your objections. I would say there are two separate things that are missing: The most substantial one is how to do prefix mappings in a DOM or a HTML document. Prefix mapping is currently defined using the Namespaces in XML recommendation. However this recommendation only defines how prefix mappings are done in a serialized XML document. I hope we can all agree that neither DOMs (an in-memory datastructure) or HTML documents are not XML documents. For example, if I have a DOM and I want to do map the prefix "foo", which of the following algorithms should I use: 1. Call Node.lookupNamespacePrefix as defined by DOM Level 3 using "foo" as the prefix argument. 2. Walk up the parent chain looking for an element with an attribute with localName "foo" and namespace "http://www.w3.org/2000/xmlns/", and then use the value of that attribute. 3. Walk up the parent chain looking for an element with an attribute with tagName "xmlns:foo", and then use the value of that attribute. 4. Walk up the parent chain looking for either the attribute in 2 or 3, and if both are specified use some prioritization order. 5. Walk up the parent chain looking for either the attribute in 2 if the document was parsed as XHTML, or attribute in 3 if the document was parsed as HTML. 6. Do something else? Any of 1 to 5 (as well as possibly 6) seems equally valid to me, and as far as I can tell there really is no specified answer. Likewise, how do I find out what the 'cc' prefix is mapped for the <a> element in the following serialized HTML document? <!DOCTYPE html> <html xmlns:cc="http://example.org/myNamespace#"> <head><title>HTML+RDFa example</title></head> <body> <table xmlns:cc="http://creativecommons.org/ns#"> <a rel="cc:license" href="http://creativecommons.org/licenses/by-nc-nd/3.0/"> Creative Commons License </a> <tr><td>Example table</td></tr> </table> </body> </html> As far as I can see the Namespaces in XML recommendation can't help me in either situation. For the DOM it doesn't deal with in-memory data models at all, rather it only deals with serialized XML documents. For the HTML document, if I try to apply XML processing as prescribed by Namespaces in XML, I conclude that 'cc' maps to "http://creativecommons.org/ns#", when in reality suspect it should map to "http://example.org/myNamespace#". The second, IMHO lesser problem, is that no processing is defined anywhere for non XML documents. Even as far as reading a rel attribute out of a document is only defined for XML documents. All normative requirements refer to XML processing and thus applies no more to HTML documents than to GIF images. I consider this less of a problem because I think it's fairly obvious that data is read out of a DOM using the getAttribute function. And from a HTML document by first parsing it to a DOM and then calling getAttribute. But it really should be formally defined somewhere. / Jonas
Received on Tuesday, 22 September 2009 18:15:34 UTC