- From: Booth, David (HP Software - Boston) <dbooth@hp.com>
- Date: Fri, 27 Apr 2007 02:43:34 -0400
- To: <public-grddl-comments@w3.org>
First of all, thanks for doing this work! I am glad to see it progressing. Here are some comments/questions based on some review (though incomplete) of the GRDDL spec: http://www.w3.org/2004/01/rdxh/spec 1. As a document consumer, I do not really care *how* an XML document is transformed into RDF, I just care that my GRDDL-aware agent can execute an appropriate transformation function and that function produces the right triples. Suppose a GRDDL transformation author wishes to provide transformation functions both in XSLT and in Javascript, as equivalent, alternate means of transforming XML to RDF. Section 6 says: http://www.w3.org/2004/01/rdxh/spec#txforms [[ Developers of transformations should make available representations in widely-supported formats . . . . ]] Is the intent here that content negotiation should be used to permit a GRDDL-aware agent to retrieve the transformation function in its desired language (either XSLT or Javascript)? If so, this sounds good. But now I am wondering how the GRDDL-aware agent can specify its desired GRDDL result format also (e.g., RDF/XML, N3, etc.). Since a specific transformation function would only produce one result format, logically it would make sense to specify the desired result format *and* the desired transformation function langauge using content negotiation. So for example, if my GRDDL-aware agent knows how to execute either XSLT or XSLT2, and wants the result in N3 format, it should be able to specify that it wants receive an XSLT + N3 XSLT2 + N3 How would the GRDDL transformation developer support this? How would the GRDDL-aware agent specify its preferences? 2. Why are GRDDL transformations limited to root elements? Could separate GRDDL transformations be specified for subtrees of an XML document? Suppose I have two XML documents, Cats.xml and Dogs.xml, each having its own GRDDL transformation, and I later combine them into a larger document, Pets.xml, as subtrees. How would I specify the GRDDL transformation for Pets.xml in terms of the GRDDL transformations of Cats.xml and Dogs.xml? 3. Are GRDDL transformations deterministic or not? The spec seems to be saying that two different GRDDL-aware agents, both conforming to the spec, could yield different RDF triples for the same XML document. Section 6: http://www.w3.org/2004/01/rdxh/spec#txforms [[ This specification is purposely silent on the question of which XML processors are employed by or for GRDDL-aware agents. Whether or not processing of XInclude, XML Validity, XML Schema Validity, XML Signatures or XML Decryption take place is implementation-defined. There is no universal expectation that an XSLT processor will call on such processing before executing a GRDDL transformation. Therefore, it is suggested that GRDDL transformations be written so that they perform all expected pre-processing, including processing of related DTDs, Schemas and namespaces. Such measure can be avoided for documents which do not require such pre-processing to yield an infoset that is faithful. That is, for documents which do not reference XInclude, DTDs, XML Schemas and so on. Document authors, particularly XHTML document authors, who wish their documents to be unambiguous when used with GRDDL should avoid dependencies on an external DTD subset ]] That seems to be saying that if the GRDDL transformation is written carefully, or if the input XML document is written in a restricted subset of XML, then the result is deterministic (i.e., the transform always produces the same RDF triples given the same input), otherwise the result is non-deterministic (i.e., different implementations conforming to the GRDDL spec may legitimately produce different RDF triples). I find this somewhat troubling, because a key purpose of expressing information in RDF is to be clear about what is being asserted. So if it isn't clear what is being asserted, that seems to somewhat defeat the purpose. First, I think we should assume that XML document authors cannot (in general) limit their documents to using only a particular subset of XML, because the authors may have little or no control over the schema and other conventions to which their documents must conform. Therefore (if I have understood the GRDDL spec correctly) in order to achieve unambiguous transformations, the burden would be on GRDDL transformation authors to write their transformations in the proper way to achieve determinism. To my mind this raises two issues: - Why should GRDDL transformation authors be permitted to write ambiguous transformations, given that a key purpose of expressing information in RDF is to be unambiguous? - If there is a really good reason why GRDDL transformations should not be required to be unambiguous, then it seems critical that the GRDDL spec should strongly encourage unambiguous transformations, both by providing very clear and prominent guidelines, and, ideally, by providing a validator (or GRDDL "lint") that could ensure that those guidelines were met. Is this planned? 4. Regarding Section 2: http://www.w3.org/2004/01/rdxh/spec#grddl-xml [[ 2. To resolve the relative URI reference glean_title.xsl to absolute form, we use the base URI of this XML element, which is http://www.w3.org/2001/sw/grddl-wg/td/titleauthor.html in this example. ]] It is not clear where the base URI in this example is coming from. Does the sentence above mean: [[ 2. To resolve the relative URI reference glean_title.xsl to absolute form, we use the base URI of this XML element, which *we shall assume* is http://www.w3.org/2001/sw/grddl-wg/td/titleauthor.html in this example. ]] Thanks David Booth, Ph.D. HP Software +1 617 629 8881 office | dbooth@hp.com http://www.hp.com/go/software
Received on Friday, 27 April 2007 06:50:02 UTC