- From: Booth, David (HP Software - Boston) <dbooth@hp.com>
- Date: Thu, 28 Jun 2007 10:29:39 -0400
- To: <public-grddl-comments@w3.org>
Per the working group's decision yesterday http://www.w3.org/2007/06/27-grddl-wg-minutes.html#item05 to adopt proposal 3c http://lists.w3.org/Archives/Public/public-grddl-wg/2007Jun/0333.html I am satisfied with this resolution. David Booth, Ph.D. HP Software +1 617 629 8881 office | dbooth@hp.com http://www.hp.com/go/software Opinions expressed herein are those of the author and do not represent the official views of HP unless explicitly stated otherwise. > -----Original Message----- > From: public-grddl-comments-request@w3.org > [mailto:public-grddl-comments-request@w3.org] On Behalf Of > Booth, David (HP Software - Boston) > Sent: Tuesday, May 29, 2007 12:33 AM > To: public-grddl-comments@w3.org > Cc: Jeremy Carroll; McBride, Brian > Subject: issue-dbooth-3: Ambiguity in an XML document's > intended GRDDL results > > > This is a personal comment -- not on behalf of HP. > > This comment is about ambiguity in an XML instance document's > *intended* > GRDDL results. Such ambiguity should be distinguished from > cases where > the GRDDL-aware agent *knowingly* chooses to deviate from the GRDDL > transformation author's expressed intent (for security or other > reasons), and thus accepts responsibility for any differences between > the computed results and the GRDDL transformation author's intended > results. > > Definition: By "XML instance document" I am referring to a concrete > "representation" in the TAG WebArch sense -- not an "information > resource". > > POINT 1: For any XML instance document, to the extent possible, the > GRDDL spec should make it clear exactly what are the intended GRDDL > results for that XML instance document. Two implementations > faithfully > implementing the GRDDL spec should come to the same conclusions about > what those intended GRDDL results should be, i.e., there should be no > ambiguity. > > I do not think the GRDDL specification should be considered finished > until the spec makes this clear, given that: > - GRDDL is the cornerstone for bridging the worlds of XML and RDF. > - A key purpose in expressing semantics in RDF is to make them > *unambiguous*. > - GRDDL is on track to become a W3C Recommendation. > - GRDDL may have quite a long life. Both XML and RDF have > been around > for several years with little change, and show no signs of being > replaced. I see no reason why GRDDL should not have a > similar lifespan. > > POINT 2: At present, it is not clear what is the view of the Working > Group (WG) toward ambiguity in an XML document's intended > GRDDL results, > i.e., whether the WG believes: > > a. it is a problem, but we do not know a solution; > b. it is a problem now, but we expect the problem to go away > when the XProc or some other spec is completed; or > c. the WG does not consider it a problem. > > I would vehemently object to position c, for the reasons > above. In the > case of position a, I believe there *are* ways to reduce or eliminate > such unintended ambiguity, and I will be happy to suggest > ways to do so. > In the case of position b, I think it is important that the WG make > clear exactly *how* XProc or some other spec is intended to make the > problem go away, and indicate that in the spec. At present, the spec > explicitly allows the intended results to be implementation defined, > which IMO is unacceptable for a spec of this kind. > > POINT 3: The spec needs to define a notion of "complete GRDDL results" > for a given XML instance document. It is good that the specification > describes how partial GRDDL results can be determined, because partial > results may be adequate for many applications. But the spec > also needs > to clearly define what constitutes the *complete* GRDDL results > indicated by a given XML instance document, i.e., all and only the > intended GRDDL results for all GRDDL transformations indicated by that > XML instance document. > > This is particularly important in supporting applications in > which GRDDL > is used to express the *entire* semantics of an XML instance document, > such as a messaging application as described in issue-dbooth-9a, > http://lists.w3.org/Archives/Public/public-grddl-comments/2007 > AprJun/006 > 9.html > i.e., where custom XML document types are created or treated as custom > serializations of RDF, as described in > http://dbooth.org/2007/rdf-and-soa/rdf-and-soa-paper.htm . > One must be able to say with clarity: "For this XML instance document, > the complete GRDDL results are intended to be precisely the following > RDF triples -- no more and no less." > > (Note that the spec currently defines GRDDL results in relation to > information resources rather than XML instance documents (i.e., > representations), and this is needed for namespace and > profile URIs, but > it is not sufficient. GRDDL results *also* need to be > defined in terms > of XML instance documents (i.e., representations), because as pointed > out in issue-dbooth-9a, > http://lists.w3.org/Archives/Public/public-grddl-comments/2007 > AprJun/006 > 9.html , it *always* makes sense to talk about the GRDDL results of an > XML instance document, but it does *not* always make sense to > talk about > the GRDDL results of an information resource.) > > Tellingly, I notice that the WG has routinely been using an implicit > concept of the complete GRDDL results (though not using this > term) when > discussing and comparing test results, for example when two > testers talk > about whether they got "the same" results for a particular test case. > > Furthermore, the algorithm given in sec 7 of the GRDDL spec > http://www.w3.org/2004/01/rdxh/spec#sec_agt > describes most of the process needed to determine the complete GRDDL > results for a particular XML instance document, but: > - it does not define a conformance term for people to use; > - it is defined in terms of a URI as a starting point, which > introduces > much more ambiguity than being defined in terms of an XML instance > document as the starting point; > - it is intended for describing partial GRDDL results; and > - more needs to be nailed down to define the notion of complete GRDDL > results. > > Namespace and profile information URIs make it much more difficult to > define the notion of complete GRDDL results, because there is no > guarantee that the GRDDL processor is able to retrieve the correct > namespace or profile representation that specifies all of the intended > grddl:namespaceTransformations or > grddl:profileTransformations that the > author intended should be applied. However, this difficulty can be > overcome by adding something to the Faithful Renditions section to the > effect that: > > "By specifying a GRDDL namespace transformation or profile > transformation in a representation of a namespace or profile > information resource, the creator of that namespace or > profile states that every other representation of that same > information resource that also specifies a GRDDL namespace > transformation or profile transformation is functionally > equivalent." > > If desired, I can describe in more detail how this can be done. > > This approach will work when namespace and profile documents have > representations available that define GRDDL transformations. But many > XML instance documents will need to make use of namespaces or profile > documents that will not have such representations available, and since > the dependency for defining complete GRDDL results is > recursive through > all namespace and profile documents, it seems likely that in > many cases > this approach will be infeasible. Therefore, the GRDDL spec > should also > define a short-cut mechanism to allow an XML instance document to > specify, for example, a grddl:completeTransformation attribute whose > presence would indicate that namespace and profile documents do *not* > need to be processed in order to determine the complete GRDDL results. > > To cover xhtml document types that cannot contain > grddl:completeTransformation annotations directly, this > approach *could* > also be extended by defining a grddl:completeProfileTransformation > property whose presence would have a similar effect of > saying: "there is > no need to look at any other profile documents". However it > may be less > important to know the complete GRDDL results for xhtml > documents than it > is for XML documents in general, so such an attribute may not be > necessary. > > POINT 4: The Faithful Rendition section is excellent for making clear > how the semantics of GRDDL results should be interpreted. However, I > will note that its intent is somewhat unclear, as it could mean either > or both of: > > - The RDF results of a GRDDL transformation reflect > real-life semantics > of the input XML instance document, however these semantics may be a > subset of the full semantics of that document. (In essence, they are > whatever subset of the full semantics the GRDDL transformation author > has chosen to expose via GRDDL.) > > - GRDDL results for a given XML instance document may be ambiguous > (implementation defined), and it is the GRDDL transformation author's > responsibility to anticipate this ambiguity and ensure that > the results > reflect real-life semantics of the input XML instance document anyway. > > I like the first interpretation, and I consider that as a > feature of the > spec. I do not like the second -- and I view it as a bug in the spec > -- because it merely foists the ambiguity problem off to the GRDDL > transformation author, and as I point out below, AFAICT it is not even > *possible* for the GRDDL transformation author to always write > transformations that produce correct, unambiguous results. > > POINT 5: In discussing the Faithful Rendition assurance, Section 6 > explicitly says: "Therefore, it is suggested that GRDDL > transformations > be written so that they perform all expected pre-processing . . . .". > But if the GRDDL transformation requires a particular sequence of > pre-processing, or it requires there to be *no* pre-processing, then > AFAICT it is not possible for the transformation author to > control this > if pre-processing is explicitly permitted to be arbitrarily chosen by > the implementation before the GRDDL transformation ever sees > the input. > > For example, suppose my schema includes blocks of XML code from other > documents, and I define a <myns:quote> tag to prevent the embedded > chunks of XML from being interpreted, and suppose that one of those > embedded chunks uses xinclude: > > <myns:myDoc . . . > > <myns:quote> > <otherNs:whatever> > <xi:include href="http://example.org/do-not-expand" /> > </otherNx:whatever> > <myns:quote> > </myns:myDoc> > > When this document is GRDDL transformed, the entire chunk of > XML inside > the <myns:quote> element is supposed to become the value of an RDF > property *verbatim*, without expanding the xi:include > directive. If the > XML parser is permitted to expand or not expand the > xi:include directive > at its discretion, before the GRDDL transformation even sees > it, then it > is not possible for the GRDDL transformation author to ensure that > correct results will be produced. > > Again, please let me know how I can be most helpful in resolving this > issue. > > Thanks, > > David Booth, Ph.D. > HP Software > +1 617 629 8881 office | dbooth@hp.com > http://www.hp.com/go/software > >
Received on Thursday, 28 June 2007 14:29:52 UTC