Re: Comments on GRDDL (using 3rd-party XML schemas with GRDDL) [OK?] from C. M. Sperberg-McQueen on 2007-07-24 (public-grddl-comments@w3.org from July to September 2007)

From: C. M. Sperberg-McQueen <cmsmcq@acm.org>
Date: Tue, 24 Jul 2007 12:12:21 -0600
To: "Booth, David (HP Software - Boston)" <dbooth@hp.com>
Cc: "C. M. Sperberg-McQueen" <cmsmcq@acm.org>, "Dan Connolly" <connolly@w3.org>, "Andrew Eisenberg" <andrew.eisenberg@us.ibm.com>, <public-grddl-comments@w3.org>, <w3c-xsl-query@w3.org>
Message-Id: <B2D6D94D-C854-4264-98CC-E006AA99685D@acm.org>
On 18 Jul 2007, at 22:26 , Booth, David (HP Software - Boston) wrote:

 > This is not an official response from the GRDDL WG, but just a
 > couple of points that I hope will aid this discussion.

 > 1. A key concept of GRDDL is that the XML document author has
 > *authorized* the resulting RDF as a "faithful rendition" of the
 > original XML document: http://www.w3.org/TR/grddl/#sec_rend

 > The central idea is that there must be a clear chain of authority
 > leading from the XML document to the resulting RDF, either via
 > direct annotations or by explicit reference to some a namespace or
 > profile document.

It seems to me to be a surprising and unfortunate lurch toward a
closed-world assumption, to require that the RDF representation of a
document originate with, or be authorized by, the creator of the
document.  I think that's a pretty severe design error, and one that
surprises me, coming from the Semantic Web activity.

But I'll accept that as a ground rule, if you tell me that that's what
you intended.  The problem reported to you from the XML Schema, XML
Query, and XSL Working Groups is that if GRDDL works on some schema
documents (by which I mean: GRDDL can locate transformations described
or mentioned in some schema documents), but only on schema documents
dereferenceable from a namespace name, then GRDDL is imposing a
restriction which has nothing to do with guaranteeing a clear chain of
authority from the XML document.

 From your description I take the rationale for the restriction to be
roughly this:

   (1) Schema documents dereferenceable from the namespace name
       are clearly authorized by the owner of the namespace.

       (I think this is true, or at least plausible.)

   (2) Document authors who use a namespace clearly authorize the
       interpretation of their document according to the rules
       specified by the namespace owner.

       (Debatable, and clearly not true of any of the many many
       documents which commit tag abuse -- for those documents, the
       author clearly wishes that the interpretation of the document be
       that specified by the particular processor they have in mind,
       NOT the interpretation specified by the creator of the
       vocabulary.  But I have a certain sympathy with this premise.)

   (3) So GRDDL transforms pointed to from schema documents
       dereferenceable from the namespace name are authorized by the
       document author.

   (4) Schema documents pointed to from the document itself are
       clearly are accepted by the document author as having authority.

   (5) So GRDDL transforms pointed to from schema documents
       pointed to from the document are authorized by the
       document author.

   (6) Since document authors have authorized GRDDL transforms
       pointed to from schema documents findable at the namespace
       name, or from schema documents pointed to from the document
       itself, GRDDL should support such schema documents.
       (So far, so good.)

   (7) Schema documents for a namespace which are not at the namespace
       name and not pointed to from the document, but found elsewhere
       (e.g. in a local repository), are not authorized by the
       author of the document.

       (Not true.)

   (8) Since document authors have not authorized GRDDL transforms
       pointed to from schema documents not findable at the namespace
       name and not pointed to from the document itself, GRDDL should
       not support such schema documents.
       (Since the premise is untrue, the conclusion is unsound.)

If I have understood your logic, the problem is that GRDDL is
attributing properties to schema documents and their locations which
don't seem to have any basis in the relevant specs (namely XML Schema
1.0 and 1.1).  Nothing in the XML Schema spec licenses the inference
that schema documents dereferenceable from a namespace name have
more authority than others.  The choice of schema documents is not
made by the author of the document but by the user or user agent
asking for validation to be performed.

 > 2. The use (or non-use) of XML Schema is irrelevant to GRDDL.

That's one reason it so mystifies me to see GRDDL making rules about
which schema documents are to be trusted and which are not.  Isn't
that something that should be out of scope for the GRDDL spec?

 > Independent of GRDDL, the same XML document may be used by different
 > applications that wish to transform that XML document to different
 > kinds of RDF for different purposes that are *not* necessarily
 > "faithful renditions" of the original XML document, just as the same
 > XML document may be used with different XML schemas, and not all of
 > them may be sanctioned by the original XML document author.  GRDDL
 > is not designed to support such arbitrary transformations.

Why on earth not?  (But this gets back to the closed- vs. open-world
attitude.)

 > It is only designed to support those that are demonstrably intended
 > to produce a "Faithful Rendition" as indicated by a clear chain of
 > authority leading from the original XML document as mentioned above.

 > This chain of authority is what permits XML document consumers to
 > reliably follow their noses from the XML document to authorized RDF
 > results *without* resorting to out-of-band communication between XML
 > document producers and consumers.

 > If XML document consumers are intended to use a transformation that
 > is *not* indicated directly or indirectly by the XML document -- a
 > 3rd party transformation --

How on earth do you get from the proposition "Transformation T is not
indicated directly by the XML document" to "Transformation T is a
third-party transformation" ?!  It could be a third-party
transformation.  It could be a transform specified by the owner of the
namespace; it could be a transform specified by the author of the
document.  How can you conclude that it's neither of those?

--Michael Sperberg-McQueen
Received on Tuesday, 24 July 2007 18:12:41 UTC