Re: The XML Schema GRDDL Story from Dominique Hazaël-Massieux on 2004-12-16 (www-archive@w3.org from December 2004)

From: Dominique Hazaël-Massieux <dom@w3.org>
Date: Thu, 16 Dec 2004 11:40:41 +0100
To: Dan Connolly <connolly@w3.org>
Cc: "C. M. Sperberg-McQueen" <cmsmcq@acm.org>, Eric Miller <em@w3.org>, Ben Adida <ben@mit.edu>, www-archive@w3.org
Message-Id: <1103193641.2786.128.camel@stratustier>

Le mer 15/12/2004 à 20:44, Dan Connolly a écrit :
> >   - follow the hints given in schemaLocation attributes
> 
> That's what we're considering. data-view:transformation
> attributes serve a similar purpose, meanwhile.

I guess the point is:
- we want GRDDL to be able to work on documents families sharing similar
structures/semantics
- given how GRDDL works, these documents families should be identifiable
by a URI, preferentially dereferenceable 

Currently, GRDDL only groks the implicit convention that documents
starting with a root element in the same namespace are part of a unique 
family - this convention is not backed by any spec, and is even wrong in
some cases, e.g. the infamous XSLT LRE example.

xsi:schemaLocation is another way of identifying a family of documents,
and given what XML Schema are, there is a strong chance that documents
with a root element sharing an xsi:schemaLocation share a common set of
structure/semantics (*) - I can't tell if this is stronger than the Root
namespace convention in practice.

(the profile attribute is another type of family of documents
identifier, for XHTML docs)

Another difference is, as Michael mentions, the control over the
dereferenceable URI used to identify the document. With GRDDL, the
following parties are involved:
A - one associating a family of documents to a set of GRDDL
transformations, thusly defining the semantics attached to the documents
adhering to this family
B - one associating a document to a family of documents, by referencing
the URI of this family in a GRDDL-understanble way
C - one using the (direct or indirect) association between the document
and a set of GRDDL transformations to get the annotated semantics of the
document

With the namespace document solution, (A) is the one that owns the
namespace document, usually the party that actually created and defined
what the namespace is above.

With dataview:transformation, (A) is one with write-access to the
document.

Having xsi:schemaLocation recognized by GRDDL could allow a middle party
to get into play: the one controlling the specific schema used to
validate the document; while this can probably be achieved using for
instance an intermediate dataview:transformation, the benefit of relying
on xsi:schemaLocation is that its is already deployed, and bound to a
well-defined validation mechanism that allows some structure/semantics
consistency checking.

Now dataview:transformation could also be used for some sort of semantic
validation (doc -> RDF/XML -> RDF/OWL consistency checking)...

So I guess the questions to consider on whether or not having GRDDL grok
xsi:schemaLocation are:
- can it be used to identify a family of documents with consistent
structure/semantics? I think it clear can
- does it allow for a large deployment of semantics for XML documents? I
have no idea; I don't know how much xsi:schemaLocation is used in the
real world, nor if it would be easier to get the schemas referenced by
these to integrate GRDDL than with the namespace document case.
- how big is the cost of implementing this additional dereferencing
property? The main cost as I see it it that it develops the possibility
of deep recursions when GRDDL-ing a document; it's hard to assess
whether this is important or not without more GRDDL deployment.
- where do we stop wrt integration of new syntactic links in GRDDL? I
don't know whether RelaxNG, NRL and co have a similar mechanism as XML
Schema to bind documents and their schemas, but if/when they do, should
they be integrated into GRDDL too?

Dom

(*) I haven't put any thought as to having GRDDL to grok
xsi:schemaLocation on any element; I think it would likely fall into the
quotation/reification issues...
-- 
Dominique Hazaël-Massieux - http://www.w3.org/People/Dom/
W3C/ERCIM
mailto:dom@w3.org

Received on Thursday, 16 December 2004 10:41:00 UTC