- From: Dan Brickley <danbri@w3.org>
- Date: Sat, 9 Sep 2000 10:52:32 -0400 (EDT)
- To: James Tauber <JTauber@bowstreet.com>
- cc: "'Jonathan Borden'" <jborden@mediaone.net>, www-rdf-interest@w3.org
(sorry, this is a bit long.) Summary: infoset vs application dataset distinction is important. Some excerpts from Cambridge Communique to this effect. Speculation about using Schematron for the application dataset mapping problem. On Fri, 8 Sep 2000, James Tauber wrote: > > Suppose we have arbitrary XML > > > > <person> > > <name type='full'> > > <first>John</first> > > <last>Doe</last> > > </name> > > <name type='nickname'>Johnny Dee</name> > > </person> > > > > is <name> a property of the person, or is name an instance of > > a class which has properties <first> and <last>? > > That's up to the designer of the XML schema. It could be one, the other or > both. What I am suggesting is the designer of the XML schema is the one that > specifies how an instance maps to RDF triples. I think that's right, if we want the triples to represent some meaningful entity/relationship style model rather than simply be an edge-labelled graph version of the DOM. I think Ora raised a similar point a week or two back; we should be wary of mechanically shovelling any/all XML into RDF and expecting something meaningful at the end. One slippery point here is that quite a few people have (very productively) been looking at the latter scenario as well. Some of the early XML Query proposals projected arbitrary XML into an RDF-like graph model, *without taking into account the intentions behind the XML vocabularies being used*. While I can see value in both approaches, it's important to distinguish between them. The Cambridge Communique gives us some conceptual machinery that might help here. Excerpts from... http://www.w3.org/TR/1999/NOTE-schema-arch-19991007 The Cambridge Communique W3C NOTE 7 October 1999 1.The XML data model is the XML Information Set being specified by the XML Information Set Working Group. Other data models exist, both generic and application-specific. RDF is an example of one such generic data model.[...] 2.An XML Schema schema document will be able to hold declarations for validating instance documents. It should also be able to hold declarations for mapping from instance document XML infosets to application-oriented data structures. [...] 4.The extension mechanism should be appropriate for use to incorporate declarations ("mapping declarations") to aid the construction of application-oriented data structures (e.g. ones implementing the RDF model) as part of the schema-validation and XML infoset construction process. This facility should not be exclusive to RDF, but should also be useable to guide the construction of data structures conforming to other data models, e.g. UML. 5.Such mapping declarations should ideally also be useable by other schema processors to map in the other direction, i.e. from application-oriented data structures to XML infosets. By now it has become pretty clear that *both* the XML infoset data structures (elements + attributes stuff) *and* application-oriented data structures (eg. entity-relationships models, UML, RDF models) can be represented in edge-labelled graphs. The thing that we need to be most careful about is talk of turning 'any arbitrary XML into RDF', as if there were a sole, simple answer to this challenge. ('Colloquial XML' is one phrase I've heard used btw). I can think of lots of RDF-ifications of any chunk of 'colloquial' XML. In particular, two broad categories: one where we reflect infoset constructs directly into RDF, another where we reflect the XML-encoded "application data structures" into RDF without preserving details of that encoding. The latter seems to me to be one holy grail of web-data aggregation: we might have two differently serialized chunks of application data that were talking about the same stuff, and use Cambridge Communique-style mapping techniques to form a common representation. The alternative approach, infoset-over-RDF, has it's uses too, so long as we don't make the mistake of assuming that nodes and arcs are and end in themself... So, I look forward to seeing how Redfoot shapes up. I'm wondering if Schematron might be an interesting model to follow, at least in its broad approach to using XSLT. See http://www.ascc.net/xml/resource/schematron/schematron.html and the paper at http://www.ascc.net/xml/resource/schematron/Schematron2000.html (which the former page mentions as in need of corrections, but is still a good read). In particular, Schematron-RDF is intriguing. This "creates RDF statements for each detected pattern in a schema"... Dan ps. another reference to a SOAP/RDF thread from xml-dev some time back; http://lists.w3.org/Archives/Public/www-rdf-interest/2000May/0114.html quoting a helpful clarification from Andrew Layman that's (temporarily I hope) 404-ing at http://xml.org/archives/xml-dev/2000/05/0335.html
Received on Saturday, 9 September 2000 10:52:34 UTC