- From: Peter F. Patel-Schneider <pfps@research.bell-labs.com>
- Date: Mon, 03 Dec 2001 09:41:23 -0500
- To: Patrick.Stickler@nokia.com
- Cc: www-rdf-interest@w3.org
I think that we are in the midst of a disagreement over what (an implementation of) RDF is. My view is that an implementation of RDF, or RDF Schema or RDF plus datatypes, is supposed to implement the specification of RDF, or RDF Schema or RDF plus datatypes. The implementation is free to do this in any effective way that it chooses, but it is not free to deviate from the specification either by removal or by addition. That's it. Simple, no? What could be the problem? Well the problem is that the RDF and RDFS documents are silent on what the specification of RDF or RDFS is! This is very surprising, but lets see what they do say. The RDF Model and Syntax Specification does define a formal grammar for RDF and does provide some indication on a mapping from this formal grammar into RDF graphs. However, there is no interface defined for accessing RDF graphs, nor is there any interface for an RDF implementation to indication whether the input it is given is actually syntactically-valid RDF. In the absence of such indications it is permissable---permissable but not reasonable, by the way---to implement RDF as a sink that accepts any input and produces no output at all. Of course any reasonable implementator of RDF goes farther than this and any reasonable reader of the RDF MSS reads more than this into the specification. A typical response is to believe that the RDF MSS also specifies full access to the RDF graph and an out-of-band indication of syntactic errors. Under this reading, an RDF implementation is required to parse RDF syntax, to construct the RDF graph that corresponds to that input, and to provide access to the graph for use by applications. So far so good. There is now a reasonable specification and a reasonable thing for implementations to do. However, now along comes the RDF Core Working Group and they, perhaps inadvertently, provide a different specification of what an RDF implementation is supposed to do. What is this specification? It is the model theory. The model theory provides the meaning for RDF and RDF Schema and, moreover, it provides an interface to this meaning, via entailment. Now reasonable implementors and reasonable readers have a different and--- some would claim, myself among them---much better specification. An RDF implementation is supposed to accept RDF syntax and answer entailment questions, nothing more, and nothing less. There now seems to be an impasse. There are two very different, competing specifications for RDF. What is an implementor supposed to do? All is *not* lost, provided that the RDF Core Working Group does its job correctly. It should turn out that the two specifications are the same, or, more precisely, that the graph specification and interface is just a more-concrete description of what is happening in the model theory. That is, an implementation that constructs a graph and allows access to this graph is just providing an alternative interface to the model theory and entailment. (Of course, this has not yet been proven.) Now along comes datatypes, and the whole point of this note. The datatype model theory is going to end up saying quite a lot about datatypes. It will provide a meaning for the datatype constructions, including which datatype constructions make sense and which don't. This will mean that entailment has to take into account the meaning of such constructions. For example, the datatype model theory is going to have to answer under what conditions <John> <age> "10". entails <John> <age> "010". Any RDF implementation that does not produce the answers demanded by the model theory will not be in compliance with the model theory's specification of RDF. (Note that an RDF implementation is free to use any means to implement this entailment, such as passing all its input through an XML Schema validator that produces native, canonical representations for typed literals.) Now what about the RDF graph specification for RDF? Well it either has to comply with the model theory or there will be two differing specifications for RDF plus datatypes, a very unhappy state of affairs. So the disagreement here appears to be that I am looking at a model theory, suitably extended for datatypes, and inferring what RDF has do to based on this specification. You appear to be looking at a graph specification that does not correspond to the model theory. I claim that your graph specification, aside from not matching the model theory, is not capturing a reasonable specification of RDF plus datatypes. I further claim that there are ways of extending the graph specification that put all or almost all datatype syntax issues, including the lexical-to-value mapping, in a syntax phase that preceeds any RDF-specific processing. (Note that this does not work for all datatype specifications---some specifications need access to a black-box lexical-to-value mappping at a later phase.) Even further, I claim that there is no way to implement a reasonable view of the RDF specification without some processing of the datatype syntax with an RDF implemention itself, if only to determine what is syntactically valid. Given that an RDF plus datatypes implementation will have to process the datatypes, why not then provide a native interface to the underlying data? This interface will be much easier for applications than requiring them to accept pairs consisting of a lexical form and a type. There is nothing in the above that requires XML Schema, by the way. A datatype extension for RDF that uses another datatype schema could be devised. It would also be possible to parameterize the RDF specification so that any compatible datatype scheme could be used. It would also be possible, but somewhat harder, to parameterize a native interface. It would be somewhat easier, and I think probably the most reasonable path, to provide a parameterized interface in terms of a subset of the datatype schema. For example, for XML Schema, the interface could pass a pair like <integer,10> or even <integer,"10"> instead of <decimal with 0 fractionDigits union string,"010">. This would be much easier for applications to handle than requiring them to understand all of XML Schema constructed datatypes. Peter F. Patel-Schneider Bell Labs Research
Received on Monday, 3 December 2001 09:42:37 UTC