Re: Comments on Last-Call Working Draft of RDF 1.1 Semantics from Michael Schneider on 2013-12-13 (public-rdf-comments@w3.org from December 2013)

From: Michael Schneider <schneid@fzi.de>
Date: Fri, 13 Dec 2013 23:02:58 +0100
To: Pat Hayes <phayes@ihmc.us>
CC: Guus Schreiber <guus.schreiber@vu.nl>, "public-rdf-comments@w3.org Comments" <public-rdf-comments@w3.org>
Message-ID: <52AB8412.9080903@fzi.de>
Hi Pat!

As you have asked me in another mail to reply to your technical points, 
I will come back in this mail and reply to one particular point that you 
made in your long answer to my original reply. I consider this 
particular point of major relevance for this discussion, as I believe 
that it represents the fundamental differences between our views:

Am 07.12.2013 11:01, schrieb Pat Hayes:
> Michael, let me try again to explain the relationship between
> the 2004 and 2013 treatments of how datatypes are identified.
 > [...]
 > In 2004, the D parameter was a function from a set of IRIs
 > to datatypes, and given the generality of the way the
 > semantics were stated, this was an arbitrary function.

In any case, in 2004, each singular mapping <u,d> in D did determine, 
for a given specific entailment regime D-X, which datatype IRI u refers 
to which datatype d (i.e. the combination of a lexical space, value 
space, and lex2value mapping). This has been an explicit(!) statement of 
the information required to fully determine the formal meaning of the 
entailment regime D-X.

Now to the actual point I am about:

 > [...]
> Now, let us examine this carefully. How exactly are we referring
> to datatypes here? In fact, the only way we have available is
> either to use a phrase like "the datatype referred to as "int"
> in the XML Schema (part 2) specification document", or, less
> ambiguously and more compactly, to use the URI-reference naming
> convention specified in that very document. I used this convention
> myself when writing the equation displayed above, to refer to the
> datatype called "anyURI". And this is inevitable: the only way we
> have available to refer to a datatype is to use Web conventions,
> defined by documents and specifications external to RDF, which
> define what datatype a certain IRI is the name of.

What is meant by this statement? Obviously, it is entirely in the 
responsibility of an entailment regime D-X to specify what is the 
associated datatype for each of the IRIs mentioned in D. Otherwise the 
entailment regime would not be sufficiently specified, or may even be 
non-well-defined.

In particular, one cannot rely on any "web conventions" to provide the 
definition of what the datatype IRIs in D are denoting. The formal 
meaning of an entailment regime and the mathematical results that one 
can draw from it must not depend on the current state of the Semantic 
Web! Otherwise I could simply "hijack" an existing entailment regime by 
publishing a nonsense datatype on the web and give it an IRI of some 
existing entailment regime. The RDF spec may perhaps provide a little 
amount of protection against such a kind of "semantic hijacking attack" 
for the small number of XSD types by adding some semantic restrictions 
on their names, as it's currently done in the RDF-1.1 Concepts spec, but 
it cannot do so for every datatype around, s.a. datatypes for phyisical 
units or domain-specific custom datatypes, or even for datatypes to be 
invented in the future. After all, it's fully up to the entailment 
regime to make precisely and completely clear what's in and out of its 
scope!

> There are no "mappings" available to an RDF processor, other
> than thosedefined by such external specifications or
> conventions. Such a processor is simply presented with an IRI
> used in a typed literal, and it - the processor - either
> 'knows' what datatype this IRI is being conventionally used
> to denote, or it does not.If it does, then this externally
 > defined meaning of the IRI is what it should be interpreting
 > the IRI to mean.

If such a processor, for a given specific entailment regime D-X and an 
IRI u mentioned in D, does not know what datatype d is denoted by u, 
then such a processor is simply /not compliant/ for D-X. Plain and 
simple! It is the entailment regime that specifies what it means for a 
processor to be correct and complete for this entailment regime, and 
nothing else.

And the semantics specification that underlies such an entailment 
regime, here: the RDF 1.1 Semantics, has to provide all technical means 
such that it is always possible for the author of an entailment regime 
to define her entailment regime in a way that makes it possible to check 
(prove!) for every semantic processor whether it is sound and/or 
complete with regard to this entailment regime.

And this requirement is not restricted to processing tools: it has, of 
course, also be possible to decide mathematically for every RDF graph G 
or pair of RDF graphs (G1,G2), whether G is satisfiable w.r.t. D-X, or 
not, or whether G1 entails G2, or not. Same for proving correctness and 
completeness of algorithmic reasoning methods for the entailment regime. 
Same for proving semantic relationships between this and other 
entailment regimes, such as that one is a semantic extension of the 
other. And so on.

> In other words, it should implicitly *use the Web and the
 > external-to-RDF world* to determine what datatype is being
> referred to by the type IRI. And if it cannot do this -
> If it does not recognize the IRI as one that is mapped
> to an IRI by any known specification - then there are
> essentially no useful inferences it can make about the
 > literal.

If I'd followed your argumentation, then this would be like saying that 
if someone who is trying to prove or disprove whether a given graph G1 
entails another graph G2 does /not/ know the meaning of one of the 
datatype IRIs in D, then the result may be different from the result 
obtained by someone who /does/ know the associated datatype? So the 
entailment depends on the particular state of knowledge of that person? 
Are we still in a technology/logics field here?

Again: the meaning of the IRI needs to be fully defined by the 
entailment regime itself, and has to be technically self-contained. This 
does not exclude, of course, that the entailment regime refers to 
external specifications of datatypes, like for the XSD datatypes, which 
are editorially external to the definition of D-RDFS (if we regard a D 
consisting of XSD datatypes). But it remains exclusively up to the 
definition of the entailment regime to say that "this particulary IRI is 
associated with that particluar datatype (from that particular spec, if 
needed)".

In other words: I couldn't disagree more with your view here. If this is 
it what is behind this change (and I really hadn't expected this), then 
it is clear that we are so far apart that we won't find any agreement. 
With the sad consequence that my only choice will then be - and, believe 
me, I'm going to hate to do this - to formally object. A formal 
semantics specification needs to be a formal semantics specification, at 
least, and not draw parts of its formal meaning from what's available in 
the world around!

Regards,
Michael
Received on Friday, 13 December 2013 22:03:23 UTC