Re: Comments on Last-Call Working Draft of RDF 1.1 Semantics

On Dec 13, 2013, at 2:02 PM, Michael Schneider <schneid@fzi.de> wrote:

> Hi Pat!
> 
> As you have asked me in another mail to reply to your technical points, I will come back in this mail and reply to one particular point that you made in your long answer to my original reply. I consider this particular point of major relevance for this discussion, as I believe that it represents the fundamental differences between our views:

Having read to the end of your email, I agree. I rather suspected that the real reason for your (and maybe also Antoine's) stubbornness over what seems like a very small matter was that the new wording violates a kind of logical/mathematical purity which you feel is, or should be, inviolable. And if I am right, then this is indeed a fundamental difference between our views. 

> Am 07.12.2013 11:01, schrieb Pat Hayes:
>> Michael, let me try again to explain the relationship between
>> the 2004 and 2013 treatments of how datatypes are identified.
> > [...]
> > In 2004, the D parameter was a function from a set of IRIs
> > to datatypes, and given the generality of the way the
> > semantics were stated, this was an arbitrary function.
> 
> In any case, in 2004, each singular mapping <u,d> in D did determine, for a given specific entailment regime D-X, which datatype IRI u refers to which datatype d (i.e. the combination of a lexical space, value space, and lex2value mapping). This has been an explicit(!) statement of the information required to fully determine the formal meaning of the entailment regime D-X.

Suppose it had been phrased without using the terminology "datatype map", but defined simply as a fixed interpretation mapping on the set of IRI datatypes (ie the D of the 2013 draft). This would have been just as exact, do you agree? The only difference between that and the current draft is that the current draft assumes in addition that the fixed interpretation of the D vocabulary is determined by Web naming conventions external to RDF. It seems then that your problem is with this second assumption. 

> 
> Now to the actual point I am about:
> 
> > [...]
>> Now, let us examine this carefully. How exactly are we referring
>> to datatypes here? In fact, the only way we have available is
>> either to use a phrase like "the datatype referred to as "int"
>> in the XML Schema (part 2) specification document", or, less
>> ambiguously and more compactly, to use the URI-reference naming
>> convention specified in that very document. I used this convention
>> myself when writing the equation displayed above, to refer to the
>> datatype called "anyURI". And this is inevitable: the only way we
>> have available to refer to a datatype is to use Web conventions,
>> defined by documents and specifications external to RDF, which
>> define what datatype a certain IRI is the name of.
> 
> What is meant by this statement? Obviously, it is entirely in the responsibility of an entailment regime D-X to specify what is the associated datatype for each of the IRIs mentioned in D. Otherwise the entailment regime would not be sufficiently specified, or may even be non-well-defined.

Of course. But you seem to have missed my point. The only way to do this sufficient specification is to use the IRI naming the datatype, either directly or via some phrasing along the lines of "the datatype referred to as <IRI> in <document>", or perhaps "The datatype which <document> specifies as being the one corresponding to <IRI>". And all of these appeal, one way or the other, to the same Web naming conventions that the current Semantics draft refers to directly. 

> 
> In particular, one cannot rely on any "web conventions" to provide the definition of what the datatype IRIs in D are denoting.

I profoundly disagree. One has to rely on them, because there is no other way to do it. (If you disagree, I challenge you show how else to do it.)

> The formal meaning of an entailment regime and the mathematical results that one can draw from it must not depend on the current state of the Semantic Web!

Why not? Seems to me that if this is indeed true, then entailment regimes would have little relevance for the semantic web. 

> Otherwise I could simply "hijack" an existing entailment regime by publishing a nonsense datatype on the web and give it an IRI of some existing entailment regime.

That would clearly not suffice to change what the IRI identifies, particularly if you are not the owner of the IRI in question. 

> The RDF spec may perhaps provide a little amount of protection against such a kind of "semantic hijacking attack" for the small number of XSD types by adding some semantic restrictions on their names, as it's currently done in the RDF-1.1 Concepts spec, but it cannot do so for every datatype around, s.a. datatypes for phyisical units or domain-specific custom datatypes, or even for datatypes to be invented in the future. After all, it's fully up to the entailment regime to make precisely and completely clear what's in and out of its scope!
> 
>> There are no "mappings" available to an RDF processor, other
>> than thosedefined by such external specifications or
>> conventions. Such a processor is simply presented with an IRI
>> used in a typed literal, and it - the processor - either
>> 'knows' what datatype this IRI is being conventionally used
>> to denote, or it does not.If it does, then this externally
> > defined meaning of the IRI is what it should be interpreting
> > the IRI to mean.
> 
> If such a processor, for a given specific entailment regime D-X and an IRI u mentioned in D, does not know what datatype d is denoted by u, then such a processor is simply /not compliant/ for D-X. Plain and simple! It is the entailment regime that specifies what it means for a processor to be correct and complete for this entailment regime, and nothing else.

But the only access to the datatype is through the IRI. Datatype maps are not transmitted or shared over the Web. A  datatype map is simply an interpretation map applied to the datatype IRI vocabulary D. In the general case, a processor has no access to this interpretation map other than by using standard Web naming conventions and protocols to determine what the IRIs identify. So how can any RDF engine or processor know whether or not it is compliant to an arbitrary datatype map? 

> 
> And the semantics specification that underlies such an entailment regime, here: the RDF 1.1 Semantics, has to provide all technical means such that it is always possible for the author of an entailment regime to define her entailment regime in a way that makes it possible to check (prove!) for every semantic processor whether it is sound and/or complete with regard to this entailment regime.

Yes. What you have to do is exactly what the XML Schema authors did: publish a spec document which defines the datatypes in full detail and specifies which IRIs are to be used to identify them. This is exactly what the semantics says is required. [1] But note, it is not enough to just privately invent this datatype map and not tell anyone about it. The mere *existence* of the datatype map is not enough. The map has to be accessible by a Web-mediated route which starts with the IRI itself, ie you can (in principle, at any rate) take the IRI and use it to access the published information source that tells you what the IRI is used to refer to. All of this is discussed at length in other W3C publications, of course, and is implicitly referenced by the use of the term "identifies" when talking about IRIs.

> 
> And this requirement is not restricted to processing tools: it has, of course, also be possible to decide mathematically for every RDF graph G or pair of RDF graphs (G1,G2), whether G is satisfiable w.r.t. D-X, or not, or whether G1 entails G2, or not. Same for proving correctness and completeness of algorithmic reasoning methods for the entailment regime. Same for proving semantic relationships between this and other entailment regimes, such as that one is a semantic extension of the other. And so on.
> 
>> In other words, it should implicitly *use the Web and the
> > external-to-RDF world* to determine what datatype is being
>> referred to by the type IRI. And if it cannot do this -
>> If it does not recognize the IRI as one that is mapped
>> to an IRI by any known specification - then there are
>> essentially no useful inferences it can make about the
> > literal.
> 
> If I'd followed your argumentation, then this would be like saying that if someone who is trying to prove or disprove whether a given graph G1 entails another graph G2 does /not/ know the meaning of one of the datatype IRIs in D, then the result may be different from the result obtained by someone who /does/ know the associated datatype? So the entailment depends on the particular state of knowledge of that person? Are we still in a technology/logics field here?
> 
> Again: the meaning of the IRI needs to be fully defined by the entailment regime itself, and has to be technically self-contained.

Why? 

> This does not exclude, of course, that the entailment regime refers to external specifications of datatypes, like for the XSD datatypes, which are editorially external to the definition of D-RDFS (if we regard a D consisting of XSD datatypes). But it remains exclusively up to the definition of the entailment regime to say that "this particulary IRI is associated with that particluar datatype (from that particular spec, if needed)".
> 
> In other words: I couldn't disagree more with your view here.

Or I with yours.

> If this is it what is behind this change (and I really hadn't expected this), then it is clear that we are so far apart that we won't find any agreement.

Perhaps so, although it seems to me to be a storm in a tea-cup.

> With the sad consequence that my only choice will then be - and, believe me, I'm going to hate to do this - to formally object. A formal semantics specification needs to be a formal semantics specification, at least, and not draw parts of its formal meaning from what's available in the world around!

I think we can focus all the heat into this one sentence. Indeed, we disagree here. I believe, and indeed have always believed, that in a Web context, a formal semantics specification *must* be connected to the world around, as you put it, if it is to be of any real utility. 

You are clearly horrified by the idea that the meaning of some RDF might be influenced by what is published on the Web. To me, this seems absurd. It is like a linguist being horrified by the fact that word meanings are in part determined by how words are used. If (in an unlikely but possible scenario) an IRI has its meaning changed by the external authority which is responsible for defining that meaning – for example, if a datatype specification were to be changed without changing the IRIs used to denote the datatypes (as I say, highly unlikely and in gross violation of all best Web practice, but logically possible) – then the meaning of any RDF containing this IRI would indeed change. This is just a fact of life, and I see no reason why the RDF semantics should be artificially insulated from the unpalatable consequences of such an event. 

Pat

[1] The most recent draft has some wording added in section 7 which tries to briefly spell this out explicitly. It also, by the way, uses the term "datatype map" in several places to try to establish a clear link to the 2004 style of expression.

> 
> Regards,
> Michael
> 
> 

------------------------------------------------------------
IHMC                                     (850)434 8903 home
40 South Alcaniz St.            (850)202 4416   office
Pensacola                            (850)202 4440   fax
FL 32502                              (850)291 0667   mobile (preferred)
phayes@ihmc.us       http://www.ihmc.us/users/phayes

Received on Saturday, 14 December 2013 11:03:17 UTC