Re: Comments on Last-Call Working Draft of RDF 1.1 Semantics from Pat Hayes on 2013-12-10 (public-rdf-comments@w3.org from December 2013)

From: Pat Hayes <phayes@ihmc.us>
Date: Mon, 9 Dec 2013 21:12:55 -0600
To: Michael Schneider <schneid@fzi.de>
Cc: "public-rdf-comments@w3.org Comments" <public-rdf-comments@w3.org>
Message-Id: <2BDF3648-2058-45F4-A693-C55A6D187C65@ihmc.us>
On Dec 9, 2013, at 4:15 PM, Michael Schneider <schneid@fzi.de> wrote:

> Pat,
> 
> you are the only one whose mails tend to be even longer than mine.

I am flattered :-)

> This change must really be extremely important to you to get it in, after so many years of everyone else apparently being happy with the original definition.

Not everyone else was happy with the original definition, or indeed with the entire RDF semantics document, which has been extensively rewritten in response to this unhappiness. A constant concern in this rewriting process was to simplify and rationalize things where possible. Many small changes to the semantic description were made with this as a motivating concern (eg, eliminating the notion of a vocabulary, rationalizing the descriptions of the built-in datatypes, defining ill-formed literals to be inconsistent.) The change to which you object was one of these. 

> But I hear you saying below that the problem "had been widely noted", and I guess that everyone probably just waited for the next RDF WG to finally make the change. Too sad, that no one ever told me all the time, not even when I have been a "Semantics editor" myself...

The people who were unhappy were not specification editors or logically trained readers, but rather people concerned with RDF uptake by software developers, and the rather steep initial learning barrier that the RDF specifications presented to the average programmer. 

> Anyways, I think, all points have been made. I have made mine, at least, and will stick with them.

I believe I have responded to all your technical points in some detail. I would be interested in seeing your detailed reply to my responses to your arguments.

> There will be no further involvement into the discussion from my side, except for answering concrete requests, e.g. for clarification, etc. It's now up to the WG to make a decision. What I can say is that if this change makes it into PR,
> I'm going to formally object, and my basic line of argumentation should be clear by now.

It is not clear to me. As the change does not affect any entailments and does not alter the actual interpretation structures being described, and as the documents now (in reponse to your original comment) exactly define the 2004 notion of datatype map so as to provide backwards compatibility with earlier specifications which use the concept, I do not see what the basis is for your objection, other than that you prefer the older style of exposition. 

Pat



> 
> Best,
> Michael
> 
> Am 09.12.2013 10:07, schrieb Pat Hayes:
>> More (unofficial) replies from me.
>> -Pat
>> 
>> On Dec 8, 2013, at 2:28 PM, Michael Schneider <schneid@fzi.de> wrote:
>> 
>>> Hi Richard!
>>> 
>>> Am 07.12.2013 02:52, schrieb Richard Cyganiak:
>>>> Hi Michael,
>>>> 
>>>> An unofficial response from the sidelines.
>>>> I always make a fool of myself when I comment
>>>> on Semantics matters, so I’m already regretting this email.
>>> 
>>> Very good that you step into this discussion, as you
>>> are one of the editors of the RDF Concepts document,
>>> and I haven't had realized yet that my points hit
>>> this document as well.
>>> 
>>>> But doesn’t http://www.w3.org/TR/rdf11-concepts/#datatype-maps
>>>> answer your concern? It says that XSD IRIs MUST denote their
>>>> respective datatypes. This is a normative for Semantics.
>>> 
>>> But what about all the other, non-XSD datatypes, that are
>>> around (in OWL and elsewhere) or that can be invented as
>>> custom datatypes (e.g. a datatype for representing complex
>>> numbers)? What does the "MUST denote their respective
>>> datatypes" mean then?
>> 
>> It means, the IRI that is used to identify that datatype must indeed uniquely identify it, and be understood to so identify it throughout any RDF processing which involves that datatype IRI, i.e., in brief, it must be what is called an "identifier" in a large swathe of Web specifications and recommendations and TAG discussions.
>> 
>>> In the RDF 2004 spec, this was perfectly clear: once you have
>>> provided an explicit datatype map D, i.e. a set of pairs <aaa,x>
>>> of datatype IRIs aaa and their respective datatypes x (the
>>> latter, for example, given in terms of a pointer to another specification that defines that datatype)
>> 
>> This is exactly the same situation we have with the new description. Once you are provided with a datatype IRI which identifies a datatype (as you say, typically given in terms of a pointer to another specification; indeed, typically this pointer is the root IRI of the datatype IRI itself) then the "datatype map" is fixed. Indeed, this datatype map is simply the interpretation mapping of that datatype IRI, if we require (as we do in the semantic condition) that D-interpretations must interpret 'recognized' IRIs (those in D) to denote the datatypes they identify.
>> 
>>> , then the "General
>>> semantic conditions for datatypes", as defined in the old
>>> spec, would do the rest for you, e.g. the first of these
>>> semantic conditions is:
>>> 
>>>  "if <aaa,x> is in D then I(aaa) = x"
>>> 
>>> To be read as: "the datatype IRI aaa denotes its
>>> 'respective datatype' x."
>> 
>> Exactly. But notice what the 2004 description does: it introduces a new, unmotivated and *arbitrary* mapping from IRIs to values, then insists that the interpretation mapping be identical to this new mapping on datatype IRIs. All this amounts to is exactly what you just said: the datatype IRIs *denote* their respective datatypes. So the 2013 description starts with this idea: we *assume* that the vocabulary D has a fixed interpretation in which each IRI *denotes* a datatype. A D-interpretation is then just an interpretation which extends this fixed interpretation of D. This is no more nor less arbitrary or undefined than the 2004 description, in which the datatype map was arbitrary: it simply hands over to some external authority the task of assigning the interpretation of datatypes to IRIs. Which, as I tried to explain in more detail in my previous email response to you, reflects the actual reality of how datatypes are assigned to IRIs on the Web. And the 2013 way of describing t
> his is simpler, because it does not introduce a new concept and immediately discard it in favor of an interpretation mapping, but talks throughout in terms of interpretations of IRIs.
>> 
>>> Now, indeed, the old form in its generality allowed for
>>> defining datatype maps where an XSD IRI is mapped to some
>>> datatype which is not the corresponding XSD datatype.
>>> Neither do I see this as a problem (it's like as you can
>>> write non-terminating or "non-intuitive" programs
>>> in any programming language), nor can such "evil" stuff
>>> be excluded entirely, anyways.
>> 
>> I agree it is not a major problem in practice, but it is a flaw in the specifications. Your analogy with unintuitive programs is beside the point. Of course we cannot prevent users writing crazy RDF, but this was a craziness in the *specification*, not in user-written RDF.
>> 
>>> As I have pointed out in
>>> my original mail, you can associate any XSD URI with any
>>> other datatype easily as soon as you have equality in
>>> your entailment regime, i.e., owl:sameAs.
>> 
>> It is only an aside, but I think this is not correct. The RDF specs refer to the XSD specs, and the XSD specs state what it is that the XSD IRIs refer to. So, any owl:sameAs assertion which has the OWL consequence that such an IRI refers to something other than what XSD says it does, is inconsistent according to the OWL+RDF specs. An inconsistent assertion does not associate anything to anything.
>> 
>>> Of course,
>>> with the restriction given above, this would then
>>> easily lead to unsatisfiable RDF graphs, if the
>>> value spaces of the datatypes are disjoint
>>> (as it is the case for, e.g., xsd:string and xsd:integer).
>> 
>> My point above does not rely on the disjointness of the value spaces. XSD asserts that the datatypes themselves are pairwise distinct.
>> 
>>> But in any way there is no method to stop people from doing
>>> crazy stuff.
>> 
>> But we can have specs which say when they are being inconsistent, as we do when the write an ill-typed literal.
>> 
>>> And as I said, I don't consider this not
>>> to be a problem at all, let people fool around if they like,
>>> I don't have to buy their stuff. And you have this option
>> 
>>> of fooling around in virtually any (non-trivial) technology,
>>> e.g. in programming languages, but no one ever complains.
>>> 
>>> But if the Working Group believes that it is still a good
>>> idea to include such a restriction on XSD datatype IRIs,
>>> then just add a sentence corresponding to the above one
>>> to the spec:
>>> 
>>>  "Given a datatype map D and a mapping <aaa,x> in D,
>>>  then if aaa is an XSD IRI, then x MUST be
>>>  the respective datatype (as defined in the XSD spec)."
>> 
>> Both the 2004 and 2013 documents already make this imposition. But look at how they do it. In the 2013 version, it says that RDF must conform to the XSD specifications of what the XSD datatype IRIs mean. This is simply a reinforcement of what all Web users expect, the standard way to determine meanings of IRIs on the Web. The 2004 version said that there is a new, special, kind of mapping (different from an interpretation map) which applies only to IRIs denoting datatypes; that this map must be defined, but its only purpose is that D-interpretations must agree with it; and in the XSD case, it must map each XSD IRI to the datatype that the XSD spec says it identifies. So there is a map which agrees with the XSD specs and the D-interpretation agrees with that map. Which is just a more complicated way of saying what the 2013 specs say, but in non-standard terminology and using a construct which is neither intuitive nor necessary. The complication introduced by the 'datatype map' serv
> es no useful function.
>> 
>>> Then, you have the restriction that you want, without the
>>> need to change the original representation formalism for
>>> datatype semantics - the two things, datatype maps and
>>> restrictions on datatype IRIs, have really nothing to
>>> do which each other. Personally, I would be ok with this
>>> treatment (knowing well that I can still happily garble
>>> up the whole datatype semantics with owl:sameAs ;-)).
>>> 
>>>> Can’t specs that currently use datatype maps in their
>>>> formalism simply continue to do so? They just need to
>>>> state that the IRIs in the datatype map are considered
>>>> recognized datatype IRIs (to be technically compatible
>>>> with RDF 1.1 Semantics), and add a requirement that certain
>>>> datatype IRIs, if present in a datatype map, MUST be paired
>>>> with certain datatypes (to be compatible with 1.1 Concepts).
>>>> It’s not like RDF 1.1 is outlawing the datatype map construct.
>>>> It just doesn’t use it to define its own semantics.
>>> 
>>> Well, other specs may perhaps continue to use datatype maps,
>>> but then imagine what an embarrassment this would be for
>>> the RDF WG: by going on with the old datatype maps
>>> even in the next spec, the other spec's WG would clearly
>>> confirm that it prefers the old way over the new way,
>>> so the old way has always been good enough for that spec,
>>> while the new one is not good enough to switch to it.
>>> Another round of spec wars in the SW...
>> 
>> Oh, bullshit. I fully expect that future spec WGs will find the 2013 treatment of datatypes clearer and simpler than the 2004 treatment and will use it in preference. If they wish to go on using the term "datatype map' for a D-interpretation mapping restricted to datatype IRIs, the only embarrassment I will feel is a slight sense of regret that I was so lazy as to have introduced the clumsy term in 2004.
>> 
>>> But I think there may be a general misunderstanding here
>>> of what I am about primarily. I'm not in the first
>>> place interested in the question whether the particular
>>> proposed change is appropriate or not (although I consider
>>> it to be confusing, awkward, and a bad idea). What I am
>>> primarily about is that there is no need for any change
>>> whatsoever!
>> 
>> The hallmark of significance used by the WG  for changes to RDF semantics has always been, does it change any entailments? And this does not. It is purely a change in the *style of description* of the semantics, rather than to the semantics itself. The actual interpretation structures it describes are mathematically identical.
>> 
>>> It appears to me that the WG has easily
>>> accepted such a need as a fact, but, as I have pointed
>>> out in my longish earlier mail, all evidence known to
>>> me goes right against such a need.
>>> 
>>> Datatype maps in their current form have been in use
>>> so long and so widely and examined with such intense,
>>> including by myself, and without any indication for
>>> problems ever raised, that the only thing I can say
>>> about it is that the original datatype semantics in
>>> their precise form have to be considered robust
>>> technology by now.
>> 
>> They are not technology at all, robust or otherwise. They are a purely mathematical device used to describe D-interpretations. Any 2013 D-interpretation is a 2004-D-interpretation where the datatype map is defined by the interpretation map. Any 2004-D-interpretation [[which conforms to whatever external specifications define the meanings of the datatype IRIs in D]] is also a 2013-D-interpretation. Any 2004-D-interpretation which does *not* conform in [[this way]] is insane, should never have been allowed as a legal interpretation, violates basic Web assumptions about IRI identification, will be useless for interoperability and will probably have been prohibited in any case by the specification which defines the particular extension of RDF which uses those datatypes.
>> 
>>> The change, as also pointed out
>>> by Antoine Zimmermann in his original discussion of
>>> the topic, comes completely out of the blue now!
>>> 
>>> I mean, if there really was a problem, why hasn't
>>> this problem been brought up during the LCWD or CR
>>> phase of the SPARQL 1.1 spec, which was finalized just
>>> in Spring this year? Has there been any discussion
>>> between the two working groups on this change?
>> 
>> It is not a matter of sufficient importance to require such elaborate discussion. The problem which had been widely noted, as reported by several members of the WG, was that this section of the 2004 Semantics specification was particularly opaque and hard to follow, and that many readers were puzzled as to why datatype IRIs needed such a different and special treatment from other IRIs. (One confusion which was particularly common and problematic was between the datatype map and the L2V mapping of a particular datatype.)
>> 
>>> And if the problem was the issue with the "pathological
>>> mappings" for XSD datatypes
>> 
>> That was not the primary motivation, but it does serve to as further motivation for the change.
>> 
>> Speaking as a Semantics editor, my chief motivation for this change, apart from its simplicity and clarity, is that it introduces into the RDF semantics some slight hint of the reality of how meanings are specified on the Web. A purely mathematical - model-theoretic - semantics cannot by itself recognize the reality of how IRI meanings are imposed by specifications and used by other specifications. I struggled with how to describe this reality in 2004, and failed, and the simplistic but clumsy idea of a 'datatype map' was invented to patch over the gap that was left.
>> 
>> I sincerely wish that we could incorporate a lot more of Web reality into the RDF formal semantics, but of course to do so now would ruffle so many traditionalist feathers that it is probably impossible.
>> 
>>> , then I have given a
>>> solution above, without the need for replacing the
>>> original notion of a datatype map. And again: the
>>> proposed change would by no means remove that problem,
>>> as I will always be able to fool around with datatypes,
>>> if only I have enough semantic power, as with owl:sameAs.
>>> 
>>> I say that the original datatype maps were perfectly
>>> ok, clear and simple enough, at least for me and
>>> for several other WGs, and for several textbook writers,
>>> and for several university lecturers, and so on.
>> 
>> But not for a large number of potential RDF users and developers. Not, in particular, for an entire community of 'linked data' enthusiasts who wish to use RDF in the wider world.
>> 
>>> The change does not solve any existing problem
>>> (including the one discussed above), so why should
>>> there be a change at all?
>> 
>> 
>> RDF is widely perceived as so complicated and arcane that it cannot be practically used; this negative reputation is so widespread that at least one new SW specification was deliberately drafted so as to not even mention RDF. There is a real issue here that we ignore at our peril, and it has been a constant background motivation for the WG discussions. Perhaps this debate you and I are having over this tiny simplification (I will not even honor it with the title of "change" since no interpretation structures or entailments are changed) can be put into some perspective by the observation that the WG did at one point seriously consider removing the nomative semantics of RDF altogether, and may have done so if our charter had not prohibited it. Now that *would* have been a real change.
>> 
>> Pat
>> 
>>>  So no problems, so no need for
>>> a change, so no need to discuss the change by other WGs,
>>> so no danger of interoperability issues, spec wars,
>>> or whatever (and, on a personal note, no tons of mails
>>> in the following years by people wanting to know from
>>> /me/ how this new formalization relates to the good
>>> old datatype maps they were accustomed to :-]).
>>> 
>>> So, please, just don't do anything about the original
>>> definition of datatype maps, because there is absolutely
>>> no need to change anything, because there's nothing wrong
>>> with them - that's all I am wishing for Christmas! :-)
>>> 
>>>> Best,
>>>> Richard
>>> 
>>> Cheers,
>>> Michael
>>> 
>> 
>> ------------------------------------------------------------
>> IHMC                                     (850)434 8903 home
>> 40 South Alcaniz St.            (850)202 4416   office
>> Pensacola                            (850)202 4440   fax
>> FL 32502                              (850)291 0667   mobile (preferred)
>> phayes@ihmc.us       http://www.ihmc.us/users/phayes
>> 
>> 
>> 
>> 
>> 
>> 
> 

------------------------------------------------------------
IHMC                                     (850)434 8903 home
40 South Alcaniz St.            (850)202 4416   office
Pensacola                            (850)202 4440   fax
FL 32502                              (850)291 0667   mobile (preferred)
phayes@ihmc.us       http://www.ihmc.us/users/phayes
Received on Tuesday, 10 December 2013 03:13:25 UTC