W3C home > Mailing lists > Public > public-rdf-wg@w3.org > January 2014

Re: Resolution needed: ISSUE-165: datatype map

From: Richard Cyganiak <richard@cyganiak.de>
Date: Mon, 6 Jan 2014 09:40:20 +0000
Cc: RDF Working Group WG <public-rdf-wg@w3.org>
Message-Id: <E94E8F24-CB1D-41EE-BAD6-C09BA6A64ED4@cyganiak.de>
To: Antoine Zimmermann <antoine.zimmermann@emse.fr>
Hi Antoine,

Happy new year. Hope you had a good vacation.

The response below has been sitting in my drafts folder. I wrote it last year, and wanted to look it over once more before sending. Then the holidays intervened. This conversation is now perhaps becoming a bit stale, given that we have a WG resolution. But since a formal objection was threatened over this issue, and since your last email contains quite a few assertions that I think are confused or mistaken, I feel it might be worth trying to get to the bottom of the disagreement. So, here we go. The full conversation is still attached after my email, in case you don’t remember the full context of the bits I quote.

On 18 Dec 2013, at 14:44, Antoine Zimmermann <antoine.zimmermann@emse.fr> wrote:
> You are referring to some kind of social agreement with a documented specification.


The idea of publishing specifications as a means of establishing conformance of implementations is the core tenet of every standardisation body, and shouldn’t be dismissed as “some kind of social agreement”.

> Pat is more specific, since he refers to "Web machinery". By saying Web machinery, it seems that it excludes datatype maps that would be decided at face tod face meetings to implement the D-entailment in a company's information system.

Pat writes: “In practice, this can be achieved by…”

I don’t think this can be read as excluding other approaches.

> you mistakenly assume that the datatype to which IRIs map is unspecified

I don’t assume that. I know that RDF 2004 requires all the datatypes to be in the datatype map. My point was that *you* left the datatype unspecified in *your* example entailment regime, making it invalid/incomplete/broken/untestable in both RDF 2004 and RDF 1.1.

>> Implementations of D-Entailment in RDF 2004 cannot be tested for conformance, unless you fix D and define all the datatypes.
> 
> Wrong.

Not wrong.

If I have an implementation, let’s call it SuperRDF, and claim that it conforms to D-Entailment as defined in RDF 2004, how would you test that? Mind you, I have not specified what D is (I have not fixed D).

The answer is you can’t tell if SuperRDF conforms. You need to know D before you can test it. That’s the same in RDF 2004 and in RDF 1.1.

(In RDF 1.1 you *additionally* need to know the referents of the IRIs in D, while in RDF 2004 that information is “in” D. It’s still the same information that you need to know, just packaged using different machinery.)

> {ex:geometry}-entailment is not specified, so implementation have to declare they implement "{ex:geometry}-entailment with ex:geometry identifying spatial datatype”.

I guess that typically it will not be *implementations* that declare this, but *specifications* will declare the referents, and implementations will claim conformance to the specification.

> But if this is implementation specific, it means that two implementations of {ex:geometry}-entailment may conform while having different results.

That’s confused. You conform *to a specification*. So talking of conformance only makes sense with respect to a particular specification. 

In the absence of a specification that states the referent of ex:geometry, {ex:geometry}-entailment is not fully defined and violates a MUST clause in RDF 1.1 Semantics. It is pointless to talk about {ex:geometry}-entailment as an example when there’s any doubt or ambiguity about the referent of the IRI.

> {(xsd:int,d1),(ex:geometry,d2)} […] In practice, you don't write the pairs in line as I do here. […] For the second pair, any ways of explaining what datatype is associated with ex:geometry is ok.

So you’re saying that you prefer the datatype map approach, because it puts the full definition of the entailment regime “inside” the D.

And you’re opposed to the datatype-IRI-based approach, because it doesn’t put everything “inside” the D, but requires additional outside information for a full definition of the entailment regime. Okay.

But then you go on to tell me that in practice, people can’t write everything “inside” the D anyway, and rely on additional outside information either way. I simply fail to see the practical difference. The mathematical machinery is a bit different, fine. But why does this matter? Surely there is nothing *wrong* with placing that association “outside” of the D, using the mechanism of denotation.

>> You can’t say whether an implementation conforms to that either.
> 
> You have both the IRI and the datatype it maps to, so of course you can.

This is just completely muddled.

This is how the conversation went.

You said: “Here’s an example entailment regime called {ex:geometry}-entailment. It’s impossible to test whether an implementation conforms!”

Me: “That’s not a valid D-entailment regime! RDF 1.1 Semantics says you MUST provide the datatype by declaring the IRI’s referent! So obviously you can’t test conformance to this, and you couldn’t either in RDF 2004!”

You: “But RDF 2004 says you MUST provide the datatype as part of the datatype map! So obviously you can test conformance!”

Me, now: “Yeah, but your example *didn’t* provide the datatype, so you *can’t* test conformance.”

The point is that both RDF 2004 and RDF 1.1 *require* you to provide the datatype. If you *do* it, you can test conformance with either. If you *don’t*, you’re in violation of both specs. The example that you gave to illustrate an alleged problem with RDF 1.1 *didn’t* provide the datatype, but just a name for it, ex:geometry. Thus, the example was broken, regardless of RDF version.

It’s just the mechanism how you associate IRIs and datatypes that differs between both approaches. In the one case, you define a dedicated map. In the other case, you define a set of IRIs and call it D, and then define that those IRIs denote certain datatypes.

> The spec pretends to define the family of entailment regimes called D-entailment. But you are saying it does not in fact, it only defines a template for such entailment regimes,

In the absence of precise definitions, “family of entailment regimes” and “template for entailment regimes” sound pretty much the same to me. 

> where you have to fill in the mapping from IRIs in D to datatypes (that is to say, implementations have to provide the mapping anyway, which was what RDF 2004 already said, more explicitly).

Yes, they say the same thing. RDF 2004 perhaps says it more explicitly. RDF 1.1 says it by using the notion that IRIs identify datatypes, and thus gets away without needing *yet another* mechanism for associating IRIs and things.

On an architectural level, and considering all the confusion between “identifying” and “denoting”, the worst thing about the datatype map approach is that it has thrown in yet another notion—“being associated in a datatype map”—which is distinct from the other two, and unnecessary. It is true that a better time to address this problem would have been back in 2004. But it’s never too late to fix mistakes.

Best,
Richard



> The following is your interpretation of the spec. Although it is a sensible interpretation, and I hope everyone will be able to understand it like this, it is not what RDF 1.1 Semantics says.
> 
> But in any case, Semantics not describe how an agreement on what IRIs denote is to be defined or provided. Concepts already tackles this subject a bit.
> 
> But I am not even certain that your interpretation is what Pat has in mind. You are referring to some kind of social agreement with a documented specification. Pat is more specific, since he refers to "Web machinery". By saying Web machinery, it seems that it excludes datatype maps that would be decided at face tod face meetings to implement the D-entailment in a company's information system.
> 
> Because of the necessity to rely on "Web machinery" (assuming this interpretation of the spec), such a scenario would lead to a non-conforming implementation.  So in fact, the conformance is subject to interpretation of the spec.  This was not true in 2004: 2004-style D-entailment does not leave anything to interpretation of the spec (although you suspect it did, but you mistakenly assume that the datatype to which IRIs map is unspecified).
> 
> 
> Few comments below.
> 
> 
>> You probably think of D-Entailment as an entailment regime. An
>> entailment regime that is parameterised by a datatype map.
>> 
>> But D-Entailment is not really an entailment regime, but more a
>> template for entailment regimes. “Batteries not included”.
>> 
>> How do you turn this template into an actual entailment regime?
>> 
>> You write a specification. A very simple specification. You call it
>> something like My-Entailment. My-Entailment becomes an actual
>> entailment regime by defining it in a specification. The
>> specification states the following:
>> 
>> 1) That My-Entailment is a kind of D-Entailment. 2) What the set of
>> recognised datatype IRIs for My-Entailment is. 3) For every datatype
>> IRI in that set, you state the referent, or normatively reference a
>> spec that already does this.
>> 
>> An example:
>> 
>> [[ My-Entailment is a form of D-Entailment [RDF11-MT]. An
>> implementation of My-Entailment MUST recognise the datatype IRIs
>> xsd:string, xsd:decimal, xsd:boolean, rdf:langString, rdf:HTML, and
>> geo:GMLLiteral, as defined in [RDF11-CONCEPTS] and [GEOSPARQL]. ]]
>> 
>> Note that this entails the following sentence: “In My-Entailment, the
>> IRI geo:GMLLiteral MUST denote GeoSPARQL’s GMLLiteral datatype.” So,
>> if you like, you can read this document as placing a semantic
>> condition on My-interpretations:
>> 
>> I(geo:GMLLiteral) = the datatype defined in [GEOSPARQL]
>> 
>>> Is U a global mapping that all implementations must use?  Is it
>>> implementation specific?
>> 
>> U is not defined by RDF Semantics but MUST be defined by the specific
>> entailment regime:
>> 
>> [[ RDF processors may recognize other datatype IRIs, but when other
>> datatype IRIs are recognized, the mapping between the datatype IRI
>> and the datatype it refers to MUST be specified unambiguously, and
>> MUST be fixed during all RDF transformations or manipulations. ]]
>> 
>> So, it is not global. It is entailment regime specific.
>> 
>>> Is is defined on all IRIs? On a predefined subset of the IRIs?
>> 
>> That question is irrelevant for IRIs outside of D, and the
>> requirements for IRIs in D are clearly stated, see quote above.
>> 
>>> Does it depend on the set D?
>> 
>> No, it depends on the entailment regime.
>> 
>>> If it does depend on D, is it possible that for D1 =
>>> {http://ex.com/d1} and D2 = {http://ex.com/d1,http://ex.com/d2},
>>> the U associated with D1 differ from the U associated with D2 on
>>> the interpretation of http://ex.com/d1?
>> 
>> It doesn’t depend on D.
>> 
>>> If this U is global, then there are problems in determining
>>> conformance. For instance, if a datatype (with IRI ex:d) evolves
>>> from version v1 to v2, and an {ex:d}-entailment implementation uses
>>> v1 while another implementation uses v2, one of these two
>>> implementations is necessarily non-conformant. But which one of
>>> them?
>> 
>> It’s not global.
>> 
>>> In fact, can D-entailment implementations be tested for conformance
>>> at all?  Yes, if D only includes datatypes IRIs that are W3C
>>> standards. No otherwise.
>> 
>> How is this different from RDF 2004? Implementations of D-Entailment
>> in RDF 2004 cannot be tested for conformance, unless you fix D and
>> define all the datatypes.
> 
> Wrong. For all datatype maps D, the fact that a graph D-entails another graph is fully specified by RDF 2004. If you pretend to implement D-entailment with a specific D, it is verifiable from RDF 1.0 Semantics whether your implementation conforms or not.
> 
> Of course, there is no single D-entailment regime, it is a family of entailment regimes parameterised by a datatype map. So what RDF 1.0 defines is something like {(xsd:int,d1),(ex:geometry,d2)}-entailment, where d1 is the datatype "int" from XSD and d2 is the datatype defined in the spatial extension of RDF and SPARQL. Since you have d1 and d2, all you have to do is check RDF Semantics to verify whether:
> 
> :s :p "(200,60)"^^ex:geometry .
> 
> {(xsd:int,d1),(ex:geometry,d2)}-entails:
> 
> :s :p "(-160,60.0)"^^ex:geometry .
> 
> In practice, you don't write the pairs in line as I do here. Because of the constraints we put in Concepts about XSD datatypes, it is not necessary to provide the first pair, the IRI suffices. For the second pair, any ways of explaining what datatype is associated with ex:geometry is ok. If you are a linked data enthusiast, you'd likely put a document that you can fetch when dereferencing ex:geometry.
> 
> Now, with a simple set, e.g., {ex:geometry}-entailment is not specified, so implementation have to declare they implement "{ex:geometry}-entailment with ex:geometry identifying spatial datatype". But if this is implementation specific, it means that two implementations of {ex:geometry}-entailment may conform while having different results. So it leads to the same situation where D-entailment must be specified in terms of a mapping rather than a set. But again, all of this is not clearly stated, and I'm not sure everyone has the same interpretation.
> 
> 
>>> For instance, I have an implementation of
>>> {http://ex.com/}-entailment. […] Is it a conforming
>>> {http://ex.com/}-entailment implementation?
>> 
>> 
>> Show me the definition of {http://ex.com/}-entailment, and I can tell
>> you. With the information you have given me, your entailment regime
>> is not fully specified and violates the spec.
> 
> The spec pretends to define the family of entailment regimes called D-entailment. But you are saying it does not in fact, it only defines a template for such entailment regimes, where you have to fill in the mapping from IRIs in D to datatypes (that is to say, implementations have to provide the mapping anyway, which was what RDF 2004 already said, more explicitly).
> 
> 
> Again, RDF Semantics
>> says:
>> 
>> [[ RDF processors may recognize other datatype IRIs, but when other
>> datatype IRIs are recognized, the mapping between the datatype IRI
>> and the datatype it refers to MUST be specified unambiguously, and
>> MUST be fixed during all RDF transformations or manipulations. ]]
>> 
>> So your question is like asking in RDF 2004 about conformance to
>> D-Entailment with a datatype map described as “consisting of the IRI
>> http://ex.com/ mapped and an unspecified datatype”.
> 
> In RDF 2004, the datatype is part of the map. The letter D is a place holder for a mapping, there is no one entailment regime called "D-entailment" (similarly to RDF 1.1, where D is a place holder for a set). It is thus fully specified in 2004-style D-entailment, not quite in RDF 1.1.
> 
> 
>> You can’t say
>> whether an implementation conforms to that either.
> 
> You have both the IRI and the datatype it maps to, so of course you can.
> 
> 
> AZ.
> 
>> 
>>> I do not note this, it is not editorial because it changes how
>>> conformance can or cannot be tested, as explain above.
>> 
>> I don’t think that’s true, see above.
>> 
>> Best, Richard
>> 
>> 
>> 
>> 
>>> On 17 Dec 2013, at 22:30, Antoine Zimmermann
>>> <antoine.zimmermann@emse.fr> wrote:
>>> 
>>> Pat,
>>> 
>>> 
>>> I'm sorry but I very strongly disagree with your response. It
>>> reinforces my decision to formally object.
>>> 
>>> 
>>> 
>>> Here are remarks that also relate to previous emails:
>>> 
>>> First, I do not think it is necessary to introduce "datatype maps"
>>> in Concepts. It is sufficient to say that implementation may
>>> recognise a set of IRIs to be denoting datatypes, in which case any
>>> literals typed with these IRIs must be interpreted according to
>>> their respective datatype (the current text pretty much says this,
>>> so its ok).
>>> 
>>> But in the formalisation of this concept in model theory, some kind
>>> of mapping from a set of IRIs to datatypes must be introduced and
>>> used in the semantic conditions. It is quite logical to name this
>>> mapping "datatype map" in accordance with what was standardised in
>>> 2004, and to what is used in various other specifications.
>>> 
>>> 
>>> The text of RDF 1.1 Semantics CR clearly says that D-entailment
>>> works assuming that there is an association between some datatype
>>> IRIs and datatypes, which is another way of saying that there is a
>>> mapping from a set of IRIs to datatypes. Let us now name this
>>> mapping U, to avoid relying on the notion of datatype map.
>>> 
>>> The semantic conditions are expressing, in a different way, that a
>>> D-interpretation I is such that for any recognised datatype IRI x,
>>> I(x) = U(x). The problem is that, by meticulously avoiding
>>> introducing and using a name for this mapping U, it is not clear
>>> what is the scope of U and the precise definition of it.
>>> 
>>> Is U a global mapping that all implementations must use?  Is it
>>> implementation specific?  Is is defined on all IRIs?  On a
>>> predefined subset of the IRIs?  Does it depend on the set D?  If it
>>> does depend on D, is it possible that for D1 = {http://ex.com/d1}
>>> and D2 = {http://ex.com/d1,http://ex.com/d2}, the U associated with
>>> D1 differ from the U associated with D2 on the interpretation of
>>> http://ex.com/d1?
>>> 
>>> If this U is global, then there are problems in determining
>>> conformance. For instance, if a datatype (with IRI ex:d) evolves
>>> from version v1 to v2, and an {ex:d}-entailment implementation uses
>>> v1 while another implementation uses v2, one of these two
>>> implementations is necessarily non-conformant. But which one of
>>> them?
>>> 
>>> In fact, can D-entailment implementations be tested for conformance
>>> at all?  Yes, if D only includes datatypes IRIs that are W3C
>>> standards. No otherwise.
>>> 
>>> For instance, I have an implementation of
>>> {http://ex.com/}-entailment. My implementation says the following:
>>> 
>>> { :s  :p  "abc"^^<http://ex.com/> }
>>> 
>>> entails:
>>> 
>>> { :s  :p  "123"^^<http://ex.com/> }
>>> 
>>> Is it a conforming {http://ex.com/}-entailment implementation?
>>> 
>>> 
>>> I find these issues to be way beyond editorial.
>>> 
>>> 
>>> Now some comments on your text below:
>>> 
>>> "the restriction of a D-interpretation mapping to the set D of
>>> recognized datatype IRIs"
>>> 
>>> This would be the definition of a datatype map if and only if a
>>> D-interpretation was constrained to have its recognised datatype
>>> IRIs denote the associated datatypes (that is, constraint to use a
>>> given mapping from datatype IRIs to datatypes, a.k.a. a datatype
>>> map).
>>> 
>>> There is a kind of circular argument: we say that the
>>> interpretation of datatype IRIs is constrained by a certain mapping
>>> that we do not name, then say that the restriction (which is in
>>> fact the unnamed mapping) is what was named "datatype map" before.
>>> 
>>> 
>>> "The newer style of description is more intuitive, less artificial,
>>> simpler (fewer semantic clauses, fewer new concepts introduced)"
>>> 
>>> That it is more intuitive, less artificial and simpler is
>>> subjective, and I disagree. But even though it was the case, it is
>>> incomplete (it does not have enough information to describe
>>> conformance, as I mentioned above). The fact that fewer semantic
>>> clauses are present is false.  I made explicit all the changes that
>>> should be made to RDF 1.1 Semantics in order to reintroduce
>>> datatype map. It results in exactly the same number of semantic
>>> clauses. To say that it introduces fewer new concepts is dishonest:
>>> you behave as if RDF semantics has been defined without datatype
>>> map and that Michael and I are trying to impose a new concept that
>>> would fondamentally change RDF. It is the opposite: D-entailment
>>> has been defined in terms of datatype map, as well as it is in
>>> other specifications, and you are trying to impose a change to the
>>> existing standard. By doing so, you even added a new concept,
>>> recognised datatype IRIs.
>>> 
>>> 
>>> "more directly related to concepts in wide use in other Web
>>> standards and literature, such as the 2004 Architecture of the
>>> Web"
>>> 
>>> I don't see how this is true but in any case, it may relate more to
>>> other Web standards, but it disconnects it from other existing Web
>>> standards like OWL 2 RDF-based semantics, SPARQL 1.1 Entailment
>>> regimes and RIF OWL/RDF compatibility (as well as tons of
>>> publications).
>>> 
>>> 
>>> "It also introduces the useful terminology of "recognition" of a
>>> datatype IRI"
>>> 
>>> Introducing this notion does not require any change at all to RDF.
>>> 
>>> 
>>> "We also note that the changes to which you objected are
>>> editorial"
>>> 
>>> Who are "we"?  I do not note this, it is not editorial because it
>>> changes how conformance can or cannot be tested, as explain above.
>>> 
>>> 
>>> 
>>> 
>>> AZ.
>>> 
>>> 
>>> Le 17/12/2013 09:07, Pat Hayes a écrit :
>>>> 
>>>> 
>>>> On Dec 16, 2013, at 12:13 PM, Guus Schreiber
>>>> <guus.schreiber@vu.nl> wrote:
>>>> 
>>>>> Peter, Pat,
>>>>> 
>>>>> For this weeks telecon we need a proposed resolution to
>>>>> resolve ISSUE -165 (Datatype maps). I assume that the proposal
>>>>> will be to keep to the current state of affairs, stating
>>>>> briefly the rationale.
>>>> 
>>>> Yes, noting the textual modifications that have been made in
>>>> response to the comments.
>>>> 
>>>>> Could you propose text?
>>>> 
>>>> Below.
>>>> 
>>>>> 
>>>>> Thanks, Guus
>>>> 
>>>> --------------------------
>>>> 
>>>> Thank you for your comment concerning datatype maps, noted in
>>>> http://lists.w3.org/Archives/Public/public-rdf-comments/2013Oct/0067.html
>>>> 
>>>> 
> which was recorded by the WG as ISSUE-165
>>>> (https://www.w3.org/2011/rdf-wg/track/issues/165).
>>>> 
>>>> You requested that we "bring back the old notion of a datatype
>>>> map." In subsequent correpondence, we explained that the idea is
>>>> in fact still present in the newer description, it being the
>>>> restriction of a D-interpretation mapping to the set D of
>>>> recognized datatype IRIs. Since your email suggested that this
>>>> was not as clear as we had intended, we have re-worded parts of
>>>> the relevant section 7 to give this as an explicit definition of
>>>> the 2004 concept of 'datatype map' and added a sentence to
>>>> clarify how other specifications and recommendations which refer
>>>> to and impose extra conditions on datatype maps, can be
>>>> interpreted as applying to the newer form of description. We also
>>>> added a sentence clarifying how external specifications of
>>>> datatypes can typically define both the type itself and the fixed
>>>> interpretation of its referring IRI, using the "datatype map"
>>>> language to help make the connection clear.
>>>> 
>>>> You also objected that there was no motivation for making the
>>>> change to the way that the semantics is described. Here we
>>>> disagree. The newer style of description is more intuitive, less
>>>> artificial, simpler (fewer semantic clauses, fewer new concepts
>>>> introduced), more uniform with the rest of the semantic
>>>> description (the mapping in question is simply a partial
>>>> interpretation mapping) and more directly related to concepts in
>>>> wide use in other Web standards and literature, such as the 2004
>>>> Architecture of the Web (http://www.w3.org/TR/webarch/) document.
>>>> It also introduces the useful terminology of "recognition" of a
>>>> datatype IRI, which is used throughout the document and also in
>>>> the Concepts document, and which we anticipate will be useful
>>>> more generally.
>>>> 
>>>> We also note that the changes to which you objected are editorial
>>>> and descriptive rather than substantive, since no semantic
>>>> structures are changed, and no entailments are changed.
>>>> 
>>>> Please check the wording changes referred to above in the latest
>>>> version of the Semantics document, section 7, and respond to
>>>> this list indicating whether this response resolves the issue
>>>> raised by your comment, including [RESOLVED] in the subject line
>>>> if it does resolve this to your satisfaction.
>>>> 
>>>> 
>>>> ------------------------------------------------------------
>>>> IHMC (850)434 8903 home 40 South Alcaniz St.            (850)202
>>>> 4416 office Pensacola                            (850)202 4440
>>>> fax FL 32502                              (850)291 0667   mobile
>>>> (preferred) phayes@ihmc.us       http://www.ihmc.us/users/phayes
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>> 
>>> -- Antoine Zimmermann ISCOD / LSTI - Institut Henri Fayol École
>>> Nationale Supérieure des Mines de Saint-Étienne 158 cours Fauriel
>>> 42023 Saint-Étienne Cedex 2 France Tél:+33(0)4 77 42 66 03
>>> Fax:+33(0)4 77 42 66 66 http://zimmer.aprilfoolsreview.com/
>> 
>> 
> 
> 
> -- 
> Antoine Zimmermann
> ISCOD / LSTI - Institut Henri Fayol
> École Nationale Supérieure des Mines de Saint-Étienne
> 158 cours Fauriel
> 42023 Saint-Étienne Cedex 2
> France
> Tél:+33(0)4 77 42 66 03
> Fax:+33(0)4 77 42 66 66
> http://zimmer.aprilfoolsreview.com/
> 
Received on Monday, 6 January 2014 09:40:43 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 22:02:18 UTC