Re: comments on Antoine's draft from Pat Hayes on 2013-12-14 (public-rdf-wg@w3.org from December 2013)

From: Pat Hayes <phayes@ihmc.us>
Date: Sat, 14 Dec 2013 00:47:31 -0800
To: Antoine Zimmermann <antoine.zimmermann@emse.fr>
Cc: RDF WG <public-rdf-wg@w3.org>
Message-Id: <F0E865A7-E325-4B74-BECD-19CE9028AE1C@ihmc.us>
On Dec 13, 2013, at 1:24 AM, Antoine Zimmermann <antoine.zimmermann@emse.fr> wrote:

> Some comments below, deleting pieces that are not relevant.

Ditto.

> 
>>>> 3.2 I think this is misleading. We have formally decided that
>>>> datasets are single bnode scopes, so to treat bnodes in two
>>>> named graphs as distinct, ie to merge their graphs rather than
>>>> take their union, is just wrong.
>>> 
>>> This document precisely avoids to say that such and such choices
>>> are wrong, which would lead people to think that there are
>>> legitimate and illegitimate dataset semantics.  We have not gotten
>>> to this level of requirements for dataset semantics.  It seems to
>>> me pretty straightforward to say that a dataset is true in an
>>> interpretation if all the graphs in it are true in that
>>> interpretation.  This corresponds to applying a merge operation.
>> 
>> It is a merge only if the graphs share no bnodes. But we have
>> decided, formally, that bnodes in graphs in a dataset are shared, ie
>> their scope is the dataset rather than the local graph. So take the
>> example
>> 
>> { } :1 { :a :p _:x } :2 { :b :q _:x }
>> 
>> and the merge
>> 
>> :a :p _:x :b :q _:y
>> 
>> There are interpretations which satisfy the merge but do not make all
>> the graphs in the dataset true, so if dataset truth means the truth
>> of all the graphs in it, then the merge does not entail the dataset.
>> But the union will always be equivalent to the truth of all the
>> graphs.
> 
> What?!  All interpretations that satisfy the merge obviously satisfy the two graphs!  

Not if they share a bnode. Consider the interpretation I with universe {A B P Q} and IEXT(P)={<A, P>}, IEXT(Q)={<B, Q>}. This satisfies the merge, but it does not satisfy all the graphs in the dataset. I(:1) is true only when _:x is mapped to P, but I(:2) is true only when _:x is mapped to Q. So there is no assignment of values to the bnode _:x which makes both graphs true. Of course if we consider each graph in isolation, then it is true in that interpretation, but that ignores the fact that the bnode is not local to the graph, but is shared between both graphs. 

> You even wrote the proof yourself in RDF 2004

That assumed that they did not share any bnodes, but we have explicitly ruled that bnodes in datasets are shared between the graphs. If this has any meaning at all, it requires us to treat bnode mappings in interpretations as applying across all the graphs simultaneously. 

> (http://www.w3.org/TR/rdf-mt/#mergelemprf).
> 
> Should I really show the proof to you here?

No need :-)

> 
> 
> > [...]
> 
>>>> Also, "as a way to refer to the RDF graph" // "as a way to
>>>> identify..." , since Semantics draws this distinction carefully.
>>> 
>>> The text beginning each presentation of a distinct formal semantics
>>> is meant to be informal and intuitive. I'll try to be rigorous as
>>> much as I can, but the use of common words, with their ambiguity,
>>> may be legitimate in such a style of presentation.  I am open to
>>> suggestion, though.  In this case, even if Semantics makes the
>>> distinction, I don't see why it should not be "identify" here.
>> 
>> Well, if some other natural word can be found, it would be better to
>> avoid a terminology clash with part of the existing normative
>> documents.
> 
> But isn't "identify" precisely the right term here?

Sorry, yes of course it is. I got muddled there for a moment.

> 
> 
>> [...]
> 
>>>> " the presence of blank nodes as graph names can be problematic
>>>> because a named graph entails an infinity of other named graphs
>>>> where only the graph name is changed to a different blank node."
>>>> I disagree. If there are n graph names, then there are at most
>>>> 2|n distinct bnode generalizations. Just changing the bnodeID
>>>> does not change a graph into a different graph. And in any case,
>>>> the situation in datasets is no worse than in RDF graphs, so I
>>>> think this is a non-issue.
>>> 
>>> This may be a non-issue, but here I'm not talking about bnode ID.
>>> In this dataset semantics, any blank node used as a graph name can
>>> be replaced by another unused blank node.
>> 
>> But it has to be replaced consistently throughout the dataset, or
>> else it is a different dataset. Right?
> 
> Yes.
> 
>>> There is an infinite amount of blank nodes from which to choose
>>> from.
>> 
>> Bnodes are not distinct 'things', they are just 'places' in a graph
>> (or in this case, a dataset.) That is why we treat graph-equivalent
>> graphs (ie 1:1 substitution of bnodes) as identical. Concepts defines
>> a similar equivalence for datasets.
>> 
>>> This may also lead to having blank nodes used inside named graphs
>>> become the same as or different from bnodes used as graph names.
>>> E.g.,
>>> 
>>> _:b { _:b  dc:created  "2013-12-10"^^xsd:date } _:d { ex:a  ex:b
>>> ex:c }
>>> 
>>> is equivalent to (according to this particular semantics) to:
>>> 
>>> _:c { _:b  dc:created  "2013-12-10"^^xsd:date } _:b { ex:a  ex:b
>>> ex:c }
>> 
>> Unless this is a typo, I don't follow. How can you replace the bnode
>> _:b by _:c when it is used as a label but not when it is used inside
>> the graphs?
> 
> It's not a typo. What I say is that the two datasets are equivalent according to this semantics.
> 
>> That should not be permissible in *any* semantics.
> 
> From your reaction to this, it seems that it *is* an issue.
> 
>> Did you mean this?
>> 
>> _:c { _:c dc:created "2013-12-10"^^xsd:date } _:b { ex:a  ex:b  ex:c
>> }
>> 
>> This is equivalent to your first example.
> 
> Strictly speaking, the graphs inside the named graph pairs are not equal. With just a tad bit of abuse, we can say that isomorphic graphs are equal, in which case yes, it is equivalent to my first example. But my second example is also equivalent in this case.

I meant graph-equivalent not logically equivalent. But I now see your point. Indeed, I think this is such a counterintuitive property of this semantics that it should be called out and remarked on a bit more explicitly.

>>> But after all, as you say, this is not relevant as a drawback.
>>> 
>>> 
>>>> "Therefore, any entailment regime that recognizes datatypes and
>>>> use this semantics has to be able to ..."  Why "that recognizes
>>>> datatypes"? Any entailment regime that extends this semantics has
>>>> to 'know about' graphs and their identity conditions. It is
>>>> *like* typed literals, but its not actually a new datatype. (It
>>>> could be, of course, and then we would have graph literals.)
>>>> 
>>>> 3.4 Second sentence: "From the truth of these triples, it is
>>>> possible to infer knowledge that it is convenient to make part of
>>>> the named graph." ?? Do you mean to say that graphs must be
>>>> deductively closed? Surely not, but then what does this mean?
>>> 
>>> The formulation was clumsy. I reformulated to: "From the truth of
>>> these triples according to the graph semantics, follows the truth
>>> of the named graph pair."
>> 
>> I think the key point is that it would be valid to add valid
>> entailments to any named graph, in this semantics. Whereas *any
>> change at all* to a named graph would be invalid according to the
>> naming semantics. This is a very sharp and vivid way to distinguish
>> them.
> 
> With this comment, do you imply that this distinction should be stressed more?

Well, it is up to you, but I can say that a lot of the document became clearer for me when I thought of it in this way. 

>>>> ".. one wants to allow different view points to be expressed and
>>>> reasoned with, without creating a conflict or inconsistency." I
>>>> don't like this. Technically, this is different from the time
>>>> and provenance cases, and it appeals to a different logic. The
>>>> latter are like having an extra parameter (what you describe as
>>>> the quad case, later) but the useage where separate graphs are
>>>> used to insulate against contradictions is different, because no
>>>> extra parameter is implied. But maybe this is getting too subtle
>>>> :-)
>>> 
>>> There are cases where you want to isolate the content of an RDF
>>> document and draw conclusion from this content only. The content as
>>> well as the conclusions are attached to a graph name to keep track
>>> where the conclusion comes from. You may want to do this
>>> independently from the graph being consistent or not, and
>>> independently from the graph being in contradiction with another
>>> graph in the store.
>>> 
>>> I haven't made a change for the moment, since you say it may be too
>>> subtle.
>> 
>> It probably is. There are several journal papers waiting to be
>> written about this stuff, and that would be the place to go into such
>> matters.
> 
> And there are several journal and conference papers already written on this matter.

Do you refer to them all? If not, why not add the references?

> 
>> [...]
> 
>>>> 3.5 Its not exactly clear how this differs from the 3.4 case when
>>>> the 'context' is, for example, times. [ <a b c> true in d ], and
>>>> [ <a b c d> true ], are pretty interchangeable. But again, this
>>>> is perhaps too picky.
>>> 
>>> <a b c> true in d may not mean the same as <a b c d> true unless it
>>> is specified like this.
>> 
>> Yes, of course. What I meant to say is that one can impose isomorphic
>> truth conditions on either syntactic pattern. They are functionally
>> interchangeable, so to speak. So there is no important distinction in
>> kind between triples-plus-contexts and quads; they are simply
>> syntactic variants.
> 
> Yes, but the quad semantics presented there modifies the structure of graph interpretations with ternary relations, while the other semantics simply reuse the graph interpretation structure as is. So, in order to make the quad semantics isomorphic to the dataset semantics, you need to rewrite the semantic conditions using the ternary relations, for each semantic extensions (simple-quad semantics, RDF-quad semantics, RDFS-quad semantics, etc).

Well, I think adding some kind of context is not really re-using the interpretations "as is" since the truth conditions change. But OK, lets not argue about it, I just wanted to suggest that there might be more in common, which maybe could simplify the options a little. But I don't want to push for a change unless you want to make one. 

Pat

------------------------------------------------------------
IHMC                                     (850)434 8903 home
40 South Alcaniz St.            (850)202 4416   office
Pensacola                            (850)202 4440   fax
FL 32502                              (850)291 0667   mobile (preferred)
phayes@ihmc.us       http://www.ihmc.us/users/phayes
Received on Saturday, 14 December 2013 08:48:04 UTC