Re: weakness of embedded triples from Pierre-Antoine Champin on 2020-10-22 (public-rdf-star@w3.org from October 2020)

From: Pierre-Antoine Champin <pierre-antoine.champin@ercim.eu>
Date: Thu, 22 Oct 2020 17:06:09 +0200
To: thomas lörtsch <tl@rat.io>
Cc: public-rdf-star@w3.org
Message-ID: <11976d2a-34b3-223b-0c91-729be97e832d@ercim.eu>
Dear Thomas,

my point is not to dismiss the fact that we *could* handle embedded
triples as some kind of complex IRIs

On 22/10/2020 00:51, thomas lörtsch wrote:

>> (...)
>> So yes, I believe we need a new semantics, because embedded triples have
>> an internal structure that we need to take into account.
> That is another discussion
Well, that was the original discussion :)
> (your arguments above relate to technical issues only)
I may have been carried away...
>  and I’d like to have that. However I seem to have a fundamental problem understanding the semantic problem (nobody is surprised here). Taking a statement that is asserted and annotated in RDF*, in a currently en vogue syntax:
>
>    :a  :b  :c  [[ :d  :e ]] .
>
> What internal structure are you refering to that has to be taken into account? The statement is stated, and annotated, and IIUC that’s about it. Ensuring that the annotation refers to this specific statement is a syntactic problem and a statement identifier composed of subject, predicate and object and encoded in an IRI is a syntactic solution to that problem. Paraphrasing your example above:
>
>    :a  :b  :c .
>    :triple:a:b:c  :d  :e .
>
> What "internal structure" has changed here? In what ways could this syntax convey a different meaning than the one above?

Again, that is fine when no blank nodes are involved. But if you replace
:c with _:x, you would get something like:

    :a :b _:x.
    :triple:a:b:_:x :d :e.

Then, RDF semantics says you can replace _:x with _:y without changing
the meaning of the graph. This renaming only impacts the first triple;
from the point of view of standard RDF semantics, the second triple
contains 3 IRIs, so it is kept unchanged by the renaming. So under
standard RDF semantics, we can replace the graph above with:

    :a :b _:y.
    :triple:a:b:_:x :d :e.

and we have lost the connection between the asserted triple and the
annotated triple. That's why we need to have an extended semantics for RDF*.

(more precisely: that's why the trick of encoding embedded triples into
IRIs does not work. There might be a smarter encoding of RDF* into RDF,
which would allow us to rely on the standard semantics, but I seriously
doubt it)

> Thomas
>
>
> [0] Aidan Hogan, 2017, Canonical Forms for Isomorphic and Equivalent RDF Graphs: Algorithms for Leaning and Labelling Blank Nodes
>
>
>>   best
>>
>>> Thomas
>>>
>>>
>>> [0] https://www.w3.org/TR/rdf11-concepts/
>>>
>>>> Also, note that the semantics' goal is not to prescribe a particular implementation method; it is to ensure that different implementations remain interoperable.
>>>>  pa
>>>>
>>>>> On Mon, Oct 19, 2020 at 11:07 AM Pavel Klinov
>>>>> <pavel@stardog.com>
>>>>> wrote:
>>>>>
>>>>>> Yeah right. We have a mechanism in place to avoid using the same Skolem constant for bnodes with the same lexical form occurring in multiple RDF datasets (eg. when loading multiple files) but that's pretty much it. IIRC it's called something like "standardising apart" in one of the RDF docs.
>>>>>>
>>>>>> Cheers,
>>>>>> Pavel
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Oct 19, 2020 at 10:54 AM Pierre-Antoine Champin
>>>>>> <pierre-antoine.champin@ercim.eu>
>>>>>> wrote:
>>>>>>
>>>>>>> Dear all,
>>>>>>>
>>>>>>> Holger, Pavel: I assume that blank nodes are internally skolemized, so indeed, internally, you only have IRIs and literals. Correct?
>>>>>>>
>>>>>>> On 19/10/2020 10:28, Holger Knublauch wrote:
>>>>>>>
>>>>>>> Similar situation here at TopQuadrant, see
>>>>>>>
>>>>>>>
>>>>>>> http://datashapes.org/reification.html#uriReification
>>>>>>>
>>>>>>>
>>>>>>> Holger
>>>>>>>
>>>>>>>
>>>>>>> On 10/19/2020 6:24 PM, Pavel Klinov wrote:
>>>>>>>
>>>>>>> This is roughly how Stardog supports RDF* and so far we find it sufficient in the enterprise context. It's pretty easily understood by users familiar with edge properties in the property graph data model, which is one of the most important factors for us.
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Pavel
>>>>>>>
>>>>>>> On Sat, Oct 17, 2020 at 9:54 PM Martynas Jusevičius
>>>>>>> <martynas@atomgraph.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Does RDF* need new semantics at all? Couldn't it be a syntax-level
>>>>>>>> convention for unique triple IDs?
>>>>>>>>
>>>>>>>> E.g. <<s>, <p>, <o>> being syntactic sugar for
>>>>>>>> uri(concat("urn:rdf:id:", hash(str(<s>)), hash(str(<p>)),
>>>>>>>> hash(str(<p>)))).
>>>>>>>>
>>>>>>>> For example, the triple
>>>>>>>>
>>>>>>>> <
>>>>>>>> <https://www.w3.org/People/Berners-Lee/card>
>>>>>>>> <http://xmlns.com/foaf/0.1/primaryTopic>
>>>>>>>> <https://www.w3.org/People/Berners-Lee/card#i>
>>>>>>>> gives
>>>>>>>>
>>>>>>>> URI(CONCAT("urn:rdf:id:",
>>>>>>>> SHA1(STR(
>>>>>>>> <https://www.w3.org/People/Berners-Lee/card>
>>>>>>>> )),
>>>>>>>> SHA1(STR(
>>>>>>>> <http://xmlns.com/foaf/0.1/primaryTopic>
>>>>>>>> )),
>>>>>>>> SHA1(STR(
>>>>>>>> <https://www.w3.org/People/Berners-Lee/card#i>
>>>>>>>> ))))
>>>>>>>>
>>>>>>>> gives
>>>>>>>>
>>>>>>>> <urn:rdf:id:63874e34ff5f326e67e888f6818f72d5033ecb343cadd8c2120281d72cefce4481485c937b6a95a656beaa67c13db29f3d7be801328b7c9125976c5f>
>>>>>>>>
>>>>>>>> which essentially would become the "5th element", in addition to quads.
>>>>>>>>
>>>>>>>> On Thu, Oct 15, 2020 at 1:38 PM Pierre-Antoine Champin
>>>>>>>>
>>>>>>>> <pierre-antoine.champin@ercim.eu>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> On 14/10/2020 23:13, Peter F. Patel-Schneider wrote:
>>>>>>>>>
>>>>>>>>> Let's make the height example even more stark.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> :loisLane :believes << :clarkKent :height "6.0"^^xsd:decimal >> .
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> does not imply
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> :loisLane :believes << :clarkKent :height "6.00"^^xsd:decimal >> .
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I would hope that any Tom, Dick, and Lois can realize that these two literals
>>>>>>>>> are the same.
>>>>>>>>>
>>>>>>>>> I see your point, but this is really a matter of deciding where you put the boundary...
>>>>>>>>>
>>>>>>>>> So I would still prefer to be radical here and consider any lexical difference as potentially significant.
>>>>>>>>>
>>>>>>>>> If you want to stick to literals that have to be supported in RDF
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> :loisLane :believes << :clarkKent :name "Clark"@en-US >> .
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> does not imply
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> :loisLane :believes << :clarkKent :name "Clark"@en-us >> .
>>>>>>>>>
>>>>>>>>> Are "Clark"@en-US and "Clark"@en-us really different literals, for the abstract syntax??
>>>>>>>>>
>>>>>>>>> I would have thought they are the same (and so the implication above would hold).
>>>>>>>>>
>>>>>>>>> Reading the spec again, I realize that things are not so clear: "Lexical representations of language tags MAY be converted to lower case", and then Literal term equality requires that language tags "compare equal, character by character". So these 2 literals MAY be considered equal, and the implication MAY hold... :-/ Add to this that BCP47 explicitly state that language tags are case insensitive... I'd say that we are in gray area here.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> peter
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 10/14/20 4:45 PM, Doerthe Arndt wrote:
>>>>>>>>>
>>>>>>>>> Dear Peter,
>>>>>>>>>
>>>>>>>>> you are right with both observations. The question is whether we want that
>>>>>>>>> behavior or not.
>>>>>>>>>
>>>>>>>>> In
>>>>>>>>> https://w3c.github.io/rdf-star/
>>>>>>>>> there is a section on referential opacity.
>>>>>>>>> The main claim there is that triples are referentially opaque.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> But embedded triples are much weaker than just being referntially opaque.  To
>>>>>>>>> see this consider the following RDF* graph under the RDF* version of RDF
>>>>>>>>> entailment recognizing xsd:decimal and xsd:integer.
>>>>>>>>>
>>>>>>>>> :loisLane :believes << :clarkKent :height "6"^^xsd:decimal >> .
>>>>>>>>>
>>>>>>>>> In this semantics "6"^^xsd:decimal means the same as "6"^^xsd:integer so one
>>>>>>>>> would expect that
>>>>>>>>>
>>>>>>>>> :loisLane :believes << :clarkKent :height "6"^^xsd:integer >> .
>>>>>>>>>
>>>>>>>>> is RDF*-entailed.
>>>>>>>>>
>>>>>>>>> But it is not.  There are two reasons for this.
>>>>>>>>>
>>>>>>>>> First, there is no requirement that satisfying interpretations for the first
>>>>>>>>> graph map < :clarkKent :height "6"^^xsd:integer > to anything and if a
>>>>>>>>> satisfying interpretation does map the triple there is no requirement that its
>>>>>>>>> ITEXT mapping gives the triple its correct meaning.  (The value of ITEXT for
>>>>>>>>> the triple could have the real number pi as its third element.)
>>>>>>>>>
>>>>>>>>> Second, "6"^^xsd:integer is a different node from "6"^^xsd:decimal. So even if
>>>>>>>>> the intepretation treats the second embedded triple nicely, and thus gives it
>>>>>>>>> the same meaning as the first embedded triple, they are still two different
>>>>>>>>> triples and :loisLane can believe one but not the other.  So very little of
>>>>>>>>> the semantics of RDF gets into embedded triples.
>>>>>>>>>
>>>>>>>>> We wanted different that different representations are treated differently
>>>>>>>>> if they have the same meaning. The reason for that is that we expected that
>>>>>>>>> RDF* would also be used to make statements about triples as they were
>>>>>>>>> stated, for example to be able to explain the reasoning performed on the
>>>>>>>>> triples but also for simple provenance. In these cases there should be a
>>>>>>>>> difference between
>>>>>>>>>
>>>>>>>>> :loisLane :believes << :clarkKent :height "6"^^xsd:decimal >> .
>>>>>>>>>
>>>>>>>>> and
>>>>>>>>>
>>>>>>>>> :loisLane :believes << :clarkKent :height "6"^^xsd:integer >>
>>>>>>>>>
>>>>>>>>> since we still talk about different representations.
>>>>>>>>>
>>>>>>>>> Each triple is, in effect, its own context.  So, in an RDFS version of RDF*,
>>>>>>>>> even if :loisLane believes several triples that should imply another, they
>>>>>>>>> generally don't.  For example:
>>>>>>>>>
>>>>>>>>> :loisLane :believes << :clarkKent rdf:type :man >> .
>>>>>>>>> :loisLane :believes << :man rdfs:subClassOf :human >> .
>>>>>>>>>
>>>>>>>>> Does not imply
>>>>>>>>>
>>>>>>>>> :loisLane :believes << :clarkKent rdf:type :human >> .
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> So embedded triples are incredibly weak in RDF*.   Making them useful will
>>>>>>>>> likely require quite a bit of work.
>>>>>>>>>
>>>>>>>>> Here, "useful" depends again on your intended use. We wanted to have a
>>>>>>>>> rather weak semantics which allows users with more complex use cases to add
>>>>>>>>> their semantics. It is easier to make the semantics more complex by adding
>>>>>>>>> extensions than to ignore certain parts. I for example remember that Jos De
>>>>>>>>> Roo announced some time ago that his EYE reasoner supports rules on RDF*. Of
>>>>>>>>> course that alone would not allow you to cover all cases, but it could be
>>>>>>>>> very helpful in practice.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On the other hand, there are some unusual inferences that can be made in
>>>>>>>>> RDF*.  In an RDF* version of RDFS++ it is possible to state that two triples
>>>>>>>>> are the same.   The graph
>>>>>>>>>
>>>>>>>>> :loisLane :believes << :superman :can :fly >>.
>>>>>>>>> << :superman :can :fly >> owl:sameAs << :clarkKent :can :fly >> .
>>>>>>>>>
>>>>>>>>> is consistent here and implies
>>>>>>>>>
>>>>>>>>> :superman owl:sameAs :clarkKent .
>>>>>>>>> :loisLane :believes << :clarkKent :can :fly >>.
>>>>>>>>>
>>>>>>>>> This last case is an interesting one. We indeed wanted the triple
>>>>>>>>>
>>>>>>>>> :loisLane :believes << :clarkKent :can :fly >>.
>>>>>>>>>
>>>>>>>>> to be a consequence of your statements. The question is whether
>>>>>>>>>
>>>>>>>>> :superman owl:sameAs :clarkKent .
>>>>>>>>>
>>>>>>>>> should follow (it does indeed follow, just as you describe). We made the
>>>>>>>>> semantics of embedded triples the way it is to be able to deal with blank
>>>>>>>>> notes. Here, I can't give a concrete answer whether (at least to my
>>>>>>>>> understanding) it should be that way. I will think about it (and read
>>>>>>>>> Pierre-Antoine's thoughts in the mean time, which just arrived as well) and
>>>>>>>>> come back to you.
>>>>>>>>>
>>>>>>>>> Kind regards,
>>>>>>>>> Doerthe
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
Received on Thursday, 22 October 2020 15:06:18 UTC