Re: RDF* and reification from Pierre-Antoine Champin on 2021-02-03 (public-rdf-star@w3.org from February 2021)

From: Pierre-Antoine Champin <pierre-antoine.champin@ercim.eu>
Date: Wed, 3 Feb 2021 22:09:05 +0100
To: "Peter F. Patel-Schneider" <pfpschneider@gmail.com>, public-rdf-star@w3.org
Message-ID: <c2b49fee-fa53-c93f-85fc-410c7fb87149@ercim.eu>
On 28/01/2021 01:59, Peter F. Patel-Schneider wrote:
> On 1/25/21 4:21 AM, Pierre-Antoine Champin wrote:
>
>> Peter,
>>
>>> Uniqueness can be achieved by using the same blank node for
>> an embedded triple, which is done during the construction of RDF graphs from
>> surface syntaxes.
>>
>> I am concerned about the fact that uniqueness is *only* achieved during the
>> construction, and not enforced at the semantic level. I fear that this may
>> cause undesired entailments, or on the contrary miss some desired
>> entailments. Does this concern make sense to you (even if you don't agree on
>> the proposed solution)?
>>
>> One example of "missed" entailement would be (I think) with the
>> "malformed-literal-bnode" test
>> (https://w3c.github.io/rdf-star/tests/semantics/manifest.html#malformed-literal-bnode).
>> But granted, that one is a corner-case, that maybe not everybody agrees on
>> anyway.
> I don't see why this entailment would be missed.

(this is based on your proposal here: 
https://lists.w3.org/Archives/Public/public-rdf-star/2021Jan/0059.html)

* the input of the test has an rdf*:object arc, but no rdf:object arc 
(because o is a malformed litteral)

* the expected entailed graph as a spurious rdf:object arc, and misses 
the rdf*:object arc (because o is a blank node)

>> Another more concerning example (IMO) is the one I pointed out in
>> https://lists.w3.org/Archives/Public/public-rdf-star/2021Jan/0060.html. I
>> repeat it below for completeness:
>>
>> Consider the following RDF* graphs (serialized in Turtle*, assuming the
>> adequate prefix declaration) :
>>
>>       G1:  << :s :p :o >> a :A.
>>       G2:  << :s :p :o >> a :B.
>>       G3:  _:x a :A, :B.
>>
>> Does the merging of G1 and G2 entail G3? If you merge them before the
>> mapping (straightforwardly extending the definition of merging to the RDF*
>> abstract syntax), they do. And for that reason, I personally think they should.
>>
>> If you merge them after the mapping, the merging will ensure that the blank
>> nodes generated respectively in G1 and G2 are distinct, and so the result
>> will *not* entail G3. The mapping from RDF* to RDF is lossy, as it does not
>> preserve the identity of RDF* embedded triples in the RDF abstract syntax.
> The blank node method for embedded triples does not preserve uniqueness when
> merging.  Implementations that perform merging have to use a modified version
> of merging.
... and of union, I just realized (because the mapping of the union of 
two RDF* graphs is not the union of the mapping of each of them).

So the mapping from RDF* to RDF does not produce a "standard" RDF graph, 
it produces an RDF-graph-that-needs-special-treatment-for-merging-and-union.

The whole mapping idea was (IIUC) to get rid of the new kind of node 
that RDF* introduced. But in effect, the blank nodes used to represent 
embedded triples require a special treatment for union and merging. I 
can't help but consider them as a somewhat new kind of blank node...

It seems to me that the extra machinery that you insist on keeping out 
of the semantics shows up elsewhere... I still don't understand why you 
prefer it that way, but I guess I could live with that. I'll try to 
capture that in a PR.

>> I don't have any idea right now on how to make a lossless mapping. The
>> mapping proposed in PR81 is lossy as well, but the additional machinery in
>> the semantics aims at re-introducing the missing information, i.e. two blank
>> nodes representing triples with the same subject, predicate and object are
>> forced to denote the same thing.
>>
>> Maybe there is a simpler solution to avoid the "merging issue" described
>> above -- and I am all for simplicity. But I think this issue needs addressing.
>>
>> Or do you consider that this "merging issue" is not a problem?
> I do wonder whether any RDF implementations actually merge RDF graphs.

Maybe not explicitly. But all implementations I know allow you to load 
multiple files into a single graph object, and they ensure that blank 
nodes coming from each (graph parsed from a) file do not clash. 
Effectively, this is a merge.

To achieve the same thing in RDF*, some post-processing would have to be 
run after loading each file, in order to merge blank nodes having the 
same subject/predicate/object...

> (Sometimes I wonder whether any RDF implementations actually compute RDF
> entailments.)
I have indeed never seen an implementation of RDF entailment alone, but 
RDFS inference engines are meant to support it. Whether this is complete 
or not... ;->

   pa

>>    best
>>
>>
>> On 22/01/2021 19:20, Peter F. Patel-Schneider wrote:
>>> As I've mentioned several times it turns out that reification can be used in
>>> many different ways, each producing a different variation of RDF*.   I've also
>>> sent out several examples of defining RDF* using reification.
>>>
>>>
>>> If << s p o >> is just replaced by
>>>
>>> _:b rdf:subject s .
>>>
>>> _:b rdf:predicate p .
>>>
>>> _:b rdf:object o .
>>>
>>> with a different blank node for each occurrence of the embedded triple then
>>> you get transparency and non-uniqueness.
>>>
>>> If you require using the same blank node for a triple then you get
>>> uniqueness.  Uniqueness can be by document, by graph, or universal.  (Of
>>> course, using the same blank node in multiple RDF graphs doesn't always get
>>> what you might think it does.)
>>>
>>> If you add extra links for non-blank subjects, predicates, or objects of
>>> embedded triples that link to literal versions of the subject, predicate, and
>>> object then you get a semi-opaque version.  The literals can just be strings
>>> whose values are a canonical representation of the subject, predicate, or
>>> object.
>>>
>>> If these links are also added for blank node subjects, predicates, or objects
>>> then you get a fully opaque version.
>>>
>>>
>>> So it is possible to define several versions of RDF* with very minimal
>>> additions to RDF.   Several versions of opacity can be achieved by using three
>>> new predicates.   Uniqueness can be achieved by using the same blank node for
>>> an embedded triple, which is done during the construction of RDF graphs from
>>> surface syntaxes.
>>>
>>>
>>> peter
>>>
>>>
>>>
>>>
Received on Wednesday, 3 February 2021 21:09:11 UTC