Re: RDF* vs RDF vs named graphs from thomas lörtsch on 2020-11-30 (public-rdf-star@w3.org from November 2020)

From: thomas lörtsch <tl@rat.io>
Date: Mon, 30 Nov 2020 20:38:49 +0100
To: "Peter F. Patel-Schneider" <pfpschneider@gmail.com>
Cc: public-rdf-star@w3.org
Message-Id: <722CB291-4F25-4427-8B3F-B1683637DCB6@rat.io>
On 30. Nov 2020, at 18:57, Peter F. Patel-Schneider <pfpschneider@gmail.com> wrote:
> 
> Agreed, but that makes them no different from IRIs or other blank nodes.  And,
> in some sense, makes them no different from literals.  Although the RDF
> semantics says what string literals denote, nowhere is this denotation made
> identical to whatever you or I think of as a string.  There is no way for
> formal specifications to directly connect to anything that happens in the real
> world, assuming that there really is a real world, or in our minds, assuming
> that we have minds.
> 
> And there is no reason that two occurrences of an embedded triple in an RDF
> document must denote the same entity.  RDF* generally has this requirement,
> and one of the test cases needs it, but it is entirely reasonable to consider
> a version of RDF* that does not.  The only thing that you lose is that you
> don't get that information associated with one occurrence is associated with
> the other.

The semantic web operates under the assumption of shared understanding about what an IRI, a literal or a blank node denote. IMO there is absolutely no reason to think that this assumption shouldn’t or can’t hold for embedded triples (or names denoting them, should they get names by mapping to reification or whatever). As soon as they become resources in the universe of discourse we may disagree about there properties - you may find them big, I may find them blue - but we have to come to some shared understanding about what they denote, which probably is informed by what their subject, predicate and object denote, which again is as well-defined in RDF as it can be under the constraints you outline above concerning the connection to the real world. 

To become a subject of discourse however a resource requires a proper address. An embedded triple only addresses a type. An occurrence is addressed by type and location. That location may be a graph, a document or any other snippet of RDF in which the triple surfaces. That’s not much to do with meaning, minds and other metaphysical arangements.

Thomas


> peter
> 
> 
> On 11/30/20 12:13 PM, thomas lörtsch wrote:
>>> On 30. Nov 2020, at 15:07, Peter F. Patel-Schneider <pfpschneider@gmail.com> wrote:
>>> 
>>> It's actually quite easy to come up with a version of RDF* (both as a mapping
>>> to RDF and as a separate semantics) that is both set-based and allows multiple
>>> embedded triples for the same subject, predicate, and object.  This is easiest
>>> to see in the mapping - instead of having an injective mapping from triples to
>>> blank nodes just use a fresh blank node for each <<>>.   This use of fresh
>>> blank nodes is already done in Turtle.
>> Except that you don’t know what those blank nodes refer to. They describe occurrences of the same triple (or rather _triple type_ to be verbose but unambiguous) and give those descriptions different names. They don't refer to specific triple occurrences in specific graphs, not even in the same graph (as one might think and which could be a useful default). Or rather: by definition they do refer to occurrences but we don’t know which.(*)
>> 
>> But let’s say we extend RDF reification to be able to refer to occurrences in the local graph. Let’s say two reifications refer to occurrences of the same triple type by different names (fresh blank nodes, re-used blank nodes, IRIs, whatever). Wouldn’t you agree that those names then necessarily are owl:sameAs? If you didn’t, wouldn’t all hell break loose?
>> 
>> Thomas
>> 
>> 
>> (*) I hope I got it right and I have to admit that I start to like this weirdness - but only because I start to understand how it doesn’t try to define some half-baked solution but just omits one essential piece. The rest however is well defined. I wouldn’t call this a weak semantics as Antoine did. It would be perfectly strong if only it was complete.
>> 
>> 
>> 
>>> peter
>>> 
>>> 
>>> 
>>> 
>>> On 11/30/20 8:55 AM, thomas lörtsch wrote:
>>>>> On 30. Nov 2020, at 12:21, Miel Vander Sande <miel.vandersande@meemoo.be> wrote:
>>>>> 
>>>>> That's valid and I also see the merit of having it as part of RDF* rather than N3 (then should be well aligned and the nquads/trig syntaxes would be rooted).  
>>>>> But AFAIK nobody actually officially dismissed including graph annotation,
>>>> That’s not true. Over the last one and a half years I brought up the issue of quads and Named Graphs several times in mails long and short and Olaf made it quite that he doesn’t want to discuss this matter. The RDF* space actually feels like a rather hostile territory w.r.t. quads, graphs etc. I understand that the failed attempt of the RDF 1.1 WG to standardize Named Graph semantics burned the ground and that it might seem easier to get things done if one evades the question, but actually: one just can’t. 
>>>> 
>>>>> so why not gather the stakeholders from enterprise, make a joint request for scope expansion, and have a structured discussion there? Evidently, a broader scope should come with the necessary engagement to manage the debate, follow-up on issues, and draft the specs. Let's approach this positively, this group is an opportunity :)
>>>> We had that funny little moment of awkwardness in last week's call when the WikiData use case came up and nobody dared to spell it out: as long as RDF is based on sets there is no way to have the same assertion multiple times in the same graph, with different annotations. PG mode looks like it could do it but it can’t. There is no clear mapping between PG and SA mode. 
>>>> Pierer-Antoine proposes to introduce an intermediate blank node in SA mode, but that is exatly the opposite of a new approach to reification: one can model everything with blank nodes in RDF already, one always could. 
>>>> Named graphs can do it, of course, but we then get to some kind of nesting or embedding of graphs. I’m quite convinced that nesting graphs (while not giving up on the triple-ish surface and serialization) is the way to go for a solution that can cater to all use cases! But is it well-investigated, proven, does it have a semantics covering all the corner cases? I’m not aware of such work. I’m working on it but I definitely can’t pull it off alone.
>>>> I did write some long-ish and one very long mail regarding this subject and I got very little support or even actual discussion on the nitty-gritty. Pat and Peter are dropping Named Grapgs superiority from time to time on thias list, which is good to keep the topic on the table, but of course not enough to solve it. Of course there are issues, e.g. I disagree with Pat on how graphs should be named. 
>>>> 
>>>> In that respect I totally agree with you, Miel: where is the activity, warmly welcomed or not? But in the end this needs both: people willing to work on it, and this community to endorse the activity.
>>>> 
>>>> Best,
>>>> Thomas
>>>> 
>>>> 
>>>> P.S.:
>>>> 
>>>> <rant>
>>>> I don’t know why nobody reacts when I write to this list that Pierre-Antoine's semantics ground RDF* in abstract triple types while Peter’s semantics grounds them in occurrences - and I couldn't think of a more fundamental difference in terms of semantics, certainly more fundamental than the question if they are referentially opaque or transparent. 
>>>> 
>>>> I’d rather not have work my way around RDF* later because it has subtly broken semantics. It would be much better if RDF* was formalized in a way that fits into the semantics of RDF, including sensible default assumptions for the semantc pieces missing in RDF, or extending it right away! 
>>>> 
>>>> If RDF* ends up with the a semantics that doesn’t only have gaps, like RDF Standard Reification and RDF Named Graphs, but is outright misguided, where is the value proposition that may convince PG store users to switch boat? What’s left then of the advantage of RDF over PG: standardized serializations? I think the PG community can pull that off on its own. If I had vested interest in an RDF software and would want to make inroads in the PG market rather sooner than later, I’d be careful nonetheless…
>>>> 
>>>> RDF* is a subset of what can be done with Named Graphs. It may very well be a very intuitive and useful subset, even giving the impression it was orthogonal to named graphs. But if RDF* isn't formalized in a way that relates it to Standard Reification and Named Graphs the risk is very high that in the end RDF* will only add to the mess of half-finished approaches to meta modelling in RDF: supporting some practical needs but not supporting others - and no poor soul that doesn’t crawl very deep into the rabbit hole of formal semantics will be able to understand which, and why. There goes intuitivity. 
>>>> </rant>
>>>> 
>>>> 
>>>>> Op ma 30 nov. 2020 om 11:56 schreef james anderson <james@dydra.com>:
>>>>> 
>>>>>> On 2020-11-30, at 09:48:18, Miel Vander Sande <miel.vandersande@meemoo.be> wrote:
>>>>>> 
>>>>>> ...
>>>>>> 
>>>>>> I appreciate the work this group is doing in terms of making the interpretation of reification clear and usable. Its main goal is still to provide compatibility with the PG world, where properties over a group of edges simply doesn't exist. I think this limited scope actually helps getting somewhere within reasonable time. 
>>>>> in order for this effort to yield a useful result it will need to do more than "provide compatibility with the PG world”.
>>>>> during the call last friday, one exchange included
>>>>> 
>>>>>   blake: I want to inquire a bit to see the aspects of embedded graph, embedded quad
>>>>>   <thomas> +1 to blake: keeping the possibility open to have embedded quads in the future
>>>>>   pchampin: A very good question by blake. There should be an issue for that in the repo. Yet another separate question 
>>>>>   ... that need to be checked and discussed 
>>>>>   ... Anyone wants to react?
>>>>>   <pchampin> ACTION: blake to submit an issue on embedded quads
>>>>> 
>>>>> that is, quads are seen as “something to be discussed”.
>>>>> the statistics on our sites suggest a stronger imperative.
>>>>> while triples dominate quads on a free site by a ratio of five to one, which would suggest that to claim pg-compatibility suffices, on an enterprise site the ratio is fifty to one in the opposite direction.
>>>>> in those contexts, if rdf* does not provide for quads, it will be of little use.
>>>>> 
>>>>> ---
>>>>> james anderson | james@dydra.com | http://dydra.com
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>
Received on Monday, 30 November 2020 19:39:08 UTC