Re: N-ary relation and rdf star

> On 6. Jan 2021, at 14:46, Andy Seaborne <andy@apache.org> wrote:
> 
> 
> 
> On 05/01/2021 18:05, thomas lörtsch wrote:
>>> On 5. Jan 2021, at 17:25, Andy Seaborne <andy@apache.org> wrote:
>>> 
>>> 
>>> 
>>> On 05/01/2021 14:24, thomas lörtsch wrote:
>>>>> On 5. Jan 2021, at 11:31, Pierre-Antoine Champin <pierre-antoine.champin@ercim.eu> wrote:
>>>>> 
>>>>> On 05/01/2021 09:36, Patrick J Hayes wrote:
>>>>> 
>>>>>>> On Jan 4, 2021, at 8:55 PM, Jeen Broekstra <jb@metaphacts.com> wrote:
>>>>>>> 
>>>>>>> On Tue, 5 Jan 2021, at 01:21, Joy lix wrote:
>>>>>>>> Dear All:
>>>>>>>> I'm learning RDF star, what is the  difference between N-ary relation and RDF star?
>>>>>>>> N-ary relation : https://www.w3.org/TR/swbp-n-aryRelations/
>>>>>>> 
>>>>>>> RDF* is an extension of RDF that aims to make modeling certain kinds of n-ary relations easier.
>>>>>> 
>>>>>> Really?? That is not what I have been reading up until now. RDF* replaces RDF reification: what it encodes is assertions /about/ a triple. Such as provenance information. That is quite different from extending a triple to an n-tuple in order to encode n-ary relations.
>>>> We discussed the matter in some detail in December. The outcome was that an RDF* embedded triple doesn’t represent a specific triple occurrence but a triple type. So it is best understood not as an alternative to RDF reification but as a way to encode n-ary relations and in that respect an alternative to rdf:value.
>>>> If RDF* was to record provenance it would need a mechanism to refer to a specific occurrence e.g. in some document or named graph.
>>> 
>>> Indeed, RDF* does not directly solve this use case.
>>> RDF* does provide a building block whereby an application can capture that.
>>> 
>>> <<:s :p :o >> :usedIn <someDocument> .
>> Right, and that might be regarded as progress compared to RDF Standard Reification, but how much? It looks short and compact but when you actually want to attach assertions to that occurrence: how would you do that?
>> A slightly verbose way to go about it would be the following:
>> _:x rdf:type rdfx:Occurrence ;
>>     rdfx:ofTriple <<:s :p :o >> ;
>>     rdfx:inGraph <someDocument> .
>> Now you can start asserting stuff like date, source etc to the subject.
>> _:x :as :example ;
>>     :onDate "05.01.2021" .
>> It saves 2 triples compared to RDF Standard Reification. The relation is slighty better in practice as the type statement can be omitted, but still only 2 triples difference. The cost is introducing a new (and very long) node type.
>> You might also nest the embedded triple in another embedded triple:
>> << <<:s :p :o >> :usedIn <someDocument> >>
>>   :as :example ;
>>   :onDate "05.01.2021" .
>> How would you translate that to ordinary triples? Would you use reification or an n-ary relation? Can that even be standardized? I fear it can’t be :-(
> 
> Personal opinion - if I have a complicated provenance problem, I'd use reification, being clear about stating/claims and statements.  

Reification and n-ary relations tackle the problem from opposite directions: n-ary relations refine a description from the inside by adding detail, refication annotates and comments from the outside. There are some clear cut rules when to use which but many problems can be solved either way with more or less accuracy. "Always using refication when things get complicated" will not always be the right approach.

> The "triple instance" effect in fact comes from using blank nodes.

I don’t know what you mean.

> There is a non-technical dimension here - reification hasn't gained traction.

I’m not arguing pro RDF Standard Reification, but it is the measure against which new approaches have to compare against. And if they introduce a new node type they’d better compare very favorably.

I haven’t made up my mind yet if defining the embedded triple as syntactic sugar for n-ary relation is a good idea or not. 
It does let down the very common provenance use case though and that needs a solution as otherwise the embedded triple will be used for reification anyways, almost guaranteeing another shit pile of semantically unsound triples that any future endeavour to solve the problem of meta modelling on the web will have to navigate. Not a pleasant prospect!

> RDF* has implementation experience. It has syntax and some practical/implementation advantages over reification.

But it doesn’t have a semantics yet and the experience is that even the creator of RDF* until a few months ago tauted it as syntactic sugar for RDF reification - whereas we now know that it is a variant of n-ary relations, really syntactic sugar for rdf:value - which is quite an experience. All the implementation experience etc aside, RDF* is still a moving target.

> I don't think anyone is (realistically!) claiming RDF* is a complete replacement for reification.

I don’t think anybody is asking for that. Getting exactly the same is very hard, given all the subtle deviations in semantics that one can observe when getting close enough to any piece of syntax. 
I was rather asking for even better reification, and not unreasonably so. My proposal, a few months ago, was to define that the embedded triple refers to an occurrence in the same graph. That doesn’t cover corner cases like WikiPedias multiple annotations and it doesn’t allow to annotate triples in distant graphs but it serves the usual provenance use case and is indeed syntactic sugar for RDF Standard Reification and halfways fills the gap in its semantics (the other half would be the WikiPedia usecase and distant triples - not unsolvabe either as I showed). This is pragmatic AND sound AND intuitive.

> The idea of labelling triple instances is interesting but it is not designed in detail and not yet proven.

Of course its’s not yet proven - nothing is. I was however trying to initiate a discussion but the interest is indeed low. There is definitely a lack of enthusiasm here towrads really tackling the problem at hand and much more hope of finding an easy way out. Not that I was ever a big fan of RDF* but it has its merits. This lackluster approach to semantics jeopardizes those very merits.

>  For example, a graph is set of triples. What happens if there are two triples added?

I guess you mean inserting the same triple into a graph again, when the triple first inserted already got annotated. Indeed a problem that might require some housekeeping wrt un-annotated triples to solve properly. We have the same problem however with unasserted, but annotated triples in SA mode, oups. What happens when the same triple is then added, for real. Not designed in detail, not proven - and so far not a problem it seems as I bleated about this a while ago but got no response. Will this change your mind?

> Or how talk about triples in no known graphs - talking about disputed cliams.


Why not just skip the graph identifier? And in SA mode simply not assert it? Maybe I don’t understand your question?

> I hope we will see all these issues worked on and developed in the future with proof-of-value usages.

What does this sentence even mean? Is this some gospel to ease the pain? How many tries do you think we have? How many brackets are there in the RDF-world of infinite backwards compatabiity? How much patience and good faith in the user community? 

To the contrary I hope we will not be too disappointed by the results of the yet-to-be-determined proof-of-value of the proposed RDF* semantics. My fear is that the semantics of usage will be all over the place, simply ignoring the very specific N3 inspired and superman-problem solving semantics of the spec. The current semantics seems mainly inspired by the desire to circumvent problems. That’s understandable and pragmatic, within reason. But if it doesn’t meet the use cases or doesn’t clearly NOT meet them - that will backfire. And that RDF* doesn’t even try to understand how it could fit into the bigger picture - graphs for example, annotating groups of triples - is hard to excuse.

Again: you argue that RDF* is the proven, pragmatic solution, but the usefulness of its proposed semantics is not proven at all, and its pragmatics in an interoperabiity context depend on that. If all we needed was PG style modelling we could well use PG and be done with it. And I would have stopped to bother quite a while ago about RDF* if I thought we have an unlimited amount of shots on this problem.

Thomas

>    Andy
> 
>>>> But RDF* doesn’t provide such a mechanism and proposals to that effect have been met with a certain reluctance.  Rather to the opposite it was said that the provenance examples in teh numerous RDF* papers have always been a mistake and that RDF* embedded triples where always intended to refer to triple types. So RDF* as of December 2020 doesn’t and can’t and doesn’t want to record provenance, quite explicitly, although it was presented and understood from the start as doing that and although it will no doubt be used for that no matter what a spec says. In a way this has a certain Groundhog Day feel to it.
>>>>> More generally, the goal is to mimic Property Graph's ability to annotate arcs, which is used for many use-cases. Provenance is one of them,
>>>> And now we will have the whole discussion again?
>>>>> N-ary relations is another one. For example, a common example in Property Graphs is "Tom Hanks stars in 'Saving Private Ryan' as Captain Miller", which could be represented in RDF* (using the annotation syntax):
>>>>> 
>>>>>   :tom_hanks :stars_in :movie123 {| :as "Captain Miller" |}.
>>>>>> Keeping things like this clear is why it is important to have a crisp semantics for RDF*.
>>>>> We are working on it ;) I expect to submit a new Pull Request soon...
>>>> Looking forward to it ;-)
>>>> Thomas
>>>>>   best
>>>>> 
>>>>>> 
>>>>>> Pat Hayes
>>>>>> 
>>>>>>> As the working group note you mention shows, there are other ways to represent such relations in RDF without needing an extension, but they all have certain drawbacks. RDF* hopes to overcome some of those drawbacks.
>>>>>>> 
>>>>>>>> This W3C Working Group Note  hasn't been updated since 2006. Can it still be used?
>>>>>>> 
>>>>>>> Yes. The approaches for modeling n-ary relations shown in that note are still as valid today as they were in 2006. It's just that RDF* gives you another option, that (hopefully, in some cases) will make things easier.
>>>>>>> 
>>>>>>> Kind regards,
>>>>>>> 
>>>>>>> Jeen
>>>>>>> --
>>>>>>> Dr Jeen Broekstra (he, him)
>>>>>>> principal software engineer
>>>>>>> 
>>>>>>> jb@metaphacts.com
>>>>>>> www.metaphacts.com

Received on Wednesday, 6 January 2021 18:39:59 UTC