Re: RDF lists/arrays and n-ary relations [was Re: OWL and RDF lists] from Hugh Glaser on 2022-09-23 (semantic-web@w3.org from September 2022)

From: Hugh Glaser <hugh@glasers.org>
Date: Fri, 23 Sep 2022 12:16:41 +0100
To: Pierre-Antoine Champin <pierre-antoine@w3.org>
Cc: David Booth <david@dbooth.org>, Semantic Web <semantic-web@w3.org>, Dan Brickley <danbri@google.com>
Message-Id: <4B031667-1079-4364-8BA7-D4CF7E96257C@glasers.org>
Hi.
I’ve been following much of this, although not all, and offer a few comments and maybe misgivings about the discussion.
Sorry if I have just restarted things, at the wrong moment, and also to be visiting old ground.

It is not good, I think, to discuss RDF features of how to represent knowledge without significant examples of how it would then be used in a practical application, for real-looking examples.
I have seen almost nothing of this.

I don’t think talking about “lists” at all is a good term.
Lists were invented and named in a world where modern, sophisticated, data structures were not available.
Once data structures became available, the use of lists per se became much rarer.
This is because it is much better to use proper identifiers for the relations, carrying meaning in the name, than generics such as CAR, CDR, “next”, CDDDR etc.
Although RDF is of course not a data structure, I would suggest that the same observation applies.
(That means that if we really want a “list” syntactic sugar, then the relation corresponding to “next” should be specified as part of it.)
In fact, almost every time I have thought that a list is what I wanted, it has turned out to be a bad fit, and in the end a natural knowledge structure with good, application-specific names for all the relations was much better.

If you start to talk about indexing the construct, then it isn’t really a list that you wanted any more, it is more like an array, possibly easily mutable.

My view of what is needed:
We want ordering of the sort provided by simply source text ordering in things like XML & JSON.
I think the time that I feel the need for this is exemplified by the use of things like dct:creator for publications.
Pierre-Antoine reports Dan getting it right below (of course!).
>   # Example 10 expanded
>   <#paper1> schema:creator
>       <#alice> {| ex:order 1 |},
>       <#bob> {| ex:order 2 |},
>       <#charlie> {| ex:order 3, ex:last |}.

But this is very much not what anyone would call a list.
As he points out, the crucial thing is that the schema:creator relation is asserted for all the authors.
And *also* there is somehow an ordering relationship among the objects of those triples.
But this is actually not a “next” ordering.
The knowledge I am representing is rarely so that when I query it, if I know the third author I would like to know the fourth.
It is much more that I would like to know the third author (if there is one), or what ordinal in the author list a particular author is: Bob is the second author.
“Which papers with Alice as first author have Charlie as an author?” Or Charlie as third.
So that aspect of it looks much more like an array to me.
This is true of dog shows as well, I think:- “Who came 6th?”, rather than “Who was next after 5th?"
I don’t know what to call this slightly complex structure, if it was a data structure in a programming language, but since I think that the schema:creator relation is often the primary knowledge, it is certainly misleading to call it a list.
The example 10 tries to capture this, I think.

I also don’t have any new suggestions for how this might be represented in RDF - sorry.

Best
Hugh
> On 23 Sep 2022, at 08:46, Pierre-Antoine Champin <pierre-antoine@w3.org> wrote:
> 
> Hi David,
> 
> On 22/09/2022 23:38, David Booth wrote:
>> On 9/22/22 16:34, Pierre-Antoine Champin wrote:
>>> I think it is useful to consider every proposed extension, and carefully consider whether it really requires an extension of the underlying data model, or whether it can be managed purely as syntactic sugar.
>> 
>> Agreed.  That would be best for backward compatibility.  And it occurs to me that some of these ideas for new "built-in" object types, such as arrays and composite object, could actually be implemented as syntactic sugar for named graphs.  For example, this array of dog show winners:
>> 
>>   # Example 1
>>   :dogShow winners ( :ginger :bailey ) .
>> 
>> might be treated as syntactic sugar for this TriG:
>> 
>>   # Example 1-expanded
>>   :dogShow :winners N2 .
>>   N2 { :dogShow :winners
>>          [
>>            0 :ginger ;
>>            1 :bailey
>>          ] .
>>      }
> 
> That's intriguing :)
> 
> But I'm not sure exactly what you gain here... Querying lists in SPARQL would be hardly easier as what it is today...
> 
>> 
>> where N2 is an auto-generated named graph name of some kind (TBD) -- perhaps a blank node, a relative URI, or a Skolem URI.  By "unblessing" N2, you get to "see" the triples that implement that list object.
>> 
>> And this composite diagnosis object, used for an n-ary relation:
>> 
>>   # Example 6
>>   :christine :diagnosis @[
>>     :disease :breastCancer ;
>>     :probability 0.8
>>   ] .
>> 
>> might be treated as syntactic sugar for this TriG:
>> 
>>   # Example 6-expanded
>>   :christine :diagnosis N6 .
>> 
>>   N6 { :christine :diagnosis
>>          [
>>            :disease :breastCancer ;
>>            :probability 0.8
>>          ] .
>>      }
>> 
>> And this RDF-star syntax:
>> 
>>   # Example 9
>>   :a :name "Alice" {|
>>       :statedBy :bob ;
>>       :recorded "2021-07-07"^^xsd:date
>>       |} .
>> 
>> could be syntactic sugar for this TriG:
>> 
>>   # Example 9-expanded
>>   :a :name "Alice" .
>>   N9 { :a :name "Alice" . }
>>   N9 :statedBy :bob ;
>>       :recorded "2021-07-07"^^xsd:date .
> 
> This pattern for expressing triples about triples has been largely discussed as an alternative to RDF-star (or, as you propose here, as a the plain RDF interpretation of Turtle-star / Sparql-star).
> 
> The outcome of the discussion was, IIRC, that this was not ideal, because it overloads named graphs with new uses. Since a dataset is a flat collection of named graphs, there is no easy way to distinguish "quoted-triples-named graphs" from "plain-old-named-graphs"...
> 
> Also, I think I read that some implementations do not behave well with too many named graphs.
> 
> Now, about the elephant in the room: it may seem strange that I am strongly advocating against extending the core model of RDF, while being a co-editor of RDF-star [1], which does exactly that. For the record, I first lend towards making RDF-star only syntactic sugar, but was eventually convinced that it deserved to become an integral part of the core model. I still believe, however, that such extensions should be the exception.
> 
>> 
>> This would have the benefit of supporting labeled property graphs, n-ary relations and arrays all under the same mechanism, without adding anything to the RDF core.
>> 
>> Thoughts?
> 
> For ordered lists, Dan Brickley made a suggestion some time ago (on a github issue that I can't find right now, unfortunately): they could be encoded using RDF-star, like that:
> 
>   # Example 10 expanded
>   <#paper1> schema:creator
>       <#alice> {| ex:order 1 |},
>       <#bob> {| ex:order 2 |},
>       <#charlie> {| ex:order 3, ex:last |}.
> 
> It has the advantage of keeping the "simple" triple for each creator, and is quite easy to query in SPARQL. Of course, some syntactic sugar could be created to make this easier to write/read, e.g.:
> 
>   # Example 10 syntactic sugae
>   <#paper1> schema:creator (| <#alice> <#bob> <#charlie> |).
> 
> It occurs to me that RDF-star could be leveraged in a similar way with your Example 6:
> 
>   # Example 6 expanded with RDF-star
>   :christine :diagnosis _:d.
>   _:d
>     :disease :breastCancer {| ex:propertyOf _:d |};
>     :probability 0.8 {| ex:propertyOf _:d |}.
> 
> (although this does not quite capture the "closed-ness" of properties... work in progress)
> 
>   pa
> 
>> David Booth
>> 
> <OpenPGP_0x9D1EDAEEEF98D438.asc>
Received on Friday, 23 September 2022 11:17:02 UTC