Re: RDF lists/arrays and n-ary relations [was Re: OWL and RDF lists] from Patrick J. Hayes on 2022-09-28 (semantic-web@w3.org from September 2022)

From: Patrick J. Hayes <phayes@ihmc.org>
Date: Wed, 28 Sep 2022 05:42:30 +0000
To: David Booth <david@dbooth.org>
CC: "semantic-web@w3.org" <semantic-web@w3.org>, "Patrick J. Hayes" <phayes@ihmc.us>
Message-ID: <C91BF4C6-4997-41D6-8DFC-0BE8DB3BAB8C@ihmc.org>
> On Sep 27, 2022, at 1:32 PM, David Booth <david@dbooth.org> wrote:
> 
> On 9/27/22 09:58, Pierre-Antoine Champin wrote:
>> lists do not only give you order, they give you "closedness": the first/rest ladder captures the fact that the list contains all these elements *and only them* (and in this particular order).
> 
> Small but important clarification: currently RDF lists do NOT give "closedness",

Sure they do, if you use the vocabulary properly. 

:thislist rdf:first :A .
:thislist rdf:rest :x1 .
:x1rdf:first :B .
:x1 rdf:rest :x2 .
:x2 rdf:first :C .
:x2 rdf:rest rdf:nil .

tells you that :thislist is exactly ( :A :B :C ), with three items. You can't express this kind of 'closedness' with the container vocabulary because it provides no way to say 'and no more'.

Now, of course RDF does not impose 'proper' usage of the collection vocabulary, because it imposes hardly any syntactic restrictions of how any vocabulary is used. But you can define a semantic extension of RDF which does impose this, and indeed the specs mention this possibility explicitly.

> but "closedness" is definitely what we want (for the vast majority of use cases).  That is precisely one of the reasons I and others feel such a need for (closed) arrays.
> 
>> . . .
>>> Hugh Glaser wrote:
>>> I also worry that if I assert exactly the same knowledge twice, a paper could end up with two authorLists, certainly if bonds got involved.
>> Indeed...
>> That's actually something that "lists as first class citizens" could help solve -- that is, if they were defined in such a way that two lists with exactly the same elements are in fact one and the same object. 

Whoo, wait a minute. Do you really want that kind of extensionality condition? That's not true in LISP, for example, and I am pretty sure its not true in any programming language that uses linked lists as a data structure.
> 
> Yes, that is exactly what's needed, and it is readily attainable if we eliminate *explicit* blank nodes.  By explicit blank nodes, I mean blank nodes that are written like _:b42 in Turtle.  Implicit blank nodes, written with square brackets like "[ ... ]", do not cause problems because they are guaranteed to be acyclic.

Wait, wait, the nonsequiteurs are making me dizzy. First, extensionality (ie the condition same elements => same list) has nothing to do with blank nodes. Second, even implicit blank nodes are still blank nodes, so how can their explicitness be important? Third, how did being acyclic enter into the discussion suddenly? 

>  This means that it would be easy for tools to generate a consistent internal identifier for them, based recursively on their constituent elements.

And those would be blank node identifiers, right? So what you are suggesting here, if I follow you, is a scheme for letting systems generate their own bnodeIDs for the innards of list structures. Fine, but this is not a modification to RDF.

>  (This is closely related to RDF canonicalization, which becomes trivially easy without explicit blank nodes.)  Eliminating explicit blank nodes would mean that we'd lose the convenience of not having to allocate an IRI.  I think there are ways to address that loss in other ways, but that's another topic.
> 
> > But that would depart from their current interpretation, and not
> > necessarily fit all use-cases,
> 
> Agreed, but it would fit the most common use cases.  It doesn't have to fit *all* use cases.
> 
>> so this is not something to decide lightly. This is the kind of semantic rabbit hole that Pat Hayes was warning about earlier in this thread (if I got his point correctly).
> 
> I hope Pat will correct me if I'm wrong, but my read of the discussion so far is that the semantics would not be a big problem: both arrays and composite objects can have very straight-forward -- and very similar -- semantics.  And it's clear to me at least that although the rdf:aboutEach functionality could be useful in some cases, it is not what we need as the basic array functionality.  The basic functionality that we need is for an assertion about an array to *only* be about that array -- not about every element in that array:
> 
>  ("apples" "bananas" "peaches") ex:length 3 .

If indeed that is all you want, Pat agrees this would be trivially straightforward. Pat is however very suspicious that this is in fact not all that people want, and they they will be writing things like 

:PatHayes :fatherOf (:SimonHayes :RobinHayes)

before the ink is dry on the specification document.

> 
> And one other comment . . .
> 
> On 9/27/22 09:43, Pierre-Antoine Champin wrote:
> > If we can design other efficient design patterns for conveying order and
> > "closedness" (such as the one proposed above), I believe that the need
> > for representing lists would not be as pressing as suggested in this
> > thread.
> 
> Possibly.  But software developers have been using arrays for 60+ years, and they *expect* them.  So as a practical matter, I think the straight-forward solution is to add proper support for arrays, perhaps in a new higher-level RDF 2.0 syntax, to avoid breaking any existing RDF or tools.
> 
> As Manu Sporny put it, by not having proper array support in RDF, we're currently "giving developers a big fat middle finger in that area".

Both you and Manu miss the central point here. RDF is not intended to be a notation for software developers to create new structures with. It is not a developer toolkit. It was designed and intended to be an information exchange notation. As soon as you give it to developers and say, in effect, go ahead and play with this and build things with it, then all of its utility as an information exchange notation is lost, because whatever meaning one developer intends to express using the structures she develops will be opaque to any other developer and any user in a different development environment. 

There is a kind of universal understanding about what an RDF triple 'means': it says that a relation holds between two things, and it is therefore an assertion about a world consisting of these things with relations holding between them. This consensus of intended meaning is perhaps a bit shaky at the edges and under strain in some places, but still is kind of universally understood and accepted. But there is no such consensus AT ALL about what a list is supposed to mean, or what an array is supposed to mean, when used as part of an assertion about this common 'world' of things bound together by relations. And because there is no such consensus, as soon as these structures are given to developers to build with, what they build will have, at best, an idiosyncratic meaning private to the community in which the developer is working. At which point, the entire purpose of RDF is lost. Which is why any new syntactic extension to RDF should be given a semantics as part of its normative definition, and this should be one that is likely to survive the pressures of how developers are wanting to use the new structure, For example, if it is clear that some folk really want to use arrays to express n-ary relations, while others really want to use them to abbreviate conjunctions but with a closed-world assumption, then these two groups should be given distinct extensions to RDF syntax which will not be confused with each other, and each given a nice crisp semantics and tutorial examples, etc.. 

Good luck, y'all.

Pat

> https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fweb.archive.org%2Fweb%2F20200119081859%2Fhttps%3A%2F%2Fmanu.sporny.org%2F2014%2Fjson-ld-origins-2%2F&amp;data=05%7C01%7Cphayes%40ihmc.us%7C19f3c95841d74e0f32d408daa0b69f00%7C2b38115bebad4aba9ea3b3779d8f4f43%7C1%7C0%7C637999003473348684%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=OivxyFPraBvcSUAccPPkRIvQXmXAZ7G6yt%2FSpl3B8vM%3D&amp;reserved=0
> 
> I think we should address that gap.
> 
> Thanks,
> David Booth
>
Received on Wednesday, 28 September 2022 05:42:49 UTC