Re: defining the semantics of lists

> On May 17, 2020, at 1:02 PM, David Booth <david@dbooth.org> wrote:
> 
> On 5/17/20 5:40 AM, thomas lörtsch wrote:
> > I'd like lists to be integrated into RDF first class because they are > such an important and ubiquitous datastructure.
> +1
> 
> And we should call them "arrays", not "lists", as in most other data representations.  We got in the habit of calling them lists in RDF because historically they were represented as linked lists.  But there is no need for continuing that historical glitch if we add arrays as first-class objects in RDF.
> 
> Piecing them together as sets of triples using logic is really trying to use the wrong tool for the job.  They need to be first-class syntactic objects, so if one is malformed it is a syntax error -- not a logic problem.

I agree. But it would require (1) defining the exact syntax and semantics (or deliberate lack of semantics) (2) writing a standard document defining this extension (3) getting it through the W3C process (3) re-tooling every RDF parser, and possibly every RDF inference engine, in existence. 

Regarding #1, consider a triple using such a list/array/sequence in the object position:

:S :R [:o1, :o2, :o3] .

Does this mean that R holds between S and the list? Or that it holds between S and all the elements of the list (so this is just a handy abbreviation for three triples with IRI objects) or something else? Or can it be used in all these ways and also perhaps others? In this latter ‘flexible’ option, arrays have no built-in RDF semantics, but then we need a way to tell readers (including inference engines) how a particular case is intended to be understood. For example, we might have special classes of properties telling us how they are supposed to behave when applied to arrays. 

What would this mean:

 [:s1, :s2, :s3] :R :O .

? What if both subject and object are arrays?

Related question: if arrays are first-class entities, they can have names. If an IRI refers to – identifies, is the name of – an array, how do we understand a triple which uses that name to mention (instead of use) an array?  

Can an array contain blank nodes? Can one substitute an array for a blank node in the RDF instantiation process? How does this affect the basic entailment results (such as the RDF interpolation lemma)? 

It would be very helpful if people who want these things could tell us what they want to use them for. I suspect we will find a large variety of potential uses, so it will be impossible to standardize. This is what happened with RDF datasets: by the time they were considered for standardization, they were already in use in incompatible ways by people with skin in the game who were not willing to re-tool their committments, so agreement on a clean semantics was already out of reach. 

Pat

> 
> David Booth
> 

Received on Sunday, 17 May 2020 21:20:25 UTC