Re: defining the semantics of lists from Patrick J Hayes on 2020-05-15 (semantic-web@w3.org from May 2020)

From: Patrick J Hayes <phayes@ihmc.us>
Date: Thu, 14 May 2020 23:07:01 -0500
To: thomas lörtsch <tl@rat.io>
CC: Semantic Web <semantic-web@w3.org>
Message-ID: <377FF0B6-81F6-44A3-9E54-2EC5AE6E2961@ihmc.us>
Hi Thomas

Let me explain why the semantics of the RDF containers is the way it is. Several members of the RDF WG were surprised by this, but it kind of follows inevitably from other, more basic, design decisions of RDF. 

First, RDF is NOT designed to be a datastructure language: it is a descriptive language. It describes things. The semantics is entirely set up with this basic design decision in mind. And second, it describes things under an open-world assumption. That point (open-world vs. closed-world) was always controversial, but it was thrashed out very early in the design process and became a fundamental design choice, on the grounds that a Web-based description language can never assume that all the relevant data is known about some topic. So this means that given any piece of RDF, you can cut out some piece of it, or adjoin some more RDF to it, without anything breaking. 

So now, how could rdfx:ClosedSeq work? Presumably it would come along with a bunch of assertions about the first, second, third etc. elements of the seq, and maybe a way of saying that the one of them is the last item, so we might need rdfx:LastItemIn. Suppose however that we simply don’t have a triple that specifies the second element. Is this an error? Or just an incomplete description? If the latter, what if we omit the LastItemIn triple; then we don't know how long this seq is. Is that also an incomplete description (as the open-world assumption requires) or is it an error? What happens if we are told that A is the second item and also that B is the second item? Is that an error, or can we conclude that owl:sameAs A B ?  If we take the open-world choice in these cases then this is hardly distinguishable from that we have already. But if we say that incomplete or ‘excessive’ information is an error then we don’t really have an RDF graph, since the extra constraints amount to a fundamental change to the idea of graph syntax. A ‘legal’ RDF graph now is not just a set of triples: it has global constraints on what must be present or what is allowed to be present. This is of course possible, but it would change RDF fundamentally. 

Now, another way to go would be to say that RDF needs containers, but it doesn’t need to describe them using triples. We could just allow a new kind of construct as a node in a triple (in addition to IRIs, Bnodes and literals) and give it its own definition. We would have to invent new syntax to represent them, of course, which would break all known RDF engines, but maybe it would be worth it (?) Then sequences (etc) would be much more like conventional datastructures. Of course, the semantics would have to say something about these things, but not much. (For example, we might require that IRIs inside sequences denote the same thing as they do outside, basic things like that.) This would not break the open world assumption. 

Anyway, I hope this helps people think about what the issues are :-)

Best wishes

Pat


> On May 14, 2020, at 8:18 AM, thomas lörtsch <tl@rat.io> wrote:
> 
> I’m aware that the topic of lists in RDF can ingnite lively debate nearly as much as blank nodes so my apologies in advance. I have a very specific question and I don’t intend to discuss the use of lists in OWL, syntactic sugar in Turtle, querying in SPARQL or historic details about how some decisions came to be (although I do find all that very interesting, but another time...).
> 
> Lists from the RDF container vocabulary - rdf:Seq, rdf:Bag and rdf:Alt - can’t be closed whereas lists from the collection vocabulary - rdf:List - are always closed. Collections, in constrast to containers, are popularly considered to "have semantics" because of this closing characteristic. 
> The excruciatingly exact Lisp-style modelling of rdf:Lists through rdf:first/rest/nil properties does indeed leave no room for misunderstanding about the listiness and closedness of an rdf:List.
> What level of semantics could be provided for containers by explicitly defining either a closed container, e.g. rdfx:ClosedSeq, or an appropriate property, e.g. rdfx:hasLength, that implicitly closes a container? 
> 
> I reckon that semantics introduced per definition are always somewhat weaker than semantics that emenate naturally and unmistakably from a datastructure itself. However collections have a lot of disadvantages (that I promised above not to discuss) and I wonder how workable the semantics provided by defining an rdfx:ClosedSeq class or an rdfx:hasLength property would be. Would they be able to take some load, none at all, not enough, or almost the same as collections?
> 
> Thomas
Received on Friday, 15 May 2020 04:07:23 UTC