Re: Proposal to resolve ISSUE-102 (well-formed lists) from Pat Hayes on 2012-11-13 (public-rdf-wg@w3.org from November 2012)

From: Pat Hayes <phayes@ihmc.us>
Date: Tue, 13 Nov 2012 02:15:33 -0800
To: Richard Cyganiak <richard@cyganiak.de>
Cc: RDF Working Group WG <public-rdf-wg@w3.org>
Message-Id: <64E32EF8-0AFB-49F8-9EC2-564841C56EF0@ihmc.us>
On Nov 12, 2012, at 9:05 AM, Richard Cyganiak wrote:

> Hi Pat,
> 
> On 12 Nov 2012, at 10:20, Pat Hayes wrote:
>> OK, after reading the email trail and thinking harder about this, some more comments.
>> 
>> 1. RDF only describes lists, it does not construct or process them. So there is no such thing as a "well-formed RDF list". What there might be is something like a well-formed RDF *description* of a list, but that raises a number of issues.
> 
> I think that RDF does not merely describe lists, but it expresses them, in the same way that it expresses triples and graphs.

Well, I hate to simply contradict you, but that's just plain not true, according to the current specifications. I think a lot of the problems we are having over this issue stem from this misunderstanding. 

> 
>> 2. What counts as a well-formed RDF description of a list? If this means the list which is described by the description is "well-formed" (which I take to mean, it is a LISP S-expression), then almost any description is well-formed. For example:
>> 
>> _:x rdf:first :a
>> _:x rdf:first :b
>> 
>> is true if :a owl:sameAs :b and _:x is the list (a), or indeed the list (a fred somethingelse); any list with a as its first element will do. 
> 
> The constraint that I have in mind is a constraint of the abstract syntax, not a constraint of the semantics.
> 
> To be precise: A well-formed list is either the node rdf:nil, or a blank node with exactly one incoming arc

Why exactly one? Not even LISP requires this. 

> and exactly two outgoing arcs, one of which is rdf:first, and the other is rdf:rest pointing to another well-formed list.

So these lists are "parts" of RDF graphs? They aren't actual graphs, since a single node can be a list. This is a whole new kind of entity in the RDF model. What do they describe? Themselves? Is this:

_:x rdf:first "1"^^xsd:number
_:x rdf:rest _:x1
_:x1 rdf:first "2"^^xsd:number
_:x1 rdf:rest rdf:nil

a list of literals? A list of numbers? A graph describing a list of numbers? A list describing a list of numbers? What does _:x denote? A list (in your sense)? Or a "semantic" list? What is the relationship between the list expressed and the list described? 

> 
> So, the two triples you give would not be a well-formed list.
> 
> If :a sameAs :b, then the well-formed list (:a :fred :somethingelse) entails the well-formed list (:b :fred :somethingelse), as well as various non-well-formed lists, one of which you showed in your example.

So a well-formed list can correctly (validly) entail a non-well-formed list? That seems crazy to me. 

> 
>> 3. If the idea is to require that RDF graphs must give complete information about lists, so that every rdf:next assertion is given all the way to rdf:nil, then this would break the basic entailment between graphs and subgraphs, so a strong -1 on that idea. In general, RDF is designed to allow partial information and to be open-world in its treatment of data, and this kind of a completeness assumption violates that basic design.
> 
> The idea is *not* that RDF graphs must give complete information about lists. The idea is that RDF graphs *SHOULD* give complete information about lists.

But why? You have not yet given any indication of why you feel that this is even desirable, let alone worth fixing in a standard.

> It doesn't break entailment between graphs and subgraphs; these entailments still hold in the case of well-formed lists, but they are highlighted as entailments that are not particularly useful, and therefore implementers of inference engines are discouraged from producing these entailments.

I don't agree that this is known certainly enough to be given out even as vague advice. The internal dynamics of inference engines are a topic best left to the implementors of those engines. 

> That's what the SHOULD means.
> 
> Allowing partial information and being open-world are great, of course. But there are cases where retaining only partial information means that all you have left is a useless carcass of triples.

I wish you would use less flowery language and stick to actual arguments. I can see many cases where all I might need to know about a single thing and a list is that the thing is in the list somewhere. That can be captured by a single triple, or at most two. I do not need to have all the rest of the list around in my subgraph of interest just in order to keep it "well-formed": the two that matter to me might be a carcass, but they are a useful carcass. 

> Certainly the reason why we have the open-world assumption is not in order to allow inference engines to rip apart well-formed lists. I don't quite understand the motivation in defending their right to do so.

Have you ever written LISP code? It consists almost entirely of processes which "rip apart" Sexpressions (and others which bulld them back up again, often in different ways.) These are *data structures*, not precious relics to be preserved at all costs. 

> 
>> 4. If the idea is to encourage RDF authors to use the RDF list vocabulary in a certain obvious way, then make it part of a non-normative best practice guide. The current specs already say this, however, and even say that systems may require adherence to this and treat "ill-formed" descriptions as an error. So I don't see that any major change is needed.
> 
> The most relevant spec here is RDF Schema (as it defines the vocabulary in question), and it doesn't say anything to that effect. Neither does the Primer, which has the generally best description of the list construct in the current document set.
> 
> Yes, Semantics says useful things about lists, but it strikes me as the wrong place for this content, given that it imposes no semantic conditions whatsoever. It should just state that fact and move on. (Same for several other constructs that don't have any formal semantics associated.)

I'm happy with that. What we are arguing about is not where the information is placed in the documents, but what the information actually is. You want to change the very idea of "lists" in RDF to be a different notion than the one in the current specs. I disagree. 

Pat

> 
> Best,
> Richard
> 
> 
> 
>> 
>> Pat
>> 
>> 
>> 
>> On Nov 9, 2012, at 2:48 AM, Richard Cyganiak wrote:
>> 
>>> ISSUE-102: Shall we highlight Turtle's list structures as "Well-Formed Lists" in one of our Recs?
>>> http://www.w3.org/2011/rdf-wg/track/issues/102
>>> 
>>> 
>>> PROPOSAL: Define the concept “well-formed list” in detail in RDF Schema, including a nice diagram. State that any use of terms from the collections vocabulary SHOULD be part of a well-formed list. Update Semantics to remove discussion of collections in 3.3.3. Update Turtle and RDF/XML to refer to well-formed lists when introducing the respective syntax shorthands. Send an email to OWL WG comments list informing them of this and suggest that future versions of OWL do the same.
>>> 
>> 
>> ------------------------------------------------------------
>> IHMC                                     (850)434 8903 or (650)494 3973   
>> 40 South Alcaniz St.           (850)202 4416   office
>> Pensacola                            (850)202 4440   fax
>> FL 32502                              (850)291 0667   mobile
>> phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
>> 
>> 
>> 
>> 
>> 
>> 
> 
> 
> 

------------------------------------------------------------
IHMC                                     (850)434 8903 or (650)494 3973   
40 South Alcaniz St.           (850)202 4416   office
Pensacola                            (850)202 4440   fax
FL 32502                              (850)291 0667   mobile
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
Received on Tuesday, 13 November 2012 10:16:08 UTC