Re: OWL and RDF lists from Thomas Lörtsch on 2022-09-04 (semantic-web@w3.org from September 2022)

From: Thomas Lörtsch <tl@rat.io>
Date: Sun, 4 Sep 2022 13:18:28 +0200
To: "Patrick J. Hayes" <phayes@ihmc.org>
Cc: Anthony Moretti <anthony.moretti@gmail.com>, David Booth <david@dbooth.org>, "semantic-web@w3.org" <semantic-web@w3.org>
Message-Id: <B03366E8-169A-4194-9846-E1C54CA447E6@rat.io>
Hi Pat, Anthony,


I feel a bit bad because I opened this discussion without having the time to properly follow through. I was rather looking for advice than prepared to make a more than sketchy proposal. But I can add two or three hopefully meaningful bits:


LIST OBJECTS 

The main idea of my list object proposal probably is that a list *object* is opaque to RDF just like any IRI is, but it can be fully described in RDF [0]. And since it is required to be syntactically valid RDF a parser has little trouble to decipher it. The parser only has to be aware of that annotation syntax I made up in my example in [1] (that (@L :a :b :c ) syntax that adds an annotation to the inside of the opening parenthesis as I figured that might be easiest to parse).

So the list object itself
 (@L :a :b :c )
is opaque to RDF just like any IRI is.

However it is easily accessible to any RDF processor as it follows all of RDF’s syntactic rules. Therefore a description of this list object with an ordinary RDF list
 ( :a :b :c )
can quickly and easily be checked for accuracy anytime. This *list* that faithfully describes the *list object* can then serve all intents and purposes that any other, ordinary RDF lists can. It is a requirement to the object that such transformation is well-defined. The main difference to normal RDF lists is just that the object is known to be well-behaved in a certain way (complete, ordered, finite or whatever we define it to be).

[
Okay, on re-reading I see that there might be a bit missing: a definition of what it means that the processor can always check on the basis of the *list object* if what is said about the *list* is true. Now I should probably define if the processor is required to do that check and what it should do if the check returns 'false'. Well, that needs more thinking but my intuition is: such false statements should be rejected. The standard mode of operation would be that the processor works on the *list object*, not the ordinary list which is just its description. Sorry that my thoughts are still so unordered.
In a way we might say that the Open World Assumption still applies, we just automated the removal of false statements (w.r.t. list objects only, of course) that otherwise application logic or human intervention would enforce - as obviously a global decentralized and universally true information space is an unobtainable ideal.
]

So the assertion
 (@L :a :b :c ) ex:correctlyAndFullyDescribedBy ( :a :b :c )
is obviously true.

OTOH another possible assertion
 (@L :a :b :c ) ex:correctlyAndFullyDescribedBy ( :a :b :c :d )
is obviously false, as is 
 (@L :a :b :c ) ex:correctlyAndFullyDescribedBy ( :a :b )
- although both are of course legal per the open world semantics of RDF.

Adding an element to the list object, i.e.
 (@L :a :b :c :x )
creates a new object. So they are immutable? Should be - but I’m not entirely sure, see below under NAMING.

In general an RDF processor that can handle list objects doesn’t need the standard RDF list but can work directly on the object, it can use it like a list structure in a programming language. There is no need for first-rest-ladders or numbered members other than for backward compatability.

Of course such lists are nestable. An interesting question is if nested lists could also be normal lists that are ruled by the ususal open world assumption. I’m not sure if that is possible but it would of course be nice to retain that freedom. 

To answer Pat’s question if the IRIs in the list object denote just like any IRI outside the list: a basic list object should certainly be defined that way to blend in as seeamlessly as possible with the rest of RDF. But we could also define other list objects with semantics in which list items are i.e. referentially opaque. We will have to have this discussion w.r.t. named graphs (in the RDF 2.x sense, not in the Carroll et al 2005 sense) at some point anyway and there is no reason why we couldn’t employ the results to lists or other objects as well. Which brings me to the second point...


CONFIGURABLE OBJECTS

I agree with Anthony that we don’t need to constrain ourselves to lists but I wanted to keep the example task simple to better understand the problem in principal. But a second step that goes from lists to other objects, i.e. defined as shapes in Shacl/Shex, and defines a properly extensible syntax to describe them (something less clumsy than the @L above) would certainly be desirable.

We might of course also tackle the problem from the opposite direction and define a nesting syntax with configurable semantics that is applicable to any element of RDF - nodes, statements, graphs, what have you. Assuming the syntax of that new RDFsuper is curly braces we could have something like:
 {@UNA
  {@NUNA 
   :a :b {@CWA (@L:c :d :e), :f } .
   {@UNA :g} :h :i .
  } :x {@REFOP :y, :z }.
 }

where @UNA stands for Unique Name Assumption, @CWA for Closed World Assumption, @L for finite lists, @NUNA for No Unique Name Assumption (to override/lift that restriction on an inner element) and @REFOP for referential opacity. Also the annotations are made on nodes, statements and graphs alike. Anything is possible although not everything makes sense. Obviously I didn’t properly think all details of this example through. Nonetheless I’m not ruling out that such a radical approach has merit. One could still put a sticker on it saying "Use with care, and at your own risk". Generally I’m not a fan of premature restrictions (like no literals in subject position, no blank nodes as predicates).


NAMING

Right now I am (or should be) bending my head over the discussions the RDF 1.1 WG had on graph naming and so I have an idea of how hard it is to define a sensible naming semantics. The usual problems with identity apply here just as anywhere else in RDF. 
A list asserted to be equal to an identifier, i.e.
 (@L :a :b :c ) owl:sameAs :ex:MyFirstListObject
will always be that list. Adding an :x creates a new list that is no longer owl:sameAs :ex:MyFirstListObject. Otherwise the assertion 
 (@L :a :b :c ) owl:sameAs (@L :a :b :c :x) 
would result, which is clearly false (and which a list object aware RDF processor can easily spot and report as an error).
So does defining a proper name require extra syntax? Probably...


Best,
Thomas


[0] I found that idea (later) in the archives: Graham Klyne had mentioned a similar approach in https://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2002Mar/0267.html but it seems that nobody answered. So maybe it is not stupid but still has some obvious flaw?
[1] https://lists.w3.org/Archives/Public/semantic-web/2022Aug/0051.html


> Am 04.09.2022 um 09:11 schrieb Patrick J. Hayes <phayes@ihmc.org>:
> 
> Hi Anthony
> 
> OK, composite value types. ALong the lines of my previous emial, if we introduce these into RDF syntax then we have to give them a semantics (or openly declare that they have no semantics, which IMO would be a serious mistake). So what DO they mean? Can you explain?
> 
> The truth conditions for the triple 
> 
>> :JoeBiden :hasAddress {@type: :Address, streetNumber: 1600, street: "Pennsylvania Avenue NW"}
> 
> are that it is true in I just when IEXT(I(:hasAddress)) contains the pair <I(:JoeBiden), I({@type:….})>, so we need to be able to say what the value of the interpretation mapping is when applied to a composite value type. Is this related in any way at all to the denotations of IRIs inside that expression? Is it a new kind of object, distinct from other things that the RDF describes? 
> 
> If we want to have semantic constraints like 2/4=1/2 for fractions, we will need to have some kind of semantics at least for the 'recognized' composite types. In fact these look to me very like a generalization of datatyped literals, where the datatype might take several strings instead of just one. That would be a useful generalization in any case, and an easy extension to RDF without materially changing anything in the basic syntax.
> 
> What slightly bothers me is that things like the :hasAddress case look to me like a shorthand for a bunch of triples with the same subject, itself being the encoding of a n-ary relation (in this case n=3: Joe, a number and a street name) whereas things like fractions and coordinates (and complex numbers, dates+times etc) seem more like one triple with a complex datatyped literal as the object. Which makes me suspect that there are two different semantics being applied to one syntax, which is almost always a bad idea. But perhaps I am worrying too much :-)
> 
> Pat
> PS, In practice, addresses are way more complicated. See https://www.mjt.me.uk/posts/falsehoods-programmers-believe-about-addresses/

Thanks! I was looking for something like that recently but got sidetracked :-)
> 
>> On Sep 2, 2022, at 1:51 AM, Anthony Moretti <anthony.moretti@gmail.com> wrote:
>> 
>> I think it might help to first break the problem into parts, and then the semantics question can be asked about each part separately.
>> 
>> A holistic approach to adding a collection syntax might consist of:
>>  • Adding syntax for composite value types in the subject or object positions, which I've previously argued is missing from RDF.
>>  • Adding syntax for extensionally defined collections—which can be thought of as simply special cases of composite value types—in the subject or object positions.
>>  • Adding syntax for associating an IRI with a composite value type.
>> Composite value types are useful for things like:
>>  • Addresses
>>  • Coordinates
>>  • Polygons
>>  • Fractions
>> The components of the value are enough to uniquely identify the value, so an IRI is optional, and whether two values are equal or not depends upon a comparison operation that can be defined canonically for each type (for example, if we're talking about Fractions then 1/2 == 2/4 should be true).
>> 
>> An example of (1) might be something like:
>> 
>> :JoeBiden :hasAddress {@type: :Address, streetNumber: 1600, street: "Pennsylvania Avenue NW"}
>> 
>> Once you have that, collections are a special case, so an example of (2) might be something like:
>> 
>> :MichaelJordan :jerseyNumbers @set[23, 45]
>> 
>> With a full composite value type, if there wasn't a special collection syntax, it might look something like:
>> 
>> :MichaelJordan :jerseyNumbers {@type: :Set, 1: 23, 2: 45}
>> 
>> In my opinion, composite value types are closed atomic concepts, they only make sense as a whole, therefore for collections described in the manner of (2), no entailment should occur across the boundary of the collection and there shouldn't be any semantics, and therefore the collection is also fully defined and closed.
>> 
>> Examples of (3) might be something like:
>> 
>> :s: @set[:b, :c, :d]
>> :l: @list[:b, :c, :d]
>> :cs: @closedSet[:b, :c, :d]
>> :cl: @closedList[:b, :c, :d]
>> 
>> In my opinion, for collections described in this way no such boundary exists, entailments can be made across the boundary of the collection to the members, but they can depend upon other things like the class that the collection itself belongs to etc. It's still definitely not the case that all things said about a collection should automatically be said about each member though.
>> 
>> Anthony
>> 
>> On Thu, Sep 1, 2022 at 11:52 PM Patrick J. Hayes <phayes@ihmc.org> wrote:
>> 
>> 
>> > On Aug 16, 2022, at 11:20 AM, David Booth <david@dbooth.org> wrote:
>> > 
>> > On 8/16/22 12:56, Holger Knublauch wrote:
>> >> A next generation of RDF(-star) can hopefully get rid of rdf:Lists through reification with an index property.
>> > 
>> > That sounds surprising, given the widespread dislike of reification.  Do you have a pointer to an explanation?
>> > 
>> > Incidentally, I personally think RDF should natively support lists, which David Wood and James Leigh proposed at the 2009 W3C RDF Next Steps workshop:
>> > https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.w3.org%2F2009%2F12%2Frdf-ws%2Fpapers%2Fws14&amp;data=05%7C01%7Cphayes%40ihmc.us%7C0a77747beb0a409a1dd808da7fb4dc50%7C2b38115bebad4aba9ea3b3779d8f4f43%7C1%7C0%7C637962712068140669%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=t61SxH5CMAP%2Bt7jSbWCayCVpP3tAZfnRrPid8bTaZUw%3D&amp;reserved=0
>> 
>> OK, that proposes to make lists part of the RDF /syntax/. So a triple with a list as subject is now a syntactically legal RDF expression:
>> 
>> [:a :b :c] :p :o .
>> 
>> and this is now syntactically legal RDF. 
>> 
>> That was easy, but now this ball lands in my court as semantics editor. If this is legal RDF syntax, it has to mean something, has to have a semantics. What does it mean? How is it to be interpreted? Or, if you don't like the S-word, what does this new RDF 'listriple' entail? What can be inferred from it, and what is it inferrable from? For example, is this valid:
>> 
>> from 
>> :a :p :o .
>> :b :p :o .
>> :c :p :o .
>> infer
>> [:a :b :c] :p :o .
>> 
>> ? Or the reverse? How about
>> 
>> from 
>> [:a :b :c] :p :o .
>> infer 
>> [:a :c :b] :p :o .
>> 
>> No? Because if you have the first one in both directions you can prove that this one is valid as well. And what if you have a list in both subject and object positions? Etc. 
>> 
>> Note, it is not enough to answer "sometimes" or "it depends". That kind of woolliness destroys the utility of RDF as an information exchange notation. If the answer depends on something else, then that something else has to also be somehow encoded into the RDF syntax, in order to disambiguate the list notation.
>> 
>> The RDF WG could never agree on what the semantics of list syntax might be. And - a private observation of my own - the people who most wanted to put lists into the syntax were usually the same ones who insisted that they could not, or should not, be given a semantics. I suspect that this is because those folk are slipping into thinking of RDF as a kind of programming language rather than a descriptive logical language. 
>> 
>> Now, one can take a completely different view of lists (and other things like lists), which is that rather than having them as part of the syntax of RDF they should be some of the things that RDF describes, just as it is used to describe wine, consumer goods, animal species and everything else in the wonderful world of wikipedia. Then they don't have to be given an RDF semantcs because they are in the RDF semantics universe along with everything else. And that is what the WG decided to do. But given the relatively poverty of RDF as a descriptive language, it's impossible to put very tight constraints on what lists look like, so one gets the oddities described in https://www.w3.org/TR/rdf11-mt/#rdf-collections. 
>> 
>> OK, just a small growl from the dugout. Anyone who advocates having lists in RDF syntax, please don't speak until you have at least a sketch of what they are supposed to mean, that you are willing to commit to.
>> 
>> Pat Hayes
>> 
>> 
>
Received on Sunday, 4 September 2022 11:18:56 UTC