Re: OWL and RDF lists from Eric Prud'hommeaux on 2022-09-06 (semantic-web@w3.org from September 2022)

From: Eric Prud'hommeaux <eric@w3.org>
Date: Tue, 6 Sep 2022 13:41:05 +0200
To: Anthony Moretti <anthony.moretti@gmail.com>
Cc: Thomas Lörtsch <tl@rat.io>, "Patrick J. Hayes" <phayes@ihmc.org>, David Booth <david@dbooth.org>, "semantic-web@w3.org" <semantic-web@w3.org>
Message-ID: <Yxcx0TK0aFnlw9E6@w3.org>
I'm mostly interested in the OWL hack; we pay a heavy usability price
because the rules say you can't reason over rdf:* rather than limiting
the inferences *of* lists (e.g. no rdf:first/rest in rule heads). That
said, I think that these amount to the same discussion because the
rules you need to impose to make OWL tractable over RDF Lists seem be
that, for inference purposes, they can't be composed by inferring
triples.


On Mon, Sep 05, 2022 at 12:13:32AM +0700, Anthony Moretti wrote:
> Hi Pat
> 
> > In fact these look to me very like a generalization of datatyped literals,
> > where the datatype might take several strings instead of just one. That
> > would be a useful generalization in any case, and an easy extension to RDF
> > without materially changing anything in the basic syntax.
> >
> 
> Yes, a generalization of datatyped literals, but in the same manner as JSON
> objects they have to take more than just strings (they have to take
> numbers, booleans, RDF literals, other composite values, etc.).
> 
> In a way, they're similar to blank nodes in Turtle, the differences being:
> 
>    - They don't expand to a blank node ID and sets of triples.
>    - They always have a "type" key, and the type is a Datatype.
> 
> They're also similar to JSON-LD value objects, but the value can be a
> composite.

I like the JSON Array object analogy and think we can learn a bunch of
things from it. I'm afraid the other analogies will distract us from
what's handled as a primitive type in most models.

Things to learn from e.g. JSON (and I presume ASN.1, but just try
looking for the abstract syntax for something called "Abstract
Syntax"...):

1. list is a (value) datatype that can contain other (value) datatypes.

   JSON is defined syntactically (mild PITA), but the abstract syntax
   appears to be:
     value: object | array | rnumber | string | boolean | ull
     object: mapping of string -> value
     array: sequence of value

   Stuffing "sequence of" into RDF's AS:
     triple: subject, predicate, object
     subject: IRI | BNode | NewList
     predicate: IRI
     object: IRI | BNode | Literal | NewList
     NewList: sequence of (IRI | BNode | Literal | NewList)


2. lists can't be constructed from other other primitives.

   There's no JSON string that means the same as `["a", "b", "c"]`.
   Javascript has some functions that might imply to you that it's the
   same as `{"1": "a", "2": "b", "3": "c"}` but those are merely API
   features.

   In RDF, we already have serializeations that parse `("a" "b", "c").`
   to a first/rest ladder:
     _:a rdf:first "a" . _:a rdf:rest _:b .
     _:b rdf:first "b" . _:a rdf:rest _:c .
     _:c rdf:first "c" . _:a rdf:rest rdf:nill .

   People use them in SPARQL queries and graph APIs (not not OWL,
   except as syntactic constructs for axioms). It would be nice to
   rewind the clock and say those never existed, but we might be able
   to engineer our way to a point where the distinction between having
   first/rest ladders and having native lists is undetectable using
   RDF standards.

   It would be tempting to add List constructors to OWL, but that
   would require care that you interpret the entire list as a
   conjunction (i.e. you don't infer things on partial lists).


3. lists are lists, not sets or bags.

   `[1 2]` is arguably equivalent to `[1 2]`, but not `[2 1]`. I'm
   fine with not having bags, alts or seqs as primitive types. SPARQL
   and OWL primitives could provide some normalization for when the
   app developer knows they are sets.


I'm less motivated than Anthony by the idea of modeling n-ary
structures (Addresses) with lists; else we slip into slot names, etc.


How to morph first/read ladders into native lists (beginning ideas):

This would be a long row to hoe, but it might be possibe. For my
purposes, I'm happy to restrict inference of lists enough to keep OWL
tractable.

1. query - add lists to SPARQL results formats

2. linting - you're not allowed to construct bogus or non-terminated lists.

3. Add native List accessor functions to SPARQL. Conventional SPARQL
   rule bodies which match lists are mapped to native functions and
   the old first/rest syntax is treated as legacy.

4. something like the SPARQL adaptation but for OWL.


> Once you have composite value types, value collections fall out as a
> special case. I say value collections because composite value types
> shouldn't contain any IRIs, I'll try to explain my view:
> 
> You can have a datatype hierarchy (I say hierarchy, but there are no
> subtypes):
> 
>     *Values* (don't have IDs)
>         Addresses
>         Coordinates
>         Floats
>         Ints
>         Lists
>         Sets
>         Strings
>         etc.
> 
> But it's entirely valid to "reify" any value by associating it with an ID,
> so you then have a corresponding class hierarchy:
> 
>     *Things* (have IDs)
>         Addresses
>         Coordinates
>         Floats
>         Ints
>         Lists
>         Sets
>         Strings
>         etc.
> 
> To be able to reason properly over any model, value types should only
> contain value types. That means if you want to have a collection that
> contains IRIs you need to use a collection from the second hierarchy, so a
> collection with an associated ID, so something like:
> 
>     :a: @list[:b, :c, :d]
> 
> Then you could use :a in other statements:
> 
>     :e :f :a
> 
> Or put it all in one line:
> 
>     :e :f {a: @list[:b, :c, :d]}
> 
> If it was just a collection of values you could use a collection from the
> first hierarchy, so a collection with no associated ID:
> 
>     :MichaelJordan :jerseyNumbers @set[23, 45]
> 
> Anthony
> 
> On Sun, Sep 4, 2022 at 6:18 PM Thomas Lörtsch <tl@rat.io> wrote:
> 
> > Hi Pat, Anthony,
> >
> >
> > I feel a bit bad because I opened this discussion without having the time
> > to properly follow through. I was rather looking for advice than prepared
> > to make a more than sketchy proposal. But I can add two or three hopefully
> > meaningful bits:
> >
> >
> > LIST OBJECTS
> >
> > The main idea of my list object proposal probably is that a list *object*
> > is opaque to RDF just like any IRI is, but it can be fully described in RDF
> > [0]. And since it is required to be syntactically valid RDF a parser has
> > little trouble to decipher it. The parser only has to be aware of that
> > annotation syntax I made up in my example in [1] (that (@L :a :b :c )
> > syntax that adds an annotation to the inside of the opening parenthesis as
> > I figured that might be easiest to parse).
> >
> > So the list object itself
> >         (@L :a :b :c )
> > is opaque to RDF just like any IRI is.
> >
> > However it is easily accessible to any RDF processor as it follows all of
> > RDF’s syntactic rules. Therefore a description of this list object with an
> > ordinary RDF list
> >         ( :a :b :c )
> > can quickly and easily be checked for accuracy anytime. This *list* that
> > faithfully describes the *list object* can then serve all intents and
> > purposes that any other, ordinary RDF lists can. It is a requirement to the
> > object that such transformation is well-defined. The main difference to
> > normal RDF lists is just that the object is known to be well-behaved in a
> > certain way (complete, ordered, finite or whatever we define it to be).
> >
> > [
> > Okay, on re-reading I see that there might be a bit missing: a definition
> > of what it means that the processor can always check on the basis of the
> > *list object* if what is said about the *list* is true. Now I should
> > probably define if the processor is required to do that check and what it
> > should do if the check returns 'false'. Well, that needs more thinking but
> > my intuition is: such false statements should be rejected. The standard
> > mode of operation would be that the processor works on the *list object*,
> > not the ordinary list which is just its description. Sorry that my thoughts
> > are still so unordered.
> > In a way we might say that the Open World Assumption still applies, we
> > just automated the removal of false statements (w.r.t. list objects only,
> > of course) that otherwise application logic or human intervention would
> > enforce - as obviously a global decentralized and universally true
> > information space is an unobtainable ideal.
> > ]
> >
> > So the assertion
> >         (@L :a :b :c ) ex:correctlyAndFullyDescribedBy ( :a :b :c )
> > is obviously true.
> >
> > OTOH another possible assertion
> >         (@L :a :b :c ) ex:correctlyAndFullyDescribedBy ( :a :b :c :d )
> > is obviously false, as is
> >         (@L :a :b :c ) ex:correctlyAndFullyDescribedBy ( :a :b )
> > - although both are of course legal per the open world semantics of RDF.
> >
> > Adding an element to the list object, i.e.
> >         (@L :a :b :c :x )
> > creates a new object. So they are immutable? Should be - but I’m not
> > entirely sure, see below under NAMING.
> >
> > In general an RDF processor that can handle list objects doesn’t need the
> > standard RDF list but can work directly on the object, it can use it like a
> > list structure in a programming language. There is no need for
> > first-rest-ladders or numbered members other than for backward
> > compatability.
> >
> > Of course such lists are nestable. An interesting question is if nested
> > lists could also be normal lists that are ruled by the ususal open world
> > assumption. I’m not sure if that is possible but it would of course be nice
> > to retain that freedom.
> >
> > To answer Pat’s question if the IRIs in the list object denote just like
> > any IRI outside the list: a basic list object should certainly be defined
> > that way to blend in as seeamlessly as possible with the rest of RDF. But
> > we could also define other list objects with semantics in which list items
> > are i.e. referentially opaque. We will have to have this discussion w.r.t.
> > named graphs (in the RDF 2.x sense, not in the Carroll et al 2005 sense) at
> > some point anyway and there is no reason why we couldn’t employ the results
> > to lists or other objects as well. Which brings me to the second point...
> >
> >
> > CONFIGURABLE OBJECTS
> >
> > I agree with Anthony that we don’t need to constrain ourselves to lists
> > but I wanted to keep the example task simple to better understand the
> > problem in principal. But a second step that goes from lists to other
> > objects, i.e. defined as shapes in Shacl/Shex, and defines a properly
> > extensible syntax to describe them (something less clumsy than the @L
> > above) would certainly be desirable.
> >
> > We might of course also tackle the problem from the opposite direction and
> > define a nesting syntax with configurable semantics that is applicable to
> > any element of RDF - nodes, statements, graphs, what have you. Assuming the
> > syntax of that new RDFsuper is curly braces we could have something like:
> >         {@UNA
> >                 {@NUNA
> >                         :a :b {@CWA (@L:c :d :e), :f } .
> >                         {@UNA :g} :h :i .
> >                 } :x {@REFOP :y, :z }.
> >         }
> >
> > where @UNA stands for Unique Name Assumption, @CWA for Closed World
> > Assumption, @L for finite lists, @NUNA for No Unique Name Assumption (to
> > override/lift that restriction on an inner element) and @REFOP for
> > referential opacity. Also the annotations are made on nodes, statements and
> > graphs alike. Anything is possible although not everything makes sense.
> > Obviously I didn’t properly think all details of this example through.
> > Nonetheless I’m not ruling out that such a radical approach has merit. One
> > could still put a sticker on it saying "Use with care, and at your own
> > risk". Generally I’m not a fan of premature restrictions (like no literals
> > in subject position, no blank nodes as predicates).
> >
> >
> > NAMING
> >
> > Right now I am (or should be) bending my head over the discussions the RDF
> > 1.1 WG had on graph naming and so I have an idea of how hard it is to
> > define a sensible naming semantics. The usual problems with identity apply
> > here just as anywhere else in RDF.
> > A list asserted to be equal to an identifier, i.e.
> >         (@L :a :b :c ) owl:sameAs :ex:MyFirstListObject
> > will always be that list. Adding an :x creates a new list that is no
> > longer owl:sameAs :ex:MyFirstListObject. Otherwise the assertion
> >         (@L :a :b :c ) owl:sameAs (@L :a :b :c :x)
> > would result, which is clearly false (and which a list object aware RDF
> > processor can easily spot and report as an error).
> > So does defining a proper name require extra syntax? Probably...
> >
> >
> > Best,
> > Thomas
> >
> >
> > [0] I found that idea (later) in the archives: Graham Klyne had mentioned
> > a similar approach in
> > https://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2002Mar/0267.html but
> > it seems that nobody answered. So maybe it is not stupid but still has some
> > obvious flaw?
> > [1] https://lists.w3.org/Archives/Public/semantic-web/2022Aug/0051.html
> >
> >
> > > Am 04.09.2022 um 09:11 schrieb Patrick J. Hayes <phayes@ihmc.org>:
> > >
> > > Hi Anthony
> > >
> > > OK, composite value types. ALong the lines of my previous emial, if we
> > introduce these into RDF syntax then we have to give them a semantics (or
> > openly declare that they have no semantics, which IMO would be a serious
> > mistake). So what DO they mean? Can you explain?
> > >
> > > The truth conditions for the triple
> > >
> > >> :JoeBiden :hasAddress {@type: :Address, streetNumber: 1600, street:
> > "Pennsylvania Avenue NW"}
> > >
> > > are that it is true in I just when IEXT(I(:hasAddress)) contains the
> > pair <I(:JoeBiden), I({@type:….})>, so we need to be able to say what the
> > value of the interpretation mapping is when applied to a composite value
> > type. Is this related in any way at all to the denotations of IRIs inside
> > that expression? Is it a new kind of object, distinct from other things
> > that the RDF describes?
> > >
> > > If we want to have semantic constraints like 2/4=1/2 for fractions, we
> > will need to have some kind of semantics at least for the 'recognized'
> > composite types. In fact these look to me very like a generalization of
> > datatyped literals, where the datatype might take several strings instead
> > of just one. That would be a useful generalization in any case, and an easy
> > extension to RDF without materially changing anything in the basic syntax.
> > >
> > > What slightly bothers me is that things like the :hasAddress case look
> > to me like a shorthand for a bunch of triples with the same subject, itself
> > being the encoding of a n-ary relation (in this case n=3: Joe, a number and
> > a street name) whereas things like fractions and coordinates (and complex
> > numbers, dates+times etc) seem more like one triple with a complex
> > datatyped literal as the object. Which makes me suspect that there are two
> > different semantics being applied to one syntax, which is almost always a
> > bad idea. But perhaps I am worrying too much :-)
> > >
> > > Pat
> > > PS, In practice, addresses are way more complicated. See
> > https://www.mjt.me.uk/posts/falsehoods-programmers-believe-about-addresses/
> >
> > Thanks! I was looking for something like that recently but got sidetracked
> > :-)
> > >
> > >> On Sep 2, 2022, at 1:51 AM, Anthony Moretti <anthony.moretti@gmail.com>
> > wrote:
> > >>
> > >> I think it might help to first break the problem into parts, and then
> > the semantics question can be asked about each part separately.
> > >>
> > >> A holistic approach to adding a collection syntax might consist of:
> > >>      • Adding syntax for composite value types in the subject or object
> > positions, which I've previously argued is missing from RDF.
> > >>      • Adding syntax for extensionally defined collections—which can be
> > thought of as simply special cases of composite value types—in the subject
> > or object positions.
> > >>      • Adding syntax for associating an IRI with a composite value type.
> > >> Composite value types are useful for things like:
> > >>      • Addresses
> > >>      • Coordinates
> > >>      • Polygons
> > >>      • Fractions
> > >> The components of the value are enough to uniquely identify the value,
> > so an IRI is optional, and whether two values are equal or not depends upon
> > a comparison operation that can be defined canonically for each type (for
> > example, if we're talking about Fractions then 1/2 == 2/4 should be true).
> > >>
> > >> An example of (1) might be something like:
> > >>
> > >> :JoeBiden :hasAddress {@type: :Address, streetNumber: 1600, street:
> > "Pennsylvania Avenue NW"}
> > >>
> > >> Once you have that, collections are a special case, so an example of
> > (2) might be something like:
> > >>
> > >> :MichaelJordan :jerseyNumbers @set[23, 45]
> > >>
> > >> With a full composite value type, if there wasn't a special collection
> > syntax, it might look something like:
> > >>
> > >> :MichaelJordan :jerseyNumbers {@type: :Set, 1: 23, 2: 45}
> > >>
> > >> In my opinion, composite value types are closed atomic concepts, they
> > only make sense as a whole, therefore for collections described in the
> > manner of (2), no entailment should occur across the boundary of the
> > collection and there shouldn't be any semantics, and therefore the
> > collection is also fully defined and closed.
> > >>
> > >> Examples of (3) might be something like:
> > >>
> > >> :s: @set[:b, :c, :d]
> > >> :l: @list[:b, :c, :d]
> > >> :cs: @closedSet[:b, :c, :d]
> > >> :cl: @closedList[:b, :c, :d]
> > >>
> > >> In my opinion, for collections described in this way no such boundary
> > exists, entailments can be made across the boundary of the collection to
> > the members, but they can depend upon other things like the class that the
> > collection itself belongs to etc. It's still definitely not the case that
> > all things said about a collection should automatically be said about each
> > member though.
> > >>
> > >> Anthony
> > >>
> > >> On Thu, Sep 1, 2022 at 11:52 PM Patrick J. Hayes <phayes@ihmc.org>
> > wrote:
> > >>
> > >>
> > >> > On Aug 16, 2022, at 11:20 AM, David Booth <david@dbooth.org> wrote:
> > >> >
> > >> > On 8/16/22 12:56, Holger Knublauch wrote:
> > >> >> A next generation of RDF(-star) can hopefully get rid of rdf:Lists
> > through reification with an index property.
> > >> >
> > >> > That sounds surprising, given the widespread dislike of reification.
> > Do you have a pointer to an explanation?
> > >> >
> > >> > Incidentally, I personally think RDF should natively support lists,
> > which David Wood and James Leigh proposed at the 2009 W3C RDF Next Steps
> > workshop:
> > >> >
> > https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.w3.org%2F2009%2F12%2Frdf-ws%2Fpapers%2Fws14&amp;data=05%7C01%7Cphayes%40ihmc.us%7C0a77747beb0a409a1dd808da7fb4dc50%7C2b38115bebad4aba9ea3b3779d8f4f43%7C1%7C0%7C637962712068140669%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=t61SxH5CMAP%2Bt7jSbWCayCVpP3tAZfnRrPid8bTaZUw%3D&amp;reserved=0
> > >>
> > >> OK, that proposes to make lists part of the RDF /syntax/. So a triple
> > with a list as subject is now a syntactically legal RDF expression:
> > >>
> > >> [:a :b :c] :p :o .
> > >>
> > >> and this is now syntactically legal RDF.
> > >>
> > >> That was easy, but now this ball lands in my court as semantics editor.
> > If this is legal RDF syntax, it has to mean something, has to have a
> > semantics. What does it mean? How is it to be interpreted? Or, if you don't
> > like the S-word, what does this new RDF 'listriple' entail? What can be
> > inferred from it, and what is it inferrable from? For example, is this
> > valid:
> > >>
> > >> from
> > >> :a :p :o .
> > >> :b :p :o .
> > >> :c :p :o .
> > >> infer
> > >> [:a :b :c] :p :o .
> > >>
> > >> ? Or the reverse? How about
> > >>
> > >> from
> > >> [:a :b :c] :p :o .
> > >> infer
> > >> [:a :c :b] :p :o .
> > >>
> > >> No? Because if you have the first one in both directions you can prove
> > that this one is valid as well. And what if you have a list in both subject
> > and object positions? Etc.
> > >>
> > >> Note, it is not enough to answer "sometimes" or "it depends". That kind
> > of woolliness destroys the utility of RDF as an information exchange
> > notation. If the answer depends on something else, then that something else
> > has to also be somehow encoded into the RDF syntax, in order to
> > disambiguate the list notation.
> > >>
> > >> The RDF WG could never agree on what the semantics of list syntax might
> > be. And - a private observation of my own - the people who most wanted to
> > put lists into the syntax were usually the same ones who insisted that they
> > could not, or should not, be given a semantics. I suspect that this is
> > because those folk are slipping into thinking of RDF as a kind of
> > programming language rather than a descriptive logical language.
> > >>
> > >> Now, one can take a completely different view of lists (and other
> > things like lists), which is that rather than having them as part of the
> > syntax of RDF they should be some of the things that RDF describes, just as
> > it is used to describe wine, consumer goods, animal species and everything
> > else in the wonderful world of wikipedia. Then they don't have to be given
> > an RDF semantcs because they are in the RDF semantics universe along with
> > everything else. And that is what the WG decided to do. But given the
> > relatively poverty of RDF as a descriptive language, it's impossible to put
> > very tight constraints on what lists look like, so one gets the oddities
> > described in https://www.w3.org/TR/rdf11-mt/#rdf-collections.
> > >>
> > >> OK, just a small growl from the dugout. Anyone who advocates having
> > lists in RDF syntax, please don't speak until you have at least a sketch of
> > what they are supposed to mean, that you are willing to commit to.
> > >>
> > >> Pat Hayes
> > >>
> > >>
> > >
> >
> >
Received on Tuesday, 6 September 2022 11:41:19 UTC