W3C home > Mailing lists > Public > semantic-web@w3.org > June 2020

Re: defining the semantics of lists

From: thomas lörtsch <tl@rat.io>
Date: Tue, 9 Jun 2020 13:54:48 +0200
Cc: semantic-web@w3.org
Message-Id: <4F0108F7-8216-4F89-9840-76DC20895EC3@rat.io>
To: "Peter F. Patel-Schneider" <pfpschneider@gmail.com>

> On 8. Jun 2020, at 17:01, Peter F. Patel-Schneider <pfpschneider@gmail.com> wrote:
> OWL DL is just a distraction here.  

Rather a side effect that I tried to properly understand.

> It's solution to the problems of lists,
> etc., in RDF can't be used in RDF itself.
> RDF is very, very bad at tight definitions. 
> If all you want to do is to add new vocabulary, go ahead.  That's not going to
> solve the problems of lists, etc., in RDF though.

Well, to a certain extent it would. The question is of course if the limited effect warrants the effort. But I have to understand the backgrounds first before I can come to any conclusion about the best way forward.

I’m currently trying to understand if a new datatype can solve the problems of lists in RDF in a way that provides significant advantages and therefor warrants the also significant effort. A fittingly defined datatype would allow to encapsulate all constraints to lists that one might want inside a single node. I’m not sure yet about the consequences though, especially how members of such lists, assuming they are URIs, relate to other occurrences of the same URI as nodes in other statements.

> If you want to add semantics to RDF or change the syntax of RDF you need to be
> very careful about saying what you want.  For example, requiring only one n'th
> member of a list will inevitably add equality to RDF, which is a major
> change.  If you want to add maximum length to lists then you are potentially
> destroying the monotonic nature of RDF.  So you need to be completely clear
> about what it is that you want.  The best way to do that is to state upfront
> what you want, without any reference to RDF.  Only then say how you want to
> represent this in RDF.  But lists and friends are hard to specify correctly. 
> (See https://en.wikipedia.org/wiki/Russell%27s_paradox for an example.)

I would try not to change the semantics of RDF all that much - by giving the 'last' property strictly descriptive semantics, just like the rest of the list vocabulary in RDF. So I would be very careful not to _require_ anything. I would attach some informal semantics similar to rdf:Lists in OWL DL - no branching, no gaps, no surplus members - but with no entailments and no guarantees (and in contrast to OWL DL syntax lists nothing in the core machinery would break from some non-wellformed Containers). Semantic extensions to RDF however could pick such finite lists up and turn the described intent into definite constraints as they see fit. Currently Containers do not even provide this descriptive property. So it would be an improvement with not too much cost and not too much effect - but in the end it might just strike a good balance. However I’m not sure yet, still investigating.

> And saying what you want technically is best done if you have a good and
> widely useful set of use cases so that your solution can be evaluated against
> some need.

I know I’m bad with test cases but at this early stage I still insist that lists are so ubiquitous that they don’t need one ;-)

> I think you are going to have to allow for empty lists, empty sets, and empty
> bags.   They are so useful that not allowing them is going to produce a lot of
> backlash.

Well, they are allowed to be empty, they just can’t be described to be empty with the containerMembership-based 'last' property. But one could probably describe emptyness through some other mechanism, maybe Shacl or Shex.


> peter

> On 6/8/20 5:42 AM, thomas lörtsch wrote:
>>> On 2. Jun 2020, at 20:00, Peter Patel-Schneider <pfpschneider@gmail.com> wrote:
>>> TLDR:  Without a firm notion of what chains are for and a specification
>>> of how they work it is not possible to determine whether they are non-
>>> problematic.
>> I’m not asking if they are problematic or not. Some of your questions I can’t answer without further discussion but those questions are quite specific to OWL DL. I still think the basic idea of what a closed list should be is reasonably clear. If and how such characteristics can be encoded in OWL DL I cannot say. The semantics of Collections in OWL DL are encoded in a few sentences of plain text. That wouldn’t be hard to do for the kind of closed Containers that I have sketched either.
>> I see from your questions that soundly defining a closed list in OWL DL is not trivial. I would however not embark on that effort if the membership property can’t be used on principal. That’s something that I’d like to be reasonably sure of beforehand.
>> What I’m asking is if the approach to describe a lists size through reference to its membership properties is problematic. I assume that question can be answered without knowing all details about the handling of malformed lists and/or exact definitions of what constitutes malformedness. 
>> It is really the question: is referring to a membership property in the object position safe in OWL DL or is it something that one should better not attempt to do?
>> I’ll try to answer some more of your questions below. A clarification first: I may have made a mistake by putting my proposal in an rdfx namespace. I just wanted to disambiguate proposed from existing vocabulary but the new stuff is intended for addition to RDF core.
>> Nomenclatura: they were called 'rdf:Chain's first, but that name was later changed to the current 'rdf:Series'. A question of taste probably but lets stick to the decision.
>>  Series  subClassOf  Container
>>  last    domain      Series
>>          range       ContainerMembeshipProperty
>> The purpose of rdf:Series is to provide lists that are ordered and finite. I wanted to have these properties defined as tight as possible as we have rdf:Containers already for lists with very weak semantics. However over the weekend I read the old RDF 1.0 Primer and MT documents from 2004 again which explain the semantics of RDF in adorable detail to stubborn civilians like me (I should file an error that those explanations where dropped from the 1.1 specs). That made me realize that in RDF descriptions and handwaving is really and justifiably all we can get. 
>> Describing that a list is meant to be finite still makes a lot of sense IMO (and after all it made enough sense to introduce the rdf:List vovabulary) and adding that capability to Containers although we have closed Collections already makes sense too as I argued in some detail in several other places of this thread. I’m still undecided between introducing a new type rdf:Series or subclassing rdf:Seq, rdf:Bag and rdf:Alt to rdf:FiniteSeq, rdf:FiniteBag and rdf:FiniteAlt but that decision is secondary to the question at hand so let’s stay with a new type rdf:Series and new property rdf:last. 
>> But now on to your questions.
>> Anything can be a list item in an rdf:Series, also an rdf:Series. In RDF I can't think of any reason, or way, to regulate what can be the value of a list item. Does OWL DL demand to forbid certain constructs? Then we’ll have to do that. Forbidding a series to have itself as member wouldn’t hurt anything I can think of.
>> From setting a limit follows that rdf:Series is supposed to have no members beyond that limit. 
>> The very nature of membership properties suggests that they are ordered, without gaps and duplicates.
>> I’m woefully unfamiliar with the subtleties that differentiate a syntactic error from an inconsistency or whatever else there is available to express unhappiness. - I would wish for surplus members to be ignored but the graph still be valid. 
> Ignoring probably produces non-mononicity.
>> - I guess entailing missing members as blank nodes is harmless enough to include it even in RDF.
> So you want some entailments, but which ones?  Is it going to computationally
> difficult to do these entailments?
>> - I reckon that entailing identity from multiple occurrences of the same membership property can be considered a subtle way of punishing careless providers of malformed lists ;-) I’d prefer to see malformed lists issue a warning or be ignored but that’s probably outside the capabilities of OWL DL?
> Again, OWL DL doesn't count here.  
>> In RDF the 'last' property notifys the consumer about the intended properties of some list. The consumer CAN check if the list actually meets that description but in case it doesn’t he will have to come up with some way to deal with the situation on his own. 
> This seems to go against the idea that there is some semantic impact of the
> last property.  You can't have it both ways.
>> I don’t see the pressing need for empty lists. If the list size was described by integers I would let them start from 0 but as membership properties start from _1 we don’t have that chance. However I think we can live with that. What one could probably do is define something to be of type rdf:Series without specifying any length but that’s of course not the same as saying that it is empty. Why do you think empty lists are important?
> See above.
>> Thomas
>>> On Mon, 2020-06-01 at 23:11 +0200, thomas lörtsch wrote:
>>>> Hi Peter,
>>>> my initial question when starting this thread was: what holds
>>>> rdf:Containers back from having the same semantic weight as
>>>> rdf:Lists? What could be done to overcome that deficit. There are
>>>> some reasons why I was asking that and not the least was to
>>>> understand semantics of lists - hence the subject line. With the
>>>> question came the idea for an answer: an rdfx:length property to
>>>> specify the length of a Container and thereby close it. With Pat's
>>>> help that developed into an rdfx:last property because that way we
>>>> don’t have to introduce arithmetic into the semantics. The intention
>>>> is exactly the same: provide a means to close rdf:Containers, make
>>>> them finite. Such a finite container could then be used instead of
>>>> rdf:Lists. It would have a less obtrusive syntax, need fewer triples,
>>>> perform much faster in the usual triple stores, would be easier to
>>>> query and could be used with OWL DL. The only real downside so far:
>>>> no syntactic sugar in Turtle - that would be another topic.
>>>> So the short answer to your questions is: the aim is to provide
>>>> Containers with a semantics similar to rdf:Lists respectively to find
>>>> out what is needed to reach that goal. 
>>>> The longer answer is: rdf:Containers intuitively provide some
>>>> semantics already. Membership properties are ordered by integer
>>>> values, starting with 1 and incremented by 1 for each entry. What’s
>>>> missing compared to lists is to be able to close a container, to make
>>>> it finite. Most of your questions can intuitively be answered from
>>>> there:
>>>>> On 1. Jun 2020, at 18:54, Peter F. Patel-Schneider <
>>>>> pfpschneider@gmail.com> wrote:
>>>>> I've been looking at the suggestions to set up chains in RDF and I
>>>>> have a
>>>>> number of questions.
>>>>> What are chains supposed to be for?  
>>>> See above.
>>> Saying that chains are supposed to have the same semantics as something
>>> else doesn't say what chains are for.  There are lots of things that
>>> chains could be for - OWL DL says that rdf:first/rdf:rest is for syntax
>>> that needs a sequence of elements.
>>> You might have said that chains are for syntax, or for some more
>>> general purpose, perhaps platonic unchangeable simple lists, but
>>> without a purpose it is hard to determine whether any proposed
>>> specification is correct or useable or useful.
>>>>> What are the semantics of chains (saying a bit more than
>>>>> rdf:Collection is not adequate)?  
>>>> See above.
>>> Well, containers don't have significant semantics in RDF beyond their
>>> triple semantics, just some informal hand waving that leaves very many
>>> questions unanswered.
>>> See https://www.w3.org/TR/rdf11-mt/#rdf-containers for more on the lack
>>> of semantics of containers.
>>>>> Is a chain more than a bunch of triples?  
>>>> Yes, otherwise what would be the point?
>>> OK, what is this extra?
>>>>> What counts as a chain?  
>>>>> Can chains be circular?
>>>> No.
>>>>> Can they be infinite?  
>>>> No, that’s the whole point, remember?
>>>>> Can they share heads?
>>>> No, there can be only one membership property :_1 
>>> Oops, I meant sharing tails, as in two lists that share a tail. 
>>> Implementing as in RDF containers prevents sharing tails.
>>>>> Can they be multi-tailed?  
>>>> No, there can be only one membership property per integer
>>> This is different, and only shows up if there is a separate tail aspect
>>> of chains, which is not compatible with implementing as containers.
>>>>> Can they have multiple values for one position?  
>>>> Same question as before, isn’t it?
>>> No, but how is this specified or enforced?  What happens in an RDF
>>> graph that contains triples like:
>>> ex:C a rdfx:Chain .
>>> ex:C rdf:_1 ex:a .
>>> ex:C rdf:_1 ex:b .
>>> ex:C rdfx:last rdf:_1.
>>> Is this invalid syntax?  Does it imply that ex:a and ex:b denote the
>>> same individual?
>>>>> Can chains be elements of chains?  
>>>> Yes, but that is another topic
>>> Determining whether chains can be elements of chains is important, and
>>> could be answered either way.
>>>>> Can chains be ungrounded?
>>>> I don’t understand.
>>> Can a chain be an element of itself, maybe indirectly?  If so, the
>>> chain can be called ungrounded.  Is this allowed?
>>>>> Is it possible to have a chain with no elements?
>>>> No, as there is no :_0 membership property.
>>> That's very limiting.  Empty chains (or lists or sets or bags) are
>>> important.
>>>> Maybe you mean missing list entries? Any list items at list positions
>>>> less or equal than the 'last' one that are not listed can be entailed
>>>> as being blank nodes.
>>> OK, here is a semantic upgrade from triples, and from containers.  How
>>> will this meaning be specified?
>>>>> In my view it is a good idea to determine what chains are supposed
>>>>> to be
>>>>> fore and how they are supposed to work before the syntax (and any
>>>>> axiomatization) of chains is presented, not least so that it can be
>>>>> determined whether the syntax (and axiomatization, if present)
>>>>> actually
>>>>> supports what is wanted.
>>>> I think we did that already and I hope you have a clearer picture
>>>> now.
>>> There are still questions that need to be answered?  For example, are
>>> two chains with the same elements the same chain?  As you have stated
>>> above that there are some semantics beyond just triples, you need to
>>> say what these semantics are.
>>>>> Making chains be rdf:Seq plus an explicit stop point appears to
>>>>> answer some
>>>>> of these questions, but the implications of this setup should be
>>>>> spelled out
>>>>> explicitly.  Further not all the questions are answered.  
>>>> Will they ever be?
>>> Yes.  You can just say that "This is the meaning of chains."  That
>>> doesn't prohibit others from using chains in situations where more
>>> meaning is required or suitabe, but it does limit what *the* meaning of
>>> chains is.  
>>> In essence, you are proposing a new kind of thing, like a set.  How
>>> does this thing work? 
>>>>> For example, what happens if there are two rdf:_<n> values in a
>>>>> chain for a particular <n>?
>>>> It is an inconsistency.
>>>> What happens if it happens to occurr all the same is another question
>>>> that we probably can’t answer.
>>> You mean an RDF inconsistency?  That seems very harsh.  If any chain in
>>> an RDF graph has multiple values for any of its elements then the
>>> entire graph becomes unsatisfiable.
>>>>> (This might be particularly problematic if the two values are both
>>>>> container
>>>>> membership properties.)  What happens if there are values for
>>>>> rdf:_<n> with
>>>>> n greater than the stop point?  
>>>> It is an inconsistency.
>>>> What happens if it happens to occurr all the same is another question
>>>> that we probably can’t answer.
>>>>> What happens if there are multiple stop points?  (This seems to be
>>>>> particularly problematic.)  
>>>> It is an inconsistency.
>>>> What happens if it happens to occurr all the same is another question
>>>> that we probably can’t answer.
>>>>> What happens if the value of rdfx:last is not one of the
>>>>> rdf:_<n>?  
>>>> If there is a bigger rdf:_<n> then it is an incosistency, see above.
>>> So any unusual aspects of a chain render the entire RDF graph
>>> meaningless.  That seems even more harsh.
>>>>> What happens if a chain is one of its own elements?
>>>> The question makes no sense to me. There are infintely many ways to
>>>> construct infinitely large or infnitely deep nested structures at any
>>>> list item position. Why do you ask for that specific one? Is it an
>>>> achievable goal to develop a definition that successfully rules out
>>>> all pathological variants?
>>> This is something that is easy to do in RDF and similar patterns are
>>> common in RDF graphs.   Is the following acceptable as a chain:
>>> ex:C a rdfx:Chain .
>>> ex:C rdf:_1 ex:C .
>>> ex:C rdf:last rdf:_1 .
>>> This is something that can be problematic, depending on what the
>>> meaning of chains is supposed to be.
>>> Adding a new thing to RDF requires consideration of how it interacts
>>> with the rest of RDF, including its "say anything about anything"
>>> philosophy.
>>>>> Are chains complicated because of the infinite vocabulary required?
>>>> That would be a question that I would hope you could answer.
>>> This question cannot be answered without a specification of what chains
>>> are supposed to be.  In some specifications the infinite vocabulary
>>> isn't more of a problem than it is in RDF.  In others it might me.
>>>>> One way to "specify" chains, of course, is to just say that they
>>>>> are a set of
>>>>> triples, and no more.  I don't think that this is what is desired
>>>>> here, though.
>>>> I don’t think it is helpful or promising or even possible to rule out
>>>> explicitly everything that you don’t think is desired. I hope however
>>>> that you have a better idea now of the issue at hand and we can get
>>>> to the real business. 
>>> I disagree strongly.   The specification of chains could be very weak -
>>> just triples, for example.  That specification is viable, but
>>> apparently not what you want.  Other specifications might add
>>> entailments, but then the interactions of these entailments with the
>>> rest of RDF needs to be investigated.   It's very much like adding a
>>> new construct to a programming language - its interactions with the
>>> other constructs of the language need to be analyzed to see what
>>> problems come up.  Without a firm specification of the construct it is
>>> not possible to determine what problems it will create.
>>>> Pat tried his best to teach  me that RDF only _describes_, not
>>>> _prescribes_ things - and lists are just another thing. Now, I surely
>>>> need more time to fully wrap my head around that but it seems that
>>>> all your questions and all my answers aim at perscribing, not
>>>> describing. I noticed how the RDF Semantics document dances around
>>>> that topic by clarifying that it can’t preclude all sorts of
>>>> pathological containers and collections. Therefor I tried to come up
>>>> with the least intrusive way to describe an intent about some
>>>> container, namely that it has ony a certain number of members.
>>>> Everything else comes from the integer order of the membership
>>>> properties. That was good enough for the last 20 years so it should
>>>> be good for the future too. I can wholeheartedly assure that my
>>>> honest intent is only to describe that some list of mine is finite
>>>> and not prescribe something about anybody elses list.
>>>> OWL DL makes some remarks what it expects from rdf:Lists, namely:
>>>> "When a list pattern is matched to G, all list variables _:xi and
>>>> _:xj with i ≠ j MUST be matched to different nodes; furthermore, it
>>>> MUST NOT be possible to match the list pattern to two maximal subsets
>>>> of G such that some list variable in the first pattern instance is
>>>> matched to the same node as some (possibly different) variable in the
>>>> second pattern instance. This is necessary in order to detect
>>>> malformed lists such as lists with internal cycles, lists that share
>>>> tails, and lists that cross."
>>>> https://www.w3.org/TR/2012/REC-owl2-mapping-to-rdf-20121211/#Mapping_from_RDF_Graphs_to_the_Structural_Specification
>>>> This is certainly more concise and complete than what I could come up
>>>> with right now but essentially it’s not rocket science either, it’s
>>>> plain text. If that’s all that’s needed, then: yes, I can do that too
>>>> for rdf:Containers with a 'last' property. And the gist of it can be
>>>> found above.
>>> OWL DL only handles some RDF graphs, and is quite explicit on which RDF
>>> graphs are not suitable for OWL DL.  So OWL DL doesn't have to handle
>>> all RDF graphs, which is not the case here.
>>>> I hope all this is enough so that you can now answer the question if
>>>> such a 'last' property on rdf:Containers would break anything in OWL
>>>> DL, catapult the ontology out of DL or otherwise look fishy and
>>>> suspicious to you.
>>> Again, as far as OWL DL is concerned, only certain RDF graphs are
>>> allowable.  As far as I can tell, some of these graphs could contain
>>> chains.
>>>> Thomas
>>>>> peter
>>>>> On 5/24/20 11:29 AM, thomas lörtsch wrote:
>>>>>> [Lots of earlier messages and some of this message snipped.]
>>>>>> I do now understand how the OWA prohibts any explicit closing of
>>>>>> a list in RDF, how RDF is all about _describing_ things, how only
>>>>>> single triples can be a bearer of truth, how RDF terms themselves
>>>>>> are not to be messed with and how the whole endeavour of formal
>>>>>> semantics under an OWA is walking a very thin line between what
>>>>>> may be inferred and what cannot be ruled out. Maybe. [0]
>>>>>> However I also lost practically all faith in the formal semantics
>>>>>> of Collections and Containers alike. If not even the simplest
>>>>>> syntactic constraints - only one head, no branching - can be
>>>>>> enforced then why bother at all with the semantics of a length
>>>>>> attribute? 
>>>>>> Why even consider an arithmetic extension? Not withstanding its
>>>>>> usefulness in other contexts I’m not convinced that some
>>>>>> arithmetic extension can ground the semantics of an
>>>>>> rdfx:hasLength property when the rdf:Container it describes has
>>>>>> so little formal standing to build on.
>>>>>> One could make rdfx:hasLength an owl:AnnotationProperty so its
>>>>>> semantics would definitely be reduced to handwaving, providing a
>>>>>> hint to applications if some list probably is complete. Closing a
>>>>>> list was deemed useful before but it was implemented with a
>>>>>> verbose syntax and in OWL DL it's off limits for users. Lists are
>>>>>> so important in practice that IMO that’s reason enough to
>>>>>> introduce something along those lines, even with _very_ limited
>>>>>> formal semantics.
>>>>>> I was also pondering the graph based approach that Cory proposed
>>>>>> but for a basic construct like lists (and trees and tables that
>>>>>> can easily be built from it) it seems a waste. Graphs should be
>>>>>> used for all kinds of stuff, even for structural features like n-
>>>>>> ary relations, but lists - rather not. At least that’s my current
>>>>>> thinking.
>>>>>> I think it can be useful in a bigger context like being able to
>>>>>> express that in some application/source/universeOfDiscourse all
>>>>>> lists are closed. But I’d rather embed that in a semantic
>>>>>> extension that fixes a few more things and formally defines a
>>>>>> Closed World Scenarios that applications often assume and
>>>>>> require.
>>>>>> Pat has in earlier mails suggested to mark the last item of a
>>>>>> list instead of providing a length attribute. That didn’t really
>>>>>> catch on with me because I lacked an idea how to do it. Meanwhile
>>>>>> the following vocabulary extension bubbled up in my head:
>>>>>> 	rdfx:Chain rdfs:subClassOf rdfs:Container .
>>>>>> 	rdfx:last rdfs:domain rdfx:Chain .
>>>>>> 	rdfx:last rdfs:range rdfs:ContainerMembershipProperty .
>>>>>> 	_:L  rdf:_1  "a" .
>>>>>> 	_:L  rdf:_2  "b" .
>>>>>> 	_:L  rdf:_3  "c" .
>>>>>> 	_:L  rdfx:last rdf:_3 .
>>>>>> I sort of like it but I’m not convinced that it's really more
>>>>>> elegant.
>>>>>> Fundamentally it doesn’t seem to make much difference:
>>>>>> - Containers still provide only a semantically weak base
>>>>>> - a missing 2nd slot would still need to be filled
>>>>>> - a surplus 4th slot would still need to be ignored
>>>>>> And maybe the counting business on ContainermembershipProperties
>>>>>> would still require an arithmetic extension? Which would still
>>>>>> not be worth the trouble because it would only stand on
>>>>>> Collections’ shifting semantic sands?
>>>>>> BTW: I don’t like the name "Chain". I would prefer "Series" but
>>>>>> I’m not a native speaker and not sure if it captures the intended
>>>>>> purpose well enough. Also "Seq" and "Ser" are easy to confuse
>>>>>> (but "Ser" gets filed one after  "Seq", so that’s good!). "Fin de
>>>>>> Seq" would of course be even nicer.
>>>>>> Thomas
>>>>>> [0] And that the RDF Semantics at https://www.w3.org/TR/rdf11-mt/
>>>>>> use the term "intent" although I got ridiculed for introducing
>>>>>> it a few mails ago: "The intended mode of use is that things of
>>>>>> type rdf:Bag are considered to be… " etc. Ha!
>>>>> [Lots of previous messages snipped.]
Received on Tuesday, 9 June 2020 11:55:16 UTC

This archive was generated by hypermail 2.4.0 : Thursday, 24 March 2022 20:42:10 UTC