Re: defining the semantics of lists

Right, as has been pointed out by many, changing or extending RDF
semantics would be extremely difficult task, but it isn't necessary.
Easier would be to make your own RDF collection/container vocabulary and
a validation language for it. The language would have its own semantics,
possibly backed by some formal logic. The language syntax could be RDF.
In many ways Shapes Constraint Language (SHACL)[1] could be an
inspiration (also has RDF as syntax), could be used by your language, or
your language could be a SHACL extension. Overall, what you are going
for sounds like a generalization of an application of SHACL.

> But how could we enforce such constraint descriptions not just in
applications but within the OWA realms of RDF?

Do you mean make entire RDF graphs invalid just because it doesn't fit
some constrains for a particular purpose? Why would that be desirable?
Even if a graph says something which is invalid for some purpose, it
could be valid for another purpose - a simple example being RDF
visualization.

As a side note, if you decide to go down the rabbit hole, you might want
to consider handling collections restricted to members of a particular
class/datatype or their subclasses.

Cheers,
Jiri

[1] https://www.w3.org/TR/shacl/

On 6/8/20 12:19 PM, thomas lörtsch wrote:
>> On 4. Jun 2020, at 01:40, Jiří Procházka <ojirio@gmail.com> wrote:
>>
>> This has been an interesting thread to follow, but from the start I've
>> felt a clearly stated use case is missing. This would clear up many
>> things, possibly pointing to a solution which doesn't require changes to
>> RDF(S) semantics at all.
>>
>>> I remember someone (Pat?) rightly saying that RDF was not a data
>> structure language, but a KR (or something like that).
>>> So simply wanting the sort of things that programming languages have
>> as data structures is not necessarily a useful thing to spend time on.
>>
>>> I much prefer predicates to be specific to the range and domain that
>> they are working over.
>>
>> Agreed, personally I think the rdf:List, RDF containers vocabularies
>> should be used mainly as base classes/properties to be subclassed in
>> domain specific schema or perhaps not at all.
>>
>> That said, one might be describing things like APIs in RDF. In such
>> cases one might describe some examples of input, having to be careful to
>> not define them with an unintended structure (for example a branching or
>> looping list). If this was the use case, RDF authoring tools could
>> feature warnings for irregular structures.
>>
>> An interesting use case is describing APIs accepting RDF data as input.
>> Ideally it should be using a domain specific schema, but I can see it
>> often having constructs similar to rdf:List or RDF containers. The APIs
>> descriptions should also definite how unexpected inputs are treated.
>> This could be standardized, separately to RDF specifications.
>>
>> These are the use cases which I thought of, I would like to know of
>> others. I don't think a change of RDF semantics would be needed for these.
> 
> Right, no need to change the RDF semantics. It's two things however:
> A what RDF can describe
> B how descriptions can be enforced as constraints
> 
> A
> Lists in RDF can be described as
> - closed (rdf:List)
> - ordered (rdf:list, rdf:Seq, rdf:Bag, rdf:Alt)
> - without duplicates (rdf:Alt)
> - with preferred member (rdf:Alt)
> This is a bit messy and not all combinations are available. Adding a 'last' attribute to Containers might ameliorate the situation. The rdf:Series class that I proposed mainly as an example to facilitate discussion could be extended to address all those properties and maybe more:
> 
>             // A new subclass of Container that very explicitly describes
>             // size, preferred alternative, ordering and duplicates.
>             // Ordering and duplicates default to TRUE.
>             // If no attributes are set Series corresponds to Seq.
>             // This container type has the combined semantic expressivity
>             // of Seq, Bag and Alt plus the capability to describe its size.
>             // The immutable property is a hint that such a newly defined
>             // construct could indeed be used to describe a few more things.
> 
>   Series     subClassOf  Container
>   last       domain      Series
>              range       ContainerMembeshipProperty
>   pref       domain      Series
>              range       ContainerMembeshipProperty
>   order      domain      Series
>              range       Boolean
>   dupes      domain      Series
>              range       Boolean
>   immutable  domain      Series     // why not reach for the stars...
>              range       Boolean
> 
> B
> Translating those descriptions to constraints can be tricky as Peters questions proved but it certainly can be done in one way or the other. APIs described in RDF, OWL DL formalizations, rules, Shexl constraints are all viable options. But how could we enforce such constraint descriptions not just in applications but within the OWA realms of RDF? That requires closing the world around the description. Named Graphs come to mind. So basically:
> - define a way to guarantee sound naming semantics for Named Graphs,
> - define a way to attribute graphs with semantic extensions,
> - define such extensions (like e.g. an app friendly local CWA/NAF/UNA walled garden)
> - teach RDF consuming applications to process data according to the rules of such extensions locally, within those graphs
> and be done with it?
> 
> Thomas
> 
> 
>> Best,
>> Jiri
>>
>> On 6/4/20 12:02 AM, Hugh Glaser wrote:
>>> +1
>>>
>>> I've followed the long discussion (as best I could!).
>>> And what I kept wanting to know was an example of how whatever is being proposed might be used.
>>> I remember someone (Pat?) rightly saying that RDF was not a data structure language, but a KR (or something like that).
>>> So simply wanting the sort of things that programming languages have as data structures is not necessarily a useful thing to spend time on.
>>>
>>> And I think there are implications that raise questions.
>>> For example, if what you (plural) are trying to be able to do is something like tracks on a CD, then a predicate to connect things should be more like
>>> foo:nextTrack than simply foo:Next .
>>> How would this happen? - Can all the constructs proposed be sub-property'ed and/or sub-classed? Or however else it happens.
>>> And then would querying work?
>>> And there might be questions such as what happens when that CD is released with a bonus track?
>>>
>>> I would certainly not want anything as general as foo:next floating around my RDF.
>>> I much prefer predicates to be specific to the range and domain that they are working over.
>>>
>>> Mind you, I have to admit I have never used any of the rdf Seq/Collection etc.
>>> I haven't felt the need, and I prefer to represent the knowledge as close to its natural form as possible, rather than bend it to fit into some generic, and possibly slightly different form.
>>>
>>> It is the same problem in Lisp, of course - which is why things like Structs and Cobol Data Division were invented.
>>> If I could be given some view of the sorts of applications you are thinking of, I would be better able to understand whether the suggestions being made are fit for purpose.
>>>
>>> The bottom line, in case you are wondering:
>>> I'm not sure why I would need lists in RDF.
>>> And I have a nagging suspicion that people that want them are trying to do things in programming ways, rather than stepping back and doing the KR.
>>> But I have no evidence for that :-)
>>>
>>> Best
>>> Hugh
>>>
>>>> On 2 Jun 2020, at 19:00, Peter Patel-Schneider <pfpschneider@gmail.com> wrote:
>>>>
>>>> TLDR:  Without a firm notion of what chains are for and a specification
>>>> of how they work it is not possible to determine whether they are non-
>>>> problematic.
>>>>
>>>> On Mon, 2020-06-01 at 23:11 +0200, thomas lörtsch wrote:
>>>>> Hi Peter,
>>>>>
>>>>> my initial question when starting this thread was: what holds
>>>>> rdf:Containers back from having the same semantic weight as
>>>>> rdf:Lists? What could be done to overcome that deficit. There are
>>>>> some reasons why I was asking that and not the least was to
>>>>> understand semantics of lists - hence the subject line. With the
>>>>> question came the idea for an answer: an rdfx:length property to
>>>>> specify the length of a Container and thereby close it. With Pat's
>>>>> help that developed into an rdfx:last property because that way we
>>>>> don’t have to introduce arithmetic into the semantics. The intention
>>>>> is exactly the same: provide a means to close rdf:Containers, make
>>>>> them finite. Such a finite container could then be used instead of
>>>>> rdf:Lists. It would have a less obtrusive syntax, need fewer triples,
>>>>> perform much faster in the usual triple stores, would be easier to
>>>>> query and could be used with OWL DL. The only real downside so far:
>>>>> no syntactic sugar in Turtle - that would be another topic.
>>>>>
>>>>> So the short answer to your questions is: the aim is to provide
>>>>> Containers with a semantics similar to rdf:Lists respectively to find
>>>>> out what is needed to reach that goal.
>>>>>
>>>>> The longer answer is: rdf:Containers intuitively provide some
>>>>> semantics already. Membership properties are ordered by integer
>>>>> values, starting with 1 and incremented by 1 for each entry. What’s
>>>>> missing compared to lists is to be able to close a container, to make
>>>>> it finite. Most of your questions can intuitively be answered from
>>>>> there:
>>>>>
>>>>>> On 1. Jun 2020, at 18:54, Peter F. Patel-Schneider <
>>>>>> pfpschneider@gmail.com> wrote:
>>>>>>
>>>>>> I've been looking at the suggestions to set up chains in RDF and I
>>>>>> have a
>>>>>> number of questions.
>>>>>>
>>>>>> What are chains supposed to be for?
>>>>>
>>>>> See above.
>>>>
>>>> Saying that chains are supposed to have the same semantics as something
>>>> else doesn't say what chains are for.  There are lots of things that
>>>> chains could be for - OWL DL says that rdf:first/rdf:rest is for syntax
>>>> that needs a sequence of elements.
>>>>
>>>> You might have said that chains are for syntax, or for some more
>>>> general purpose, perhaps platonic unchangeable simple lists, but
>>>> without a purpose it is hard to determine whether any proposed
>>>> specification is correct or useable or useful.
>>>>
>>>>>> What are the semantics of chains (saying a bit more than
>>>>>> rdf:Collection is not adequate)?
>>>>>
>>>>> See above.
>>>>
>>>> Well, containers don't have significant semantics in RDF beyond their
>>>> triple semantics, just some informal hand waving that leaves very many
>>>> questions unanswered.
>>>> See https://www.w3.org/TR/rdf11-mt/#rdf-containers for more on the lack
>>>> of semantics of containers.
>>>>
>>>>>> Is a chain more than a bunch of triples?
>>>>>
>>>>> Yes, otherwise what would be the point?
>>>>
>>>> OK, what is this extra?
>>>>
>>>>>> What counts as a chain?
>>>>>> Can chains be circular?
>>>>>
>>>>> No.
>>>>>
>>>>>> Can they be infinite?
>>>>>
>>>>> No, that’s the whole point, remember?
>>>>>
>>>>>> Can they share heads?
>>>>>
>>>>> No, there can be only one membership property :_1
>>>>
>>>> Oops, I meant sharing tails, as in two lists that share a tail.
>>>> Implementing as in RDF containers prevents sharing tails.
>>>>
>>>>>> Can they be multi-tailed?
>>>>>
>>>>> No, there can be only one membership property per integer
>>>>
>>>> This is different, and only shows up if there is a separate tail aspect
>>>> of chains, which is not compatible with implementing as containers.
>>>>
>>>>>> Can they have multiple values for one position?
>>>>>
>>>>> Same question as before, isn’t it?
>>>>
>>>> No, but how is this specified or enforced?  What happens in an RDF
>>>> graph that contains triples like:
>>>> ex:C a rdfx:Chain .
>>>> ex:C rdf:_1 ex:a .
>>>> ex:C rdf:_1 ex:b .
>>>> ex:C rdfx:last rdf:_1.
>>>> Is this invalid syntax?  Does it imply that ex:a and ex:b denote the
>>>> same individual?
>>>>>
>>>>>> Can chains be elements of chains?
>>>>>
>>>>> Yes, but that is another topic
>>>>
>>>> Determining whether chains can be elements of chains is important, and
>>>> could be answered either way.
>>>>
>>>>>> Can chains be ungrounded?
>>>>>
>>>>> I don’t understand.
>>>>
>>>> Can a chain be an element of itself, maybe indirectly?  If so, the
>>>> chain can be called ungrounded.  Is this allowed?
>>>>
>>>>>> Is it possible to have a chain with no elements?
>>>>>
>>>>> No, as there is no :_0 membership property.
>>>>
>>>> That's very limiting.  Empty chains (or lists or sets or bags) are
>>>> important.
>>>>
>>>>> Maybe you mean missing list entries? Any list items at list positions
>>>>> less or equal than the 'last' one that are not listed can be entailed
>>>>> as being blank nodes.
>>>>
>>>> OK, here is a semantic upgrade from triples, and from containers.  How
>>>> will this meaning be specified?
>>>>
>>>>>
>>>>>> In my view it is a good idea to determine what chains are supposed
>>>>>> to be
>>>>>> fore and how they are supposed to work before the syntax (and any
>>>>>> axiomatization) of chains is presented, not least so that it can be
>>>>>> determined whether the syntax (and axiomatization, if present)
>>>>>> actually
>>>>>> supports what is wanted.
>>>>>
>>>>> I think we did that already and I hope you have a clearer picture
>>>>> now.
>>>>
>>>> There are still questions that need to be answered?  For example, are
>>>> two chains with the same elements the same chain?  As you have stated
>>>> above that there are some semantics beyond just triples, you need to
>>>> say what these semantics are.
>>>>
>>>>>> Making chains be rdf:Seq plus an explicit stop point appears to
>>>>>> answer some
>>>>>> of these questions, but the implications of this setup should be
>>>>>> spelled out
>>>>>> explicitly.  Further not all the questions are answered.
>>>>>
>>>>> Will they ever be?
>>>>
>>>> Yes.  You can just say that "This is the meaning of chains."  That
>>>> doesn't prohibit others from using chains in situations where more
>>>> meaning is required or suitabe, but it does limit what *the* meaning of
>>>> chains is.
>>>>
>>>> In essence, you are proposing a new kind of thing, like a set.  How
>>>> does this thing work?
>>>>
>>>>>> For example, what happens if there are two rdf:_<n> values in a
>>>>>> chain for a particular <n>?
>>>>>
>>>>> It is an inconsistency.
>>>>> What happens if it happens to occurr all the same is another question
>>>>> that we probably can’t answer.
>>>>
>>>> You mean an RDF inconsistency?  That seems very harsh.  If any chain in
>>>> an RDF graph has multiple values for any of its elements then the
>>>> entire graph becomes unsatisfiable.
>>>>
>>>>>> (This might be particularly problematic if the two values are both
>>>>>> container
>>>>>> membership properties.)  What happens if there are values for
>>>>>> rdf:_<n> with
>>>>>> n greater than the stop point?
>>>>>
>>>>> It is an inconsistency.
>>>>> What happens if it happens to occurr all the same is another question
>>>>> that we probably can’t answer.
>>>>>
>>>>>
>>>>>> What happens if there are multiple stop points?  (This seems to be
>>>>>> particularly problematic.)
>>>>>
>>>>> It is an inconsistency.
>>>>> What happens if it happens to occurr all the same is another question
>>>>> that we probably can’t answer.
>>>>>
>>>>>
>>>>>> What happens if the value of rdfx:last is not one of the
>>>>>> rdf:_<n>?
>>>>>
>>>>> If there is a bigger rdf:_<n> then it is an incosistency, see above.
>>>>
>>>> So any unusual aspects of a chain render the entire RDF graph
>>>> meaningless.  That seems even more harsh.
>>>>
>>>>
>>>>>> What happens if a chain is one of its own elements?
>>>>>
>>>>> The question makes no sense to me. There are infintely many ways to
>>>>> construct infinitely large or infnitely deep nested structures at any
>>>>> list item position. Why do you ask for that specific one? Is it an
>>>>> achievable goal to develop a definition that successfully rules out
>>>>> all pathological variants?
>>>>
>>>> This is something that is easy to do in RDF and similar patterns are
>>>> common in RDF graphs.   Is the following acceptable as a chain:
>>>> ex:C a rdfx:Chain .
>>>> ex:C rdf:_1 ex:C .
>>>> ex:C rdf:last rdf:_1 .
>>>>
>>>> This is something that can be problematic, depending on what the
>>>> meaning of chains is supposed to be.
>>>>
>>>> Adding a new thing to RDF requires consideration of how it interacts
>>>> with the rest of RDF, including its "say anything about anything"
>>>> philosophy.
>>>>
>>>>
>>>>>> Are chains complicated because of the infinite vocabulary required?
>>>>>
>>>>> That would be a question that I would hope you could answer.
>>>>
>>>> This question cannot be answered without a specification of what chains
>>>> are supposed to be.  In some specifications the infinite vocabulary
>>>> isn't more of a problem than it is in RDF.  In others it might me.
>>>>
>>>>>> One way to "specify" chains, of course, is to just say that they
>>>>>> are a set of
>>>>>> triples, and no more.  I don't think that this is what is desired
>>>>>> here, though.
>>>>>
>>>>> I don’t think it is helpful or promising or even possible to rule out
>>>>> explicitly everything that you don’t think is desired. I hope however
>>>>> that you have a better idea now of the issue at hand and we can get
>>>>> to the real business.
>>>>
>>>> I disagree strongly.   The specification of chains could be very weak -
>>>> just triples, for example.  That specification is viable, but
>>>> apparently not what you want.  Other specifications might add
>>>> entailments, but then the interactions of these entailments with the
>>>> rest of RDF needs to be investigated.   It's very much like adding a
>>>> new construct to a programming language - its interactions with the
>>>> other constructs of the language need to be analyzed to see what
>>>> problems come up.  Without a firm specification of the construct it is
>>>> not possible to determine what problems it will create.
>>>>
>>>>> Pat tried his best to teach  me that RDF only _describes_, not
>>>>> _prescribes_ things - and lists are just another thing. Now, I surely
>>>>> need more time to fully wrap my head around that but it seems that
>>>>> all your questions and all my answers aim at perscribing, not
>>>>> describing. I noticed how the RDF Semantics document dances around
>>>>> that topic by clarifying that it can’t preclude all sorts of
>>>>> pathological containers and collections. Therefor I tried to come up
>>>>> with the least intrusive way to describe an intent about some
>>>>> container, namely that it has ony a certain number of members.
>>>>> Everything else comes from the integer order of the membership
>>>>> properties. That was good enough for the last 20 years so it should
>>>>> be good for the future too. I can wholeheartedly assure that my
>>>>> honest intent is only to describe that some list of mine is finite
>>>>> and not prescribe something about anybody elses list.
>>>>>
>>>>> OWL DL makes some remarks what it expects from rdf:Lists, namely:
>>>>> "When a list pattern is matched to G, all list variables _:xi and
>>>>> _:xj with i ≠ j MUST be matched to different nodes; furthermore, it
>>>>> MUST NOT be possible to match the list pattern to two maximal subsets
>>>>> of G such that some list variable in the first pattern instance is
>>>>> matched to the same node as some (possibly different) variable in the
>>>>> second pattern instance. This is necessary in order to detect
>>>>> malformed lists such as lists with internal cycles, lists that share
>>>>> tails, and lists that cross."
>>>>> https://www.w3.org/TR/2012/REC-owl2-mapping-to-rdf-20121211/#Mapping_from_RDF_Graphs_to_the_Structural_Specification
>>>>>
>>>>> This is certainly more concise and complete than what I could come up
>>>>> with right now but essentially it’s not rocket science either, it’s
>>>>> plain text. If that’s all that’s needed, then: yes, I can do that too
>>>>> for rdf:Containers with a 'last' property. And the gist of it can be
>>>>> found above.
>>>>
>>>> OWL DL only handles some RDF graphs, and is quite explicit on which RDF
>>>> graphs are not suitable for OWL DL.  So OWL DL doesn't have to handle
>>>> all RDF graphs, which is not the case here.
>>>>
>>>>> I hope all this is enough so that you can now answer the question if
>>>>> such a 'last' property on rdf:Containers would break anything in OWL
>>>>> DL, catapult the ontology out of DL or otherwise look fishy and
>>>>> suspicious to you.
>>>>
>>>> Again, as far as OWL DL is concerned, only certain RDF graphs are
>>>> allowable.  As far as I can tell, some of these graphs could contain
>>>> chains.
>>>>
>>>>>
>>>>> Thomas
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> peter
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 5/24/20 11:29 AM, thomas lörtsch wrote:
>>>>>>> [Lots of earlier messages and some of this message snipped.]
>>>>>>> I do now understand how the OWA prohibts any explicit closing of
>>>>>>> a list in RDF, how RDF is all about _describing_ things, how only
>>>>>>> single triples can be a bearer of truth, how RDF terms themselves
>>>>>>> are not to be messed with and how the whole endeavour of formal
>>>>>>> semantics under an OWA is walking a very thin line between what
>>>>>>> may be inferred and what cannot be ruled out. Maybe. [0]
>>>>>>> However I also lost practically all faith in the formal semantics
>>>>>>> of Collections and Containers alike. If not even the simplest
>>>>>>> syntactic constraints - only one head, no branching - can be
>>>>>>> enforced then why bother at all with the semantics of a length
>>>>>>> attribute?
>>>>>>>
>>>>>>> Why even consider an arithmetic extension? Not withstanding its
>>>>>>> usefulness in other contexts I’m not convinced that some
>>>>>>> arithmetic extension can ground the semantics of an
>>>>>>> rdfx:hasLength property when the rdf:Container it describes has
>>>>>>> so little formal standing to build on.
>>>>>>>
>>>>>>> One could make rdfx:hasLength an owl:AnnotationProperty so its
>>>>>>> semantics would definitely be reduced to handwaving, providing a
>>>>>>> hint to applications if some list probably is complete. Closing a
>>>>>>> list was deemed useful before but it was implemented with a
>>>>>>> verbose syntax and in OWL DL it's off limits for users. Lists are
>>>>>>> so important in practice that IMO that’s reason enough to
>>>>>>> introduce something along those lines, even with _very_ limited
>>>>>>> formal semantics.
>>>>>>>
>>>>>>> I was also pondering the graph based approach that Cory proposed
>>>>>>> but for a basic construct like lists (and trees and tables that
>>>>>>> can easily be built from it) it seems a waste. Graphs should be
>>>>>>> used for all kinds of stuff, even for structural features like n-
>>>>>>> ary relations, but lists - rather not. At least that’s my current
>>>>>>> thinking.
>>>>>>> I think it can be useful in a bigger context like being able to
>>>>>>> express that in some application/source/universeOfDiscourse all
>>>>>>> lists are closed. But I’d rather embed that in a semantic
>>>>>>> extension that fixes a few more things and formally defines a
>>>>>>> Closed World Scenarios that applications often assume and
>>>>>>> require.
>>>>>>>
>>>>>>> Pat has in earlier mails suggested to mark the last item of a
>>>>>>> list instead of providing a length attribute. That didn’t really
>>>>>>> catch on with me because I lacked an idea how to do it. Meanwhile
>>>>>>> the following vocabulary extension bubbled up in my head:
>>>>>>>
>>>>>>>  rdfx:Chain rdfs:subClassOf rdfs:Container .
>>>>>>>  rdfx:last rdfs:domain rdfx:Chain .
>>>>>>>  rdfx:last rdfs:range rdfs:ContainerMembershipProperty .
>>>>>>>
>>>>>>>  _:L  rdf:_1  "a" .
>>>>>>>  _:L  rdf:_2  "b" .
>>>>>>>  _:L  rdf:_3  "c" .
>>>>>>>  _:L  rdfx:last rdf:_3 .
>>>>>>>
>>>>>>> I sort of like it but I’m not convinced that it's really more
>>>>>>> elegant.
>>>>>>> Fundamentally it doesn’t seem to make much difference:
>>>>>>> - Containers still provide only a semantically weak base
>>>>>>> - a missing 2nd slot would still need to be filled
>>>>>>> - a surplus 4th slot would still need to be ignored
>>>>>>>
>>>>>>> And maybe the counting business on ContainermembershipProperties
>>>>>>> would still require an arithmetic extension? Which would still
>>>>>>> not be worth the trouble because it would only stand on
>>>>>>> Collections’ shifting semantic sands?
>>>>>>>
>>>>>>>
>>>>>>> BTW: I don’t like the name "Chain". I would prefer "Series" but
>>>>>>> I’m not a native speaker and not sure if it captures the intended
>>>>>>> purpose well enough. Also "Seq" and "Ser" are easy to confuse
>>>>>>> (but "Ser" gets filed one after  "Seq", so that’s good!). "Fin de
>>>>>>> Seq" would of course be even nicer.
>>>>>>>
>>>>>>>
>>>>>>> Thomas
>>>>>>>
>>>>>>>
>>>>>>> [0] And that the RDF Semantics at https://www.w3.org/TR/rdf11-mt/
>>>>>>> use the term "intent" although I got ridiculed for introducing
>>>>>>> it a few mails ago: "The intended mode of use is that things of
>>>>>>> type rdf:Bag are considered to be… " etc. Ha!
>>>>>>>
>>>>>> [Lots of previous messages snipped.]
>>>
>>
> 

Received on Wednesday, 10 June 2020 20:30:45 UTC