RE: defining the semantics of lists

Pat,
>It is not easy to
> see how to do this, however. I have thought about this on and off for about
> a decade or more, and have not come up with a workable general way to do  it.

I know you have thought about these topics deeply, where is the difficulty? For example the triple: {?Thing a NyExchangeListedStock} may only be asserted or inferred in NyExchangeGraph.

> data that it happens to have at any given
> moment - the ‘current graph', so to speak.

Sure, temporality is its own can of worms (don't get me started). Whatever strategy someone may have adopted to ignore or express temporality would seem to be able to leverage a closed set capability. We could, for example, construct a closed temporal snapshot graph based on the timeframe of reified relationships. But many ontologies are limited to "now" so it would not be a factor.

> SPARQL can of course report that a  query has failed relative to some graph

At least one SparQL engine - Stardog, could not construct "ultimate parent" due to OWA.

> Nope, that does not work. Just listing the things is not enough, you also
> need a way to say what kinds of facts are being ‘closed’. For example, a list
> of employees might be complete in the sense that it lists them all, but not
> in the sense that it says everything that can be said about them. 

Yes, that is what I was suggesting in being able to express what types or predicates about specific types would be "closed" in a particular graph.

> And order is irrelevant (or at any rate is a different topic.)

Agreed, but included to complete the requested "semantics of graphs". The discussion about first/next, etc. seem to be more a data structure approach where as "ordered closed set" more semantic.

-Cory

> -----Original Message-----
> From: Patrick J Hayes <phayes@ihmc.us>
> Sent: Monday, May 18, 2020 12:44 PM
> To: Cory Casanave <cory-c@modeldriven.com>
> Cc: thomas lörtsch <tl@rat.io>; Semantic Web <semantic-web@w3.org>
> Subject: Re: defining the semantics of lists
> 
> Hi Cory
> 
> > On May 18, 2020, at 11:26 AM, Cory Casanave <cory-
> c@modeldriven.com> wrote:
> >
> > Pat,
> > Your summary provides valuable and important context, but it seems to
> be missing an important aspect with respect to "It describes things". There
> are two aspects of "open-world", the one you site "never assume that all
> the relevant data is known about some topic" and the existence criteria for
> topics (things). For some things, their inclusion in some known set is their
> differentia.
> >
> > For example, consider a N.Y. Stock exchange listed stock. The set of these
> stocks at any point in time is well known - you can't just "infer" one, it has
> to be listed - a very specific process and set of requirements. There is some
> information about that listed stock that is also curated by the exchange (it
> is a class, not just a predicate). So the inclusion in a closed set is part of the
> semantics of a listed stock. You can query the list of stocks, one is either
> listed or not. However, it is fine to say that anyone can say anything about
> such a stock.
> >
> > Another example; A company with an LEI (https://www.gleif.org/en/) -
> same situation, it is a managed set. One of the facts managed by GLEIF is
> company parentage, as reported under strict legal guidelines. A derivative
> fact is then the "ultimate parent" of any GLEIF listed company (as their
> parents have to listed as well). Under a strict "open-world assumption" we
> can't determine the ultimate parent as we can never know if there may be
> ae another "out in the wild" - yet we do know.
> >
> > We could go on with examples; contracted customers of an organization,
> etc. There are also cases where we may choose to treat some set as closed
> for a particular reasoning task. Most examples seem to be based on social
> constructs.  Without closed sets there are needed inferences that can't be
> stated, the ultimate parent being just one example. This seems like more of
> an "open-world mandate" than assumption, as the assumption can't be
> changed.
> 
> Closed world rather than closed set, but yes I agree. There are lots of
> examples of ‘closed’ collections of data out there. But RDF had to adopt an
> open-world stance towards data taken as a whole, rather than a closed-
> world stance towards the data that it happens to have at any given
> moment - the ‘current graph', so to speak. And in any case, the key signal of
> closed-world reasonnig is negation as failure: if you can't prove something,
> assume it is false (because in a closed world, if it were true you would be
> able to prove it). And nothing about ‘falsity’ can be expressed in RDF, since
> it does not have negation, of any kind. SPARQL can of course report that a
> query has failed relative to some graph, so one can use negation-by-failure
> reasoning with RDF data, just not express the result itself in RDF.
> 
> >
> > We have an existing capability for managing sets of facts - the graph. The
> graph is already a "closed" set of facts. What we can't express is that a
> graph may be complete for a specific set of types or a specific set of
> predicates about a specific set of types. With that capability we could
> express a closed set - something that is REAL in our world.
> 
> Yes, that ability – to say explicitly, in the data, that a certain set of data is
> complete wrt some kinds of information – would enable closed worlds to
> be reasoned about in an open-world reasoning framework. It is not easy to
> see how to do this, however. I have thought about this on and off for about
> a decade or more, and have not come up with a workable general way to do
> it.
> >
> > A "list" is then just an ordering of the things in a closed graph.
> 
> Nope, that does not work. Just listing the things is not enough, you also
> need a way to say what kinds of facts are being ‘closed’. For example, a list
> of employees might be complete in the sense that it lists them all, but not
> in the sense that it says everything that can be said about them. And order
> is irrelevant (or at any rate is a different topic.)
> 
> Pat
> 
> > There are a few ways to express order, such as a value ordering by some
> predicate. The needed capability for closed sets and some ordering criteria
> would seem to satisfy the need for a semantics of lists.
> >
> > -Cory Casanave
> >
> >> -----Original Message-----
> >> From: Patrick J Hayes <phayes@ihmc.us>
> >> Sent: Friday, May 15, 2020 12:07 AM
> >> To: thomas lörtsch <tl@rat.io>
> >> Cc: Semantic Web <semantic-web@w3.org>
> >> Subject: Re: defining the semantics of lists
> >>
> >> Hi Thomas
> >>
> >> Let me explain why the semantics of the RDF containers is the way it is.
> >> Several members of the RDF WG were surprised by this, but it kind of
> >> follows inevitably from other, more basic, design decisions of RDF.
> >>
> >> First, RDF is NOT designed to be a datastructure language: it is a
> >> descriptive language. It describes things. The semantics is entirely
> >> set up with this basic design decision in mind. And second, it
> >> describes things under an open- world assumption. That point
> >> (open-world vs. closed-world) was always controversial, but it was
> >> thrashed out very early in the design process and became a
> >> fundamental design choice, on the grounds that a Web-based
> >> description language can never assume that all the relevant data is
> >> known about some topic. So this means that given any piece of RDF,
> >> you can cut out some piece of it, or adjoin some more RDF to it, without
> anything breaking.
> >>
> >> So now, how could rdfx:ClosedSeq work? Presumably it would come
> along
> >> with a bunch of assertions about the first, second, third etc.
> >> elements of the seq, and maybe a way of saying that the one of them
> >> is the last item, so we might need rdfx:LastItemIn. Suppose however
> >> that we simply don’t have a triple that specifies the second element.
> >> Is this an error? Or just an incomplete description? If the latter,
> >> what if we omit the LastItemIn triple; then we don't know how long
> >> this seq is. Is that also an incomplete description (as the
> >> open-world assumption requires) or is it an error? What happens if we
> >> are told that A is the second item and also that B is the second
> >> item? Is that an error, or can we conclude that owl:sameAs A B ?  If
> >> we take the open-world choice in these cases then this is hardly
> >> distinguishable from that we have already. But if we say that
> >> incomplete or ‘excessive’ information is an error then we don’t
> >> really have an RDF graph, since the extra constraints amount to a
> >> fundamental change to the idea of graph syntax. A ‘legal’ RDF graph
> >> now is not just a set of triples: it has global constraints on what must be
> present or what is allowed to be present. This is of course possible, but it
> would change RDF fundamentally.
> >>
> >> Now, another way to go would be to say that RDF needs containers, but
> >> it doesn’t need to describe them using triples. We could just allow a
> >> new kind of construct as a node in a triple (in addition to IRIs,
> >> Bnodes and literals) and give it its own definition. We would have to
> >> invent new syntax to represent them, of course, which would break all
> >> known RDF engines, but maybe it would be worth it (?) Then sequences
> >> (etc) would be much more like conventional datastructures. Of course,
> >> the semantics would have to say something about these things, but not
> >> much. (For example, we might require that IRIs inside sequences
> >> denote the same thing as they do outside, basic things like that.) This
> would not break the open world assumption.
> >>
> >> Anyway, I hope this helps people think about what the issues are :-)
> >>
> >> Best wishes
> >>
> >> Pat
> >>
> >>
> >>> On May 14, 2020, at 8:18 AM, thomas lörtsch <tl@rat.io> wrote:
> >>>
> >>> I’m aware that the topic of lists in RDF can ingnite lively debate
> >>> nearly as
> >> much as blank nodes so my apologies in advance. I have a very
> >> specific question and I don’t intend to discuss the use of lists in
> >> OWL, syntactic sugar in Turtle, querying in SPARQL or historic
> >> details about how some decisions came to be (although I do find all
> >> that very interesting, but another time...).
> >>>
> >>> Lists from the RDF container vocabulary - rdf:Seq, rdf:Bag and
> >>> rdf:Alt -
> >> can’t be closed whereas lists from the collection vocabulary -
> >> rdf:List - are always closed. Collections, in constrast to
> >> containers, are popularly considered to "have semantics" because of this
> closing characteristic.
> >>> The excruciatingly exact Lisp-style modelling of rdf:Lists through
> >> rdf:first/rest/nil properties does indeed leave no room for
> >> misunderstanding about the listiness and closedness of an rdf:List.
> >>> What level of semantics could be provided for containers by
> >>> explicitly
> >> defining either a closed container, e.g. rdfx:ClosedSeq, or an
> >> appropriate property, e.g. rdfx:hasLength, that implicitly closes a
> container?
> >>>
> >>> I reckon that semantics introduced per definition are always
> >>> somewhat
> >> weaker than semantics that emenate naturally and unmistakably from a
> >> datastructure itself. However collections have a lot of disadvantages
> >> (that I promised above not to discuss) and I wonder how workable the
> >> semantics provided by defining an rdfx:ClosedSeq class or an
> >> rdfx:hasLength property would be. Would they be able to take some
> >> load, none at all, not enough, or almost the same as collections?
> >>>
> >>> Thomas
> >>
> >

Received on Monday, 18 May 2020 18:01:37 UTC