Re: RDF lists/arrays and n-ary relations [was Re: OWL and RDF lists] from Anthony Moretti on 2022-09-22 (semantic-web@w3.org from September 2022)

From: Anthony Moretti <anthony.moretti@gmail.com>
Date: Thu, 22 Sep 2022 09:55:24 +0700
To: Pierre-Antoine Champin <pierre-antoine@w3.org>
Cc: David Booth <david@dbooth.org>, "semantic-web@w3.org" <semantic-web@w3.org>
Message-ID: <CACusdfRAhM_+0o1_UkP=NcOUO_9dEriOARpycjPg7FR4doizdg@mail.gmail.com>
Also, if it hasn't been clear, I'm proposing four types of identifiable
collection. I've only been giving two examples because I think two is
enough to demonstrate the idea. The four types:

  @set
  @list
  @closedSet
  @closedList

For value collections only two types are needed because value collections,
like all value types, are closed by default:

  @set
  @list

As is common:

  Set: unordered, with a uniqueness restriction.
  List: ordered, without a uniqueness restriction.

Anthony






On Thu, Sep 22, 2022 at 9:28 AM Anthony Moretti <anthony.moretti@gmail.com>
wrote:

> Last one from me.
>
> If you wanted to strip it down to the minimum possible idea, a
> syntactically simpler way of grouping things than first-rest ladders, it
> might look like the following:
>
> Changes from last proposal:
>
>    - No inline definition of a collection - they take their own line, and
>    only the collection ID is used in other triples.
>    - Type information is described on its own line, the same as it is in
>    existing RDF.
>    - No compact representation of multiple triples.
>    - Delimiter syntax closer to Turtle than JSON.
>
>
> 1. Open set.
>
>   [
>     @id: :TeamUSA2020,
>     @set: (
>       :SimoneBiles,
>       :MeganRapinoe,
>       :KevinDurant,
>     )
>   ]
>   :TeamUSA2020 :type :OlympicTeam
>   :TeamUSA2020 :medalCount 113
>
> 2. Closed list.
>
>   [
>     @id: :NycToSydneyFlight,
>     @closedList: (
>       :NycToLAFlight,
>       :LALayover,
>       :LAToSydneyFlight,
>     )
>   ]
>   :NycToSydneyFlight :type :Flight
>   :NycToSydneyFlight :durationInHours 22
>
> 3. Value set (value collections are closed by definition, like all value
> types).
>
>   :MichaelJordan :jerseyNumbers @set(23, 45)
>
> Anthony
>
> On Wed, Sep 21, 2022 at 10:53 PM Anthony Moretti <
> anthony.moretti@gmail.com> wrote:
>
>> If the aim was to migrate everyone away from core RDF to these extra
>> layers then I'd agree, but if it's not and if there's still a significant
>> percentage of people using core RDF shouldn't we continue trying to make
>> core RDF more complete and maybe easier to use?
>>
>> People have been asking for a better way to do lists for so long, there's
>> something behind that isn't there? Maybe there's something I'm not getting
>> though (if so, maybe someone could explain it to me). Happy to keep trying
>> to contribute ideas anyway.
>>
>> Anthony
>>
>> On Wed, Sep 21, 2022 at 4:46 PM Pierre-Antoine Champin <
>> pierre-antoine@w3.org> wrote:
>>
>>>
>>> On 20/09/2022 00:17, David Booth wrote:
>>> > Hi Pierre-Antoine,
>>> >
>>> > On 9/18/22 21:36, Pierre-Antoine Champin wrote:
>>> >> it seems to me that RDF + Shapes + Ontology gives you all this
>>> already:
>>> >
>>> > Agreed, but that kind of illustrates the point: yes, these things are
>>> > *possible* to do using RDF, but they are substantially more difficult
>>> > than they should be.  Objects (composed of key-value pairs, which can
>>> > also give us n-ary relations) and arrays are so basic to developers,
>>> > they should be *easy*, not merely possible.
>>>
>>> I totally agree.
>>>
>>> However, I am more inclined to handle this as an extra layer on top of
>>> the core of RDF, than by making the core more complex.
>>>
>>> For example, JSON-LD provides such an extra layer, as it is designed to
>>> let people use it as "plain" JSON if they want to, handling objects,
>>> lists... as one is used to. (I'm not implying, of course, that JSON-LD
>>> solves all the problems -- only illustrating my point above.)
>>>
>>>    pa
>>>
>>> >
>>> > According to the DB-Engines site, of the top 10 graph databases, RDF
>>> > databases have only 14% of the market.  And even that is probably an
>>> > over-count, because most of those RDF databases are actually
>>> > multi-modal, so it isn't clear how many of them are being used for
>>> > their RDF capability.
>>> > https://db-engines.com/en/ranking/graph+dbms
>>> >
>>> > If it were easier in RDF to do things that are trivially easy for
>>> > programmers to do in non-RDF applications, I think RDF could get much
>>> > greater uptake.
>>> >
>>> > Best wishes,
>>> > David Booth
>>> >
>>> >>
>>> >> - Shapes can be used to guarantee that any node with a :disease
>>> >> property also has a :probability property (and vice-versa) -- and
>>> >> that these properties can't have multiple values.
>>> >>
>>> >> - Ontologies can be used to guarantee that any two nodes with the
>>> >> same :disease and :probability values are owl:sameAs.
>>> >>
>>> >> All your examples would then work with the standard [] syntax instead
>>> >> of the new @[] syntax.
>>> >>
>>> >>
>>> >> Note that Shapes + Ontologies can also be used for lists,
>>> >> constraining first/rest ladders to be well-formed. Granted, this
>>> >> would require
>>> >>
>>> >> 1) to solve the problem of rdf:first/rdf:rest being not allowed in
>>> >> OWL A-boxes, and
>>> >> 2) to extend the SPARQL syntax to make it more convenient to query
>>> lists
>>> >>
>>> >> but none of it, in my opinion, calls for an extension of RDF itself.
>>> >>
>>> >>    pa
>>> >>
>>> >> On 18/09/2022 13:20, David Booth wrote:
>>> >>> Great discussion!  It seems that lists and n-ary relations are
>>> >>> closely related, in that one could view a list as a set of key-value
>>> >>> pairs (or predicate-object pairs) of an n-ary relation.
>>> >>>
>>> >>> For example, if the Turtle list syntax were used to express a
>>> >>> built-in list object -- or more properly an *array* object -- rather
>>> >>> than a first-rest ladder of triples, then this example:
>>> >>>
>>> >>>   # Example 1
>>> >>>   :dogShow winners ( :ginger :bailey ) .
>>> >>>
>>> >>> might be almost equivalent to:
>>> >>>
>>> >>>   # Example 2
>>> >>>   :dogShow :winners [
>>> >>>     0 :ginger ;
>>> >>>     1 :bailey
>>> >>>   ] .
>>> >>>
>>> >>> if integers could be used as predicates, which they can in
>>> >>> generalized RDF.
>>> >>> https://www.w3.org/TR/rdf11-concepts/#section-generalized-rdf
>>> >>>
>>> >>> However, example 1 expresses a single triple, whereas example 2
>>> >>> expresses three triples.
>>> >>>
>>> >>> In languages that manipulate RDF, such as SPARQL and various
>>> >>> programming languages, it is always helpful to have ways to convert
>>> >>> between a built-in construct and its constituent parts, and this can
>>> >>> either be done implicitly or with explicit operators.  Implicit
>>> >>> conversion offers more convenience, but at the price of being more
>>> >>> error prone.  For example, if SPARQL did this conversion implicitly,
>>> >>> the ordered list of winners from example 1 above might be obtained
>>> by:
>>> >>>
>>> >>>   # Example 3: implicit conversion from list to set of triples
>>> >>>   SELECT ?winner ?index
>>> >>>   WHERE {
>>> >>>    :dogShow :winners [ ?index ?winner ]
>>> >>>    }
>>> >>>   ORDER BY ?index
>>> >>>
>>> >>> On the other hand, if an explicit "@[ ... ]" operator were instead
>>> >>> added to SPARQL, to convert a built-in list to its equivalent set of
>>> >>> explicit triples, then the query might look like this:
>>> >>>
>>> >>>   # Example 4: explicit conversion from list to set of triples
>>> >>>   SELECT ?winner ?index
>>> >>>   WHERE {
>>> >>>    :dogShow :winners @[ ?index ?winner ]
>>> >>>    }
>>> >>>   ORDER BY ?index
>>> >>>
>>> >>> I'm just making up a possible syntax here for illustrative purposes.
>>> >>> Some other syntax might be better.
>>> >>>
>>> >>> A method should also be provided to go the other direction: convert
>>> >>> a set of triples into the equivalent built-in object. And although I
>>> >>> think that sets and bags would also be useful, I think they could be
>>> >>> readily layered on top of lists/arrays if we get proper built-in
>>> >>> list/array support.
>>> >>>
>>> >>> Example 2 above is strikingly similar to a commonly used idiom for
>>> >>> encoding an n-ary relation:
>>> >>>
>>> >>>   # Example 5
>>> >>>   :christine :diagnosis [
>>> >>>     :disease :breastCancer ;
>>> >>>     :probability 0.8
>>> >>>   ] .
>>> >>>
>>> >>> Idioms for n-ary relations are explained in
>>> >>> https://www.w3.org/TR/swbp-n-aryRelations/
>>> >>>
>>> >>> This similarity that others have pointed out between lists and n-ary
>>> >>> relations seems like good news, because it suggests that if we can
>>> >>> figure out how to add one to RDF, we can also add the other, and
>>> >>> both are sorely needed for convenience.  For reasons why, see:
>>> >>> https://github.com/w3c/EasierRDF/issues/74
>>> >>> https://github.com/w3c/EasierRDF/issues/20
>>> >>>
>>> >>> Example 5 above is really a work-around for the lack of native n-ary
>>> >>> relations in RDF.  It expresses three triples:
>>> >>>
>>> >>>   # Example 5a -- ntriples for example 5
>>> >>>   :christine :diagnosis _:b0 .
>>> >>>   _:b0 :disease :breastCancer .
>>> >>>   _:b0 :probability 0.8 .
>>> >>>
>>> >>> However, inspired by example 4 above, perhaps a similar syntax could
>>> >>> be used to write an n-ary relation that would treat Christine's
>>> >>> suspected disease and probability as a single entity participating
>>> >>> in the :diagnosis relation:
>>> >>>
>>> >>>   # Example 6
>>> >>>   :christine :diagnosis @[
>>> >>>     :disease :breastCancer ;
>>> >>>     :probability 0.8
>>> >>>   ] .
>>> >>>
>>> >>> This differs from example 5 because example 6 expresses a *single*
>>> >>> triple that connects :christine with a diagnosis object -- not 3
>>> >>> triples.  The order in which the diagnosis properties are listed has
>>> >>> no effect -- they are a set:
>>> >>>
>>> >>>   # Example 7a: property order does not matter
>>> >>>   @[ :probability 0.8 ; :disease :breastCancer ]
>>> >>>      owl:sameAs  @[ :disease :breastCancer ; :probability 0.8 ] .
>>> >>>
>>> >>> and adding or removing a property makes it different:
>>> >>>
>>> >>>   # Example 7b
>>> >>>   @[ :probability 0.8 ; :disease :breastCancer ]
>>> >>>      :NOT_sameAs  @[ :disease :breastCancer ; :probability 0.8 :year
>>> >>> 2022 ] .
>>> >>>
>>> >>> Trying to specify the same property twice should be a syntax error:
>>> >>>
>>> >>>   # Example 7c -- INVALID -- SYNTAX ERROR!
>>> >>>   :christine :diagnosis @[
>>> >>>     :disease :breastCancer ;
>>> >>>     :disease :colonCancer ;
>>> >>>     :probability 0.8
>>> >>>   ] .
>>> >>>
>>> >>> But the following would not be a syntax error, even if it may be
>>> >>> semantically wrong:
>>> >>>
>>> >>>   # Example 7d
>>> >>>   :malady owl:sameAs :disease .
>>> >>>   :christine :diagnosis @[
>>> >>>     :disease :breastCancer ;
>>> >>>     :malady :colonCancer ;
>>> >>>     :probability 0.8
>>> >>>   ] .
>>> >>>
>>> >>> And of course, these constructs could be nested as desired.
>>> >>>
>>> >>> I think something like this could meet the need for n-ary relations
>>> >>> in some future RDF syntax.  And based on previous comments by Pat
>>> >>> and Anthony, it sounds like the semantics would not be a problem.
>>> >>>
>>> >>> Thanks very much to Thomas, Pat, Anthony and others for a very
>>> >>> helpful discussion!
>>> >>>
>>> >>> David Booth
>>> >>>
>>> >
>>>
>>
Received on Thursday, 22 September 2022 02:55:50 UTC