Re: RDF lists/arrays and n-ary relations [was Re: OWL and RDF lists] from David Booth on 2022-09-19 (semantic-web@w3.org from September 2022)

From: David Booth <david@dbooth.org>
Date: Mon, 19 Sep 2022 18:17:05 -0400
To: Pierre-Antoine Champin <pierre-antoine@w3.org>, "semantic-web@w3.org" <semantic-web@w3.org>
Message-ID: <33a7c55c-8d4c-ba54-001c-360d963259ab@dbooth.org>
Hi Pierre-Antoine,

On 9/18/22 21:36, Pierre-Antoine Champin wrote:
> it seems to me that RDF + Shapes + Ontology gives you all this already:

Agreed, but that kind of illustrates the point: yes, these things are 
*possible* to do using RDF, but they are substantially more difficult 
than they should be.  Objects (composed of key-value pairs, which can 
also give us n-ary relations) and arrays are so basic to developers, 
they should be *easy*, not merely possible.

According to the DB-Engines site, of the top 10 graph databases, RDF 
databases have only 14% of the market.  And even that is probably an 
over-count, because most of those RDF databases are actually 
multi-modal, so it isn't clear how many of them are being used for their 
RDF capability.
https://db-engines.com/en/ranking/graph+dbms

If it were easier in RDF to do things that are trivially easy for 
programmers to do in non-RDF applications, I think RDF could get much 
greater uptake.

Best wishes,
David Booth

> 
> - Shapes can be used to guarantee that any node with a :disease property 
> also has a :probability property (and vice-versa) -- and that these 
> properties can't have multiple values.
> 
> - Ontologies can be used to guarantee that any two nodes with the same 
> :disease and :probability values are owl:sameAs.
> 
> All your examples would then work with the standard [] syntax instead of 
> the new @[] syntax.
> 
> 
> Note that Shapes + Ontologies can also be used for lists, constraining 
> first/rest ladders to be well-formed. Granted, this would require
> 
> 1) to solve the problem of rdf:first/rdf:rest being not allowed in OWL 
> A-boxes, and
> 2) to extend the SPARQL syntax to make it more convenient to query lists
> 
> but none of it, in my opinion, calls for an extension of RDF itself.
> 
>    pa
> 
> On 18/09/2022 13:20, David Booth wrote:
>> Great discussion!  It seems that lists and n-ary relations are closely 
>> related, in that one could view a list as a set of key-value pairs (or 
>> predicate-object pairs) of an n-ary relation.
>>
>> For example, if the Turtle list syntax were used to express a built-in 
>> list object -- or more properly an *array* object -- rather than a 
>> first-rest ladder of triples, then this example:
>>
>>   # Example 1
>>   :dogShow winners ( :ginger :bailey ) .
>>
>> might be almost equivalent to:
>>
>>   # Example 2
>>   :dogShow :winners [
>>     0 :ginger ;
>>     1 :bailey
>>   ] .
>>
>> if integers could be used as predicates, which they can in generalized 
>> RDF. https://www.w3.org/TR/rdf11-concepts/#section-generalized-rdf
>>
>> However, example 1 expresses a single triple, whereas example 2 
>> expresses three triples.
>>
>> In languages that manipulate RDF, such as SPARQL and various 
>> programming languages, it is always helpful to have ways to convert 
>> between a built-in construct and its constituent parts, and this can 
>> either be done implicitly or with explicit operators.  Implicit 
>> conversion offers more convenience, but at the price of being more 
>> error prone.  For example, if SPARQL did this conversion implicitly, 
>> the ordered list of winners from example 1 above might be obtained by:
>>
>>   # Example 3: implicit conversion from list to set of triples
>>   SELECT ?winner ?index
>>   WHERE {
>>    :dogShow :winners [ ?index ?winner ]
>>    }
>>   ORDER BY ?index
>>
>> On the other hand, if an explicit "@[ ... ]" operator were instead 
>> added to SPARQL, to convert a built-in list to its equivalent set of 
>> explicit triples, then the query might look like this:
>>
>>   # Example 4: explicit conversion from list to set of triples
>>   SELECT ?winner ?index
>>   WHERE {
>>    :dogShow :winners @[ ?index ?winner ]
>>    }
>>   ORDER BY ?index
>>
>> I'm just making up a possible syntax here for illustrative purposes. 
>> Some other syntax might be better.
>>
>> A method should also be provided to go the other direction: convert a 
>> set of triples into the equivalent built-in object.  And although I 
>> think that sets and bags would also be useful, I think they could be 
>> readily layered on top of lists/arrays if we get proper built-in 
>> list/array support.
>>
>> Example 2 above is strikingly similar to a commonly used idiom for 
>> encoding an n-ary relation:
>>
>>   # Example 5
>>   :christine :diagnosis [
>>     :disease :breastCancer ;
>>     :probability 0.8
>>   ] .
>>
>> Idioms for n-ary relations are explained in 
>> https://www.w3.org/TR/swbp-n-aryRelations/
>>
>> This similarity that others have pointed out between lists and n-ary 
>> relations seems like good news, because it suggests that if we can 
>> figure out how to add one to RDF, we can also add the other, and both 
>> are sorely needed for convenience.  For reasons why, see:
>> https://github.com/w3c/EasierRDF/issues/74
>> https://github.com/w3c/EasierRDF/issues/20
>>
>> Example 5 above is really a work-around for the lack of native n-ary 
>> relations in RDF.  It expresses three triples:
>>
>>   # Example 5a -- ntriples for example 5
>>   :christine :diagnosis _:b0 .
>>   _:b0 :disease :breastCancer .
>>   _:b0 :probability 0.8 .
>>
>> However, inspired by example 4 above, perhaps a similar syntax could 
>> be used to write an n-ary relation that would treat Christine's 
>> suspected disease and probability as a single entity participating in 
>> the :diagnosis relation:
>>
>>   # Example 6
>>   :christine :diagnosis @[
>>     :disease :breastCancer ;
>>     :probability 0.8
>>   ] .
>>
>> This differs from example 5 because example 6 expresses a *single* 
>> triple that connects :christine with a diagnosis object -- not 3 
>> triples.  The order in which the diagnosis properties are listed has 
>> no effect -- they are a set:
>>
>>   # Example 7a: property order does not matter
>>   @[ :probability 0.8 ; :disease :breastCancer ]
>>      owl:sameAs  @[ :disease :breastCancer ; :probability 0.8 ] .
>>
>> and adding or removing a property makes it different:
>>
>>   # Example 7b
>>   @[ :probability 0.8 ; :disease :breastCancer ]
>>      :NOT_sameAs  @[ :disease :breastCancer ; :probability 0.8 :year 
>> 2022 ] .
>>
>> Trying to specify the same property twice should be a syntax error:
>>
>>   # Example 7c -- INVALID -- SYNTAX ERROR!
>>   :christine :diagnosis @[
>>     :disease :breastCancer ;
>>     :disease :colonCancer ;
>>     :probability 0.8
>>   ] .
>>
>> But the following would not be a syntax error, even if it may be 
>> semantically wrong:
>>
>>   # Example 7d
>>   :malady owl:sameAs :disease .
>>   :christine :diagnosis @[
>>     :disease :breastCancer ;
>>     :malady :colonCancer ;
>>     :probability 0.8
>>   ] .
>>
>> And of course, these constructs could be nested as desired.
>>
>> I think something like this could meet the need for n-ary relations in 
>> some future RDF syntax.  And based on previous comments by Pat and 
>> Anthony, it sounds like the semantics would not be a problem.
>>
>> Thanks very much to Thomas, Pat, Anthony and others for a very helpful 
>> discussion!
>>
>> David Booth
>>
Received on Monday, 19 September 2022 22:17:21 UTC