Re: RDF lists/arrays and n-ary relations [was Re: OWL and RDF lists] from Anthony Moretti on 2022-09-24 (semantic-web@w3.org from September 2022)

From: Anthony Moretti <anthony.moretti@gmail.com>
Date: Sat, 24 Sep 2022 14:23:00 +0700
To: Dan Brickley <danbri@danbri.org>
Cc: Hugh Glaser <hugh@glasers.org>, Dan Brickley <danbri@google.com>, David Booth <david@dbooth.org>, Pierre-Antoine Champin <pierre-antoine@w3.org>, Semantic Web <semantic-web@w3.org>
Message-ID: <CACusdfQoc_UKpTaU5bD5Oqen1cptV8em-V37ZJ-UjitD+KUsng@mail.gmail.com>
Thanks for the link, Dan. It's so hard to read XML!

It looks like their aboutEach is the same as object lists in Turtle and
arrays in JSON-LD, so useful as a shorthand for multiple triples with
repeated properties.

Section 3.5 of that spec, *"Containers versus repeated properties"*, goes
on to give this description:

  :Sue :publication :AnthologyOfTime
  :Sue :publication :ZoologicalReasoning
  :Sue :publication :GravitationalReflections

And, if I convert the last paragraph, also this description:

  :Sue :publications :Sue'sWorks
  :Sue'sWorks rdf:type rdf:Bag
  :Sue'sWorks rdf:_1 :AnthologyOfTime
  :Sue'sWorks rdf:_2 :ZoologicalReasoning
  :Sue'sWorks rdf:_3 :GravitationalReflections

But if you had a triple like the following:

  :publications :pluralOf :publication

You could link the two examples because the second example would entail the
first, and there wouldn't be any conflict if the graphs were merged.

Anthony

On Sat, Sep 24, 2022 at 12:00 PM Dan Brickley <danbri@danbri.org> wrote:

>
>
> On Fri, 23 Sep 2022 at 07:22, Hugh Glaser <hugh@glasers.org> wrote:
>
>> Hi.
>> I’ve been following much of this, although not all, and offer a few
>> comments and maybe misgivings about the discussion.
>> Sorry if I have just restarted things, at the wrong moment, and also to
>> be visiting old ground.
>>
>> It is not good, I think, to discuss RDF features of how to represent
>> knowledge without significant examples of how it would then be used in a
>> practical application, for real-looking examples.
>> I have seen almost nothing of this.
>>
>> I don’t think talking about “lists” at all is a good term.
>> Lists were invented and named in a world where modern, sophisticated,
>> data structures were not available.
>> Once data structures became available, the use of lists per se became
>> much rarer.
>> This is because it is much better to use proper identifiers for the
>> relations, carrying meaning in the name, than generics such as CAR, CDR,
>> “next”, CDDDR etc.
>> Although RDF is of course not a data structure, I would suggest that the
>> same observation applies.
>> (That means that if we really want a “list” syntactic sugar, then the
>> relation corresponding to “next” should be specified as part of it.)
>> In fact, almost every time I have thought that a list is what I wanted,
>> it has turned out to be a bad fit, and in the end a natural knowledge
>> structure with good, application-specific names for all the relations was
>> much better.
>>
>> If you start to talk about indexing the construct, then it isn’t really a
>> list that you wanted any more, it is more like an array, possibly easily
>> mutable.
>>
>> My view of what is needed:
>> We want ordering of the sort provided by simply source text ordering in
>> things like XML & JSON.
>> I think the time that I feel the need for this is exemplified by the use
>> of things like dct:creator for publications.
>> Pierre-Antoine reports Dan getting it right below (of course!).
>> >   # Example 10 expanded
>> >   <#paper1> schema:creator
>> >       <#alice> {| ex:order 1 |},
>> >       <#bob> {| ex:order 2 |},
>> >       <#charlie> {| ex:order 3, ex:last |}.
>>
>> But this is very much not what anyone would call a list.
>> As he points out, the crucial thing is that the schema:creator relation
>> is asserted for all the authors.
>
>
> I don’t entirely remember that discussion:) but yes it seems common to
> want the ordered and unordered versions of author list descriptions to
> present some kind of shared common face to the world. If you know I am the
> 3rd named author of a paper, you ought not be suprised to hear that I am an
> author of that paper!
>
> Btw the original (quarter century ago!) rdf specs attempted this via
> aboutEach but only in the xml syntax, so the expansion happens when parsing:
>
> https://www.w3.org/TR/1999/REC-rdf-syntax-19990222/#containers
>
> Ora at least will remember!
>
> Excerpting 3.3:
>
> “””3.3 Distributive Referents: Statements about Members of a Container
>
> Container structures give rise to an issue about statements: when a
> statement is made referring to a collection, what "thing" is the statement
> describing? Or in other words, to what object is the statement is
> referring? Is the statement describing the container itself or is the
> statement describing the members of the container? The object being
> described (in the XML syntax indicated by the about attribute) is in RDF
> called the *referent*.
>
> The following example:
>
> <*rdf*:Bag ID="pages">
>   <*rdf*:li resource="http://foo.org/foo.html" />
>   <*rdf*:li resource="http://bar.org/bar.html" />
> </*rdf*:Bag>
>
> <*rdf*:Description about="#pages">
>   <*s*:Creator>Ora Lassila</*s*:Creator>
> </*rdf*:Description>
>
> expresses that "Ora Lassila" is the creator of the Bag "pages". It does
> not, however, say anything about the individual pages, the members of the
> Bag. The referent of the Description is the container (the Bag), not its
> members. One would sometimes like to write a statement about each of the
> contained objects individually, instead of the container itself. In order
> to express that "Ora Lassila" is the creator of each of the pages, a
> different kind of referent is called for, one that *distributes* over the
> members of the container. This referent in RDF is expressed using the
> aboutEach attribute:
>
>   [3a] idAboutAttr    ::= idAttr | aboutAttr | aboutEachAttr
>   [26] aboutEachAttr  ::= 'aboutEach="' URI-reference '"'
>
> As an example, if we wrote
>
> <*rdf*:Description aboutEach="#pages">
>   <*s*:Creator>Ora Lassila</*s*:Creator>
> </*rdf*:Description>
>
> we would get the desired meaning. We will call the new referent type a *distributive
> referent*. Distributive referents allow us to "share structure" in an RDF
> Description. For example, when writing several Descriptions that all have
> a number of common statement parts (predicates and objects), the common
> parts can be shared among all the Descriptions, possibly resulting in
> space savings and more maintainable metadata. The value of an aboutEach attribute
> must be a container. Using a distributive referent on a container is the
> same as making all the statements about each of the members separately.
>
> No explicit graph representation of distributive referents is defined.
> Instead, in terms of the statements made, distributive referents are
> expanded into the individual statements about the individual container
> members (internally, implementations are free to retain information about
> the distributive referents - in order to save space, for example - as long
> as any querying functions work as if all of the statements were made
> individually). Thus, with respect to the resources "foo" and "bar", the
> above example is equivalent to
>
> <*rdf*:Description about="http://foo.org/foo.html">
>   <*s*:Creator>Ora Lassila</*s*:Creator>
> </*rdf*:Description>
>
> <*rdf*:Description about="http://bar.org/bar.html">
>   <*s*:Creator>Ora Lassila</*s*:Creator>
> </*rdf*:Description>“””
>
>
> Cheers,
>
> Dan
>
>
>
>
>> And *also* there is somehow an ordering relationship among the objects of
>> those triples.
>> But this is actually not a “next” ordering.
>> The knowledge I am representing is rarely so that when I query it, if I
>> know the third author I would like to know the fourth.
>> It is much more that I would like to know the third author (if there is
>> one), or what ordinal in the author list a particular author is: Bob is the
>> second author.
>> “Which papers with Alice as first author have Charlie as an author?” Or
>> Charlie as third.
>> So that aspect of it looks much more like an array to me.
>> This is true of dog shows as well, I think:- “Who came 6th?”, rather than
>> “Who was next after 5th?"
>> I don’t know what to call this slightly complex structure, if it was a
>> data structure in a programming language, but since I think that the
>> schema:creator relation is often the primary knowledge, it is certainly
>> misleading to call it a list.
>> The example 10 tries to capture this, I think.
>>
>> I also don’t have any new suggestions for how this might be represented
>> in RDF - sorry.
>>
>> Best
>> Hugh
>> > On 23 Sep 2022, at 08:46, Pierre-Antoine Champin <pierre-antoine@w3.org>
>> wrote:
>> >
>> > Hi David,
>> >
>> > On 22/09/2022 23:38, David Booth wrote:
>> >> On 9/22/22 16:34, Pierre-Antoine Champin wrote:
>> >>> I think it is useful to consider every proposed extension, and
>> carefully consider whether it really requires an extension of the
>> underlying data model, or whether it can be managed purely as syntactic
>> sugar.
>> >>
>> >> Agreed.  That would be best for backward compatibility.  And it occurs
>> to me that some of these ideas for new "built-in" object types, such as
>> arrays and composite object, could actually be implemented as syntactic
>> sugar for named graphs.  For example, this array of dog show winners:
>> >>
>> >>   # Example 1
>> >>   :dogShow winners ( :ginger :bailey ) .
>> >>
>> >> might be treated as syntactic sugar for this TriG:
>> >>
>> >>   # Example 1-expanded
>> >>   :dogShow :winners N2 .
>> >>   N2 { :dogShow :winners
>> >>          [
>> >>            0 :ginger ;
>> >>            1 :bailey
>> >>          ] .
>> >>      }
>> >
>> > That's intriguing :)
>> >
>> > But I'm not sure exactly what you gain here... Querying lists in SPARQL
>> would be hardly easier as what it is today...
>> >
>> >>
>> >> where N2 is an auto-generated named graph name of some kind (TBD) --
>> perhaps a blank node, a relative URI, or a Skolem URI.  By "unblessing" N2,
>> you get to "see" the triples that implement that list object.
>> >>
>> >> And this composite diagnosis object, used for an n-ary relation:
>> >>
>> >>   # Example 6
>> >>   :christine :diagnosis @[
>> >>     :disease :breastCancer ;
>> >>     :probability 0.8
>> >>   ] .
>> >>
>> >> might be treated as syntactic sugar for this TriG:
>> >>
>> >>   # Example 6-expanded
>> >>   :christine :diagnosis N6 .
>> >>
>> >>   N6 { :christine :diagnosis
>> >>          [
>> >>            :disease :breastCancer ;
>> >>            :probability 0.8
>> >>          ] .
>> >>      }
>> >>
>> >> And this RDF-star syntax:
>> >>
>> >>   # Example 9
>> >>   :a :name "Alice" {|
>> >>       :statedBy :bob ;
>> >>       :recorded "2021-07-07"^^xsd:date
>> >>       |} .
>> >>
>> >> could be syntactic sugar for this TriG:
>> >>
>> >>   # Example 9-expanded
>> >>   :a :name "Alice" .
>> >>   N9 { :a :name "Alice" . }
>> >>   N9 :statedBy :bob ;
>> >>       :recorded "2021-07-07"^^xsd:date .
>> >
>> > This pattern for expressing triples about triples has been largely
>> discussed as an alternative to RDF-star (or, as you propose here, as a the
>> plain RDF interpretation of Turtle-star / Sparql-star).
>> >
>> > The outcome of the discussion was, IIRC, that this was not ideal,
>> because it overloads named graphs with new uses. Since a dataset is a flat
>> collection of named graphs, there is no easy way to distinguish
>> "quoted-triples-named graphs" from "plain-old-named-graphs"...
>> >
>> > Also, I think I read that some implementations do not behave well with
>> too many named graphs.
>> >
>> > Now, about the elephant in the room: it may seem strange that I am
>> strongly advocating against extending the core model of RDF, while being a
>> co-editor of RDF-star [1], which does exactly that. For the record, I first
>> lend towards making RDF-star only syntactic sugar, but was eventually
>> convinced that it deserved to become an integral part of the core model. I
>> still believe, however, that such extensions should be the exception.
>> >
>> >>
>> >> This would have the benefit of supporting labeled property graphs,
>> n-ary relations and arrays all under the same mechanism, without adding
>> anything to the RDF core.
>> >>
>> >> Thoughts?
>> >
>> > For ordered lists, Dan Brickley made a suggestion some time ago (on a
>> github issue that I can't find right now, unfortunately): they could be
>> encoded using RDF-star, like that:
>> >
>> >   # Example 10 expanded
>> >   <#paper1> schema:creator
>> >       <#alice> {| ex:order 1 |},
>> >       <#bob> {| ex:order 2 |},
>> >       <#charlie> {| ex:order 3, ex:last |}.
>> >
>> > It has the advantage of keeping the "simple" triple for each creator,
>> and is quite easy to query in SPARQL. Of course, some syntactic sugar could
>> be created to make this easier to write/read, e.g.:
>> >
>> >   # Example 10 syntactic sugae
>> >   <#paper1> schema:creator (| <#alice> <#bob> <#charlie> |).
>> >
>> > It occurs to me that RDF-star could be leveraged in a similar way with
>> your Example 6:
>> >
>> >   # Example 6 expanded with RDF-star
>> >   :christine :diagnosis _:d.
>> >   _:d
>> >     :disease :breastCancer {| ex:propertyOf _:d |};
>> >     :probability 0.8 {| ex:propertyOf _:d |}.
>> >
>> > (although this does not quite capture the "closed-ness" of
>> properties... work in progress)
>> >
>> >   pa
>> >
>> >> David Booth
>> >>
>> > <OpenPGP_0x9D1EDAEEEF98D438.asc>
>>
>>
>>
Received on Saturday, 24 September 2022 07:23:26 UTC