Re: RDF lists/arrays and n-ary relations [was Re: OWL and RDF lists]

On Fri, 23 Sep 2022 at 07:22, Hugh Glaser <hugh@glasers.org> wrote:

> Hi.
> I’ve been following much of this, although not all, and offer a few
> comments and maybe misgivings about the discussion.
> Sorry if I have just restarted things, at the wrong moment, and also to be
> visiting old ground.
>
> It is not good, I think, to discuss RDF features of how to represent
> knowledge without significant examples of how it would then be used in a
> practical application, for real-looking examples.
> I have seen almost nothing of this.
>
> I don’t think talking about “lists” at all is a good term.
> Lists were invented and named in a world where modern, sophisticated, data
> structures were not available.
> Once data structures became available, the use of lists per se became much
> rarer.
> This is because it is much better to use proper identifiers for the
> relations, carrying meaning in the name, than generics such as CAR, CDR,
> “next”, CDDDR etc.
> Although RDF is of course not a data structure, I would suggest that the
> same observation applies.
> (That means that if we really want a “list” syntactic sugar, then the
> relation corresponding to “next” should be specified as part of it.)
> In fact, almost every time I have thought that a list is what I wanted, it
> has turned out to be a bad fit, and in the end a natural knowledge
> structure with good, application-specific names for all the relations was
> much better.
>
> If you start to talk about indexing the construct, then it isn’t really a
> list that you wanted any more, it is more like an array, possibly easily
> mutable.
>
> My view of what is needed:
> We want ordering of the sort provided by simply source text ordering in
> things like XML & JSON.
> I think the time that I feel the need for this is exemplified by the use
> of things like dct:creator for publications.
> Pierre-Antoine reports Dan getting it right below (of course!).
> >   # Example 10 expanded
> >   <#paper1> schema:creator
> >       <#alice> {| ex:order 1 |},
> >       <#bob> {| ex:order 2 |},
> >       <#charlie> {| ex:order 3, ex:last |}.
>
> But this is very much not what anyone would call a list.
> As he points out, the crucial thing is that the schema:creator relation is
> asserted for all the authors.


I don’t entirely remember that discussion:) but yes it seems common to want
the ordered and unordered versions of author list descriptions to present
some kind of shared common face to the world. If you know I am the 3rd
named author of a paper, you ought not be suprised to hear that I am an
author of that paper!

Btw the original (quarter century ago!) rdf specs attempted this via
aboutEach but only in the xml syntax, so the expansion happens when parsing:

https://www.w3.org/TR/1999/REC-rdf-syntax-19990222/#containers

Ora at least will remember!

Excerpting 3.3:

“””3.3 Distributive Referents: Statements about Members of a Container

Container structures give rise to an issue about statements: when a
statement is made referring to a collection, what "thing" is the statement
describing? Or in other words, to what object is the statement is
referring? Is the statement describing the container itself or is the
statement describing the members of the container? The object being
described (in the XML syntax indicated by the about attribute) is in RDF
called the *referent*.

The following example:

<*rdf*:Bag ID="pages">
  <*rdf*:li resource="http://foo.org/foo.html" />
  <*rdf*:li resource="http://bar.org/bar.html" />
</*rdf*:Bag>

<*rdf*:Description about="#pages">
  <*s*:Creator>Ora Lassila</*s*:Creator>
</*rdf*:Description>

expresses that "Ora Lassila" is the creator of the Bag "pages". It does
not, however, say anything about the individual pages, the members of the
Bag. The referent of the Description is the container (the Bag), not its
members. One would sometimes like to write a statement about each of the
contained objects individually, instead of the container itself. In order
to express that "Ora Lassila" is the creator of each of the pages, a
different kind of referent is called for, one that *distributes* over the
members of the container. This referent in RDF is expressed using the
aboutEach attribute:

  [3a] idAboutAttr    ::= idAttr | aboutAttr | aboutEachAttr
  [26] aboutEachAttr  ::= 'aboutEach="' URI-reference '"'

As an example, if we wrote

<*rdf*:Description aboutEach="#pages">
  <*s*:Creator>Ora Lassila</*s*:Creator>
</*rdf*:Description>

we would get the desired meaning. We will call the new referent type a
*distributive
referent*. Distributive referents allow us to "share structure" in an RDF
Description. For example, when writing several Descriptions that all have a
number of common statement parts (predicates and objects), the common parts
can be shared among all the Descriptions, possibly resulting in space
savings and more maintainable metadata. The value of an aboutEach attribute
must be a container. Using a distributive referent on a container is the
same as making all the statements about each of the members separately.

No explicit graph representation of distributive referents is defined.
Instead, in terms of the statements made, distributive referents are
expanded into the individual statements about the individual container
members (internally, implementations are free to retain information about
the distributive referents - in order to save space, for example - as long
as any querying functions work as if all of the statements were made
individually). Thus, with respect to the resources "foo" and "bar", the
above example is equivalent to

<*rdf*:Description about="http://foo.org/foo.html">
  <*s*:Creator>Ora Lassila</*s*:Creator>
</*rdf*:Description>

<*rdf*:Description about="http://bar.org/bar.html">
  <*s*:Creator>Ora Lassila</*s*:Creator>
</*rdf*:Description>“””


Cheers,

Dan




> And *also* there is somehow an ordering relationship among the objects of
> those triples.
> But this is actually not a “next” ordering.
> The knowledge I am representing is rarely so that when I query it, if I
> know the third author I would like to know the fourth.
> It is much more that I would like to know the third author (if there is
> one), or what ordinal in the author list a particular author is: Bob is the
> second author.
> “Which papers with Alice as first author have Charlie as an author?” Or
> Charlie as third.
> So that aspect of it looks much more like an array to me.
> This is true of dog shows as well, I think:- “Who came 6th?”, rather than
> “Who was next after 5th?"
> I don’t know what to call this slightly complex structure, if it was a
> data structure in a programming language, but since I think that the
> schema:creator relation is often the primary knowledge, it is certainly
> misleading to call it a list.
> The example 10 tries to capture this, I think.
>
> I also don’t have any new suggestions for how this might be represented in
> RDF - sorry.
>
> Best
> Hugh
> > On 23 Sep 2022, at 08:46, Pierre-Antoine Champin <pierre-antoine@w3.org>
> wrote:
> >
> > Hi David,
> >
> > On 22/09/2022 23:38, David Booth wrote:
> >> On 9/22/22 16:34, Pierre-Antoine Champin wrote:
> >>> I think it is useful to consider every proposed extension, and
> carefully consider whether it really requires an extension of the
> underlying data model, or whether it can be managed purely as syntactic
> sugar.
> >>
> >> Agreed.  That would be best for backward compatibility.  And it occurs
> to me that some of these ideas for new "built-in" object types, such as
> arrays and composite object, could actually be implemented as syntactic
> sugar for named graphs.  For example, this array of dog show winners:
> >>
> >>   # Example 1
> >>   :dogShow winners ( :ginger :bailey ) .
> >>
> >> might be treated as syntactic sugar for this TriG:
> >>
> >>   # Example 1-expanded
> >>   :dogShow :winners N2 .
> >>   N2 { :dogShow :winners
> >>          [
> >>            0 :ginger ;
> >>            1 :bailey
> >>          ] .
> >>      }
> >
> > That's intriguing :)
> >
> > But I'm not sure exactly what you gain here... Querying lists in SPARQL
> would be hardly easier as what it is today...
> >
> >>
> >> where N2 is an auto-generated named graph name of some kind (TBD) --
> perhaps a blank node, a relative URI, or a Skolem URI.  By "unblessing" N2,
> you get to "see" the triples that implement that list object.
> >>
> >> And this composite diagnosis object, used for an n-ary relation:
> >>
> >>   # Example 6
> >>   :christine :diagnosis @[
> >>     :disease :breastCancer ;
> >>     :probability 0.8
> >>   ] .
> >>
> >> might be treated as syntactic sugar for this TriG:
> >>
> >>   # Example 6-expanded
> >>   :christine :diagnosis N6 .
> >>
> >>   N6 { :christine :diagnosis
> >>          [
> >>            :disease :breastCancer ;
> >>            :probability 0.8
> >>          ] .
> >>      }
> >>
> >> And this RDF-star syntax:
> >>
> >>   # Example 9
> >>   :a :name "Alice" {|
> >>       :statedBy :bob ;
> >>       :recorded "2021-07-07"^^xsd:date
> >>       |} .
> >>
> >> could be syntactic sugar for this TriG:
> >>
> >>   # Example 9-expanded
> >>   :a :name "Alice" .
> >>   N9 { :a :name "Alice" . }
> >>   N9 :statedBy :bob ;
> >>       :recorded "2021-07-07"^^xsd:date .
> >
> > This pattern for expressing triples about triples has been largely
> discussed as an alternative to RDF-star (or, as you propose here, as a the
> plain RDF interpretation of Turtle-star / Sparql-star).
> >
> > The outcome of the discussion was, IIRC, that this was not ideal,
> because it overloads named graphs with new uses. Since a dataset is a flat
> collection of named graphs, there is no easy way to distinguish
> "quoted-triples-named graphs" from "plain-old-named-graphs"...
> >
> > Also, I think I read that some implementations do not behave well with
> too many named graphs.
> >
> > Now, about the elephant in the room: it may seem strange that I am
> strongly advocating against extending the core model of RDF, while being a
> co-editor of RDF-star [1], which does exactly that. For the record, I first
> lend towards making RDF-star only syntactic sugar, but was eventually
> convinced that it deserved to become an integral part of the core model. I
> still believe, however, that such extensions should be the exception.
> >
> >>
> >> This would have the benefit of supporting labeled property graphs,
> n-ary relations and arrays all under the same mechanism, without adding
> anything to the RDF core.
> >>
> >> Thoughts?
> >
> > For ordered lists, Dan Brickley made a suggestion some time ago (on a
> github issue that I can't find right now, unfortunately): they could be
> encoded using RDF-star, like that:
> >
> >   # Example 10 expanded
> >   <#paper1> schema:creator
> >       <#alice> {| ex:order 1 |},
> >       <#bob> {| ex:order 2 |},
> >       <#charlie> {| ex:order 3, ex:last |}.
> >
> > It has the advantage of keeping the "simple" triple for each creator,
> and is quite easy to query in SPARQL. Of course, some syntactic sugar could
> be created to make this easier to write/read, e.g.:
> >
> >   # Example 10 syntactic sugae
> >   <#paper1> schema:creator (| <#alice> <#bob> <#charlie> |).
> >
> > It occurs to me that RDF-star could be leveraged in a similar way with
> your Example 6:
> >
> >   # Example 6 expanded with RDF-star
> >   :christine :diagnosis _:d.
> >   _:d
> >     :disease :breastCancer {| ex:propertyOf _:d |};
> >     :probability 0.8 {| ex:propertyOf _:d |}.
> >
> > (although this does not quite capture the "closed-ness" of properties...
> work in progress)
> >
> >   pa
> >
> >> David Booth
> >>
> > <OpenPGP_0x9D1EDAEEEF98D438.asc>
>
>
>

Received on Saturday, 24 September 2022 04:53:51 UTC