Re: RDF lists/arrays and n-ary relations [was Re: OWL and RDF lists]

Hi Dan,

Yes, aboutEach seems to capture exactly what I want, and I really like it.
But it may well be that it is good it didn’t make the cut, it seems.

Since it is essentially a macro, if there is no support in the underlying store, problems of consistency will occur.
Asserting a new triple at the end of the ordered representation, for example (or deleting things), without adding (or deleting) the extra stuff will cause a mess.

With a lot of this stuff, proposals seem to be fine when there is a single assertion of the relevant data.
But when changes are made, it gets complicated, doesn’t it.

I’ll leave Anthony Moretti’s suggestion to cleverer people than me for any comment.

Best
Hugh

> On 24 Sep 2022, at 05:53, Dan Brickley <danbri@danbri.org> wrote:
> 
> 
> 
> On Fri, 23 Sep 2022 at 07:22, Hugh Glaser <hugh@glasers.org> wrote:
> Hi.
> I’ve been following much of this, although not all, and offer a few comments and maybe misgivings about the discussion.
> Sorry if I have just restarted things, at the wrong moment, and also to be visiting old ground.
> 
> It is not good, I think, to discuss RDF features of how to represent knowledge without significant examples of how it would then be used in a practical application, for real-looking examples.
> I have seen almost nothing of this.
> 
> I don’t think talking about “lists” at all is a good term.
> Lists were invented and named in a world where modern, sophisticated, data structures were not available.
> Once data structures became available, the use of lists per se became much rarer.
> This is because it is much better to use proper identifiers for the relations, carrying meaning in the name, than generics such as CAR, CDR, “next”, CDDDR etc.
> Although RDF is of course not a data structure, I would suggest that the same observation applies.
> (That means that if we really want a “list” syntactic sugar, then the relation corresponding to “next” should be specified as part of it.)
> In fact, almost every time I have thought that a list is what I wanted, it has turned out to be a bad fit, and in the end a natural knowledge structure with good, application-specific names for all the relations was much better.
> 
> If you start to talk about indexing the construct, then it isn’t really a list that you wanted any more, it is more like an array, possibly easily mutable.
> 
> My view of what is needed:
> We want ordering of the sort provided by simply source text ordering in things like XML & JSON.
> I think the time that I feel the need for this is exemplified by the use of things like dct:creator for publications.
> Pierre-Antoine reports Dan getting it right below (of course!).
> >   # Example 10 expanded
> >   <#paper1> schema:creator
> >       <#alice> {| ex:order 1 |},
> >       <#bob> {| ex:order 2 |},
> >       <#charlie> {| ex:order 3, ex:last |}.
> 
> But this is very much not what anyone would call a list.
> As he points out, the crucial thing is that the schema:creator relation is asserted for all the authors.
> 
> I don’t entirely remember that discussion:) but yes it seems common to want the ordered and unordered versions of author list descriptions to present some kind of shared common face to the world. If you know I am the 3rd named author of a paper, you ought not be suprised to hear that I am an author of that paper!
> 
> Btw the original (quarter century ago!) rdf specs attempted this via aboutEach but only in the xml syntax, so the expansion happens when parsing:
> 
> https://www.w3.org/TR/1999/REC-rdf-syntax-19990222/#containers
> 
> Ora at least will remember!
> 
> Excerpting 3.3: 
> 
> “””3.3 Distributive Referents: Statements about Members of a Container
> Container structures give rise to an issue about statements: when a statement is made referring to a collection, what "thing" is the statement describing? Or in other words, to what object is the statement is referring? Is the statement describing the container itself or is the statement describing the members of the container? The object being described (in the XML syntax indicated by the about attribute) is in RDF called the referent. 
> 
> The following example: 
> 
> <rdf
> :Bag ID="pages">
>   <
> rdf:li resource="http://foo.org/foo.html
> " />
>   <
> rdf:li resource="http://bar.org/bar.html
> " />
> </
> rdf
> :Bag>
> 
> <
> rdf
> :Description about="#pages">
>   <
> s:Creator>Ora Lassila</s
> :Creator>
> </
> rdf
> :Description>
> 
> expresses that "Ora Lassila" is the creator of the Bag "pages". It does not, however, say anything about the individual pages, the members of the Bag. The referent of the Description is the container (the Bag), not its members. One would sometimes like to write a statement about each of the contained objects individually, instead of the container itself. In order to express that "Ora Lassila" is the creator of each of the pages, a different kind of referent is called for, one that distributes over the members of the container. This referent in RDF is expressed using the aboutEach attribute:
> 
>   [3a] idAboutAttr    ::= idAttr | aboutAttr | aboutEachAttr
>   [26] aboutEachAttr  ::= 'aboutEach="' URI-reference '"'
> 
> As an example, if we wrote 
> 
> <rdf
> :Description aboutEach="#pages">
>   <
> s:Creator>Ora Lassila</s
> :Creator>
> </
> rdf
> :Description>
> 
> we would get the desired meaning. We will call the new referent type a distributive referent. Distributive referents allow us to "share structure" in an RDF Description. For example, when writing several Descriptions that all have a number of common statement parts (predicates and objects), the common parts can be shared among all the Descriptions, possibly resulting in space savings and more maintainable metadata. The value of an aboutEach attribute must be a container. Using a distributive referent on a container is the same as making all the statements about each of the members separately. 
> 
> No explicit graph representation of distributive referents is defined. Instead, in terms of the statements made, distributive referents are expanded into the individual statements about the individual container members (internally, implementations are free to retain information about the distributive referents - in order to save space, for example - as long as any querying functions work as if all of the statements were made individually). Thus, with respect to the resources "foo" and "bar", the above example is equivalent to 
> 
> <rdf:Description about="http://foo.org/foo.html
> ">
>   <
> s:Creator>Ora Lassila</s
> :Creator>
> </
> rdf
> :Description>
> 
> <
> rdf:Description about="http://bar.org/bar.html
> ">
>   <
> s:Creator>Ora Lassila</s
> :Creator>
> </
> rdf:Description>“””
> 
> Cheers,
> 
> Dan
> 
> 
> 
> 
> And *also* there is somehow an ordering relationship among the objects of those triples.
> But this is actually not a “next” ordering.
> The knowledge I am representing is rarely so that when I query it, if I know the third author I would like to know the fourth.
> It is much more that I would like to know the third author (if there is one), or what ordinal in the author list a particular author is: Bob is the second author.
> “Which papers with Alice as first author have Charlie as an author?” Or Charlie as third.
> So that aspect of it looks much more like an array to me.
> This is true of dog shows as well, I think:- “Who came 6th?”, rather than “Who was next after 5th?"
> I don’t know what to call this slightly complex structure, if it was a data structure in a programming language, but since I think that the schema:creator relation is often the primary knowledge, it is certainly misleading to call it a list.
> The example 10 tries to capture this, I think.
> 
> I also don’t have any new suggestions for how this might be represented in RDF - sorry.
> 
> Best
> Hugh
> > On 23 Sep 2022, at 08:46, Pierre-Antoine Champin <pierre-antoine@w3.org> wrote:
> > 
> > Hi David,
> > 
> > On 22/09/2022 23:38, David Booth wrote:
> >> On 9/22/22 16:34, Pierre-Antoine Champin wrote:
> >>> I think it is useful to consider every proposed extension, and carefully consider whether it really requires an extension of the underlying data model, or whether it can be managed purely as syntactic sugar.
> >> 
> >> Agreed.  That would be best for backward compatibility.  And it occurs to me that some of these ideas for new "built-in" object types, such as arrays and composite object, could actually be implemented as syntactic sugar for named graphs.  For example, this array of dog show winners:
> >> 
> >>   # Example 1
> >>   :dogShow winners ( :ginger :bailey ) .
> >> 
> >> might be treated as syntactic sugar for this TriG:
> >> 
> >>   # Example 1-expanded
> >>   :dogShow :winners N2 .
> >>   N2 { :dogShow :winners
> >>          [
> >>            0 :ginger ;
> >>            1 :bailey
> >>          ] .
> >>      }
> > 
> > That's intriguing :)
> > 
> > But I'm not sure exactly what you gain here... Querying lists in SPARQL would be hardly easier as what it is today...
> > 
> >> 
> >> where N2 is an auto-generated named graph name of some kind (TBD) -- perhaps a blank node, a relative URI, or a Skolem URI.  By "unblessing" N2, you get to "see" the triples that implement that list object.
> >> 
> >> And this composite diagnosis object, used for an n-ary relation:
> >> 
> >>   # Example 6
> >>   :christine :diagnosis @[
> >>     :disease :breastCancer ;
> >>     :probability 0.8
> >>   ] .
> >> 
> >> might be treated as syntactic sugar for this TriG:
> >> 
> >>   # Example 6-expanded
> >>   :christine :diagnosis N6 .
> >> 
> >>   N6 { :christine :diagnosis
> >>          [
> >>            :disease :breastCancer ;
> >>            :probability 0.8
> >>          ] .
> >>      }
> >> 
> >> And this RDF-star syntax:
> >> 
> >>   # Example 9
> >>   :a :name "Alice" {|
> >>       :statedBy :bob ;
> >>       :recorded "2021-07-07"^^xsd:date
> >>       |} .
> >> 
> >> could be syntactic sugar for this TriG:
> >> 
> >>   # Example 9-expanded
> >>   :a :name "Alice" .
> >>   N9 { :a :name "Alice" . }
> >>   N9 :statedBy :bob ;
> >>       :recorded "2021-07-07"^^xsd:date .
> > 
> > This pattern for expressing triples about triples has been largely discussed as an alternative to RDF-star (or, as you propose here, as a the plain RDF interpretation of Turtle-star / Sparql-star).
> > 
> > The outcome of the discussion was, IIRC, that this was not ideal, because it overloads named graphs with new uses. Since a dataset is a flat collection of named graphs, there is no easy way to distinguish "quoted-triples-named graphs" from "plain-old-named-graphs"...
> > 
> > Also, I think I read that some implementations do not behave well with too many named graphs.
> > 
> > Now, about the elephant in the room: it may seem strange that I am strongly advocating against extending the core model of RDF, while being a co-editor of RDF-star [1], which does exactly that. For the record, I first lend towards making RDF-star only syntactic sugar, but was eventually convinced that it deserved to become an integral part of the core model. I still believe, however, that such extensions should be the exception.
> > 
> >> 
> >> This would have the benefit of supporting labeled property graphs, n-ary relations and arrays all under the same mechanism, without adding anything to the RDF core.
> >> 
> >> Thoughts?
> > 
> > For ordered lists, Dan Brickley made a suggestion some time ago (on a github issue that I can't find right now, unfortunately): they could be encoded using RDF-star, like that:
> > 
> >   # Example 10 expanded
> >   <#paper1> schema:creator
> >       <#alice> {| ex:order 1 |},
> >       <#bob> {| ex:order 2 |},
> >       <#charlie> {| ex:order 3, ex:last |}.
> > 
> > It has the advantage of keeping the "simple" triple for each creator, and is quite easy to query in SPARQL. Of course, some syntactic sugar could be created to make this easier to write/read, e.g.:
> > 
> >   # Example 10 syntactic sugae
> >   <#paper1> schema:creator (| <#alice> <#bob> <#charlie> |).
> > 
> > It occurs to me that RDF-star could be leveraged in a similar way with your Example 6:
> > 
> >   # Example 6 expanded with RDF-star
> >   :christine :diagnosis _:d.
> >   _:d
> >     :disease :breastCancer {| ex:propertyOf _:d |};
> >     :probability 0.8 {| ex:propertyOf _:d |}.
> > 
> > (although this does not quite capture the "closed-ness" of properties... work in progress)
> > 
> >   pa
> > 
> >> David Booth
> >> 
> > <OpenPGP_0x9D1EDAEEEF98D438.asc>
> 
> 

Received on Tuesday, 27 September 2022 10:56:31 UTC