Re: RDF-star vs Wikidata for modelling Richard Burton

On 06/12/2021 17:50, Dan Brickley wrote:

>
>
> On Sun, 5 Dec 2021, 21:48 Cox, Simon (L&W, Clayton), 
> <Simon.Cox@csiro.au> wrote:
>
>     Isn't this example just a modelling issue?
>
>     Multiple marriages between the same two people are nevertheless
>     still multiple marriages.
>     So if the history was described in a marriage-centric way rather
>     than a person-centric way, then everything would be 'easy'.
>
>
> So the lurking question is: when to give up on overloading simple 
> binary relations with extra qualifiers?
>
> The appeal of having a "dumb downable" path to a basic binary 
> relationship is appealing. But you could imagine some rules/ontology 
> approach to inferring ?X marriedToOnceOrCurrently ?Y etc from details 
> of several marriages.
>
> But the more down that road you go, the more the "simple graph 
> database" folks will be skeptical

In LPGs as well, there is a point were modelling marriages or pipes as 
edges will hit a wall. More specifically, if you want to relate them to 
other *nodes* of the graph (link a marriage to its location, link a pipe 
to its manufacturer), you also need to reifiy them as nodes.

Interestingly, Andy Seaborne (thank to him) pointed me to Chapter 3 of 
'Graph Databases' [1], more specifically the "Avoiding Anti-Patterns" 
section. (FTR, the book is (c) Neo Technologies, so this is a position 
endorsed by the on of the major LPG vendors).

 > In the general case, don’t encode entities into relationships. Use 
relationships to con‐
 > vey  semantics  about  how  entities  are  related,  and the  
quality  of  those  relationships.
 >
 > Domain  entities  aren’t  always  immediately  visible  in speech,  
so  we  must  think  care‐
 > fully about the nouns we’re actually dealing with. Verbing, the 
language habit whereby
 > a noun is transformed into a verb, can often hide the presence of a 
noun and a corre‐
 > sponding  domain  entity.  Technical  and  business  jargon is  
particularly  rife  with  such
 > neologisms: as we’ve seen, we “email” one another, rather than send 
an email, “google”
  >for results, rather than search Google.

Strictly following these guidelines, one could dismiss both of Ora's 
examples as being anti-patterns, as marriages and pipes are clearly 
nouns, that represent entities of interest (and "marrying" or "married" 
could be considered "verbing" for marriages).

Note that I personally don't consider these examples as badly designed. 
I think the boundary between "relationship with quality" and "entity" is 
blury. And it depends at least as much on 1) features of the underlying 
data model, and 2) your use-cases, than on intrinsic ontological 
features of said entities/relationships.

   pa

[1] https://graphdatabases.com/


>
> Dan
>
>     -----Original Message-----
>     From: Peter Patel-Schneider <pfpschneider@gmail.com>
>     Sent: Saturday, 4 December, 2021 01:42
>     To: thomas lörtsch <tl@rat.io>; public-rdf-star@w3.org
>     Subject: RDF-star vs Wikidata for modelling Richard Burton
>
>     Although there is quite a bit about the Wikidata (actually
>     Wikibase) data model that I disagree with, I don't think that it
>     is fair to say that it is horrible.
>
>     As far as I can tell from
>     https://www.mediawiki.org/wiki/Wikibase/DataModel, there is no
>     ordering of the statements for an entity in Wikibase nor is there
>     any ordering of the statements for a property of an entity in
>     Wikibase.  The ordering that one sees on Wikidata pages is simply
>     an artifact of the display.  Although there may be some
>     information in Wikidata that drives this ordering it is not
>     representationally significant and can change without affecting
>     the meaning of the data in Wikidata.  (Well, at least the
>     information about Richard Burton, would not change simply because
>     ordering in the display of his marriage information changes.)
>
>     Where Wikibase is different from RDF* is that there is no
>     guarantee that there is only one statement (Wikibase's rough
>     analogue of a
>     triple) with a given entity (subject), mainSnak property
>     (predicate), and mainSnak value (object).  This can be seen in
>     Richard Burton's spouses where there are two statements with
>     entity Richard Burton, mainSnak property spouse, and value
>     Elizabeth Taylor.  My understanding is that having no auxiliary
>     information (such as start and end times) associated with these
>     two statements would be something that should be somehow fixed up,
>     but as there are different start and end times for these two
>     statements then everything is in order.
>
>     So this use case (marrying the same person more than once) works
>     very well in Wikibase, but at the cost of potentially having what
>     might be called duplicate triples.  Users of WIkidata should be
>     aware of this possibility and arrange to do whatever their right
>     thing is when they encounter this situation.
>
>     RDF* does not handle this nearly as well.  There is discussion on
>     this very point in https://github.com/w3c/rdf-star/issues/36

>
>     Note that I'm not saying that the best way to model
>     serial-monagamy- with-possibility-of-remarriage-to-the-same-person
>     is via spouse statements.  I'm just saying that Wikibase can
>     handle decorated spouse triples much better than RDF* can.  This
>     use case provides an example where uniqueness hurts for more than
>     just beliefs or provenance and shows that the :occurence solution
>     has big problems.
>
>     peter
>
>
>     On Fri, 2021-12-03 at 12:22 +0100, thomas lörtsch wrote:
>     >
>     >
>     > Am 3. Dezember 2021 11:40:10 MEZ schrieb Pierre-Antoine Champin
>     > <pierre-antoine.champin@ercim.eu>:
>     > > On 03/12/2021 09:23, Dan Brickley wrote:
>     > >
>     > > > Yes, nice and concrete!
>     > > >
>     > > > What if Alice were Elizabeth Taylor, and Bob were Richard Burton
>     > >
>     > > great minds think alike... here is the example I recently
>     proposed
>     > > for addition in the spec:
>     > >
>     > >
>     https://pr-preview.s3.amazonaws.com/w3c/rdf-star/pull/225.html#marri

>     > > ed-example
>     >
>     >
>     > You use a new property :occurrence instead of :occurrenceOf in the
>     > example before that example. Is there a reason for that?
>     >
>     > Also you use the abbreviated syntax. It might be helpful to use the
>     > un- abbreviated syntax in all examples except one section where the
>     > abbreviated syntax is explained. Your example does however show
>     that
>     > the abbreviated syntax doesn't make anything better w.r.t. the
>     > wikidata usecase.
>     >
>     > The way wikidata models this is arguably even more horrible. It
>     is an
>     > ordered list with marriages of Richard Burton. Each marriage is
>     a list
>     > entry which could be a blank node, etc. Elisabeth Taylor presumably
>     > has her own list of marriages and nothing but ardent querying
>     connects
>     > the two lists I suppose.
>     >
>     >
>     > > and that was before reading Ora's use-cases... ;)
>     >
>     > The wikidata use case has been part of the RDF* use cases for a
>     year
>     > now, or two?
>     >
>     >
>     > Maybe it would be helpful to name this problem properly - not
>     > "wikidata usecase" but rather "multiset problem", or "multiset
>     use case"?
>     >
>     >
>     > Best,
>     > Thomas
>     >
>     >
>     > > > Wikidata's data model records their two marriages as edge
>     > > > annotations, on the
>     > > > https://m.wikidata.org/wiki/Property:P26 relationship
>     linking them.
>     > > >
>     > > > Eg. https://m.wikidata.org/wiki/Q34851

>     > > > https://m.wikidata.org/wiki/Q151973

>     > > >
>     > > > From a Schema.org point of view, being able to express more of
>     > > > what Wikidata can do seems attractive for interop.
>     > > >
>     > > > Dan
>     > > >
>     > > > In Burton's entry we have :
>     > > > spouse
>     > > >
>     > > > Elizabeth Taylor
>     > > > start time 15 March 1964
>     > > > end time 26 June 1974
>     > > > series ordinal 2
>     > > > 1 reference
>     > > > The Peerage person ID p33443.htm#i334430 retrieved 7 August 2020
>     > > >
>     > > > Sybil Christopher
>     > > > start time 5 February 1949
>     > > > February 1949
>     > > > end time 5 December 1963
>     > > > series ordinal 1
>     > > > 1 reference
>     > > > The Peerage person ID p33443.htm#i334430 retrieved 7 August 2020
>     > > >
>     > > > Suzy Miller
>     > > > end time 1982
>     > > > start time 21 August 1976
>     > > > series ordinal 4
>     > > > 0 references
>     > > >
>     > > > Elizabeth Taylor
>     > > > start time 10 October 1975
>     > > > end time 29 July 1976
>     > > > series ordinal 3
>     > > > 0 references
>     > > >
>     > > > Sally Burton
>     > > > start time 3 July 1983
>     > > > end time 5 August 1984
>     > > > series ordinal 5
>     > > > 0 references
>     > > >
>     > > >
>     > > > On Thu, 2 Dec 2021, 21:12 Jos De Roo, <josderoo@gmail.com>
>     wrote:
>     > > >
>     > > >     Hi Ora,
>     > > >
>     > > >     Very nice use cases and to me it looks quite natural to
>     > > > express
>     > > >     them as
>     > > >
>     > > >     :Bob :isMarriedTo :Alice .
>     > > >     [] :repr << :Bob :isMarriedTo :Alice >>; :since 2020 ;
>     :source
>     > > >     :NYTimes .
>     > > >     [] :repr << :Bob :isMarriedTo :Alice >>; :since 2021 ;
>     :source
>     > > >     :WashingtonPost .
>     > > >
>     > > >     :M1 :pipe :M2 .
>     > > >     [] :repr << :M1 :pipe :M2 >>; :size "DN 100"; :schedule
>     "30" .
>     > > >     [] :repr << :M1 :pipe :M2 >>; :size "DN 125"; :schedule
>     "10" .
>     > > >
>     > > >
>     > > >     Kind regards,
>     > > >     Jos
>     > > >
>     > > >     -- https://josd.github.io

>     > > >     <https://josd.github.io>
>     > > >
>     > > >
>     > > >     On Thu, Dec 2, 2021 at 7:44 PM Lassila, Ora <ora@amazon.com>
>     > > > wrote:
>     > > >
>     > > >         Folks,
>     > > >
>     > > >         Attached is a document that outlines a couple of uses
>     > > > cases
>     > > >         (variants of one modeling pattern ,really) we would
>     like
>     > > > to
>     > > >         submit for consideration by the upcoming RDF-star
>     Working
>     > > >         Group. I am submitting these now just in case this
>     turns
>     > > > out
>     > > >         to be relevant to how the charter gets written.
>     Comments
>     > > > are
>     > > >         welcome, and I am happy to discuss these use cases
>     > > > whenever.
>     > > >
>     > > >         Regards,
>     > > >
>     > > >         Ora
>     > > >
>     > > >         --
>     > > >
>     > > >         Dr. Ora Lassila
>     > > >
>     > > >         Principal Graph Technologist, Amazon Neptune
>     > > >
>     > > >         Amazon Web Services
>     > > >
>     > > > ora@amazon.com
>     > > >
>     >
>
>

Received on Tuesday, 7 December 2021 13:52:16 UTC