Re: publicly available RDF* datasets

> But that would be a really bad encoding which should never have been considered in the first place. At this point one needs just a little experience with ontology design. This would work:
>
> <JohnDoe presOf Bar> during _:x .
> _:x rdf: type TimePeriod .
> _:x startDate 1996 .
> _:x endDate 2002 .
>
> or of course a skolemization of it to avoid the bnode. Better still would be a typed literal using a datatype of time periods, if only we had such a thing.

That's a very good point in this case of time annotations. Thank you!

However, Wikidata qualifiers are used, and often overused, for a lot
of other annotations.

We could have things like "JohnDoe has been the 2nd president of Bar
between 1996 and 2002 and 4th president of Bar between 2008 and 2012,
replacing AliceDoe, according to Source".
It's what Wikidata contributors encode often in qualifiers. See, for
example: https://www.wikidata.org/wiki/Q7747#P39

If we still want to use blank nodes and be generic, I guess we would
end up with something like:

<JohnDoe presOf Bar> annotatedBy _:x .
_:x rdf: type Annotation .
_:x startDate 1996 .
_:x endDate 2002 .
_:x rank 2 .

<JohnDoe presOf Bar> annotatedBy _:y .
_:y rdf: type Annotation .
_:y startDate 2008 .
_:y endDate 2012 .
_:y rank 4 .
_:y replaces AliceDoe .

If we create multiple blank nodes per separate fact, I believe we
would come back to the initial problem. E.g with this:

<JohnDoe presOf Bar> during _:x .
_:x rdf: type TimePeriod .
_:x startDate 1996 .
_:x endDate 2002 .
<JohnDoe presOf Bar> rank 2 .
<JohnDoe presOf Bar> during _:y .
_:y rdf: type TimePeriod .
_:y startDate 2008 .
_:y endDate 2012 .
<JohnDoe presOf Bar> rank 4 .

I don't see a way for a query to get the proper (position, time
period, rank) tuples for JohnDoe.

There are indeed much better ways to encode these facts but my aim was
to find a proper encoding of the existing Wikidata ontology in RDF*,
not building another ontology based on Wikidata content.

Do you see another way?

Best,

Thomas



Le mer. 2 sept. 2020 à 08:07, Patrick J Hayes <phayes@ihmc.us> a écrit :
>
>
>
> > On Sep 1, 2020, at 8:41 AM, Thomas Pellissier Tanon <thomas@pellissier-tanon.fr> wrote:
> >
> > Hi!
> >
> > > I don't know if anyone has attempted it yet, but an RDFStar version of Wikidata could be very interesting. There are a lot of per-factoid annotations.
> >
> > I had a look at it while building YAGO 4. There are two challenges with Wikidata mapping to RDF*:
> >
> > 1. Different statements could have the same "main triple". For example, Wikidata could have a first statement stating that JohnDoe has been the president of Bar between 1996 and 2002 and an other statement stating he has been the president of Bar between 2008 and 2012. A simple RDF* encoding would lead to:
> > <JohnDoe presidentOf Bar> startDate 1996 .
> > <JohnDoe presidentOf Bar> endDate 2002 .
> > <JohnDoe presidentOf Bar> startDate 2008 .
> > <JohnDoe presidentOf Bar> endDate 2012 .
> > This encoding might lead query and reasoning systems to assume that JohnDoe has been the president of Bar between 1996 and 2012, fact that is wrong.
>
> But that would be a really bad encoding which should never have been considered in the first place. At this point one needs just a little experience with ontology design. This would work:
>
> <JohnDoe presOf Bar> during _:x .
> _:x rdf: type TimePeriod .
> _:x startDate 1996 .
> _:x endDate 2002 .
>
> or of course a skolemization of it to avoid the bnode. Better still would be a typed literal using a datatype of time periods, if only we had such a thing.
>
> Pat Hayes
>
>
>
>
> >
> > 2. Wikidata contains "deprecated" statements that should not be asserted as facts. For example, we could have in an RDF*-like syntax:
> > <JohnDoe presidentOf Bar> prov:wasDerivedFrom Source ; wikibase:rank wikibase:Deprecated .
> > In this case the fact "JohnDoe presidentOf Bar" should not be asserted by itself.
> > So, a RDF*-Wikidata would only be valid in "SA" mode and not in "PG" mode.
> >
> > Thomas
> >
> >
> > Le mar. 1 sept. 2020 à 10:54, Dan Brickley <danbri@google.com> a écrit :
> >> On Tue, 1 Sep 2020 at 08:53, Jeen Broekstra <jb@metaphacts.com> wrote:
> >>> Hi folks,
> >>> Does anyone have any pointers to publicly available datasets that make use of RDF*?
> >>> I am aware that Yago 4 makes some limited use of RDF* annotations, but I was curious if there are any other good examples that people use for testing, demonstration, or even production use.
> >> I don't know if anyone has attempted it yet, but an RDFStar version of Wikidata could be very interesting. There are a lot of per-factoid annotations.
> >> https://www.mediawiki.org/wiki/Wikibase/DataModel
> >>> Regards,
> >>> Jeen
> >>> --
> >>> Dr Jeen Broekstra (he, him)
> >>> principal software engineer
> >>> jb@metaphacts.com
> >>> www.metaphacts.com
> >
> >
> >
> >
>

Received on Wednesday, 2 September 2020 14:36:12 UTC