Re: Wikidata export in RDF

Hi Denny,
sorry for jumping in a bit late in the thread.
In the Ontology Engineering Group we published last year a whole provenance
dataset [1]
relying on the Open Provenance Model [2], which also uses the n-ary pattern
to qualify
some properties (in a very similar way to PROV). Although we are moving
towards PROV,
it may illustrate you how to publish and exploit your data with a lot of
examples :)

If you are planning to add any provenance information (by looking at the
wiki the properties
that may be useful for you are prov:wasDerivedFrom, prov:wasRevisionOf,
prov:wasInfluencedBy
or prov:hadPrimarySource, as Jun suggested) I would like to encourage you
to align your approach
with PROV's, it will make your records more interoperable.

Finally (and just for the record) you don't need to create ids for the
qualified statements when
you want to add extra information. Sometimes creating a blank node is
enough. For example,
the qualified ground triple could be represented as:
w:Berlin s:Population [
        rdf:type o:Statement ;
        v:Population "3499879"^^xsd:integer ;
        q:As_of "2011-11-30"^^xsd:date ;
        q:Method w:Extrapolation ;
        rdfs:label "3,499,879 (As of Nov 30, 2011, Method
Extrapolation)"^en .
    ]
The approach you have followed is also valid.
I hope this helps.

Cheers,
Daniel

[1].- http://webenemasuno.linkeddata.es/index_en.html, (SPARQL with
examples
at http://webenemasuno.linkeddata.es/sparql_en.html)
[2].- http://openprovenance.org/

2012/8/8 Denny Vrandečić <denny.vrandecic@wikimedia.de>

> Hi Hugh,
>
> thank you for the pointer. I had heard about CIDOC CRM, but I have not
> had realized how close it is to what we are doing. My trouble is that
> there are now at least 200 pages of specification for CIDOC CRM, and I
> tried to take a look at it, but I do not have the time to become an
> expert in CIDOC CRM myself.
>
> I either invite someone to create a draft of how our data model
> interplays with CIDOC CRM (E17 seems specifically) and what effect
> this has on our export (my assumption is that some of the URIs we use
> can actually be replaced by CIDOC URIs), or to have a discussion with
> me to see how they fit together.
>
> But thank you very much, this seems to be indeed much closer to
> Wikidata than I expected, by far.
>
> Cheers,
> Denny
>
>
> 2012/8/8 Hugh Glaser <hg@ecs.soton.ac.uk>:
> > Hi Denny,
> > Great stuff.
> > I've been watching the discussion, and am puzzled a bit about what you
> are modelling.
> > This message gets me to ask :-)
> > What you are doing looks very like what cultural heritage (museums,
> libraries, archeologists, etc.) do.
> > A model that seems to work very well for all this is CIDOC/CRM, which is
> in active use or consideration by a wide range of organisations from
> cultural heritage.
> > http://www.cidoc-crm.org/official_release_cidoc.html
> >
> > It is what we used for some work at the British Museum, and OCLC,
> Europeana, and various archeological places, for example.
> >
> > CIDOC/CRM is an event-based model.
> > This means it looks slightly strange to people who are used to making
> statements about things, and think they are true, rather than an opinion,
> but it comes very naturally once the power is understood; and it is not
> hard to query.
> >
> > But it copes with statements made at different times, and even
> conflicting ones.
> >
> > Just wondering if it is a closely related application area, and if you
> had considered it.
> > Best
> > Hugh
> >
> >
> > On 8 Aug 2012, at 13:05, Denny Vrandečić <denny.vrandecic@wikimedia.de>
> >  wrote:
> >
> >> Hi Jun,
> >>
> >> thank you for taking the time to look into the document and comment it..
> >>
> >> I expect that Wikipedia will almost never be the source for a
> >> statement expressed in Wikidata, as Wikipedia will probably not be
> >> regarded as a reliable source.
> >>
> >> In the example you link to are two statements:
> >>
> >> 1) Berlin has a population of 3,499,879 as of Nov 30, 2011, the method
> >> for deriving this was an extrapolation
> >>
> >> 2) Berlin has a population of 8,000 as of the 15th century
> >>
> >> Statement 1 has in the example no sources, but a good source would be
> >> the statistical yearbook of Germany, 2012 edition.
> >>
> >> Statement 2 has one source in the example, and this could be, e.g. a
> >> scientific paper about the development of the population of European
> >> cities in mediavel times.
> >>
> >> The method and the time are both not provenance information, but
> >> qualifiers of the statements and thus part of the statement.
> >>
> >> Every statement has an IRI. And the source will also have an IRI
> >> describing it (i.e. an IRI for the statistical yearbook, an IRI for
> >> the mentioned paper).
> >>
> >> What I did not figure out is: which property from the provenance
> >> ontology can I use to connect the statement IRI to the source IRI?
> >>
> >> Thank you for your help!
> >>
> >> Cheers,
> >> Denny
> >>
> >>
> >>
> >> 2012/8/8 Jun Zhao <jun.zhao@zoo.ox.ac.uk>:
> >>> Hi Denny,
> >>>
> >>> I have been looking for the motivation of this work on your page [1]. I
> >>> guessed that your main goal was trying to express facts about the same
> >>> entity but coming from different perspectives and sources? Did you get
> all
> >>> of these diverse facts from wikipedia? It will be nice to have one step
> >>> further provenance statements than just saying "Method Extrapolation"
> or "as
> >>> of 15th century".
> >>>
> >>> The patterns you used here are highly related to PROV [2,3],
> particularly
> >>> the bundle and qualification structure of the latest PROV data model.
> Please
> >>> do not hesitate to ping us if you find any impracticality or even
> problems
> >>> in the current model. We will really appreciate your feedback!
> >>>
> >>> [1] http://meta.wikimedia.org/wiki/Wikidata/Development/RDF
> >>> [2] http://www.w3.org/TR/prov-dm/
> >>> [3] http://www.w3.org/TR/prov-o/
> >>>
> >>> Good work!
> >>>
> >>> Cheers,
> >>>
> >>> Jun
> >>>
> >>>
> >>> On 08/08/2012 11:32, Denny Vrandečić wrote:
> >>>>
> >>>> Ivan,
> >>>>
> >>>> thank you! That is reassuring to hear that we are not without
> precedent :)
> >>>>
> >>>> We are investigating how we could use the provenance ontology, as we
> >>>> sure would like to reuse existing stuff instead of inventing new one..
> >>>>
> >>>> Cheers,
> >>>> Denny
> >>>>
> >>>> 2012/8/7 Ivan Herman <ivan@w3.org>:
> >>>>>
> >>>>> Denny,
> >>>>>
> >>>>> fwiw, the approach you take is very similar to what the Provenance
> >>>>> Working group took in the upcoming Prov vocabulary. Look, for
> example, in
> >>>>>
> >>>>> http://www.w3.org/TR/prov-primer/
> >>>>>
> >>>>> and for 'qualifiedXXXX'. Essentially, if there is a property 'p'
> then the
> >>>>> 'qualifiedP' is another property whose range is an object of a
> specific type
> >>>>> that has the other information. Same as our s:Population.
> >>>>>
> >>>>> As an aside, the good thing is that it may make it easier to use the
> >>>>> provenance vocabulary in your setup if you want to:-)
> >>>>>
> >>>>> Cheers
> >>>>>
> >>>>> Ivan
> >>>>>
> >>>>>
> >>>>>
> >>>>> On Aug 6, 2012, at 12:03 , Denny Vrandečić wrote:
> >>>>>
> >>>>>> Hi all,
> >>>>>>
> >>>>>> we have created the first draft of the Wikidata export in RDF.
> >>>>>>
> >>>>>> <http://meta.wikimedia.org/wiki/Wikidata/Development/RDF>
> >>>>>>
> >>>>>> I am inviting the Semantic Web and Linked Data community to a
> >>>>>> discussion about it.
> >>>>>>
> >>>>>> Cheers,
> >>>>>> Denny
> >>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> Project director Wikidata
> >>>>>> Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
> >>>>>> Tel. +49-30-219 158 26-0 | http://wikimedia.de
> >>>>>>
> >>>>>> Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens
> e.V.
> >>>>>> Eingetragen im Vereinsregister des Amtsgerichts
> Berlin-Charlottenburg
> >>>>>> unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das
> >>>>>> Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
> >>>>>>
> >>>>>
> >>>>>
> >>>>> ----
> >>>>> Ivan Herman, W3C Semantic Web Activity Lead
> >>>>> Home: http://www.w3.org/People/Ivan/
> >>>>> mobile: +31-641044153
> >>>>> FOAF: http://www.ivan-herman.net/foaf.rdf
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>
> >>>>
> >>>>
> >>>
> >>> --
> >>> Jun Zhao, PhD
> >>> EPSRC Postdoctoral Fellow
> >>> Department of Zoology
> >>> University of Oxford
> >>> Tinbergen Building, South Parks Road
> >>> Oxford, OX1 3PS, UK
> >>>
> >>
> >>
> >>
> >> --
> >> Project director Wikidata
> >> Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
> >> Tel. +49-30-219 158 26-0 | http://wikimedia.de
> >>
> >> Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
> >> Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
> >> unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das
> >> Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
> >>
> >
>
>
>
> --
> Project director Wikidata
> Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
> Tel. +49-30-219 158 26-0 | http://wikimedia.de
>
> Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
> Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
> unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das
> Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
>
>

Received on Wednesday, 8 August 2012 22:17:46 UTC