Re: Wikidata export in RDF

Hi Daniel,

thank you for the comments. This further validates the approach we
have selected. I am also happy to see the relevant Provenance ontology
properties listed for easier reference.

I dislike blank nodes due to several reasons, and I do not see any
advantage for consumers or reusers of data when blank nodes are used.
I see a minor advantage for authors, as they can omit the work of
creating an IRI. If someone from the outside wanted to address a
statement from Wikidata, e.g. to state that they like it, or that they
consider it not true, etc., a blank node would not allow them to do
so. IRIs seem a more natural choice for a web that wants to further
interconnection and reuse.

Cheers,
Denny



2012/8/8 Daniel Garijo <dgarijo@fi.upm.es>:
> Hi Denny,
> sorry for jumping in a bit late in the thread.
> In the Ontology Engineering Group we published last year a whole provenance
> dataset [1]
> relying on the Open Provenance Model [2], which also uses the n-ary pattern
> to qualify
> some properties (in a very similar way to PROV). Although we are moving
> towards PROV,
> it may illustrate you how to publish and exploit your data with a lot of
> examples :)
>
> If you are planning to add any provenance information (by looking at the
> wiki the properties
> that may be useful for you are prov:wasDerivedFrom, prov:wasRevisionOf,
> prov:wasInfluencedBy
> or prov:hadPrimarySource, as Jun suggested) I would like to encourage you to
> align your approach
> with PROV's, it will make your records more interoperable.
>
> Finally (and just for the record) you don't need to create ids for the
> qualified statements when
> you want to add extra information. Sometimes creating a blank node is
> enough. For example,
> the qualified ground triple could be represented as:
> w:Berlin s:Population [
>         rdf:type o:Statement ;
>         v:Population "3499879"^^xsd:integer ;
>         q:As_of "2011-11-30"^^xsd:date ;
>         q:Method w:Extrapolation ;
>         rdfs:label "3,499,879 (As of Nov 30, 2011, Method Extrapolation)"^en
> .
>     ]
> The approach you have followed is also valid.
> I hope this helps.
>
> Cheers,
> Daniel
>
> [1].- http://webenemasuno.linkeddata.es/index_en.html, (SPARQL with examples
> at http://webenemasuno.linkeddata.es/sparql_en.html)
> [2].- http://openprovenance.org/
>
> 2012/8/8 Denny Vrandečić <denny.vrandecic@wikimedia.de>
>>
>> Hi Hugh,
>>
>> thank you for the pointer. I had heard about CIDOC CRM, but I have not
>> had realized how close it is to what we are doing. My trouble is that
>> there are now at least 200 pages of specification for CIDOC CRM, and I
>> tried to take a look at it, but I do not have the time to become an
>> expert in CIDOC CRM myself.
>>
>> I either invite someone to create a draft of how our data model
>> interplays with CIDOC CRM (E17 seems specifically) and what effect
>> this has on our export (my assumption is that some of the URIs we use
>> can actually be replaced by CIDOC URIs), or to have a discussion with
>> me to see how they fit together.
>>
>> But thank you very much, this seems to be indeed much closer to
>> Wikidata than I expected, by far.
>>
>> Cheers,
>> Denny
>>
>>
>> 2012/8/8 Hugh Glaser <hg@ecs.soton.ac.uk>:
>> > Hi Denny,
>> > Great stuff.
>> > I've been watching the discussion, and am puzzled a bit about what you
>> > are modelling.
>> > This message gets me to ask :-)
>> > What you are doing looks very like what cultural heritage (museums,
>> > libraries, archeologists, etc.) do.
>> > A model that seems to work very well for all this is CIDOC/CRM, which is
>> > in active use or consideration by a wide range of organisations from
>> > cultural heritage.
>> > http://www.cidoc-crm.org/official_release_cidoc.html
>> >
>> > It is what we used for some work at the British Museum, and OCLC,
>> > Europeana, and various archeological places, for example.
>> >
>> > CIDOC/CRM is an event-based model.
>> > This means it looks slightly strange to people who are used to making
>> > statements about things, and think they are true, rather than an opinion,
>> > but it comes very naturally once the power is understood; and it is not hard
>> > to query.
>> >
>> > But it copes with statements made at different times, and even
>> > conflicting ones.
>> >
>> > Just wondering if it is a closely related application area, and if you
>> > had considered it.
>> > Best
>> > Hugh
>> >
>> >
>> > On 8 Aug 2012, at 13:05, Denny Vrandečić <denny.vrandecic@wikimedia.de>
>> >  wrote:
>> >
>> >> Hi Jun,
>> >>
>> >> thank you for taking the time to look into the document and comment it.
>> >>
>> >> I expect that Wikipedia will almost never be the source for a
>> >> statement expressed in Wikidata, as Wikipedia will probably not be
>> >> regarded as a reliable source.
>> >>
>> >> In the example you link to are two statements:
>> >>
>> >> 1) Berlin has a population of 3,499,879 as of Nov 30, 2011, the method
>> >> for deriving this was an extrapolation
>> >>
>> >> 2) Berlin has a population of 8,000 as of the 15th century
>> >>
>> >> Statement 1 has in the example no sources, but a good source would be
>> >> the statistical yearbook of Germany, 2012 edition.
>> >>
>> >> Statement 2 has one source in the example, and this could be, e.g. a
>> >> scientific paper about the development of the population of European
>> >> cities in mediavel times.
>> >>
>> >> The method and the time are both not provenance information, but
>> >> qualifiers of the statements and thus part of the statement.
>> >>
>> >> Every statement has an IRI. And the source will also have an IRI
>> >> describing it (i.e. an IRI for the statistical yearbook, an IRI for
>> >> the mentioned paper).
>> >>
>> >> What I did not figure out is: which property from the provenance
>> >> ontology can I use to connect the statement IRI to the source IRI?
>> >>
>> >> Thank you for your help!
>> >>
>> >> Cheers,
>> >> Denny
>> >>
>> >>
>> >>
>> >> 2012/8/8 Jun Zhao <jun.zhao@zoo.ox.ac.uk>:
>> >>> Hi Denny,
>> >>>
>> >>> I have been looking for the motivation of this work on your page [1].
>> >>> I
>> >>> guessed that your main goal was trying to express facts about the same
>> >>> entity but coming from different perspectives and sources? Did you get
>> >>> all
>> >>> of these diverse facts from wikipedia? It will be nice to have one
>> >>> step
>> >>> further provenance statements than just saying "Method Extrapolation"
>> >>> or "as
>> >>> of 15th century".
>> >>>
>> >>> The patterns you used here are highly related to PROV [2,3],
>> >>> particularly
>> >>> the bundle and qualification structure of the latest PROV data model.
>> >>> Please
>> >>> do not hesitate to ping us if you find any impracticality or even
>> >>> problems
>> >>> in the current model. We will really appreciate your feedback!
>> >>>
>> >>> [1] http://meta.wikimedia.org/wiki/Wikidata/Development/RDF
>> >>> [2] http://www.w3.org/TR/prov-dm/
>> >>> [3] http://www.w3.org/TR/prov-o/
>> >>>
>> >>> Good work!
>> >>>
>> >>> Cheers,
>> >>>
>> >>> Jun
>> >>>
>> >>>
>> >>> On 08/08/2012 11:32, Denny Vrandečić wrote:
>> >>>>
>> >>>> Ivan,
>> >>>>
>> >>>> thank you! That is reassuring to hear that we are not without
>> >>>> precedent :)
>> >>>>
>> >>>> We are investigating how we could use the provenance ontology, as we
>> >>>> sure would like to reuse existing stuff instead of inventing new one.
>> >>>>
>> >>>> Cheers,
>> >>>> Denny
>> >>>>
>> >>>> 2012/8/7 Ivan Herman <ivan@w3.org>:
>> >>>>>
>> >>>>> Denny,
>> >>>>>
>> >>>>> fwiw, the approach you take is very similar to what the Provenance
>> >>>>> Working group took in the upcoming Prov vocabulary. Look, for
>> >>>>> example, in
>> >>>>>
>> >>>>> http://www.w3.org/TR/prov-primer/
>> >>>>>
>> >>>>> and for 'qualifiedXXXX'. Essentially, if there is a property 'p'
>> >>>>> then the
>> >>>>> 'qualifiedP' is another property whose range is an object of a
>> >>>>> specific type
>> >>>>> that has the other information. Same as our s:Population.
>> >>>>>
>> >>>>> As an aside, the good thing is that it may make it easier to use the
>> >>>>> provenance vocabulary in your setup if you want to:-)
>> >>>>>
>> >>>>> Cheers
>> >>>>>
>> >>>>> Ivan
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> On Aug 6, 2012, at 12:03 , Denny Vrandečić wrote:
>> >>>>>
>> >>>>>> Hi all,
>> >>>>>>
>> >>>>>> we have created the first draft of the Wikidata export in RDF.
>> >>>>>>
>> >>>>>> <http://meta.wikimedia.org/wiki/Wikidata/Development/RDF>
>> >>>>>>
>> >>>>>> I am inviting the Semantic Web and Linked Data community to a
>> >>>>>> discussion about it.
>> >>>>>>
>> >>>>>> Cheers,
>> >>>>>> Denny
>> >>>>>>
>> >>>>>>
>> >>>>>> --
>> >>>>>> Project director Wikidata
>> >>>>>> Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
>> >>>>>> Tel. +49-30-219 158 26-0 | http://wikimedia.de
>> >>>>>>
>> >>>>>> Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens
>> >>>>>> e.V.
>> >>>>>> Eingetragen im Vereinsregister des Amtsgerichts
>> >>>>>> Berlin-Charlottenburg
>> >>>>>> unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das
>> >>>>>> Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
>> >>>>>>
>> >>>>>
>> >>>>>
>> >>>>> ----
>> >>>>> Ivan Herman, W3C Semantic Web Activity Lead
>> >>>>> Home: http://www.w3.org/People/Ivan/
>> >>>>> mobile: +31-641044153
>> >>>>> FOAF: http://www.ivan-herman.net/foaf.rdf
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>
>> >>> --
>> >>> Jun Zhao, PhD
>> >>> EPSRC Postdoctoral Fellow
>> >>> Department of Zoology
>> >>> University of Oxford
>> >>> Tinbergen Building, South Parks Road
>> >>> Oxford, OX1 3PS, UK
>> >>>
>> >>
>> >>
>> >>
>> >> --
>> >> Project director Wikidata
>> >> Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
>> >> Tel. +49-30-219 158 26-0 | http://wikimedia.de
>> >>
>> >> Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
>> >> Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
>> >> unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das
>> >> Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
>> >>
>> >
>>
>>
>>
>> --
>> Project director Wikidata
>> Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
>> Tel. +49-30-219 158 26-0 | http://wikimedia.de
>>
>> Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
>> Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
>> unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das
>> Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.
>>
>



-- 
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
unter der Nummer 23855 B. Als gemeinnützig anerkannt durch das
Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

Received on Thursday, 9 August 2012 08:22:16 UTC