DBpedia/YAGO (was Re: Planned changes to the VIAF RDF)

Jeff, others,

I'm wondering what this has to do with the original post, but ok...

YAGO is in fact an ontology extracted mostly from wikipedia, but there is some extra cleaning/enrichment done. It's rather the work of a computer science lab, not wikipedians only. So I'm not sure how anyone from the library community may influence this through Wikipedia.
The http://dbpedia.org/resource/Category:American_inventors are the direct conversion of the Wikipedia category. Hence the use of SKOS, to avoid all the logical issues a direct conversion to OWL classes would have caused.

Antoine



> Karen,
>
> Offline, Tom and I briefly examined Wikipedia's http://en.wikipedia.org/wiki/Category:American_inventors category, and it looks like they are coining a skos:Concept and an rdfs:Class for this category. I assume this is a general pattern, but don't know for sure. Here's Buckminster Fuller as an example:
>
> <http://dbpedia.org/resource/Buckminster_Fuller>
> 	rdf:type<http://dbpedia.org/class/yago/AmericanInventors>  ;
> 	dcterms:subject<http://dbpedia.org/resource/Category:American_inventors>  .
>
> If you click on either of the two, the properties and individuals seem to sort out too well to be dumb luck. The yago rdfs:Class appears to have mostly people in it, and the category skos:Concept seems to deal with the conceptual hierarchy aspects. There are some oddballs, but somehow it seems to work reasonably well. Given the weight of DBpedia in the LOD cloud, I think it would be well worth learning the mechanisms and registering as an editor to start fixing the details and data from a library domain perspective.
>
> http://mappings.dbpedia.org/index.php/Main_Page#How_is_the_Mapping_and_the_Ontology_maintained.3F
>
> Jeff
>
>> -----Original Message-----
>> From: Karen Coyle [mailto:kcoyle@kcoyle.net]
>> Sent: Wednesday, April 13, 2011 6:04 PM
>> To: Young,Jeff (OR)
>> Cc: Tom Morris; Dan Brickley; Ed Summers; public-lld@w3.org
>> Subject: RE: Planned changes to the VIAF RDF
>>
>> Quoting "Young,Jeff (OR)"<jyoung@oclc.org>:
>>
>>> I disagree that these aren't really rdf:types. An rdf:Type is a
>>> named set of individuals. Individuals can have multiple types and
>>> Wikipedia category/list pages appear to be reasonable "pages" for
>>> managing individuals in named sets. We might agree that this or that
>>> set of individuals isn't worth worthy of being a named set, but
>>> that's life in an open world model.
>>
>> Is this different from an LCSH heading that goes something like:
>>
>> Aerospace writers
>>
>> ? Don't many subject headings create a set in this same way?
>>
>> kc
>>
>>>
>>> Jeff
>>>
>>>> -----Original Message-----
>>>> From: Tom Morris [mailto:tfmorris@gmail.com]
>>>> Sent: Wednesday, April 13, 2011 12:51 PM
>>>> To: Karen Coyle
>>>> Cc: Young,Jeff (OR); Dan Brickley; Ed Summers; public-lld@w3.org
>>>> Subject: Re: Planned changes to the VIAF RDF
>>>>
>>>> On Wed, Apr 13, 2011 at 11:19 AM, Karen Coyle<kcoyle@kcoyle.net>
>>>> wrote:
>>>>> Quoting "Young,Jeff (OR)"<jyoung@oclc.org>:
>>>>>>
>>>>>> That's how DBpedia seems to do it and I think it's helpful that
>> way.
>>>> Here
>>>>>> are the types for Jane Austen:
>>>>>>
>>>>>> rdf:type
>>>>>>
>>>>>>     * foaf:Person
>>>>>>     * yago:EnglishWomenWriters
>>>>>>     * yago:PeopleFromHampshire
>>>>>>     * yago:Person100007846
>>>>>>     * yago:EnglishNovelists
>>>>>>     * yago:WomenNovelists
>>>>>>     * yago:EnglishRomanticFictionWriters
>>>>>>     * yago:PeopleFromReading,Berkshire
>>>>>>     * yago:19th-centuryEnglishPeople
>>>>>>     * yago:WomenOfTheRegencyEra
>>>>>>     * yago:18th-centuryEnglishPeople
>>>>
>>>> Those aren't really types.  It's just an indication that her
>> Wikipedia
>>>> page was linked to from those various category/list pages.  Because
>>>> the categories are human curated, they can include all kinds of
>> stuff
>>>> which doesn't make sense from a logical or type hierarchy point of
>>>> view.
>>>>
>>>>> Couldn't these be deduced from other data? Using this method, you
>>>> would only
>>>>> retrieve entities that have been given these particular classes,
>> but
>>>> if you
>>>>> turned these into data available to queries...
>>>>>
>>>>> sex:female
>>>>> dates: (whatever)
>>>>> primaryLocation: England
>>>>> language: English
>>>>> wrote: (name of novel)
>>>>>   (name of novel) -->  has genre -->  romantic fiction
>>>>>   (name of novel) -->  has genre -->  fiction (inferred?)
>>>>>
>>>>> etc. then you would be able to retrieve all or most of the above,
>>>> plus
>>>>> perhaps more. It seems to me that trying to characterize every
>>>> possible
>>>>> combination goes against the basic concepts of linked data.
>> Actually,
>>>> it
>>>>> might not even be particularly good as a metadata practice.
>>>>
>>>> Absolutely.  You'd not only get better quality results by querying
>> the
>>>> basic data directly, but you'd also get much more complete coverage
>>>> than Wikipedia categories provide.
>>>>
>>>> Tom
>>>>
>>>>>
>>>>> kc
>>>>>
>>>>>>
>>>>>> I admit the classes get a little crazy sometimes and wouldn't
>> assume
>>>> they
>>>>>> are used consistently, but I think most of them make intuitive
>>>> sense.
>>>>>>
>>>>>> Jeff
>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: public-lld-request@w3.org [mailto:public-lld-
>> request@w3.org]
>>>> On
>>>>>>> Behalf Of Dan Brickley
>>>>>>> Sent: Wednesday, April 13, 2011 9:19 AM
>>>>>>> To: Ed Summers
>>>>>>> Cc: public-lld@w3.org
>>>>>>> Subject: Re: Planned changes to the VIAF RDF
>>>>>>>
>>>>>>> On 13 April 2011 14:50, Ed Summers<ehs@pobox.com>  wrote:
>>>>>>>> Hi Jeff,
>>>>>>>>
>>>>>>>> First, let me just say I'm a big fan of the simplifications
>> that
>>>> you
>>>>>>>> and Thom are proposing ... they are clearly a big improvement.
>>>> But I
>>>>>>>> am wondering about the foaf:focus pattern that you are
>> promoting.
>>>>>>>>
>>>>>>>> I know I've said this before privately in IRC to various
>> people,
>>>> but
>>>>>>>> it's probably worth asking aloud here. Is it really necessary
>> to
>>>> use
>>>>>>>> URIs to distinguish between the thing itself, and the concept
>> of
>>>> the
>>>>>>>> thing?
>>>>>>>
>>>>>>> As a loose rule, I see value in the latter when the thing
>> figures
>>>> in
>>>>>>> some SKOS scheme, either to be mentioned alongside other related
>>>>>>> entities (also indirectly as concepts) or so that
>>>>>>> person_123_as_politician, person_123_as_parent,
>>>> person_123_as_author
>>>>>>> could be distinguished as different topics. There is value in
>> that,
>>>>>>> both for using those topic URIs to characterise information, but
>>>> also
>>>>>>> to talk in more detail about skills/expertise. Someone might be
>> a
>>>>>>> world export on "President George Bush snr. as a manager".
>>>>>>>
>>>>>>> I tend to see your question as a variant on "why both using SKOS
>>>> RDF
>>>>>>> to describe concepts of thing, when I could just describe the
>> world
>>>>>>> directly in RDF?".
>>>>>>>
>>>>>>> That's a fair question. I find
>>>>>>> http://www.w3.org/TR/2009/REC-skos-reference-20090818/#L1045
>> still
>>>> a
>>>>>>> useful overview...
>>>>>>>
>>>>>>> Dan
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Karen Coyle
>>>>> kcoyle@kcoyle.net http://kcoyle.net
>>>>> ph: 1-510-540-7596
>>>>> m: 1-510-435-8234
>>>>> skype: kcoylenet
>>>>>
>>>>>
>>>>>
>>>
>>>
>>>
>>>
>>
>>
>>
>> --
>> Karen Coyle
>> kcoyle@kcoyle.net http://kcoyle.net
>> ph: 1-510-540-7596
>> m: 1-510-435-8234
>> skype: kcoylenet
>>
>
>
>

Received on Thursday, 14 April 2011 08:01:30 UTC