Re: Planned changes to the VIAF RDF

On 13 Apr 2011, at 19:29, Tom Morris wrote:

> On Wed, Apr 13, 2011 at 1:04 PM, Young,Jeff (OR) <jyoung@oclc.org> wrote:
>> I disagree that these aren't really rdf:types. An rdf:Type is a named set of individuals. Individuals can have multiple types and Wikipedia category/list pages appear to be reasonable "pages" for managing individuals in named sets. We might agree that this or that set of individuals isn't worth worthy of being a named set, but that's life in an open world model.
>> 
> 
> The issue is that the set isn't curated as an rdf:Type, but as a
> Wikipedia category.  That means that if a Wikipedia editor thinks
> GenderDifferencesInBritishWriting (made up example) is something a
> reader would like to see Category:EnglishWomenWriters they go ahead
> and add it without any consideration for the fact that the page is not
> about a writer or a woman.
> 
> When the DBpedia importer assigns the type yago:EnglishWomenWriters to
> the entities derived from pages in this category, all kinds of logical
> inconsistencies will result.  You can't blame the Wikipedia editors
> for this since they never signed up to do data entry for DBpedia and
> there's no feedback mechanism for them to even learn that there might
> be a potential problem downstream.

These sorts of logical consistencies *do* have Wikipedia editors looking after them, in the offline (e.g. CD) projects. Martin may be able to say more, if you're interested.

-Jodi
 
> 
> Tom
> 
>> Jeff
>> 
>>> -----Original Message-----
>>> From: Tom Morris [mailto:tfmorris@gmail.com]
>>> Sent: Wednesday, April 13, 2011 12:51 PM
>>> To: Karen Coyle
>>> Cc: Young,Jeff (OR); Dan Brickley; Ed Summers; public-lld@w3.org
>>> Subject: Re: Planned changes to the VIAF RDF
>>> 
>>> On Wed, Apr 13, 2011 at 11:19 AM, Karen Coyle <kcoyle@kcoyle.net>
>>> wrote:
>>>> Quoting "Young,Jeff (OR)" <jyoung@oclc.org>:
>>>>> 
>>>>> That's how DBpedia seems to do it and I think it's helpful that way.
>>> Here
>>>>> are the types for Jane Austen:
>>>>> 
>>>>> rdf:type
>>>>> 
>>>>>    * foaf:Person
>>>>>    * yago:EnglishWomenWriters
>>>>>    * yago:PeopleFromHampshire
>>>>>    * yago:Person100007846
>>>>>    * yago:EnglishNovelists
>>>>>    * yago:WomenNovelists
>>>>>    * yago:EnglishRomanticFictionWriters
>>>>>    * yago:PeopleFromReading,Berkshire
>>>>>    * yago:19th-centuryEnglishPeople
>>>>>    * yago:WomenOfTheRegencyEra
>>>>>    * yago:18th-centuryEnglishPeople
>>> 
>>> Those aren't really types.  It's just an indication that her Wikipedia
>>> page was linked to from those various category/list pages.  Because
>>> the categories are human curated, they can include all kinds of stuff
>>> which doesn't make sense from a logical or type hierarchy point of
>>> view.
>>> 
>>>> Couldn't these be deduced from other data? Using this method, you
>>> would only
>>>> retrieve entities that have been given these particular classes, but
>>> if you
>>>> turned these into data available to queries...
>>>> 
>>>> sex:female
>>>> dates: (whatever)
>>>> primaryLocation: England
>>>> language: English
>>>> wrote: (name of novel)
>>>>  (name of novel) --> has genre --> romantic fiction
>>>>  (name of novel) --> has genre --> fiction (inferred?)
>>>> 
>>>> etc. then you would be able to retrieve all or most of the above,
>>> plus
>>>> perhaps more. It seems to me that trying to characterize every
>>> possible
>>>> combination goes against the basic concepts of linked data. Actually,
>>> it
>>>> might not even be particularly good as a metadata practice.
>>> 
>>> Absolutely.  You'd not only get better quality results by querying the
>>> basic data directly, but you'd also get much more complete coverage
>>> than Wikipedia categories provide.
>>> 
>>> Tom
>>> 
>>>> 
>>>> kc
>>>> 
>>>>> 
>>>>> I admit the classes get a little crazy sometimes and wouldn't assume
>>> they
>>>>> are used consistently, but I think most of them make intuitive
>>> sense.
>>>>> 
>>>>> Jeff
>>>>> 
>>>>>> -----Original Message-----
>>>>>> From: public-lld-request@w3.org [mailto:public-lld-request@w3.org]
>>> On
>>>>>> Behalf Of Dan Brickley
>>>>>> Sent: Wednesday, April 13, 2011 9:19 AM
>>>>>> To: Ed Summers
>>>>>> Cc: public-lld@w3.org
>>>>>> Subject: Re: Planned changes to the VIAF RDF
>>>>>> 
>>>>>> On 13 April 2011 14:50, Ed Summers <ehs@pobox.com> wrote:
>>>>>>> Hi Jeff,
>>>>>>> 
>>>>>>> First, let me just say I'm a big fan of the simplifications that
>>> you
>>>>>>> and Thom are proposing ... they are clearly a big improvement.
>>> But I
>>>>>>> am wondering about the foaf:focus pattern that you are promoting.
>>>>>>> 
>>>>>>> I know I've said this before privately in IRC to various people,
>>> but
>>>>>>> it's probably worth asking aloud here. Is it really necessary to
>>> use
>>>>>>> URIs to distinguish between the thing itself, and the concept of
>>> the
>>>>>>> thing?
>>>>>> 
>>>>>> As a loose rule, I see value in the latter when the thing figures
>>> in
>>>>>> some SKOS scheme, either to be mentioned alongside other related
>>>>>> entities (also indirectly as concepts) or so that
>>>>>> person_123_as_politician, person_123_as_parent,
>>> person_123_as_author
>>>>>> could be distinguished as different topics. There is value in that,
>>>>>> both for using those topic URIs to characterise information, but
>>> also
>>>>>> to talk in more detail about skills/expertise. Someone might be a
>>>>>> world export on "President George Bush snr. as a manager".
>>>>>> 
>>>>>> I tend to see your question as a variant on "why both using SKOS
>>> RDF
>>>>>> to describe concepts of thing, when I could just describe the world
>>>>>> directly in RDF?".
>>>>>> 
>>>>>> That's a fair question. I find
>>>>>> http://www.w3.org/TR/2009/REC-skos-reference-20090818/#L1045 still
>>> a
>>>>>> useful overview...
>>>>>> 
>>>>>> Dan
>>>>>> 
>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Karen Coyle
>>>> kcoyle@kcoyle.net http://kcoyle.net
>>>> ph: 1-510-540-7596
>>>> m: 1-510-435-8234
>>>> skype: kcoylenet
>>>> 
>>>> 
>>>> 
>> 
>> 
>> 
>> 
> 

Received on Thursday, 14 April 2011 13:15:54 UTC