- From: Tim Berners-Lee <timbl@w3.org>
- Date: Sun, 20 May 2007 13:07:50 -0400
- To: SW-forum <semantic-web@w3.org>
- Message-Id: <2519F1D0-9C45-4A8C-9FCF-400BC2DF361E@w3.org>
Begin forwarded message:
From: Tim Berners-Lee <timbl@w3.org>
Date: 2007-05-19 19:27:56 EDT
To: Chris Bizer <chris@bizer.de>
Cc: tabultor@csai.mit.edu
Subject: Linked data and rdf:type in dbpedia
Hi Chris.
That was a great session in Banff.
I'm looking now at a problem where the Tabulator sucks in huge
amounts of dbpedia. The problem is rather random rdf:type links
1. My home page says:
<http://dbpedia.org/resource/Tim_Berners-Lee> = card:i.
2. That causes tab'r to bring in http://dbpedia.org/resource/
Tim_Berners-Lee
which in turn says
<http://dbpedia.org/resource/Tim_Berners-Lee> a <http://
dbpedia.org/resource/Category:People_from_London>.
3. That causes Tab'r to look up the class Category:People_from_London
$ cwm http://dbpedia.org/resource/Category:People_from_London
This says a bunch of people whose have subject of that
<http://dbpedia.org/resource/Catherine_of_York> :subject <> .
which is fine, but it also says:
<> a </class/yago/person>,
<http://dbpedia.org/resource/
Category:English_people_by_county>,
<http://dbpedia.org/resource/Category:London>,
<http://dbpedia.org/resource/
Category:People_by_city_or_town_in_England>,
:Concept.
Here I think the use of rdf:type is incorrect. The class
People_form_London is a class of people. It is a subclass of Person.
It has no simple relationship to London. (It is in fact an
owl:Restriction on property origin to value london, but I doubt if
you can generalize that across dbpedia).
English people by County *could* be a class of classes.
The tabulator assumes that every time it follows rdf:type it is going
meta: from classes to classes of classes, etc. It does this as in
every other case so far, there have been only a few levels (like 2).
Currently, it can't use dbpedia as it pulls it memory-busting amounts
of it. It not even clear that the rdf:type links don't have cycles.
Anyone using OWL with this data wil of course find t impossible to
deal with classes of classes at all. I don't know to what extent the
issue is an
Example:
me : Unitarian
http://en.wikipedia.org/wiki/Tim_Berners-Lee
is a member of the class of
http://en.wikipedia.org/wiki/Category:Unitarian_Universalists
this is a member of the metaclass:
http://en.wikipedia.org/wiki/Category:People_by_religion
this i member of the metametametaclass of ways in whcih people
are categorized
http://en.wikipedia.org/wiki/Category:People
What follows here is the weak link. Reference is a section of
the library.
"This category is for information typically found in the
reference section of a library: reference works." Now the meta meta
class is regarded as a work? o-oh.
http://en.wikipedia.org/wiki/Category:Reference
it continues, following Category (rdf:type in dbpedia):
http://en.wikipedia.org/wiki/Category:Knowledge
http://en.wikipedia.org/wiki/Category:Information
http://en.wikipedia.org/wiki/Category:Physical_quantity
http://en.wikipedia.org/wiki/Category:Measurement
http://en.wikipedia.org/wiki/Category:Scientific_observation
http://en.wikipedia.org/wiki/Category:Data_collection
http://en.wikipedia.org/wiki/Category:Data_management
http://en.wikipedia.org/wiki/Category:Product_development
http://en.wikipedia.org/wiki/Category:Product_management
http://en.wikipedia.org/wiki/Category:Engineering
http://en.wikipedia.org/wiki/Category:Applied_sciences
http://en.wikipedia.org/wiki/Category:Science
http://en.wikipedia.org/wiki/Category:Knowledge
Ooops! It is cyclic.
The logical relationships are not consistent. i don't know whether
there are a finite number of
categories for which rdfs:class does not work, which could be put
into a stop list. "Reference" would be one.
I wonder whether dbpedia could either find a way of judging which
ones are really rdf:type relationships, or just use something vaguer
for the relationship.
Maybe wikepedia:category would be best as that is what it is in general.
Tim
Received on Sunday, 20 May 2007 17:07:59 UTC