- From: Heiko Paulheim <heiko@informatik.uni-mannheim.de>
- Date: Mon, 13 Oct 2014 17:51:07 +0200
- To: Valentina Presutti <vpresutti@gmail.com>
- CC: public-lod community <public-lod@w3.org>, dbpedia-discussion <dbpedia-discussion@lists.sourceforge.net>
- Message-ID: <543BF4EB.6070202@informatik.uni-mannheim.de>
Hi Valentina, I am not sure whether I understand you correctly. There might be cases of metonymy in DBpedia, but as far as I can see, Wikipedia is usually quite good at separating them via disambiguation pages, I am not sure whether there are too many example. The problem with the degrees, as far as I can tell, is not a metonymy one (degrees are just degrees, I have never seen them used to refer to a university), but simply a series of shortcomings in DBpedia. What happens here inside DBpedia is the following: * First, we find an infobox which says that someone's almaMater is, say, "Princeton University (B.A.)". Both Princeton and B.A. are linked to the respective Wikipedia pages. * The extraction framework extracts two statements from that: PersonX almaMater Princeton_University, and PersonX almaMater Bachelor_of_Arts (the second one being an error, which is very hard to avoid in the general case) * Since that happens a few times, we infer that Bachelor_of_Arts is a University. So in that case, I think it's purely a DBpedia problem. If you are aware of any actual cases of metonymy, however, I am curious to hear about that. All the best, Heiko Am 13.10.2014 16:33, schrieb Valentina Presutti: > Hi Heiko, > > thanks for the prompt reply and the explanation. > However, the interesting thing is that these entities are clearly used > with more than one sense (at least in the US culture), so the issue > comes from this fact originally in my opinion. > I mentioned two cases here, but if you check you can see that all > these types of entities (Degrees) have the same problem. > > My suggestion (if that can help) is to identify such metonym cases and > have a special approach: having different entities as the number of > senses. > > However, the Wikipedia page of such entities defines them as > degrees…not sure if this can be useful to notice for you. > > Valentina > > On 13 Oct 2014, at 09:03, Heiko Paulheim > <heiko@informatik.uni-mannheim.de > <mailto:heiko@informatik.uni-mannheim.de>> wrote: > >> Hi Valentina, >> >> (and CCing the DBpedia discussion list) >> >> this is an effect of the heuristic typing we employ in DBpedia [1]. >> It works correctly in many cases, and sometimes it fails - as for >> these examples (the classic tradeoff between coverage and precision). >> >> To briefly explain how the error comes into existence: we look at the >> distribution of types that occur for the ingoing properties of an >> untyped instance. For dbpedia:Bachelor_of_Arts, there are, among >> others, 208 ingoing properties with the predicate >> dbpedia-owl:almaMater (which is already questionable). For that >> predicate, 87.6% of the objects are of type dbpedia-owl:University. >> So we have a strong pattern, with many supporting statements, and we >> conclude that dbpedia:Bachelor_of_Arts is a university. That >> mechanism, as I said, works reasonable well, but sometimes fails at >> single instances, like this one. For dbpedia:Academic_degree, you'll >> find similar questionable statements involving that instace, that >> mislead the heuristic typing algorithm. >> >> With the 2014 release, we further tried to reduce errors like these >> by filtering common nouns using WordNet before assigning types to >> instances, but both "Academic degree" and "Bachelor of Arts" escaped >> our nets here :-( >> >> The public DBpedia endpoint loads both the infobox based types and >> the heuristic types. If you need a "clean" version, I advise you to >> set up a local endpoint and load only the infobox based types into it. >> >> Best, >> Heiko >> >> [1]http://www.heikopaulheim.com/documents/iswc2013.pdf >> >> >> >> >> Am 13.10.2014 02:42, schrieb Valentina Presutti: >>> Dear all, >>> >>> I noticed that dbpedia:Bachelor_of_Arts >>> <http://dbpedia.org/page/Bachelor_of_Arts>, as well as other similar >>> entities (dbpedia:Bachelor_of_Engineering, >>> dbpedia:Bachelor_of_Science, etc.), is typed as dbpedia-owl:University >>> I would expect a type like “Academic Degree” but if you look at >>> dbpedia:Academic_Degree, its type is again dbpedia-owl:University >>> >>> however, its definition is (according to dbpedia): >>> >>> "An academic degree is a college or university diploma, often >>> associated with a title and sometimes associated with an academic >>> position, which is usually awarded in recognition of the recipient >>> having either satisfactorily completed a prescribed course of study >>> or having conducted a scholarly endeavour deemed worthy of his or >>> her admission to the degree. The most common degrees awarded today >>> are associate, bachelor's, master's, and doctoral degrees.” >>> >>> Showing that there are at least two different meanings associated >>> with the term: college/university and title. >>> I thing that different meanings should be separated so as to allow >>> applications to refer to the different entities: a university or a >>> title. >>> >>> At least for me this causes errors in automatic relation extraction... >>> >>> Wdyt? >>> >>> Valentina >> >> -- >> Prof. Dr. Heiko Paulheim >> Data and Web Science Group >> University of Mannheim >> Phone: +49 621 181 2646 >> B6, 26, Room C1.08 >> D-68159 Mannheim >> >> Mail:heiko@informatik.uni-mannheim.de >> Web:www.heikopaulheim.com > -- Prof. Dr. Heiko Paulheim Data and Web Science Group University of Mannheim Phone: +49 621 181 2646 B6, 26, Room C1.08 D-68159 Mannheim Mail: heiko@informatik.uni-mannheim.de Web: www.heikopaulheim.com
Received on Monday, 13 October 2014 15:51:26 UTC