- From: Giovanni Tummarello <giovanni.tummarello@deri.org>
- Date: Fri, 6 Aug 2010 22:02:36 +0200
- To: Paul Houle <ontology2@gmail.com>
- Cc: Jörn Hees <j_hees@cs.uni-kl.de>, public-lod <public-lod@w3.org>
- Message-ID: <AANLkTimyV416TNHUSFZ0FZ_0WgYiSrv6pCXdtuGpAVdf@mail.gmail.com>
Thanks Paul, this sort of feedback is indeed tremeoudly useful, I somehow just wish you had had 1/10th of the replies of the subjects as literal thread.:-) Gio (obviously we're talking business of LOD at large and the true state of it despite the growing number of lines in the lod cloud diagram. We're not a specific tecnicalities of dbpedia which is obviously run as good as the guys economically can) On Thu, Aug 5, 2010 at 4:07 PM, Paul Houle <ontology2@gmail.com> wrote: > If you want to get something done with dbpedia, you should (i) work from > the data dumps, or (ii) give up and use Freebase instead. > > I used to spend weeks figuring how to to clean up the mess in dbpedia until > the day I wised up and realized I could do in 15 minutes w/ Freebase what > takes 2 weeks to do w/ dbpedia, because w/ dbpedia you need to do a huge > amount of data cleaning to get anything that makes sense. > > The issue here isn't primarily "RDF vs Freebase" but it's really a matter > of the business model (or lack thereof) behind dbpedia; frankly, nobody > gets excited when dbpedia doesn't work, and that's the problem. For > instance, nobody at dbpedia seems to give a damn that dbpedia contains 3000 > "countries", wheras there's more like 200 actual active countries in the > world... Sure, it's great to have a category for things like > "Austria-Hungary" and "The Teutonic Knights", but an awful lot of people > give up on dbpedia when they see they can't easily get a list of very basic > things, like a list of countries. > > Now, I was able to, more-or-less, define "active country" as a > restriction type: anything that has an ISO country code in freebase is an > active country, or is pretty close. The ISO codes aren't in dbpedia > (because they're not in wikipedia infoboxes) so this can't be done with > dbpedia: i'd probably need to code some complex rules that try to guess at > this based on category memberships and what facts are available in the > infobox. > > I complained on both dbpedia and freebase discussion lists, and found > that: (i) nobody at dbpedia wants to do anything about this, and (ii) the > people at freebase have investigated this and they are going to do something > about it. > > -------- > > In my mind, anyway, the semantic web is a set of structured boxes. It's > not like there's one "T Box" and one "A Box" but there are nested boxes of > increasing specificity. In the systems I'm building, a Freebase-dbpedia > merge is used as a sort of "T' Box" that helps to structure and interpret > information that comes from other sources. With a little thinking about > data structures, it's efficient to have a local copy of this data and use > it as a skeleton that gets fleshed out with other stuff. Closed-world > reasoning about this "taxonomic core" is useful in a number of ways, > particularly in the detection of key integrity problems, data holes, > inconsistencies, junk data, etc. I think the "dereference and merge" > paradigm is useful once you've got the taxocore and you're merging little > bits of high-qualtiy data, but w/o control of the taxocore you're just > doomed. >
Received on Friday, 6 August 2010 20:03:04 UTC