Re: RDF and its discontents

Hi Paul,

thanks a lot for your very insightful experience report about Semantic 
Web, RDF and DBPedia.

(more thoughts inline)

Am 02.07.2010 17:07, schrieb Paul Houle:
> Here are some of my thoughts
>

[skip]

>
> (4) I'm one of the people who got interested in semantic tech because of
> DBPedia,  but yet,  I've also largely given up on DBPedia.  One day I
> realized that I could,  with Freebase,  do things in 20 minutes that
> would take 2 weeks of data cleanup with DBPedia.  DBPedia 3.5/3.5.1
> seems to be a large step backwards,  with major key integrity problems
> that are completely invisible to 'open world' and OWL-paradigm systems.
>   I've wound up writing my own framework for extracting 'facts' from
> wikipedia because DBPedia isn't interested in extracting the things I
> want.  Every time I try to do something with DBpedia,  I make shocking
> discoveries (for instance, "New York City", "Berlin", "Tokyo",
> "Washington , D.C." and "Manchester, N.H." are not of rdf:type "City")
>   The fact that I see so little complaining about this on the mailing
> list seems to indicate that not a lot of people are trying to do real
> work it.

I ask me all the time, why DBPedia (and now also Uberblic) uses its own 
(very huge) ontology specification in the background. Of course, they 
sometimes re-use some pieces of (well-established) ontology 
specifications. However, I think this pattern should be strongly 
reinforced. There are some good (well-defined and well-established) 
domain specific ontology specifications out there, e.g. the Music 
Ontology (for the music domain), which should also be used instead of 
using DBPedia's own concept and property definitions there.
I know one could now also say that we could apply ontology 
mapping/alignment here. However, that would blow up the whole knowledge 
base (with obsolete mappings) and it would slow down the reasoning 
process over it. I also know that everyone is free to say everything 
about everything. Although, I think it expresses a big redundancy, if we 
define the same concepts and properties over and over again and use for 
the explanation their meaning the same definitions.
If we would like a huge distributed database in the Web, then we should 
at least agree to some important 'best practice' patterns (ontology 
reutilization is one of them) to establish a good interlinking between 
single datasets.


Cheers,


Bob

Received on Wednesday, 7 July 2010 11:30:24 UTC