- From: Paul Houle <ontology2@gmail.com>
- Date: Fri, 24 Jul 2009 11:48:04 -0400
- To: Axel Rauschmayer <axel@rauschma.de>
- Cc: public-lod@w3.org
- Message-ID: <3e12f6f40907240848q5eae4182t96c12b88cfb0f5d6@mail.gmail.com>
On Fri, Jul 24, 2009 at 9:30 AM, Axel Rauschmayer <axel@rauschma.de> wrote: > > While it's not necessarily easier to understand for end users, I've always > found Prolog easy to understand, where OWL is more of a challenge. > > So what solutions are out there? I would prefer description logic > programming to OWL. Does Prolog-like backward-chaining make sense for RDF? > If so, how would it be combined with SPARQL; or would it replace it? Or > maybe something frame-based? > > Am I making sense? I would appreciate any pointers, hints and insights. I've got some projects in the pipe that are primarily based on Dbpedia and Freebase, but I'm incorporating data from other sources as well. The core of this is a system called Isidore which is a specialized system for handling generic databases. My viewpoint is that there are certain kinds of reasoning that are best done in a specialized way; for instance, the handling of identities, names and categories ("category" here includes the Dbpedia ontology and Freebase types as well as internally generated. For instance, a common task is looking up an object by name. Last time I looked, there were about 10k Wikipedia articles that had names that differed only by capitalization; most of the time you want name-lookups to be case-insensitive, but you still want addressability for the strange cases. Wikipedia also has a treasure trove of information about disambiguation. The projects I do are about specific problem domains, say animals, cars, or video games: I can easily qualify a search for "Jaguar" against a problem domain and get the right dbpedia resource. The core of identity, naming and category information is small: it's easy to handle and easy to construct from Dbpedia and Freebase dumps. From the core it's possible to identify a problem domain and import data from Dbpedia, Freebase and other sources to construct a working database. ------- You might say that this is too specialized, but this is the way the brain works. It's got specific modules for understanding particular problem domains (faces, people, space, etc.) It's not so bad because the number of modules that you need is finite. Persons and Places represent a large fraction of Dbpedia, so reasoning about people and GIS can get you a lot of mileage. Freebase has particularly rich collection of data about musical recordings and I'm not sure if systems like OWL, etc are really the answer -- we might need something more like Cyc (or own brain) that has a lot of specialized knowledge about the world embedded in it. -------- I see reification as an absolute requirement. Underlying this is the fact that generic databases are full of junk. I'm attracted to Prolog-like systems (Datalog?) but conventional logic systems are easily killed by contradictory information. This becomes a scalability limitation unless you've got a system that is naturally robust to junk data. You've also got to be able to do conventional filtering: you've got to be able to say "Triple A is wrong", "I don't trust triples from source B", "Source C uses predicate D incorrectly", "Don't believe anything that E says about subject F." To deal with the (existing and emerging) semspam threat, we'll also need the same kind of probabilistic filtering that's used for e-mail and blog comments. (Take a look at the external links Dbpedia table if you don't believe me) The biggest challenge I see in generic databases is fiction. Wikipedia has a shocking amount of information about fiction: this is a both an opportunity and a danger. For one thing, people love fiction -- a G.P.A.I. certainly needs to be able to appreciate fiction in order to appreciate the human experience On the other hand, any system that does reasoning about physics needs to tell the difference between http://en.wikipedia.org/wiki/Minkowski_space and http://en.wikipedia.org/wiki/Minovsky_Physics#Minovsky_Physics Also, really it's all fiction when it comes down to it. When a robocop shows up at the scene of a fight, it's going to hear contradictory stories about who punched who first. It's got to be able to listen to contradictory stories and keep them apart, and not fall apart like a computer from a bad sci-fi movie. ----------- Microtheories? Nonmonotonic logic? Perhaps. You can go ahead and write standards and write papers about systems that ignore the problems above, but you're not going to make systems that work, on an engineering basis, unless you confront them.
Received on Friday, 24 July 2009 21:44:52 UTC