Re: Alternatives to OWL for linked data? from Paul Houle on 2009-07-24 (public-lod@w3.org from July 2009)

From: Paul Houle <ontology2@gmail.com>
Date: Fri, 24 Jul 2009 11:48:04 -0400
To: Axel Rauschmayer <axel@rauschma.de>
Cc: public-lod@w3.org
Message-ID: <3e12f6f40907240848q5eae4182t96c12b88cfb0f5d6@mail.gmail.com>
On Fri, Jul 24, 2009 at 9:30 AM, Axel Rauschmayer <axel@rauschma.de> wrote:

>
> While it's not necessarily easier to understand for end users, I've always
> found Prolog easy to understand, where OWL is more of a challenge.
>
> So what solutions are out there? I would prefer description logic
> programming to OWL. Does Prolog-like backward-chaining make sense for RDF?
> If so, how would it be combined with SPARQL; or would it replace it? Or
> maybe something frame-based?
>
> Am I making sense? I would appreciate any pointers, hints and insights.



     I've got some projects in the pipe that are primarily based on Dbpedia
and Freebase,  but I'm incorporating data from other sources as well.  The
core of this is a system called Isidore which is a specialized system for
handling generic databases.
     My viewpoint is that there are certain kinds of reasoning that are best
done in a specialized way;  for instance,  the handling of identities,
 names and categories ("category" here includes the Dbpedia ontology and
Freebase types as well as internally generated.  For instance, a common task
is looking up an object by name.  Last time I looked,  there were about 10k
Wikipedia articles that had names that differed only by capitalization;
 most of the time you want name-lookups to be case-insensitive,  but you
still want addressability for the strange cases.

    Wikipedia also has a treasure trove of information about disambiguation.
 The projects I do are about specific problem domains,  say animals,  cars,
 or video games:  I can easily qualify a search for "Jaguar" against a
problem domain and get the right dbpedia resource.

    The core of identity,  naming and category information is small:  it's
easy to handle and easy to construct from Dbpedia and Freebase dumps.  From
the core it's possible to identify a problem domain and import data from
Dbpedia,  Freebase and other sources to construct a working database.

-------

    You might say that this is too specialized,  but this is the way the
brain works.  It's got specific modules for understanding particular problem
domains (faces,  people,  space,  etc.)  It's not so bad because the number
of modules that you need is finite.  Persons and Places represent a large
fraction of Dbpedia,  so reasoning about people and GIS can get you a lot of
mileage.  Freebase has particularly rich collection of data about musical
recordings and

    I'm not sure if systems like OWL,  etc are really the answer -- we might
need something more like Cyc (or own brain) that has a lot of specialized
knowledge about the world embedded in it.

--------

    I see reification as an absolute requirement.  Underlying this is the
fact that generic databases are full of junk.  I'm attracted to Prolog-like
systems (Datalog?) but conventional logic systems are easily killed by
contradictory information.  This becomes a scalability limitation unless
you've got a system that is naturally robust to junk data.  You've also got
to be able to do conventional filtering:  you've got to be able to say
"Triple A is wrong",  "I don't trust triples from source B",  "Source C uses
predicate D incorrectly",  "Don't believe anything that E says about subject
F."  To deal with the (existing and emerging) semspam threat,  we'll also
need the same kind of probabilistic filtering that's used for e-mail and
blog comments.  (Take a look at the external links Dbpedia table if you
don't believe me)

    The biggest challenge I see in generic databases is fiction.  Wikipedia
has a shocking amount of information about fiction:  this is a both an
opportunity and a danger.  For one thing,  people love fiction -- a G.P.A.I.
certainly needs to be able to appreciate fiction in order to appreciate the
human experience  On the other hand,  any system that does reasoning about
physics needs to tell the difference between

http://en.wikipedia.org/wiki/Minkowski_space

and

http://en.wikipedia.org/wiki/Minovsky_Physics#Minovsky_Physics

Also,  really it's all fiction when it comes down to it.  When a robocop
shows up at the scene of a fight,  it's going to hear contradictory stories
about who punched who first.  It's got to be able to listen to contradictory
stories and keep them apart,  and not fall apart like a computer from a bad
sci-fi movie.

-----------

Microtheories?  Nonmonotonic logic?

Perhaps.

You can go ahead and write standards and write papers about systems that
ignore the problems above,  but you're not going to make systems that work,
 on an engineering basis,  unless you confront them.
Received on Friday, 24 July 2009 21:44:52 UTC