Inferencing databases?

Hi,

I am working on BlogEd [1] where I have developed a nice and simple  
triple to OO mapping
library [2]. This makes it very easy to write a java interface that  
can then act as a
proxy on an RDF database. It is really neat in that it makes it dead  
easy to understand
the relation between Java and the Semantic Web. So take the  
AtomPerson [3] interface

-------------8<--------------------------------------------
public interface AtomPerson {
     String BASE = "https://bloged.dev.java.net/Ontology/";
     String RDF_TYPE = BASE + "Person";

     String RDF_name = BASE + "firstName";
     public void setFirstName(String name);
     public String getFirstName();

     String RDF_family = BASE + "familyName";
     public void setFamilyName(String name);
     public String getFamilyName();

     String RDF_email = BASE + "email";
     public void setEmail(URI email);
     public URI getEmail();

     String RDF_uri = BASE + "uri";
     public void setUri(URI uri);
     public URI getUri();
}
----------------8<------------------------------------------

Here we define an interface which maps to a OWL class
     https://bloged.dev.java.net/Ontology/Person.
This class has 4 java bean properties each of which have a  
corresponding OWL relation:
   - https://bloged.dev.java.net/Ontology/firstName
   - https://bloged.dev.java.net/Ontology/familyName
   - https://bloged.dev.java.net/Ontology/email
   - https://bloged.dev.java.net/Ontology/uri

It is very easy having such a class to write a little program that  
could generate a
nice corresponding xml/rdf OWL ontology. As you can see it is really  
easy to understand.
It is also very easy to use. This is how one could create a  
AtomPerson instance:

----------------8<------------------------------------------
AtomPerson me = (AtomPerson) factory.createObject(AtomPerson.class);
me.setFirstName("Henry");
me.setEmail(URI.create("mailto:henry.story@bblfish.net"));
me.setUri(URI.create("http://bblfish.net/"));
----------------8<------------------------------------------

That's it. Each of the above statements adds some new triples to the  
database. The first
one creates an anonymous object of type AtomPerson.RDF_TYPE. The last  
two add relations to
that anonymous object.

Currently I am using the Sesame Native sail which does not support  
inferencing. I have gotten
very far without inferencing, but it gets in the way a *lot*.  
Inferencing would (I hope) make
it very easy to push the above library much further.

Consider for example that I now create another AtomPerson x, with the  
same e-mail. Clearly
the email address relation should be inverse functional, ie if two  
things have that relation
to the same e-mail address then those two things are the same.

----------------8<------------------------------------------
AtomPerson x = (AtomPerson) factory.createObject(AtomPerson.class);
assert(!x.equals(me));
x.setEmail(URI.create("henry.story@bblfish.net"));

//I should not be able to conclude
assert(x.equals(me));
assert("Henry".equals(x.getFirstName()));
----------------8<------------------------------------------

Sesame has some OWL inferencing sails that should get me this far. I  
could
use those to help me get an idea on what types of problems this  
brings up:
one thing that will need to be solved will be how this can interact with
Java hashes, as an object's equality relations may change as data is  
added to
the database, as shown above.

Functional and inverse functional properties are pretty cool, but I  
have found
while writing BlogEd that I needed one more thing: Combined Inverse  
Functional
Properties (CIFP) [4]. A bad example of a CIFP would be to say that  
the first
name and family name are CIFPs: ie if two references to people have  
the same
first name and same family name then these references refer to the  
same person.
Clearly this is not true. My grandfather was also called "Henry  
Story" and I am
not him. But in order to help me continue using the same example I'll  
assume it is
true. [5]

----------------8<------------------------------------------
me.setFamilyName("Story");

AtomPerson y = (AtomPerson) factory.createObject(AtomPerson.class);
assert(!y.equals(me));
y.setFirstName("Henry");
assert(!y.equals(me));
y.setFamilyName("Story");

//Now the database can deduce from the CIFPs that
assert(y.equals(me));
assert(URI.create("mailto:henry.story@bblfish.net").equals(y.getEmail 
()));
----------------8<------------------------------------------

If this were possible then I think it would be very easy to explain
the SemWeb to any Java programmer, and it would also make data dependent
applications much easier to write. Currently I always have to work  
hard on
creating some artificial key to help me identify equal things.

Of course inferencing could do a lot more than this, but
the nice thing is that the above seems to at least to be very  
feasible type of
inferencing. It feels nearly like an extension to garbage collection.

So to summarize I was looking for a database that would allow had OWL  
inferencing
including CIFPs or allowed one to specify rules flexibly (preferably  
N3) so that
I could get the effects needed.

Or perhaps what I am trying to do is known to be foolish, impossible,  
dangerous
or better done in some other way.


Henry Story
http://bblfish.net/blog/
[1] https://bloged.dev.java.net
[2] https://bloged.dev.java.net/source/browse/bloged/src/com/sun/labs/ 
tools/rdf/
[3] https://bloged.dev.java.net/source/browse/bloged/src/com/sun/labs/ 
tools/blog/AtomPerson.java?rev=1.2&view=markup
    It is a little out of date version of the Person construct in the  
Atom 1.0 spec
    http://atompub.org/
[4] http://www.openrdf.org/forum/mvnforum/viewthread?thread=460
[5] Other examples of CIFPs are the components of a URL:
     - protocol
     - host
     - port
     - path
    (simplifying a little). If two references to URLs have the same  
values for all the above then
    they refer to the same URL.

Received on Saturday, 23 July 2005 12:41:14 UTC