Re: Enhancing object-oriented programming with OWL from Timothy Armstrong on 2012-09-06 (semantic-web@w3.org from September 2012)

From: Timothy Armstrong <tim.armstrong@gmx.com>
Date: Thu, 06 Sep 2012 19:48:00 -0400
To: semantic-web@w3.org
Message-ID: <50493630.6080509@gmx.com>
Hi,

Well, we need a new reasoner.  I explain my approach a bit below for the 
reasoner I wrote.  I believe it is new, but I could be entirely mistaken 
that it's new.  At no time does all of memory have to be in a reasoned 
state, only a small part of it.  We need a new reasoner because all the 
existing ones run on a set of whole triples, and we need one that just 
runs on Java.

I believe it is eminently worthwhile to write a new reasoner.  I'm sure 
there is a lot we can take from existing reasoners.  We would have to 
work on it.  I'm always surprised that more people don't know about the 
Semantic Web.  It is a very powerful set of technologies, and I think 
making the Internet into a giant database is just a great practical 
development in computers.  I have in mind for all programmers to use the 
Semantic Web every day in all their object-oriented code: OWL, SPARQL, 
rules, and Semantic Web Services.  I have in mind for everyone to know 
about the Semantic Web.  Likely people will want a lot more Semantic Web 
software if that happens.

People didn't seem impressed when I said we could post all 
object-oriented data on the Semantic Web.  We can.  The Semantic Web 
will just be giant with all that data on it.  I'd like to explain in 
more detail.  People have commented on the difference between attributes 
and properties, and I've read some of the links.  My view is still that 
an attribute is just a binary predicate, nothing more, nothing less.  It 
just relates two entities.  Translating Java data into RDF is on my to 
do list to code, but it's straightforward to explain.  Each Java package 
is an ontology.  The classes in the package are members of the 
ontology.  We make each attribute into a property.  In my software, 
properties are defined in their own files alongside the classes.  We 
could put all the properties in the main ontology if the properties from 
different classes don't conflict, or we could create a separate ontology 
for each class that just contains the class's attributes.  Then for the 
data, we just take each Java Collection or array that is an attribute 
and make each element of it into a triple.  When it is a list or an 
array, we will have to differentiate whether the programmer really means 
a set of triples or a list like rdf:List.  Sometimes when programmers 
use lists, they really mean sets.  The value of each attribute that is 
not a Java Collection or an array is just a single triple.  There are 
rdf:type triples for class membership.  So all object-oriented data are 
triples.  Then we can post all object-oriented data on the Semantic 
Web.  We should just go ahead and do that so that people have access to 
the data.

My approach to reasoning is a little difficult to explain, but I did my 
best to explain it in my article, which I quote here.  It might make 
better sense in the context of the article, but you should be able to 
follow it.  As I mentioned, for each read operation, it is as if Protege 
reasoned only the part of the data set relevant to the screen the user 
is currently viewing and did not do any more reasoning.  When we call a 
read method on a Java Collection that is an attribute, we do only the 
reasoning that could potentially cause objects to be added to the set of 
objects.

> Property reasoning consists of trying to add objects to a certain 
> subject-predicate pair.  When we call a read method on a Java 
> Collection that is an attribute for a certain subject and predicate, 
> we want all the objects to be in it.  The goal is to reason this 
> object set, i.e. add as many members to it as we can according to the 
> reasoning.  For each type of property reasoning, there is a certain 
> manner in which objects may be added to the object set.  We need to 
> try to add objects by each type of reasoning.  Depending on the 
> semantics of the property, we need to try to add objects by zero or 
> more of rdfs:subPropertyOf (including owl:equivalentProperty), 
> owl:inverseOf, owl:ReflexiveProperty, owl:SymmetricProperty, and 
> owl:TransitiveProperty.  So let us look at each type of reasoning.  
> For a subject /s/ and a predicate /p/, let us call the set of objects 
> /O_s,p/.
>
> First, let us look at rdfs:subPropertyOf reasoning.  Let us have a 
> property hasChild with two subproperties, hasDaughter and hasSon, 
> representing the sets of a person's children, daughters, and sons, 
> respectively.  The domains are all Person, a class representing a 
> person.  The ranges are Person, Female, and Male, respectively, where 
> Female is a female person and Male is a male person.  We are trying to 
> find all of a person /x/'s children.  I.e., we are trying to reason 
> the object set /O_x,hasChild/, with person /x/ as subject and hasChild 
> as predicate.  To find all of /x/'s children, we need to find all of 
> /x/'s daughters and sons.  I.e., we need to reason the object sets 
> /O_x,hasDaughter/ and /O_x,hasSon/.  However, if the only form of 
> reasoning we are using for hasChild is rdfs:subPropertyOf, i.e. if no 
> other form of reasoning can add objects to /O_x,hasChild/, then we 
> reason /O_x,hasDaughter/ and /O_x,hasSon/, and then we are done 
> reasoning.  We reason those two object sets and add them to 
> /O_x,hasChild/ along with any objects that were in it before the 
> reasoning.  We have found all of /x/'s children and have likely 
> reasoned only a small part of the whole data set.  None of the rest of 
> the data matters, so we do not have to reason it.  In general, when we 
> are trying to reason an object set /O_s,p/, we need to reason /O_s,q/ 
> for each subproperty /q/ of /p/.
>
> Let us say that hasChild has further semantics beyond having 
> subproperties.  In particular, let us say it has an inverse property 
> by owl:inverseOf, hasParent, the set of a person's parents.  The 
> domain and range of each is Person.  We have found the object sets we 
> need to reason for adding objects to /O_x,hasChild/ by 
> rdfs:subPropertyOf reasoning.  Now let us find the object sets we need 
> to reason for adding objects to it by owl:inverseOf reasoning.  For 
> each member /y/ of the Person class, we reason the object set 
> /O_y,hasParent/.  If /x/ is in /O_y,hasParent/, i.e. if /y hasParent 
> x/, then /x hasChild y/.  However, those object sets are the only ones 
> we need to reason for the owl:inverseOf reasoning.  We do not need to 
> do any more reasoning.  We are not having our particular hasChild 
> example be reflexive, symmetric, or transitive.  So for our particular 
> hasChild example, we reason the object sets in this paragraph and the 
> previous one.  Then if those forms of reasoning are the only ones our 
> system does, nothing else can possibly add objects to /O_x,hasChild/.

(I use AspectJ to get all instantiated members of a class, which 
normally we cannot do in Java.)  The other property reasoning is exactly 
the same (it's on page 8 in the article on my web page, but I'll gladly 
explain here if anyone asks): when we are trying to reason an object 
set, for each type of reasoning there is only a certain number of other 
object sets we need to reason to complete the reasoning.  So when we 
call a read method on a Java Collection that is an attribute, we reason 
only a small part of memory.  Well, someone can tell me if that approach 
isn't new, but it seems very efficient computationally.

Oh, I really think I've done something new.  I have the basis for a 
whole Semantic Web system that just runs on Java instead of on a set of 
whole triples.

Tim
http://www.semanticoop.org



On 09/01/2012 08:02 PM, adasal wrote:
> Hi,
> I don't think you answered the question of how what you propose would 
> be optimised. How would it deal with reasoning, for instance, bearing 
> in mind that Pellet and many others have been subject to many years of 
> development and probably the subject of many Phds too!
>
> You might want to look at this project which Henry Story tipped me off 
> to:-
> https://github.com/w3c/banana-rdf
>
> This is from the test suit:-
> ObjectExamples.scala
>
> package org.w3.banana
>
> import scalaz.{ Validation, Failure, Success }
> import java.util.UUID
>
> class ObjectExamples[Rdf <: RDF]()(implicit diesel: Diesel[Rdf]) {
>
>   import diesel._
>   import ops._
>
>   case class Person(name: String, nickname: Option[String] = None)
>
>   object Person {
>
>     val clazz = uri("http://example.com/Person#class")
>     implicit val classUris = classUrisFor[Person](clazz)
>
>     val name = property[String](foaf.name <http://foaf.name>)
>     val nickname = optional[String](foaf("nickname"))
>     val address = property[Address](foaf("address"))
>
>     implicit val container = uri("http://example.com/persons/")
>     implicit val binder = pgb[Person](name, nickname)(Person.apply, 
> Person.unapply)
>
>   }
>
>   sealed trait Address
>
>   object Address {
>
>     val clazz = uri("http://example.com/Address#class")
>     implicit val classUris = classUrisFor[Address](clazz)
>
>     // not sure if this could be made more general, nor if we actually 
> want to do that
>     implicit val binder: PointedGraphBinder[Rdf, Address] = new 
> PointedGraphBinder[Rdf, Address] {
>       def fromPointedGraph(pointed: PointedGraph[Rdf]): 
> Validation[BananaException, Address] =
>         Unknown.binder.fromPointedGraph(pointed) orElse 
> VerifiedAddress.binder.fromPointedGraph(pointed)
>
>       def toPointedGraph(address: Address): PointedGraph[Rdf] = 
> address match {
>         case va: VerifiedAddress => 
> VerifiedAddress.binder.toPointedGraph(va)
>         case Unknown => Unknown.binder.toPointedGraph(Unknown)
>       }
>     }
>
>   }
>
>   case object Unknown extends Address {
>
>     val clazz = uri("http://example.com/Unknown#class")
>     implicit val classUris = classUrisFor[Unknown.type](clazz, 
> Address.clazz)
>
>     // there is a question about constants and the classes they live in
>     implicit val binder: PointedGraphBinder[Rdf, Unknown.type] = 
> constant(this, uri("http://example.com/Unknown#thing")) withClasses 
> classUris
>
>   }
>
>   case class VerifiedAddress(label: String, city: City) extends Address
>
>   object VerifiedAddress {
>
>     val clazz = uri("http://example.com/VerifiedAddress#class")
>     implicit val classUris = classUrisFor[VerifiedAddress](clazz, 
> Address.clazz)
>
>     val label = property[String](foaf("label"))
>     val city = property[City](foaf("city"))
>
>     implicit val ci = classUrisFor[VerifiedAddress](clazz)
>
>     implicit val binder = pgb[VerifiedAddress](label, 
> city)(VerifiedAddress.apply, VerifiedAddress.unapply) withClasses 
> classUris
>
>   }
>
>   case class City(cityName: String, otherNames: Set[String] = Set.empty)
>
>   object City {
>
>     val clazz = uri("http://example.com/City#class")
>     implicit val classUris = classUrisFor[City](clazz)
>
>     val cityName = property[String](foaf("cityName"))
>     val otherNames = set[String](foaf("otherNames"))
>
>     implicit val binder: PointedGraphBinder[Rdf, City] =
>       pgb[City](cityName, otherNames)(City.apply, City.unapply) 
> withClasses classUris
>
>   }
>
> }
>
> The stated aim of the developers is to follow the RDF spec 'carefully' 
> and to work with Jena and Sesame, so perhaps not the same as yours?
>
>
> Adam
>
>
> On 30 August 2012 17:03, Timothy Armstrong <tim.armstrong@gmx.com 
> <mailto:tim.armstrong@gmx.com>> wrote:
>
>     Hi,
>
>     Sorry for taking so long to respond, and thank you for talking to me.
>
>     As far as I can tell, it is entirely possible to build Semantic
>     Web software on top of object-oriented programming languages, and
>     presumably database technologies, instead of starting Semantic Web
>     software from scratch.  Object-oriented languages already have a
>     lot of the OWL data model implemented.  Well, all they have are
>     classes, properties, and rdfs:subClassOf, but they do those very
>     well.  So we want to add the rest of OWL to them.
>
>     All the property reasoning just works directly for attributes. 
>     Attributes can have subattributes, inverse attributes, be
>     transitive, etc., and it all makes perfect sense.  We just let
>     people use property reasoning for any of their object-oriented
>     attributes. I tried to show how useful it would be in the example
>     program on my web page: http://www.semanticoop.org/example.html.
>     There is a Person Java class with attributes for the person's
>     mother, father, parents, children, ancestors, descendants, an
>     attribute for all the person's relatives, etc.  You can imagine
>     how messy the Java program would be without the property
>     reasoning.  The reasoning really helps.  Well, maybe there are
>     better ways of doing it than with my software, but I think that if
>     we can add OWL to OOP, people will really like it.
>
>     Well, I really don't know what to do with the software. Thank you
>     for talking to me about it.
>
>     Tim Armstrong
>
>
>
>     On 08/20/2012 05:21 PM, adasal wrote:
>>     OK, there is something like multiple inheritance in Scala with
>>     Scala traits and there are a few implementations of mixin
>>     frameworks in Java which is really IoC using DI. Tapestry was my
>>     example, but Spring allows similar to enable separation of concerns.
>>     However I believe that Sesame/AliBaba also allows such hooks.
>>     When it comes to reasoning there are existing Java reasoners. It
>>     seems more than a tall order to build your own!
>>     For instance http://clarkparsia.com/pellet/
>>     The problem that you will have if you build your own is that you
>>     will both have to optimise and verify it.
>>
>>     The mixin pattern is also available in Python, there are some
>>     advantages against inheritance as run time behaviour can be
>>     determined.
>>
>>     Are you able to explain better the difference and implication of
>>     your approach to existing approaches?
>>     It has been said in this thread that code up to the RDF level is
>>     the better approach, that is the RDF is not fully modelled in the
>>     code but translated through e.g. SPARQL I suppose.
>>     How do you understand this?
>>
>>     Best,
>>
>>     Adam
>>
>>     On 20 August 2012 16:39, Timothy Armstrong <tim.armstrong@gmx.com
>>     <mailto:tim.armstrong@gmx.com>> wrote:
>>
>>         Hi,
>>
>>         Well, if there would be some way to do multiple inheritance
>>         in Java, that could be very useful.  I was thinking it would
>>         probably be possible to do some of the class reasoning in
>>         Java with its limited support for multiple inheritance with
>>         interfaces, just we would be partially limited in what we can
>>         do with classes in Java.  There shouldn't be a problem with
>>         the property reasoning, SPARQL, or rules in Java though (well
>>         I have methods to compute all the triplestore indexes).
>>         Semantic Web Services written as Java annotations on methods
>>         should work if annotations are extended to support arbitrary
>>         datatypes.
>>
>>         I could have written the code in a language that has multiple
>>         inheritance, but we can do a lot in Java, and Java is just my
>>         best language.  It might be straightforward to copy the code
>>         to any other object-oriented language.  I just looked at
>>         Python decorators this morning, since Python has
>>         multiple-inheritance.  It doesn't look straightforward to use
>>         them just for metadata on code elements like Java
>>         annotations, but maybe it can be done.  If it can, it looks
>>         like they would support arbitrary datatypes.  And then maybe
>>         we could add all of OWL, SPARQL, rules, and Semantic Web
>>         Services to Python without modifying Python...  Well, there
>>         would need to be something like AspectJ for Python as I'm
>>         doing it.
>>
>>         Truthfully, I haven't spent much thought about how best to do
>>         the class reasoning. I focused on the property reasoning and
>>         indexes and was waiting to talk to people about the class
>>         reasoning.  I'll have to look into the technologies you mention.
>>
>>         Tim
>>
>>
>>
>>         On 08/18/2012 04:55 PM, adasal wrote:
>>>
>>>             We would need to modify a compiler to determine to which
>>>             classes an object belongs so we would know what methods
>>>             can be used with it.  There could be methods in defined
>>>             classes.
>>>
>>>
>>>         You must be thinking about a multiple
>>>         class inheritance hierarchy.
>>>
>>>         There is this project
>>>         http://insightfullogic.com/blog/2011/sep/16/multiple-inheritance/
>>>         but I think there must be other implementations.
>>>         Further containers for IoC such as Tapestry have a mature
>>>         mixin implementation for class transformation, or Scala (and
>>>         Java 8 to be) supports traits.
>>>         Wouldn't this cover it instead of messing around with the
>>>         compiler?
>>>
>>>         I would have thought the real problem is how to
>>>         define precedence in the multiple hierarchy. How does OWL
>>>         deal with contradictory definitions in the hierarchy?
>>>
>>>         Adam
>>>
>>>         On 18 August 2012 16:10, Timothy Armstrong
>>>         <tim.armstrong@gmx.com <mailto:tim.armstrong@gmx.com>> wrote:
>>>
>>>             Hi Adam,
>>>
>>>             What I have in mind is fitting my software together with
>>>             Sesame or Jena and just having the back-end store sets
>>>             of objects instead of whole triples and see how that
>>>             works.  For benchmarks, I think it won't be very
>>>             difficult to get SPARQL running on my software, since I
>>>             have methods to compute all the triplestore indexes
>>>             (permutations of subject-predicate-object) from all of
>>>             main memory, but SPARQL isn't running yet.
>>>
>>>             I just meant that my understanding was that OWL can
>>>             express anything about data OOP can express, and more,
>>>             but I'm sure Alan is right that there is more than
>>>             abstract classes. By "disparity" I meant that even if
>>>             there are differences between OOP and OWL of which I'm
>>>             not aware, I still don't see a problem with adding OWL
>>>             to OOP.
>>>
>>>             We would need to modify a compiler to determine to which
>>>             classes an object belongs so we would know what methods
>>>             can be used with it.  There could be methods in defined
>>>             classes.
>>>
>>>             Tim
>>>
>>>
>>>
>>>             On 08/18/2012 07:35 AM, adasal wrote:
>>>>
>>>>
>>>>             On 17 August 2012 23:08, Timothy Armstrong
>>>>             <tim.armstrong@gmx.com <mailto:tim.armstrong@gmx.com>>
>>>>             wrote:
>>>>
>>>>                 Certainly, object-oriented classes and OWL classes
>>>>                 are different, but my understanding is that the
>>>>                 main difference is just that OWL is strictly better.
>>>>
>>>>
>>>>             What do you mean by 'strictly', 'better' and 'strictly
>>>>             better'?
>>>>
>>>>                  I'm not aware of anything OOP can do that OWL
>>>>                 cannot do, but OWL can do a lot more.
>>>>
>>>>
>>>>             What do you mean by 'do'? Do you mean it is more
>>>>             expressive such that it is possible to define in OWL
>>>>             what cannot be defined in OOP? Isn't that axiomatic in
>>>>             that they are different languages with different semantics?
>>>>             What you are really saying is that you want to extend
>>>>             the syntax of OOP in a form you think is convenient to
>>>>             use such that it will be able to express OWL semantics.
>>>>
>>>>                 Well, abstract classes, but that's all I can think of.
>>>>
>>>>             So is this relevant?
>>>>
>>>>                 Or if there is still going to be a disparity,
>>>>
>>>>             What does this mean?
>>>>
>>>>                 we should still just be able to add all the OWL
>>>>                 class constructs and everything else about OWL and
>>>>                 let people use them in OOP.
>>>>
>>>>             You mean with your annotations - but the issue really
>>>>             is whether this is more convenient than existing
>>>>             approaches.
>>>>
>>>>                 We'd need to get into a compiler to do some of it,
>>>>                 but I think it would be worth it.
>>>>
>>>>             Why would it be necessary to get into the compiler,
>>>>             what are you talking about?
>>>>             Do you mean to pick up annotations - that is not
>>>>             necessary as new annotations can be defined as things
>>>>             stand - or to optimise such as in the way you mention
>>>>             where reasoning is selective. I can't see that this
>>>>             needs access to the compiler so much as an
>>>>             understanding of the logic of whether and when
>>>>             selective reasoning is a proper optimisation.
>>>>
>>>>             You would have to show that your approach is better
>>>>             than the existing approaches to optimisation that sit
>>>>             on top of triple and quad stores.
>>>>             Can you do this?
>>>>
>>>>             Adam
>>>>
>>>
>>>
>>
>>
>
>
Received on Thursday, 6 September 2012 23:48:33 UTC