Re: Enhancing object-oriented programming with OWL from Timothy Armstrong on 2012-09-11 (semantic-web@w3.org from September 2012)

From: Timothy Armstrong <tim.armstrong@gmx.com>
Date: Tue, 11 Sep 2012 15:07:32 -0400
To: semantic-web@w3.org
Message-ID: <504F8BF4.7060003@gmx.com>
Hi,

Martynas, I'm not sure what you mean by “huge abstraction level.” If you 
look at my example program, http://www.semanticoop.org/example.html, 
you'll see that all the code is the OWL semantics for the 
attributes/properties, a single line at the beginning of the program, 
and then regular Java code. If we add OWL to Java, all OWL data are just 
object-oriented data, so we just use object-oriented code to manipulate 
the OWL data instead of some special sort of code as in Jena, Sesame, 
RDFLib, etc. (This software is very valuable; I'll probably want to 
combine my software with both Jena and Sesame.)

My motivation is mainly to allow people to use the Semantic Web in any 
of their object-oriented code, which I'm sure no one else has ever done 
before. Really, I think people will love it. However, I think my 
software also has some advantages over Semantic Web software that stores 
its data as whole triples, and I believe it is very much worth 
investigating them. I ran some benchmarks comparing my software with 
Jena for creating resources / Java objects and making triple statements. 
If you look at them, http://www.semanticoop.org/benchmarks.html, you'll 
see that my software compares very favorably. Please feel free to try 
them yourself! (And email me if you have problems running the code.) 
I'll gladly run any more benchmarks anyone suggests, given that I don't 
have SPARQL working yet (though it should be close to working). At 
least, they should be good benchmarks to run.

Well I'm glad people are talking to me, but I'm just surprised I'm 
really not getting people to say much positive. Shouldn't we just go 
ahead and try to convince people to post their object-oriented data on 
the Semantic Web so that people have access to the data, as in my last 
post? Once the code is written, that is.

Tim


On 09/07/2012 09:15 AM, Martynas Jusevičius wrote:
> Tim,
>
> let me be so bold as to argue that you do not need the object model
> *at all* to build Semantic Web apps. Above RDF/SPARQL API level, that
> is.
>
> So your idea might be interesting as a computer science experiment,
> but to those building practical applications, it might be just another
> huge abstraction level without real added value.
> It might be helpful in order to integrate legacy Java apps or lower
> the SW learning curve for developers coming from that world, but
> that's pretty much it. Are you sure it's worth the effort?
>
> Martynas
>
> On Fri, Sep 7, 2012 at 2:48 AM, Timothy Armstrong <tim.armstrong@gmx.com> wrote:
>> Hi,
>>
>> Well, we need a new reasoner.  I explain my approach a bit below for the
>> reasoner I wrote.  I believe it is new, but I could be entirely mistaken
>> that it's new.  At no time does all of memory have to be in a reasoned
>> state, only a small part of it.  We need a new reasoner because all the
>> existing ones run on a set of whole triples, and we need one that just runs
>> on Java.
>>
>> I believe it is eminently worthwhile to write a new reasoner.  I'm sure
>> there is a lot we can take from existing reasoners.  We would have to work
>> on it.  I'm always surprised that more people don't know about the Semantic
>> Web.  It is a very powerful set of technologies, and I think making the
>> Internet into a giant database is just a great practical development in
>> computers.  I have in mind for all programmers to use the Semantic Web every
>> day in all their object-oriented code: OWL, SPARQL, rules, and Semantic Web
>> Services.  I have in mind for everyone to know about the Semantic Web.
>> Likely people will want a lot more Semantic Web software if that happens.
>>
>> People didn't seem impressed when I said we could post all object-oriented
>> data on the Semantic Web.  We can.  The Semantic Web will just be giant with
>> all that data on it.  I'd like to explain in more detail.  People have
>> commented on the difference between attributes and properties, and I've read
>> some of the links.  My view is still that an attribute is just a binary
>> predicate, nothing more, nothing less.  It just relates two entities.
>> Translating Java data into RDF is on my to do list to code, but it's
>> straightforward to explain.  Each Java package is an ontology.  The classes
>> in the package are members of the ontology.  We make each attribute into a
>> property.  In my software, properties are defined in their own files
>> alongside the classes.  We could put all the properties in the main ontology
>> if the properties from different classes don't conflict, or we could create
>> a separate ontology for each class that just contains the class's
>> attributes.  Then for the data, we just take each Java Collection or array
>> that is an attribute and make each element of it into a triple.  When it is
>> a list or an array, we will have to differentiate whether the programmer
>> really means a set of triples or a list like rdf:List.  Sometimes when
>> programmers use lists, they really mean sets.  The value of each attribute
>> that is not a Java Collection or an array is just a single triple.  There
>> are rdf:type triples for class membership.  So all object-oriented data are
>> triples.  Then we can post all object-oriented data on the Semantic Web.  We
>> should just go ahead and do that so that people have access to the data.
>>
>> My approach to reasoning is a little difficult to explain, but I did my best
>> to explain it in my article, which I quote here.  It might make better sense
>> in the context of the article, but you should be able to follow it.  As I
>> mentioned, for each read operation, it is as if Protege reasoned only the
>> part of the data set relevant to the screen the user is currently viewing
>> and did not do any more reasoning.  When we call a read method on a Java
>> Collection that is an attribute, we do only the reasoning that could
>> potentially cause objects to be added to the set of objects.
>>
>> Property reasoning consists of trying to add objects to a certain
>> subject-predicate pair.  When we call a read method on a Java Collection
>> that is an attribute for a certain subject and predicate, we want all the
>> objects to be in it.  The goal is to reason this object set, i.e. add as
>> many members to it as we can according to the reasoning.  For each type of
>> property reasoning, there is a certain manner in which objects may be added
>> to the object set.  We need to try to add objects by each type of reasoning.
>> Depending on the semantics of the property, we need to try to add objects by
>> zero or more of rdfs:subPropertyOf (including owl:equivalentProperty),
>> owl:inverseOf, owl:ReflexiveProperty, owl:SymmetricProperty, and
>> owl:TransitiveProperty.  So let us look at each type of reasoning.  For a
>> subject s and a predicate p, let us call the set of objects O_s,p.
>>
>> First, let us look at rdfs:subPropertyOf reasoning.  Let us have a property
>> hasChild with two subproperties, hasDaughter and hasSon, representing the
>> sets of a person's children, daughters, and sons, respectively.  The domains
>> are all Person, a class representing a person.  The ranges are Person,
>> Female, and Male, respectively, where Female is a female person and Male is
>> a male person.  We are trying to find all of a person x's children.  I.e.,
>> we are trying to reason the object set O_x,hasChild, with person x as
>> subject and hasChild as predicate.  To find all of x's children, we need to
>> find all of x's daughters and sons.  I.e., we need to reason the object sets
>> O_x,hasDaughter and O_x,hasSon.  However, if the only form of reasoning we
>> are using for hasChild is rdfs:subPropertyOf, i.e. if no other form of
>> reasoning can add objects to O_x,hasChild, then we reason O_x,hasDaughter
>> and O_x,hasSon, and then we are done reasoning.  We reason those two object
>> sets and add them to O_x,hasChild along with any objects that were in it
>> before the reasoning.  We have found all of x's children and have likely
>> reasoned only a small part of the whole data set.  None of the rest of the
>> data matters, so we do not have to reason it.  In general, when we are
>> trying to reason an object set O_s,p, we need to reason O_s,q for each
>> subproperty q of p.
>>
>> Let us say that hasChild has further semantics beyond having subproperties.
>> In particular, let us say it has an inverse property by owl:inverseOf,
>> hasParent, the set of a person's parents.  The domain and range of each is
>> Person.  We have found the object sets we need to reason for adding objects
>> to O_x,hasChild by rdfs:subPropertyOf reasoning.  Now let us find the object
>> sets we need to reason for adding objects to it by owl:inverseOf reasoning.
>> For each member y of the Person class, we reason the object set
>> O_y,hasParent.  If x is in O_y,hasParent, i.e. if y hasParent x, then x
>> hasChild y.  However, those object sets are the only ones we need to reason
>> for the owl:inverseOf reasoning.  We do not need to do any more reasoning.
>> We are not having our particular hasChild example be reflexive, symmetric,
>> or transitive.  So for our particular hasChild example, we reason the object
>> sets in this paragraph and the previous one.  Then if those forms of
>> reasoning are the only ones our system does, nothing else can possibly add
>> objects to O_x,hasChild.
>>
>>
>> (I use AspectJ to get all instantiated members of a class, which normally we
>> cannot do in Java.)  The other property reasoning is exactly the same (it's
>> on page 8 in the article on my web page, but I'll gladly explain here if
>> anyone asks): when we are trying to reason an object set, for each type of
>> reasoning there is only a certain number of other object sets we need to
>> reason to complete the reasoning.  So when we call a read method on a Java
>> Collection that is an attribute, we reason only a small part of memory.
>> Well, someone can tell me if that approach isn't new, but it seems very
>> efficient computationally.
>>
>> Oh, I really think I've done something new.  I have the basis for a whole
>> Semantic Web system that just runs on Java instead of on a set of whole
>> triples.
>>
>> Tim
>> http://www.semanticoop.org
>>
>>
>>
>>
>> On 09/01/2012 08:02 PM, adasal wrote:
>>
>> Hi,
>> I don't think you answered the question of how what you propose would be
>> optimised. How would it deal with reasoning, for instance, bearing in mind
>> that Pellet and many others have been subject to many years of development
>> and probably the subject of many Phds too!
>>
>> You might want to look at this project which Henry Story tipped me off to:-
>> https://github.com/w3c/banana-rdf
>>
>> This is from the test suit:-
>> ObjectExamples.scala
>>
>> package org.w3.banana
>>
>> import scalaz.{ Validation, Failure, Success }
>> import java.util.UUID
>>
>> class ObjectExamples[Rdf <: RDF]()(implicit diesel: Diesel[Rdf]) {
>>
>>    import diesel._
>>    import ops._
>>
>>    case class Person(name: String, nickname: Option[String] = None)
>>
>>    object Person {
>>
>>      val clazz = uri("http://example.com/Person#class")
>>      implicit val classUris = classUrisFor[Person](clazz)
>>
>>      val name = property[String](foaf.name)
>>      val nickname = optional[String](foaf("nickname"))
>>      val address = property[Address](foaf("address"))
>>
>>      implicit val container = uri("http://example.com/persons/")
>>      implicit val binder = pgb[Person](name, nickname)(Person.apply,
>> Person.unapply)
>>
>>    }
>>
>>    sealed trait Address
>>
>>    object Address {
>>
>>      val clazz = uri("http://example.com/Address#class")
>>      implicit val classUris = classUrisFor[Address](clazz)
>>
>>      // not sure if this could be made more general, nor if we actually want
>> to do that
>>      implicit val binder: PointedGraphBinder[Rdf, Address] = new
>> PointedGraphBinder[Rdf, Address] {
>>        def fromPointedGraph(pointed: PointedGraph[Rdf]):
>> Validation[BananaException, Address] =
>>          Unknown.binder.fromPointedGraph(pointed) orElse
>> VerifiedAddress.binder.fromPointedGraph(pointed)
>>
>>        def toPointedGraph(address: Address): PointedGraph[Rdf] = address
>> match {
>>          case va: VerifiedAddress =>
>> VerifiedAddress.binder.toPointedGraph(va)
>>          case Unknown => Unknown.binder.toPointedGraph(Unknown)
>>        }
>>      }
>>
>>    }
>>
>>    case object Unknown extends Address {
>>
>>      val clazz = uri("http://example.com/Unknown#class")
>>      implicit val classUris = classUrisFor[Unknown.type](clazz,
>> Address.clazz)
>>
>>      // there is a question about constants and the classes they live in
>>      implicit val binder: PointedGraphBinder[Rdf, Unknown.type] =
>> constant(this, uri("http://example.com/Unknown#thing")) withClasses
>> classUris
>>
>>    }
>>
>>    case class VerifiedAddress(label: String, city: City) extends Address
>>
>>    object VerifiedAddress {
>>
>>      val clazz = uri("http://example.com/VerifiedAddress#class")
>>      implicit val classUris = classUrisFor[VerifiedAddress](clazz,
>> Address.clazz)
>>
>>      val label = property[String](foaf("label"))
>>      val city = property[City](foaf("city"))
>>
>>      implicit val ci = classUrisFor[VerifiedAddress](clazz)
>>
>>      implicit val binder = pgb[VerifiedAddress](label,
>> city)(VerifiedAddress.apply, VerifiedAddress.unapply) withClasses classUris
>>
>>    }
>>
>>    case class City(cityName: String, otherNames: Set[String] = Set.empty)
>>
>>    object City {
>>
>>      val clazz = uri("http://example.com/City#class")
>>      implicit val classUris = classUrisFor[City](clazz)
>>
>>      val cityName = property[String](foaf("cityName"))
>>      val otherNames = set[String](foaf("otherNames"))
>>
>>      implicit val binder: PointedGraphBinder[Rdf, City] =
>>        pgb[City](cityName, otherNames)(City.apply, City.unapply) withClasses
>> classUris
>>
>>    }
>>
>> }
>>
>> The stated aim of the developers is to follow the RDF spec 'carefully' and
>> to work with Jena and Sesame, so perhaps not the same as yours?
>>
>>
>> Adam
>>
>>
>> On 30 August 2012 17:03, Timothy Armstrong <tim.armstrong@gmx.com> wrote:
>>> Hi,
>>>
>>> Sorry for taking so long to respond, and thank you for talking to me.
>>>
>>> As far as I can tell, it is entirely possible to build Semantic Web
>>> software on top of object-oriented programming languages, and presumably
>>> database technologies, instead of starting Semantic Web software from
>>> scratch.  Object-oriented languages already have a lot of the OWL data model
>>> implemented.  Well, all they have are classes, properties, and
>>> rdfs:subClassOf, but they do those very well.  So we want to add the rest of
>>> OWL to them.
>>>
>>> All the property reasoning just works directly for attributes.  Attributes
>>> can have subattributes, inverse attributes, be transitive, etc., and it all
>>> makes perfect sense.  We just let people use property reasoning for any of
>>> their object-oriented attributes.  I tried to show how useful it would be in
>>> the example program on my web page: http://www.semanticoop.org/example.html.
>>> There is a Person Java class with attributes for the person's mother,
>>> father, parents, children, ancestors, descendants, an attribute for all the
>>> person's relatives, etc.  You can imagine how messy the Java program would
>>> be without the property reasoning.  The reasoning really helps.  Well, maybe
>>> there are better ways of doing it than with my software, but I think that if
>>> we can add OWL to OOP, people will really like it.
>>>
>>> Well, I really don't know what to do with the software.  Thank you for
>>> talking to me about it.
>>>
>>> Tim Armstrong
>>>
>>>
>>>
>>> On 08/20/2012 05:21 PM, adasal wrote:
>>>
>>> OK, there is something like multiple inheritance in Scala with Scala
>>> traits and there are a few implementations of mixin frameworks in Java which
>>> is really IoC using DI. Tapestry was my example, but Spring allows similar
>>> to enable separation of concerns.
>>> However I believe that Sesame/AliBaba also allows such hooks.
>>> When it comes to reasoning there are existing Java reasoners. It seems
>>> more than a tall order to build your own!
>>> For instance http://clarkparsia.com/pellet/
>>> The problem that you will have if you build your own is that you will both
>>> have to optimise and verify it.
>>>
>>> The mixin pattern is also available in Python, there are some advantages
>>> against inheritance as run time behaviour can be determined.
>>>
>>> Are you able to explain better the difference and implication of your
>>> approach to existing approaches?
>>> It has been said in this thread that code up to the RDF level is the
>>> better approach, that is the RDF is not fully modelled in the code but
>>> translated through e.g. SPARQL I suppose.
>>> How do you understand this?
>>>
>>> Best,
>>>
>>> Adam
>>>
>>> On 20 August 2012 16:39, Timothy Armstrong <tim.armstrong@gmx.com> wrote:
>>>> Hi,
>>>>
>>>> Well, if there would be some way to do multiple inheritance in Java, that
>>>> could be very useful.  I was thinking it would probably be possible to do
>>>> some of the class reasoning in Java with its limited support for multiple
>>>> inheritance with interfaces, just we would be partially limited in what we
>>>> can do with classes in Java.  There shouldn't be a problem with the property
>>>> reasoning, SPARQL, or rules in Java though (well I have methods to compute
>>>> all the triplestore indexes).  Semantic Web Services written as Java
>>>> annotations on methods should work if annotations are extended to support
>>>> arbitrary datatypes.
>>>>
>>>> I could have written the code in a language that has multiple
>>>> inheritance, but we can do a lot in Java, and Java is just my best language.
>>>> It might be straightforward to copy the code to any other object-oriented
>>>> language.  I just looked at Python decorators this morning, since Python has
>>>> multiple-inheritance.  It doesn't look straightforward to use them just for
>>>> metadata on code elements like Java annotations, but maybe it can be done.
>>>> If it can, it looks like they would support arbitrary datatypes.  And then
>>>> maybe we could add all of OWL, SPARQL, rules, and Semantic Web Services to
>>>> Python without modifying Python...  Well, there would need to be something
>>>> like AspectJ for Python as I'm doing it.
>>>>
>>>> Truthfully, I haven't spent much thought about how best to do the class
>>>> reasoning.  I focused on the property reasoning and indexes and was waiting
>>>> to talk to people about the class reasoning.  I'll have to look into the
>>>> technologies you mention.
>>>>
>>>> Tim
>>>>
>>>>
>>>>
>>>> On 08/18/2012 04:55 PM, adasal wrote:
>>>>> We would need to modify a compiler to determine to which classes an
>>>>> object belongs so we would know what methods can be used with it.  There
>>>>> could be methods in defined classes.
>>>>
>>>> You must be thinking about a multiple class inheritance hierarchy.
>>>>
>>>> There is this project
>>>> http://insightfullogic.com/blog/2011/sep/16/multiple-inheritance/
>>>> but I think there must be other implementations.
>>>> Further containers for IoC such as Tapestry have a mature mixin
>>>> implementation for class transformation, or Scala (and Java 8 to be)
>>>> supports traits.
>>>> Wouldn't this cover it instead of messing around with the compiler?
>>>>
>>>> I would have thought the real problem is how to define precedence in the
>>>> multiple hierarchy. How does OWL deal with contradictory definitions in the
>>>> hierarchy?
>>>>
>>>> Adam
>>>>
>>>> On 18 August 2012 16:10, Timothy Armstrong <tim.armstrong@gmx.com> wrote:
>>>>> Hi Adam,
>>>>>
>>>>> What I have in mind is fitting my software together with Sesame or Jena
>>>>> and just having the back-end store sets of objects instead of whole triples
>>>>> and see how that works.  For benchmarks, I think it won't be very difficult
>>>>> to get SPARQL running on my software, since I have methods to compute all
>>>>> the triplestore indexes (permutations of subject-predicate-object) from all
>>>>> of main memory, but SPARQL isn't running yet.
>>>>>
>>>>> I just meant that my understanding was that OWL can express anything
>>>>> about data OOP can express, and more, but I'm sure Alan is right that there
>>>>> is more than abstract classes. By "disparity" I meant that even if there are
>>>>> differences between OOP and OWL of which I'm not aware, I still don't see a
>>>>> problem with adding OWL to OOP.
>>>>>
>>>>> We would need to modify a compiler to determine to which classes an
>>>>> object belongs so we would know what methods can be used with it.  There
>>>>> could be methods in defined classes.
>>>>>
>>>>> Tim
>>>>>
>>>>>
>>>>>
>>>>> On 08/18/2012 07:35 AM, adasal wrote:
>>>>>
>>>>>
>>>>>
>>>>> On 17 August 2012 23:08, Timothy Armstrong <tim.armstrong@gmx.com>
>>>>> wrote:
>>>>>> Certainly, object-oriented classes and OWL classes are different, but
>>>>>> my understanding is that the main difference is just that OWL is strictly
>>>>>> better.
>>>>>
>>>>> What do you mean by 'strictly', 'better' and 'strictly better'?
>>>>>
>>>>>>   I'm not aware of anything OOP can do that OWL cannot do, but OWL can
>>>>>> do a lot more.
>>>>>
>>>>> What do you mean by 'do'? Do you mean it is more expressive such that it
>>>>> is possible to define in OWL what cannot be defined in OOP? Isn't that
>>>>> axiomatic in that they are different languages with different semantics?
>>>>> What you are really saying is that you want to extend the syntax of OOP
>>>>> in a form you think is convenient to use such that it will be able to
>>>>> express OWL semantics.
>>>>>
>>>>>> Well, abstract classes, but that's all I can think of.
>>>>> So is this relevant?
>>>>>
>>>>>> Or if there is still going to be a disparity,
>>>>> What does this mean?
>>>>>
>>>>>> we should still just be able to add all the OWL class constructs and
>>>>>> everything else about OWL and let people use them in OOP.
>>>>> You mean with your annotations - but the issue really is whether this is
>>>>> more convenient than existing approaches.
>>>>>
>>>>>> We'd need to get into a compiler to do some of it, but I think it would
>>>>>> be worth it.
>>>>> Why would it be necessary to get into the compiler, what are you talking
>>>>> about?
>>>>> Do you mean to pick up annotations - that is not necessary as new
>>>>> annotations can be defined as things stand - or to optimise such as in the
>>>>> way you mention where reasoning is selective. I can't see that this needs
>>>>> access to the compiler so much as an understanding of the logic of whether
>>>>> and when selective reasoning is a proper optimisation.
>>>>>
>>>>> You would have to show that your approach is better than the existing
>>>>> approaches to optimisation that sit on top of triple and quad stores.
>>>>> Can you do this?
>>>>>
>>>>> Adam
>>>>>
>>>>>
>>>>
>>>
>>
Received on Tuesday, 11 September 2012 19:08:05 UTC