Re: proposal - Fragments redux (unifying the threads under Issues 75-80) from Alan Wu on 2007-11-29 (public-owl-wg@w3.org from November 2007)

From: Alan Wu <alan.wu@oracle.com>
Date: Thu, 29 Nov 2007 12:58:09 -0500
To: Jim Hendler <hendler@cs.rpi.edu>
CC: Bernardo Cuenca Grau <bcg@cs.man.ac.uk>, OWL Working Group WG <public-owl-wg@w3.org>
Message-ID: <474EFDB1.3070604@oracle.com>
Jim,

I could not agree with you more on this topic.

Jim Hendler wrote:

>
> On Nov 28, 2007, at 8:32 PM, Bernardo Cuenca Grau wrote:
>
>>
>>
>>> * Let's be much more picky with what we name
>>>
>>>     Instead, I believe we should move much more cautiously on
>>> fragments, as these can have a lot of impact on developers and
>>> adoption of OWL.  Thus, I would suggest instead of a document like
>>> this which tries to identify all sorts of fragments (and notice, our
>>> charter does not specify tractability as a design req, so we must
>>> consider lots of other fragments as well) we try to pick a small
>>> number of them that are widely endorsed or used and attempt to
>>> highlight these.  I
>>>
>>
>> The document tries to give an overview of the different (families of)
>> different choices. In fact, it is not that exhaustive since there are 
>> much
>> more creatures in the jungle than the ones I mentioned. I had to make a
>> selection :) I agree with Jim in that we may need to be even pickier.
>>
>> I do think, however, that identifying fragments with nice computational
>> properties  *is* important. Most of the fragments in the document  have
>> been carefully designed and have involved years of intensive research. In
>> fact, most existing ontologies comply with  them, so
>> these fragments are something more than a mere academic exercise. 
>> Moreover, efficient
>> reasoners exist for them (at least for the ones I selected). I think it
>> would be a bad idea for the OWL community not to take advantage of such a
>> research effort.
>
>
> to me the question is whether the identification of these things in a 
> specification/recommendation is important - I don't disagree they are 
> worth knowing and noting, and I'd love to see you publish this in a 
> higher visibility forum - I just have some trouble believing that I'm 
> going to see a major company's product that brags about supporting 
> DL_Lite_R - however, an IBM or Oracle product that could claim to 
> cover "OWL Lite" seems like something they would like (or a Web 3.0 
> provider who can say "we can import RDFS 3.0 ontologies"

That is definitely true. Right now, Oracle says that Oracle 11gR1 
OWLPrime covers a good portion of OWL Lite semantics
and a bit OWL DL vocabularies. Our semantics are defined by the set of 
forward chaining rules we use.
Oracle would love very much to say that Oracle is fully **** compliant!

If we have a *standard* OWL DL (or Lite) subset that is both easy 
(really easy I mean) for customers to understand,
and simple enough to allow efficient and scalable implementations for 
commercial vendors, the adoption of semantic web technologies
in general will be a lot better. Commercial vendors can then claim a 
full standard compliance. And customers
don't have to worry about interoperability among commerical tools.

I want to stress again that scalability and efficiency are really 
critical here,
especially when we look at applications at the enterprise level.
For example, many customers approach Oracle with tons of data (ontologies
with hundreds of millions of triples and beyond) and they request 
efficient processing of the data.
They are willing to sacrifice or make compromises in semantics for a 
resonable response time.

>
>>
>>
>>> * Let's fix OWL Lite which IS broken
>>>
>>> In doing this, I would also suggest that we "fix" the problem that
>>> OWL Lite and OWL DL are much too close together, either by redefining
>>> OWL Lite (which in my experience almost no one uses on purpose -
>>> mostly it is people using DL who don't end up crossing the
>>> expressivity boundary) and in doing so we could consider answering
>>> the user request for a lighter lite -  by defining one (or at most a
>>> couple) of named fragments less than Lite, which have both DL and
>>> Full versions, and which tool vendors, implementors and others can
>>> then use as useful vocabularies.
>>>
>>
>> I agree on fixing OWL Lite. That was one of the goals of this document,
>> namely to present sensible possible fixes.
>>
>>> * but here's a way forward
>>>
>>>   Here's an idea - on the new fragments Wiki page - http://www.w3.org/
>>> 2007/OWL/wiki/Fragments - I listed one particular fragment I like -
>>> but I included a description, the requirements by which it was
>>> defined, and the particular language features it would include.  If
>>> people who have a particular fragment they would like to see
>>> included, and named, the I suggest you add one, and give these same
>>> pieces of information -- then we can at least see what people are
>>> suggesting in more detail, and discuss as a group how many of them
>>> people would really want to use
>>>
>>
>> I don't think I understand the motivations of that fragment, the
>> ontologies one could express using it,  and its
>> computational properties. Is there any relevant literature about this
>> fragment that I could take a look at? Any reasoning system I could try?
>
>
> well, it's not so much motivated by computational properties, see out 
> in the real world there's people who just implement fast engines and 
> don't worry so much about the details... 
> but, what is the motivator is that several studies of ontologies 
> including the one we did at Maryland a couple years back [1], a recent 
> study by Guus and Frank van Harmelen (I don't have a reference, was 
> told about it by Guus and Frank), and a study by UMBC of Swoogle 
> results (blogged in eBiquity last year) all showed that most of the 
> currently most used ontologies (including FOAF) pretty much fall into 
> this class.  I've also met with several companies involved in OWL 
> startups, and this is the expressivity that most of them say is most 
> useful.

Jim, those research match what Oracle has seen on the field. RDFS with 
transitive properties, for example, can solve a lot of real world
problems.  Some people request owl:sameAs and InverseFunctionalProperty 
so that they can integrate data from different sources.
I believe all these are covered by RDFS3.0

Thanks,

Zhe

>>
>> I am afraid I also don't quite understand some of the
>> requirements, namely 2) ``can have a clean operational semantics defined
>> via rule-based axiomatization'', 3) ``does not assume a specific 
>> reasoning
>> model'';
>>
>
> 2 is that a lot of DBs have rules built in, or companies in the data 
> space have rule-based engines, and they've said that being able to 
> represent the Owl terminology directly in rules is useful (i.e. 
> owl:inverse(P1, P2) is defined as X P1 Y => Y P2 X) - the original 
> DAML had such an axiomization (although in KIF) and we heard from a 
> number of implementors that they felt it easier to work from that than 
> from the model theory
>
> 3 is badly stated - what I meant is that many of the formal results 
> about DL are done with respect to a particular set of reasoning 
> services (mainly focused on inconsistency checking) but that there are 
> other things that some people need (for example fast computation of 
> transitive closures) that are less obviously related to the 
> tractability results for inconsistency checking -- I'd wager that most 
> of the real world web apps that are using RDF/RDFS are using 
> procedural reasoners (what you neats call "ad hoc code") that are 
> quite fast, many parallelizable, and some only care about particular 
> reasoning  -- so I don't see why we should say that writing a piece of 
> code to process "owl:sameAs" relations against a large RDF-DB is 
> somehow less interesting than proving a polynomial result for Tbox 
> reasoning in some DL subset.  
>
> In short, the people using RDF and RDFS, who many of us would like to 
> get to start using more OWL, have motivations that are answered by 
> other fragments than the ones chosen to be listed in the tractable 
> fragments (some of which depend on particular definitions of 
> tractibility, but you and Carsten have a better discussion of that 
> going than I need to)
>
>
>> Any clarifications will be welcome!
>>
>
> hope that helped -- FWIW, I wouldn't mind seeing some particular 
> subset of OWL with a really nice mapping to datalog being tagged as 
> "OWL DB", for example, but I would hope that the definition would be 
> made in such a way that someone wouldn't need a PhD in AI to decide if 
> they were using it or not 
>
>> Bernardo
>>
>>
>>>
>>>   -Jim Hendler
>>>
>>>
>>>
>>> On Nov 28, 2007, at 12:57 PM, Bernardo Cuenca Grau wrote:
>>>
>>>>
>>>>
>>>> Ok. I see quite a lot of discussion concerning the tractable
>>>> fragments document, and I will try to reply to all the issues.
>>>>
>>>> The selected version of DL-Lite is DL-Lite_R. As Carsten points
>>>> out, there are other variants of DL-Lite for which reasoning is
>>>> tractable. These variants share a common core, but provide
>>>> different extensions to this core corresponding to different
>>>> choices that one has to make in order to keep tractability. For
>>>> example, DL-Lite_R extendes the ``core'' of the language with role
>>>> inclusion axioms. DL-Lite_F extends it with role functionality of
>>>> roles and their inverses. If both role functionality and role
>>>> inclusion axioms were to be included, the nice computational
>>>> properties of the DL-Lite family of languages would be compromised.
>>>>
>>>> The selection of DL-Lite_R was motivated by the fact that it is a
>>>> proper extension of the DL subset of RDF-Schema, which provides
>>>> role-inclusion axioms but not functionality, and therefore DL-
>>>> Lite_R is a language that lies in between such DL subset of RDF-
>>>> Schema and OWL Lite.
>>>>
>>>> In any case, I agree that these choices should be discussed and
>>>> that we could do a better job in presenting all (or most of) the
>>>> variants. Also, as Carsten points out, there is also a distinction
>>>> between tractability of reasoning and the fact that query answering
>>>> can be handled using RDBMS, and this should probably be made
>>>> explicit if we are to present other variants of DL-Lite.
>>>>
>>>> I think we should discuss the alternatives within the WG
>>>>
>>>> Bernardo
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> OWL Working Group Issue Tracker wrote:
>>>>
>>>>> ISSUE-80 (DL-Lite): REPORTED: DL-Lite
>>>>>
>>>>> http://www.w3.org/2007/OWL/tracker/issues/
>>>>>
>>>>> Raised by: Bijan Parsia
>>>>> On product:
>>>>> (On behalf of Carsten Lutz.)
>>>>>
>>>>> There are many versions of DL-Lite around, all of them tractable,
>>>>> and many (but not all) of them reducible to query answering in
>>>>> RDBMS. I wonder how the fragment of DL-Lite was selected that is
>>>>> currently in the document and what are the alternatives? Maybe
>>>>> Bernardo can comment on this.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>> "If we knew what we were doing, it wouldn't be called research, would
>>> it?." - Albert Einstein
>>>
>>> Prof James Hendler http://www.cs.rpi.edu/~hendler 
>>> <http://www.cs.rpi.edu/%7Ehendler>
>>> Tetherless World Constellation Chair
>>> Computer Science Dept
>>> Rensselaer Polytechnic Institute, Troy NY 12180
>>>
>>>
>>>
>>>
>>>
>>
>>
>> ***********************************
>> Dr. Bernardo Cuenca Grau
>> Research Fellow
>> Information Management Group
>> School Of Computer Science
>> University of Manchester, UK
>> http://www.cs.man.ac.uk/~bcg <http://www.cs.man.ac.uk/%7Ebcg>
>> ************************************
>>
>>
>
> "If we knew what we were doing, it wouldn't be called research, would 
> it?." - Albert Einstein
>
> Prof James Hendler http://www.cs.rpi.edu/~hendler 
> <http://www.cs.rpi.edu/%7Ehendler>
> Tetherless World Constellation Chair
> Computer Science Dept
> Rensselaer Polytechnic Institute, Troy NY 12180
>
>
>
>
Received on Thursday, 29 November 2007 18:01:14 UTC