Re: perspectives on OWL v.next and RDF from Bijan Parsia on 2006-11-18 (public-owl-dev@w3.org from October to December 2006)

From: Bijan Parsia <bparsia@isr.umd.edu>
Date: Sat, 18 Nov 2006 01:40:22 +0000
To: Jim Hendler <hendler@cs.umd.edu>
Cc: Kendall Clark <kendall@monkeyfist.com>, public-owl-dev@w3.org
Message-Id: <5ED4790F-4527-4EB6-BB90-4D2A493ECF5D@isr.umd.edu>
On Nov 15, 2006, at 1:10 AM, Jim Hendler wrote:
[snip]

> point taken, but one would expect the uptake on the public side to  
> be continuing while the other goes on - It is unclear to me why  
> intranet adoption would favor more expressivity, woudl assume it to  
> be about the same
>

While I share the belief that intranet adoption is a perfectly fine  
rationale for adding (or removing) something, I would like to point  
out that I certainly don't think that the extra expressivity is being  
driven by "intranet" adoption...at least, I sure didn't see it that  
way. The person who, afaik, introduced this claim was Danny Ayers.  
 From what I recall, he thinks that the kind of life sciences users  
such as NCI, Galen, Snomed, etc. that want things like qualified  
cardinality restrictions are "effectively offline users". Kendall's  
point was, even if you *grant* this (what I think to be false)  
premise, it doesn't invalidate their needs or make them less useful  
for driving forward the semantic web.

Of course, I see the life sciences folks as paradigm semantic web  
users, or could be users. Think of the NCI thesaurus! It's great  
thing that it's out in public (thanks to you) and in OWL. It's  
wonderful that the NCI people are migrating from a private, closed  
system with an idiosyncratic, well, everything, to a version of  
protege + open source reasoners. Just getting the stuff *on* the web  
in a reasonably interoperable format is very important.

All the features *added* are of this sort --- demanded by users at  
OWLED (and in other fora!). These are not *all* the users, obviously,  
but that's a reason why the increment is *very* small (basically, not  
too much more than DAML+OIL), but potentially high impact. That's  
also why we included more rational fragments.


> - a lot of the DOD projects I've been involved with are using  
> relatively low expressivity w/large ABOX as well
>

Sure. And there are large TBoxes from life sciences (for which the EL+ 
+ fragment was developed...but EL++ includes qualified number  
restrictions...so we sort of have to add them to the whole).

Also, consider more expressive datatypes. It's really nice for a  
number of applications (Web service policies come to mind!) to be  
able to do comparisons. Last year, in fact, surprisingly, datatypes  
came ahead of qualified cardinality restrictions.


>  again, I think the clarity of messaging and the development of a  
> simpler subset are both necessary to better OWL adoption - i don't  
> claim sufficiency, but if I have to prioritize (and I do, at least  
> w/respect to my time and that of my employees) then I would prefer  
> to see us fill the simplicity gap before chasing the expressivity  
> end - I think OWL DL/Full is expressive enough to hold most people  
> for a while...
>

Of course, none of us, at all, have done a formal market survey, so  
we're all giving impressions. Impressions are better at determining  
positives rather than negatives. We all hear, "rational subsets!" but  
perhaps you didn't hear, "Qualified number restrictions" (and, in EL+ 
+, they unite). I do think that catering to the life science  
ontologists is important (even if "offline")...one cool drug  
discovery facilitated by OWL is a great talking point.

One very interesting paper about the trade off is:
	http://owl-workshop.man.ac.uk/acceptedPosition/submission_19.pdf

""" Next, we used DL subsumption reasoning to classify
MED. In comparing the classified hierarchy for such defined concepts  
with the original
taxonomy in MED, we found 44 additional subsumptions for laboratory  
concepts. On
manual analysis of newly inferred subsumptions, 26 were correct  
subsumptions i.e. the
concepts actually had a subclass relationship, as confirmed by a  
domain expert.
Interestingly, the false positives revealed systematic modeling errors.

The important result here is not that we identified these modeling  
errors due to the
increased expressivity of DL.  More important is our finding that the  
missed subsumptions
could have cost the hospital many missing results in various decision  
support and
infection control systems that routinely use MED to screen patients.   
"""

So, I think we all would agree is that expressivity for expressivity  
sake isn't interesting. Results are what matter.

Finally, I would like to point out that the tractable fragments OWL  
1.1 document includes fragments that are user driven. EL++ was  
specifically done to address the very large medical ontologies. DL  
Lite was motivated by database integration (not integrating reasoners  
with databases, but integrating distinct database). Some variants of  
DL Lite cover foaf. Surprisingly many ontologies turn out to be  
hornSHIQ (as Boris found out).

Finally, I venture this explanation: one reason people with large  
scale problems haven't used more expressivity is that we've not  
explained well what expressivity they can "safely" use. OWL Lite  
definitely failed on that front. I would like to improve on that. I  
get good reactions from people when I explain these alternatives, FWIW.

Cheers,
Bijan.
Received on Saturday, 18 November 2006 01:40:37 UTC