Re: [LC response] To Jeff Heflin from Jeff Heflin on 2009-08-20 (public-owl-comments@w3.org from August 2009)

From: Jeff Heflin <heflin@cse.lehigh.edu>
Date: Thu, 20 Aug 2009 12:15:55 -0400
To: Ian Horrocks <ian.horrocks@comlab.ox.ac.uk>
CC: public-owl-comments@w3.org
Message-ID: <4A8D76BB.7020908@cse.lehigh.edu>
Ian,

Thanks for the e-mail and sorry to be so slow in responding. My comments 
are are inline below.

Jeff

Ian Horrocks wrote:
> Dear Jeff,
> 
> Thank you for your comment
>      
> <http://lists.w3.org/Archives/Public/public-owl-comments/2009Jul/0014.html>
> on the OWL 2 Web Ontology Language last call drafts.
> 
> Regarding imports, as you can see from [1] the direct semantics of OWL 2 
> ontologies is explicitly applied to the axiom closure (with at link back 
> to the definition of axiom closure in Syntax [2]).

Yes, I see that now. Thanks for pointing it out.

> 
> Regarding deprecation, we introduced owl:deprecated mainly for backward 
> compatibility in order to capture the deprecated classes of OWL 1. Thus, 
> the capabilities of OWL 2 regarding deprecation are essentially the same 
> as those of OWL 1.

You say "essentially" the same. How is it different? I point out that 
the OWL 1.0 documents gave some discussion of the intention of 
deprecation, which document will contain these points? My original 
e-mail also pointed out an error in the "Mapping to RDF Graphs" doc. 
Does the group agree, and if so, has the fix been incorporated into the 
working version?

> 
> Regarding profiles, the current design is the result of long and careful 
> analysis both within and without the working group. It is true that a 
> consistency check is typically required as part of query answering, but 
> this is true even if the language includes only disjointness, which is a 
> basic feature of conceptual modelling languages; moreover, consistency 
> checking is relatively easy in the profiles (see [2]), and only needs to 
> be performed once for a given ontology. The QL profile has been designed 
> so that query answering has the same complexity as for relational 
> databases, so there is no reason why QL systems should not be just as 
> scalable as relational database systems; tests show that existing 
> implementations can easily deal with data in the order of millions of 
> triples.

First, although disjointness is a common modeling primitive, it is not 
an operation typically performed in databases (yes, you can set up 
triggers to enforce a disjointness constraint, but this can be very 
expensive). Although OWL QL has good theoretical complexity results, 
this does not always lead to systems that perform well. You mention it 
scales millions of triples (BTW, I'd appreciate some references to this 
work), but I said that people who are interested in scalability are now 
dealing with billions of triples. And if we want to be taken seriously 
by the database community, we need to set our sights on trillions of 
triples in petabyte sized knowledge bases! After all, this is part of 
the Semantic Web effort, its not just an XML syntax for DL.

You say that consistency checking is relatively easy and need only be 
performed once per ontology. However, in a pragmatic setting data will 
be added every day, if not every hour or even second. Although the T-box 
is consistent, inconsistencies can arise from this ever-evolving A-box. 
Every time we insert a new rdf:type triple, we need to check to see if 
it violates a disjointness constraint. Given that triple tables are 
widely viewed as not scalable (since joining a 1 billion row table with 
itself n times to answer a query with n conjuncts is a bad idea), this 
might mean checking many database tables to see if the subject appears 
in any of them, and this can add up to a significant performance hit. 
Although each of these checks will be less than polynomial in the size 
of the table (assuming you use an index), it takes up valuable cycles 
that will increasingly slow loading the larger and larger the KB gets. 
Alternatively, you could check consistency of the entire KB right before 
each query, but this will involve doing set difference on many large 
tables. Queries will be as slow as molasses.

I think this focus on theoretical properties and omission of pragmatic 
considerations is a symptom of the OWL2 effort being dominated by KR 
academics (not that I have anything against academics, being one 
myself). Clearly, academics should play a role in precisely defining the 
language and in keeping the scope of the effort away from the 
impossible, but there should also be a healthy balance of industry 
personnel and their opinions should be given significant weight. The 
original WebOnt WG had a heavy academic bent, in part because the idea 
was so new. However, since the SemWeb is gaining in usage, I would think 
the trend should be more industry influence on the WG, not less.

> 
> Regarding arithmetic operations, the working group has specified an 
> extension that allows for linear (in)equations with rational 
> coefficients; this was not made part of the basic specification as it 
> would place a heavy burden on implementers.

Thanks, I don't expect this feature in OWL 2. However, it is a 
requirement from real users and I think eventually some sort of 
arithmetic will eventually be necessary if OWL is to be the language of 
the Semantic Web. I'd appreciate a link to the proposed extension, if 
you don't mind.

> 
> [1] http://www.w3.org/2007/OWL/wiki/Direct_Semantics#Ontologies
> [2] http://www.w3.org/2007/OWL/wiki/Profiles#Computational_Properties
> 
> Please acknowledge receipt of this email to 
> <mailto:public-owl-comments@w3.org> (replying to this email should 
> suffice). In your acknowledgment please let us know whether or not you 
> are satisfied with the working group's response to your comment.
> 
> Regards,
> Ian Horrocks
> on behalf of the W3C OWL Working Group
Received on Thursday, 20 August 2009 16:16:37 UTC