- From: Jim Hendler <hendler@cs.umd.edu>
- Date: Tue, 3 Aug 2004 18:28:38 -0400
- To: "Rob Shearer" <Rob.Shearer@networkinference.com>, "Jeff Pollock" <Jeff.Pollock@networkinference.com>, "Farrukh Najmi" <Farrukh.Najmi@Sun.COM>
- Cc: "RDF Data Access Working Group" <public-rdf-dawg@w3.org>
I think I answered this one before you sent it -- see my earlier posting of a use case in which all the necessary information woudl be in the triple store, where the queries are semantically well-defined (but little or no entailment reasoning is required) and where no existant mechanism I can find suffices for an immediate need by a major OWL user. I'm not being academic and pedantic here - I'm trying to satisfy a real need by my organization's backers - which is why I participate in this WG, same as you. I'm not badmouthing your approach, just saying there is other necessary work that I would like the DAWG to consider as it is needed for what I do. -JH At 14:58 -0700 8/3/04, Rob Shearer wrote: >Frankly, I'm not sure this is a very productive discussion, but I feel >obliged to weigh in considering the frequency with which my name has >been raised here. > >I certainly never intended to imply that description logics, or even >OWL, should be the only "methodology" with which RDF applications should >be viewed. I absolutely see a great deal of value in the use of "pure" >RDF, with no additional inferencing layer. > >The point I have tried to raise, in my discussions with Jim in >particular, is that it is very dangerous to take an ad-hoc approach to >semantics and reasoning. It is perfectly valid to consider looking for >"domain" or "range" triples in an RDF file (which happens to contain OWL >assertions). It even makes some sense to ask whether such a triple is >entailed by an OWL ontology (although all our experiences at Network >Inference suggests that simple entailment is an incredibly inconvenient >query interface). It is definitely weird, however, to ask the general >question "what is the domain of this property" and have no clear >semantics for what constitutes a correct answer. As I pointed out to Jim >on a recent teleconference, it is quite straightforward for a property >to have a well-defined domain or range without any domain or range >triples appearing in the OWL/RDF file. > >My general concern is that any query language we come up with should >have formal model from which one can deduce the "correct" response. If >you consider this kind of mathematical rigour to be "the DL >methodology", then so be it, but it clearly is not specific to >description logics. If the results to queries are determined by ad-hoc >implementations and unspecified semantics, then I don't think we are any >closer to interoperable exchange of semantic data than we ever were. > >> -----Original Message----- >> From: Jim Hendler [mailto:hendler@cs.umd.edu] >> Sent: Tuesday, August 03, 2004 2:11 PM >> To: Jeff Pollock; Farrukh Najmi; Rob Shearer >> Cc: RDF Data Access Working Group >> Subject: RE: ebXML Registry UC (Was Re: Agenda: RDF Data >> Access 27 Jul 2004) >> >> Jeff - as tempted as I am to send a long reply (especially to >> you rsecond sentence below which is simply falacious - there >> are many FOL subsets that can produce gaurantees - DL is >> arguably the maximal such) - let me be clear why I care about >> this WITH RESPECT TO THE WORK OF THE DAWG. >> >> Consider the following example - I'd like to know whether the >> National Cancer Insitute's Cancer Ontology (available in OWL, >> see [1]) states that the FGFR3 Gene is one that promotes >> Fibroblast growth. That is, I'm looking to see if the triple >> nci:FGFR3_Gene owl:subClassOf >> nci:Fibroblast_Growth_Factor_Receptor_Family_Gene >> (where ..._Gene is a class) is in the ontology. >> One way I can do this is to do an HTTP_Get of >> nci:FGFR3_Gene and then look at the definition there (and >> hope the tool used put these triples in a standard class >> definition). Another thing I can do is to get that document, >> serialize it into a tool (such as the ones your company >> creates) and use some sort of deduction to test to see if the >> above is entailed - that seems preferable. However, there >> is a problem -- the NCI ontology is a document that is about >> 25M and contains about 300,000 triples or so -- so the >> download and serialization takes a long time. >> One thing we are exploring in our research is to serialize >> the ontology into a triple store (Tucana will be happy to >> hear we're using Kowari) and make it available on our web >> server. Queries, coming in the eventual DAWG language using >> the eventual DAWG protocol, could provide the capability to >> answer many questions about this ontology (for example the >> direct subclass relation needed for the above query) in >> extremely fast times. So we are exploring an alternate >> mechanism that looks like it will be very useful in practice >> and is of great interest to at least one major OWL supporter >> (the NCI). >> Now, I'm not arguing I would never use or prefer a reasoner, >> or that it wouldn't be possible to build persistent stores >> that allowed Cerebra or other such product to be used to >> answer these questions -- but it is my contention that many >> simple queries (and if the one above is too complex, how >> about if I want to simply know the directly asserted synonym >> list - an annotation property so no inference needed or >> allowed - for FGFR3_Gene) could be done using DAWG queries, >> and this would be of value in certain applications (but not >> all, and high end things would unquestionably need a more >> complex inferencer like yours) >> So my problem is that I don't want us to preclude a valuable >> use of RDF query because it is not the way some companies >> would prefer us to interact with OWL ontologies. I though >> that my use case (2.11, which eventually only got in in a >> very watered down form due to Rob's objections), Farrukh's >> suggestion, and the continuing argument over 4.6 vs. 4.6a all >> relate to this issue, so I was re-raising it in this context >> to remind people that there are real users and use cases for >> exploring the use of RDF queries to access RDF graphs >> representing OWL ontologies. >> -JH >> >> [1] http://www.mindswap.org/2003/CancerOntology/ >> >> w >> >> At 12:39 -0700 8/3/04, Jeff Pollock wrote: >> >Jim- >> > >> >I'm getting tired of reminding RDF people about why DL's are such an >> >important part of the tech stack. ;-) Without it, there is no >> >standardized or reliable inference capability that can guarantee same >> >answers across different reasoner implementations. UDDI 3.0, >> OWL-S and >> >many Bio/Pharma ontologies among others have chosen the DL based >> >approach for good reasons. No one will argue that a DL view of things >> >requires a conceptual shift, or that there are indeed technical >> >limitations with what may be modeled. But in many cases the >> advantages >> >outweigh the disadvantages. >> > >> >Regarding the RegRep, the SCM team has not yet debated the different >> >levels of OWL support. I, for one, think that DL is a reasonable >> >alternative to seriously consider. Depending on the level of RegRep >> >specification, it may be a needed requirement. For example, if the >> >RegRep simply exposes an OWL model as the interface to the >> repository - >> >leaving it to vendors to implement their own query support - then >> >restricting the interface to DL would enable an assured >> consistency in >> >"inference at query" results across vendor implementations. >> Otherwise, >> >different proprietary chaining algorithms could conceivably turn up >> >different results from different vendors - causing chaos in a >> >distributed DNS-like architecture. >> > >> >I know you saw my prezo at the '04 W3C AC Rep mtg, but >> here's a reminder >> >of what I was saying regarding why DL's matter: >> > >> >* Consistency - query results, across vendor implementations and >> >instances, should be consistent >> >* Performance - Although performance metrics depend on model >> constructs, >> >OWL-DL supports highly optimized inference algorithms >> >* Predictable - semantics are mathematically decidable >> within the model, >> >reasoning is finite >> >* Foundational - provides a baseline inside applications for layered >> >semantic models >> >* Reliability - if the answer to a query is implied by any >> of the model >> >data, it will be found - guaranteed. >> > >> >Lest people be fearful of DL's, which could happen if your points are >> >taken out of context, I simply wanted to say that are indeed good >> >reasons why they exist. >> > >> >Also, for the benefit of stating what should be obvious - Network >> >Inference embraces and supports ALL of the semantic web stack - RDF, >> >OWL-Lite, OWL-Full, and OWL-DL. Like you, we think that there are >> >appropriate times to leverage all aspects of the spec. >> > >> >Time for me to get off the soapbox! >> > >> >Best Regards, >> > >> >-Jeff- >> > >> > >> >-----Original Message----- >> >From: public-rdf-dawg-request@w3.org >> >[mailto:public-rdf-dawg-request@w3.org] On Behalf Of Jim Hendler >> >Sent: Tuesday, August 03, 2004 9:56 AM >> >To: Farrukh Najmi; Rob Shearer >> >Cc: RDF Data Access Working Group >> >Subject: Re: ebXML Registry UC (Was Re: Agenda: RDF Data >> Access 27 Jul >> >2004) >> > >> > >> >Farrukh, thanks for your response to Rob - I've gotten tired of >> >reminding him and others that the DL methodology is only one of the >> >ways OWL can be used (and in practice, it's not even the most common >> >- most OWL out there falls in Full, not DL) - it also has the problem >> >it is not yet scaleable to some of the largest Lite/DL ontologies out >> >there, and these are precisely the ones I want to access via query >> >instead of "document" (since the documents can get huge and take a >> >long hours to download, parse and classify). Tools that will admit >> >to the reality of the world out there and help people process it will >> >be quite welcome >> > -JH >> >p.s. Note, this is nothing against using DL when appropriate, as in >> >many of NI's business uses, but just to make it clear that DL is only >> >one of many ways OWL is being used, and it CANNOT be the defining >> >restriction for all use cases and applicability ... oops, I'm >> >starting to get passionate and use uppercase - I'll stop now... >> > >> > >> >At 10:26 -0400 8/3/04, Farrukh Najmi wrote: >> >>Rob Shearer wrote: >> >> >> >>>Greetings, Farrukh! >> >>> >> >>>Apologies for not initiating contact myself. >> >>> >> >>>Your use case came up at the face-to-face, and I was >> curious whether >> >>>there were alternative ways to achieve the results you >> were trying to >> >>>get. >> >>> >> >>>You suggest a method of "query refinement" to select the >> elements of >> >an >> >>>ontology in which you're interested: first do a general query, then >> >add >> >>>a few more qualifying predicates, then add a few more, each time >> >taking >> >>>a look at the result set and figuring out what to add to >> prune out the >> >>>results in which you're not interested. (Please correct the most >> >>>offensive bits of this crude summary.) >> >>> >> >>>In traditional description logics systems, the process of "concept >> >>>refinement" is most commonly implemented by traversing a concept >> >>>taxonomy using not just "subclass"-style edges, but rather >> >>>"direct-subclass" relationships. For example, a taxonomy >> of "Worker", >> >>>"White-Collar Worker", and "Accountant" would include both >> >"White-Collar >> >>>Worker" and "Accountant" as subclasses of "Worker", however only >> >>>"White-Collar Worker" would be a direct subclass. >> >>> >> >>>The common use pattern would be a user interested in >> "Worker", so the >> >>>user asks for the direct subs of worker and finds that >> they are "White >> >>>Collar", "Blue Collar", "Service", and "Military". He can >> then drill >> >>>down on whichever of these he wishes, each time getting a >> fairly small >> >>>and easily-consumed result set. This is usually much >> easier to manage >> >>>than trying to figure out how to refine hundreds, thousands, or >> >millions >> >>>of results by hand somehow. >> >>> >> >>>Is any approach along these lines applicable to your use case? >> >>I totally agree with sub-class refinement as the most >> common narrowing >> >>technique. >> >> >> >>The use case envision the query to have zero or more parameters. Any >> >one >> >>of the parameters >> >>MAY be a Concept in a taxonomy (or a class in an Ontology). >> >> >> >>This is implied but not stated in the use case as I was >> trying to have >> >a >> >>minimalistic >> >>description that was easy to follow and conveyed the core use case. >> >> >> >>If you would like to propose a modified version to the use case text >> >>send me a draft and >> >>we can try and reach closure on the issue before the next >> DAWG meeting >> >if >> >>possible. >> >> >> >> >> >>-- >> >>Regards, >> >>Farrukh >> > >> >-- >> >Professor James Hendler >> >http://www.cs.umd.edu/users/hendler >> >Director, Semantic Web and Agent Technologies 301-405-2696 >> >Maryland Information and Network Dynamics Lab. 301-405-6707 (Fax) >> >Univ of Maryland, College Park, MD 20742 >> >> -- >> Professor James Hendler >> http://www.cs.umd.edu/users/hendler >> Director, Semantic Web and Agent Technologies 301-405-2696 >> Maryland Information and Network Dynamics Lab. 301-405-6707 (Fax) >> Univ of Maryland, College Park, MD 20742 >> >> -- Professor James Hendler http://www.cs.umd.edu/users/hendler Director, Semantic Web and Agent Technologies 301-405-2696 Maryland Information and Network Dynamics Lab. 301-405-6707 (Fax) Univ of Maryland, College Park, MD 20742
Received on Tuesday, 3 August 2004 18:29:16 UTC