- From: Jos De_Roo <jos.deroo@agfa.com>
- Date: Wed, 4 Aug 2004 01:06:35 +0200
- To: "Jim Hendler <hendler" <hendler@cs.umd.edu>
- Cc: "Farrukh Najmi" <Farrukh.Najmi@Sun.COM>, "Jeff Pollock" <Jeff.Pollock@networkinference.com>, "RDF Data Access Working Group" <public-rdf-dawg@w3.org>, public-rdf-dawg-request@w3.org, "Rob Shearer" <Rob.Shearer@networkinference.com>
Jim - I fully agree with that NCI test case and was also
able to get the answer
[[
nci:FGFR3_Gene rdfs:subClassOf
nci:Fibroblast_Growth_Factor_Receptor_Family_Gene.
# Proof found for file://temp/testC.n3 in 15503 steps (41838 steps/sec)
using 1 engine
]]
The reasoning was done pretty quickly but there was
an overhead of a few minutes to load and prepare and
some 800 MB of RAM were needed...
--
Jos De Roo, AGFA http://www.agfa.com/w3c/jdroo/
Jim Hendler <hendler@cs.umd.edu>
Sent by: public-rdf-dawg-request@w3.org
04/08/2004 00:28
To: "Rob Shearer" <Rob.Shearer@networkinference.com>, "Jeff Pollock"
<Jeff.Pollock@networkinference.com>, "Farrukh Najmi"
<Farrukh.Najmi@Sun.COM>
cc: "RDF Data Access Working Group" <public-rdf-dawg@w3.org>
Subject: RE: ebXML Registry UC (Was Re: Agenda: RDF Data Access 27 Jul 2004)
I think I answered this one before you sent it -- see my earlier
posting of a use case in which all the necessary information woudl be
in the triple store, where the queries are semantically well-defined
(but little or no entailment reasoning is required) and where no
existant mechanism I can find suffices for an immediate need by a
major OWL user. I'm not being academic and pedantic here - I'm
trying to satisfy a real need by my organization's backers - which is
why I participate in this WG, same as you. I'm not badmouthing your
approach, just saying there is other necessary work that I would like
the DAWG to consider as it is needed for what I do.
-JH
At 14:58 -0700 8/3/04, Rob Shearer wrote:
>Frankly, I'm not sure this is a very productive discussion, but I feel
>obliged to weigh in considering the frequency with which my name has
>been raised here.
>
>I certainly never intended to imply that description logics, or even
>OWL, should be the only "methodology" with which RDF applications should
>be viewed. I absolutely see a great deal of value in the use of "pure"
>RDF, with no additional inferencing layer.
>
>The point I have tried to raise, in my discussions with Jim in
>particular, is that it is very dangerous to take an ad-hoc approach to
>semantics and reasoning. It is perfectly valid to consider looking for
>"domain" or "range" triples in an RDF file (which happens to contain OWL
>assertions). It even makes some sense to ask whether such a triple is
>entailed by an OWL ontology (although all our experiences at Network
>Inference suggests that simple entailment is an incredibly inconvenient
>query interface). It is definitely weird, however, to ask the general
>question "what is the domain of this property" and have no clear
>semantics for what constitutes a correct answer. As I pointed out to Jim
>on a recent teleconference, it is quite straightforward for a property
>to have a well-defined domain or range without any domain or range
>triples appearing in the OWL/RDF file.
>
>My general concern is that any query language we come up with should
>have formal model from which one can deduce the "correct" response. If
>you consider this kind of mathematical rigour to be "the DL
>methodology", then so be it, but it clearly is not specific to
>description logics. If the results to queries are determined by ad-hoc
>implementations and unspecified semantics, then I don't think we are any
>closer to interoperable exchange of semantic data than we ever were.
>
>> -----Original Message-----
>> From: Jim Hendler [mailto:hendler@cs.umd.edu]
>> Sent: Tuesday, August 03, 2004 2:11 PM
>> To: Jeff Pollock; Farrukh Najmi; Rob Shearer
>> Cc: RDF Data Access Working Group
>> Subject: RE: ebXML Registry UC (Was Re: Agenda: RDF Data
>> Access 27 Jul 2004)
>>
>> Jeff - as tempted as I am to send a long reply (especially to
>> you rsecond sentence below which is simply falacious - there
>> are many FOL subsets that can produce gaurantees - DL is
>> arguably the maximal such) - let me be clear why I care about
>> this WITH RESPECT TO THE WORK OF THE DAWG.
>>
>> Consider the following example - I'd like to know whether the
>> National Cancer Insitute's Cancer Ontology (available in OWL,
>> see [1]) states that the FGFR3 Gene is one that promotes
>> Fibroblast growth. That is, I'm looking to see if the triple
>> nci:FGFR3_Gene owl:subClassOf
>> nci:Fibroblast_Growth_Factor_Receptor_Family_Gene
>> (where ..._Gene is a class) is in the ontology.
>> One way I can do this is to do an HTTP_Get of
>> nci:FGFR3_Gene and then look at the definition there (and
>> hope the tool used put these triples in a standard class
>> definition). Another thing I can do is to get that document,
>> serialize it into a tool (such as the ones your company
>> creates) and use some sort of deduction to test to see if the
>> above is entailed - that seems preferable. However, there
>> is a problem -- the NCI ontology is a document that is about
>> 25M and contains about 300,000 triples or so -- so the
>> download and serialization takes a long time.
>> One thing we are exploring in our research is to serialize
>> the ontology into a triple store (Tucana will be happy to
>> hear we're using Kowari) and make it available on our web
>> server. Queries, coming in the eventual DAWG language using
>> the eventual DAWG protocol, could provide the capability to
>> answer many questions about this ontology (for example the
>> direct subclass relation needed for the above query) in
>> extremely fast times. So we are exploring an alternate
>> mechanism that looks like it will be very useful in practice
>> and is of great interest to at least one major OWL supporter
>> (the NCI).
>> Now, I'm not arguing I would never use or prefer a reasoner,
>> or that it wouldn't be possible to build persistent stores
>> that allowed Cerebra or other such product to be used to
>> answer these questions -- but it is my contention that many
>> simple queries (and if the one above is too complex, how
>> about if I want to simply know the directly asserted synonym
>> list - an annotation property so no inference needed or
>> allowed - for FGFR3_Gene) could be done using DAWG queries,
>> and this would be of value in certain applications (but not
>> all, and high end things would unquestionably need a more
>> complex inferencer like yours)
>> So my problem is that I don't want us to preclude a valuable
>> use of RDF query because it is not the way some companies
>> would prefer us to interact with OWL ontologies. I though
>> that my use case (2.11, which eventually only got in in a
>> very watered down form due to Rob's objections), Farrukh's
>> suggestion, and the continuing argument over 4.6 vs. 4.6a all
>> relate to this issue, so I was re-raising it in this context
>> to remind people that there are real users and use cases for
>> exploring the use of RDF queries to access RDF graphs
>> representing OWL ontologies.
>> -JH
>>
>> [1] http://www.mindswap.org/2003/CancerOntology/
>>
>> w
>>
>> At 12:39 -0700 8/3/04, Jeff Pollock wrote:
>> >Jim-
>> >
>> >I'm getting tired of reminding RDF people about why DL's are such an
>> >important part of the tech stack. ;-) Without it, there is no
>> >standardized or reliable inference capability that can guarantee same
>> >answers across different reasoner implementations. UDDI 3.0,
>> OWL-S and
>> >many Bio/Pharma ontologies among others have chosen the DL based
>> >approach for good reasons. No one will argue that a DL view of things
>> >requires a conceptual shift, or that there are indeed technical
>> >limitations with what may be modeled. But in many cases the
>> advantages
>> >outweigh the disadvantages.
>> >
>> >Regarding the RegRep, the SCM team has not yet debated the different
>> >levels of OWL support. I, for one, think that DL is a reasonable
>> >alternative to seriously consider. Depending on the level of RegRep
>> >specification, it may be a needed requirement. For example, if the
>> >RegRep simply exposes an OWL model as the interface to the
>> repository -
>> >leaving it to vendors to implement their own query support - then
>> >restricting the interface to DL would enable an assured
>> consistency in
>> >"inference at query" results across vendor implementations.
>> Otherwise,
>> >different proprietary chaining algorithms could conceivably turn up
>> >different results from different vendors - causing chaos in a
>> >distributed DNS-like architecture.
>> >
>> >I know you saw my prezo at the '04 W3C AC Rep mtg, but
>> here's a reminder
>> >of what I was saying regarding why DL's matter:
>> >
>> >* Consistency - query results, across vendor implementations and
>> >instances, should be consistent
>> >* Performance - Although performance metrics depend on model
>> constructs,
>> >OWL-DL supports highly optimized inference algorithms
>> >* Predictable - semantics are mathematically decidable
>> within the model,
>> >reasoning is finite
>> >* Foundational - provides a baseline inside applications for layered
>> >semantic models
>> >* Reliability - if the answer to a query is implied by any
>> of the model
>> >data, it will be found - guaranteed.
>> >
>> >Lest people be fearful of DL's, which could happen if your points are
>> >taken out of context, I simply wanted to say that are indeed good
>> >reasons why they exist.
>> >
>> >Also, for the benefit of stating what should be obvious - Network
>> >Inference embraces and supports ALL of the semantic web stack - RDF,
>> >OWL-Lite, OWL-Full, and OWL-DL. Like you, we think that there are
>> >appropriate times to leverage all aspects of the spec.
>> >
>> >Time for me to get off the soapbox!
>> >
>> >Best Regards,
>> >
>> >-Jeff-
>> >
>> >
>> >-----Original Message-----
>> >From: public-rdf-dawg-request@w3.org
>> >[mailto:public-rdf-dawg-request@w3.org] On Behalf Of Jim Hendler
>> >Sent: Tuesday, August 03, 2004 9:56 AM
>> >To: Farrukh Najmi; Rob Shearer
>> >Cc: RDF Data Access Working Group
>> >Subject: Re: ebXML Registry UC (Was Re: Agenda: RDF Data
>> Access 27 Jul
>> >2004)
>> >
>> >
>> >Farrukh, thanks for your response to Rob - I've gotten tired of
>> >reminding him and others that the DL methodology is only one of the
>> >ways OWL can be used (and in practice, it's not even the most common
>> >- most OWL out there falls in Full, not DL) - it also has the problem
>> >it is not yet scaleable to some of the largest Lite/DL ontologies out
>> >there, and these are precisely the ones I want to access via query
>> >instead of "document" (since the documents can get huge and take a
>> >long hours to download, parse and classify). Tools that will admit
>> >to the reality of the world out there and help people process it will
>> >be quite welcome
>> > -JH
>> >p.s. Note, this is nothing against using DL when appropriate, as in
>> >many of NI's business uses, but just to make it clear that DL is only
>> >one of many ways OWL is being used, and it CANNOT be the defining
>> >restriction for all use cases and applicability ... oops, I'm
>> >starting to get passionate and use uppercase - I'll stop now...
>> >
>> >
>> >At 10:26 -0400 8/3/04, Farrukh Najmi wrote:
>> >>Rob Shearer wrote:
>> >>
>> >>>Greetings, Farrukh!
>> >>>
>> >>>Apologies for not initiating contact myself.
>> >>>
>> >>>Your use case came up at the face-to-face, and I was
>> curious whether
>> >>>there were alternative ways to achieve the results you
>> were trying to
>> >>>get.
>> >>>
>> >>>You suggest a method of "query refinement" to select the
>> elements of
>> >an
>> >>>ontology in which you're interested: first do a general query, then
>> >add
>> >>>a few more qualifying predicates, then add a few more, each time
>> >taking
>> >>>a look at the result set and figuring out what to add to
>> prune out the
>> >>>results in which you're not interested. (Please correct the most
>> >>>offensive bits of this crude summary.)
>> >>>
>> >>>In traditional description logics systems, the process of "concept
>> >>>refinement" is most commonly implemented by traversing a concept
>> >>>taxonomy using not just "subclass"-style edges, but rather
>> >>>"direct-subclass" relationships. For example, a taxonomy
>> of "Worker",
>> >>>"White-Collar Worker", and "Accountant" would include both
>> >"White-Collar
>> >>>Worker" and "Accountant" as subclasses of "Worker", however only
>> >>>"White-Collar Worker" would be a direct subclass.
>> >>>
>> >>>The common use pattern would be a user interested in
>> "Worker", so the
>> >>>user asks for the direct subs of worker and finds that
>> they are "White
>> >>>Collar", "Blue Collar", "Service", and "Military". He can
>> then drill
>> >>>down on whichever of these he wishes, each time getting a
>> fairly small
>> >>>and easily-consumed result set. This is usually much
>> easier to manage
>> >>>than trying to figure out how to refine hundreds, thousands, or
>> >millions
>> >>>of results by hand somehow.
>> >>>
>> >>>Is any approach along these lines applicable to your use case?
>> >>I totally agree with sub-class refinement as the most
>> common narrowing
>> >>technique.
>> >>
>> >>The use case envision the query to have zero or more parameters. Any
>> >one
>> >>of the parameters
>> >>MAY be a Concept in a taxonomy (or a class in an Ontology).
>> >>
>> >>This is implied but not stated in the use case as I was
>> trying to have
>> >a
>> >>minimalistic
>> >>description that was easy to follow and conveyed the core use case.
>> >>
>> >>If you would like to propose a modified version to the use case text
>> >>send me a draft and
>> >>we can try and reach closure on the issue before the next
>> DAWG meeting
>> >if
>> >>possible.
>> >>
>> >>
>> >>--
>> >>Regards,
>> >>Farrukh
>> >
>> >--
>> >Professor James Hendler
>> >http://www.cs.umd.edu/users/hendler
>> >Director, Semantic Web and Agent Technologies 301-405-2696
>> >Maryland Information and Network Dynamics Lab. 301-405-6707 (Fax)
>> >Univ of Maryland, College Park, MD 20742
>>
>> --
>> Professor James Hendler
>> http://www.cs.umd.edu/users/hendler
>> Director, Semantic Web and Agent Technologies 301-405-2696
>> Maryland Information and Network Dynamics Lab. 301-405-6707 (Fax)
>> Univ of Maryland, College Park, MD 20742
>>
>>
--
Professor James Hendler http://www.cs.umd.edu/users/hendler
Director, Semantic Web and Agent Technologies 301-405-2696
Maryland Information and Network Dynamics Lab. 301-405-6707 (Fax)
Univ of Maryland, College Park, MD 20742
Received on Tuesday, 3 August 2004 19:10:11 UTC