Re: Manual Rewriting and Passing Entailments

On September 11, Jim Hendler writes:
> 
> At 10:48 AM +0300 9/11/03, Jeremy Carroll wrote:
> >Summary:
> >Do systems need a fully automated test harness to pass a test?
> >
> >
> >
> >I was chatting with Dave Reynolds about what is expected to pass an
> >entailement test.
> >
> >The tests are expressed as
> >
> >Graph1 entails Graph2
> >
> >In practice many APIs (including ours) do not directly support such an
> >operation.
> >
> >Hence Dave automatically transforms Graph2 into a query which he can then
> >execute againsts Graph1, and pass the test.
> >
> >That looks fine to me.
> >
> >For some of the tests, he has a more complex query rewrite that he does
> >manually, and then passes the test. I am discouraging him from reporting such
> >tests as passed. (These reflect the lack of support for the comprehension
> >axioms - the query rewrite essentially compensates for this).
> >
> >===
> >
> >What are other people doing? How much manual and/or automatic rewrite do
> >people do?
> >
> >Jeremy
> 
> We actually discussed this in the past and had some plan, although I 
> cannot find it in the records at the moment -- my belief is that Dan 
> and others had pointed out that it is often the case that 
> implementations cannot pass the tests automatically without some sort 
> of intervention, but that certainly doesn't count against those 
> systems being considered proof of implementation -- that is, 
> something like what Dave does above is certainly valuable 
> implementation experience that could be reported at CR.  I believe we 
> decided that we would have some sort of mechanism to say "passes the 
> test in a different way".
>   We could  handle this like follows - we could ask each reasoner to 
> provide   a description somewhere as to how it passes the tests - 
> including a description of anything like the above.  For those that 
> don't do exactly what the test document describes (i.e. run the test 
> harness automatically for every test or whatever) we could consider 
> something like PASS* (insteand of PASS) and a note at the bottom 
> somethign like "* - see <link> for details of how <system name> 
> passes these tests"
>   I'm sure Ian will disagree

You guessed right :-)

I think that a bit of hand-crafted plumbing, such as transforming
entailment tests into satisfiability tests in the obvious way, is
perfectly OK at this stage - it might be appropriate to indicate this
kind of manipulation in a note of the kind you suggest. It is not,
IMHO, reasonable to report a pass for a test that required human
intervention in the operation of the proof itself.

> , but I again think our tests are there to 
> help implementors do better

As an implementor, tests that I can (easily) pass are of little
value. Tests that I can't pass, in particular small tests, are of
great value.

Ian

> and to give people more ideas about 
> different ways to build OWL tools rather than to be an exam that is 
> intended to be hard to pass.  We should absolutely strive to be as 
> clear as possible as to how the different systems perform and what 
> their capabilities are, but we should not be setting up the 
> expectation that the only way to use OWL is to be able to run our 
> test harness exactly as our implementations do.
>   -JH
> 
> -- 
> Professor James Hendler				  hendler@cs.umd.edu
> Director, Semantic Web and Agent Technologies	  301-405-2696
> Maryland Information and Network Dynamics Lab.	  301-405-6707 (Fax)
> Univ of Maryland, College Park, MD 20742	  *** 240-277-3388 (Cell)
> http://www.cs.umd.edu/users/hendler      *** NOTE CHANGED CELL NUMBER ***

Received on Monday, 15 September 2003 12:31:19 UTC