Re: OWL Test Results page, built from RDF

On September 7, Sandro Hawke writes:
> 
> 
> Ian Horrocks writes:
> > I don't believe that it is either desirable or sensible for the
> > results to distinguish good/bad incompleteness. Bad incompleteness is
> > unsoundness and can simply be reported as "fail".
> 
> When I'm working on Surnia (based on otter+axioms), I'm trying to turn
> the Incompletes for Positive Entailment Tests and Inconsistency tests
> into Passes (while being very careful to avoid getting any Fails).  I
> have no expectation of making any progress on the Negative Entailment
> Tests or Consistent tests, however.  Is there no point to
> distinguishing between my expectations here?

I don't think so - this is simply a characteristic of your
implementation (and is typical for FO provers).


> I've split the test results page into different sections for the
> different kinds of tests; maybe I'll just produce no column for any
> system which reports no-data on the tests in some section.  Then by
> producing no-data for the the tests which a systems has no hope of
> passing, it wont even be considered in the running.  Does that make
> sense?

Would it be possible to fill the column with text such as "results not
reported", and would you consider this reasonable?

> 
> Another issue is whether it's fair to say Surnia passes a test when it
> only does so with manual (test-specific) guidance to finding a proof.
> That guidance only makes it complete sooner, so it's a
> Would-Pass-if-given-enough-computing-resources.  I'd like to call that
> a "Pass (_note_)", (where the note is a link to an explanation); does
> that seem fair?  By CADE/CASC/TPTP standards, that's not a Pass, but
> they might be after something different.   

I don't think we can call that a pass (unless you plan on providing a
clone of yourself with every installation of Surnia :-)). I would
suggest doing it the other way around - mark it as incomplete with a
footnote stating that some manual guidance (which only serves to
improve performance) allows the test to be passed.

Regards, 

Ian


> 
>    -- sandro
> 
> 
> 
> 
> 

Received on Thursday, 11 September 2003 06:51:25 UTC