passing negative entailment tests

I was surprised to read in Test:

   Note that while ideally the RIF consumer would be able to
   conclusively demonstrate that the conclusion cannot be drawn from the
   premises, in practice a failure to draw the conclusion after a
   thorough attempt to do so can be considered a successful outcome.

Is this based on a WG decision I'm forgetting?   If so, I apologize.

My sense right now is that this isn't okay.  To determine a negative
entailment is hard work; it's not enough to just try and fail to find
the entailment.  For RIF system, I expect determining a negative
entailment means (1) using an entailment-search algorithm that is known
to be complete, and (2) giving it sufficient resources to run until it
is done.  It's tempting to skimp on either of these, but I think people
who do it right -- who actually give the answer that (modulo coding
bugs) is known to be correct -- deserve better marks.

Maybe in test-results-reporting we can allow for a 'nearly-passed' or
'weak pass', to give some sort of partial credit.  Really, these folks
just got lucky.

In OWL 1, a system was supposed to report this as 'undecided'.  That's
better than failing (deciding, but deciding incorrectly), and probably
better than not reporting any result, but still not as good as a 'pass'.

I still like that solution.

   -- Sandro

Received on Friday, 25 September 2009 01:57:46 UTC