Re: Extensibility: Fallback vs. Monolithic from Bijan Parsia on 2007-06-28 (public-rif-wg@w3.org from June 2007)

From: Bijan Parsia <bparsia@cs.man.ac.uk>
Date: Thu, 28 Jun 2007 10:44:43 +0100
To: axel@polleres.net
Cc: Gary Hallmark <gary.hallmark@oracle.com>, Sandro Hawke <sandro@w3.org>, Dave Reynolds <der@hplb.hpl.hp.com>, public-rif-wg@w3.org
Message-Id: <703AEAAD-5E1E-42E0-AA30-DDBB32485E74@cs.man.ac.uk>
On 28 Jun 2007, at 10:07, Axel Polleres wrote:

>
> Bijan Parsia wrote:
>> On Jun 27, 2007, at 11:56 AM, Axel Polleres wrote:
>> [snip]
>>> a) ignoring X will lead to sound inferences only but inferences
>>>    might be incomplete
>>> b) ignoring Y will lead preserve completeness but unsound inferences
>>>    might arise
>>> c) ignoring z will neither preserve soundness nor completeness
>>>
>>> etc.
>>> while the latter two are probvably pointless,
>> [snip]
>> Since I don't know exactly what's being ignored, my conversations   
>> with various users, esp. those working on biology, certainly  
>> suggest  that B is to be preferred to A (i.e., they *really* don't  
>> want to  miss answers, and in their case it's pretty easy to check  
>> for  spuriousness, and, typically, the likelihood of false  
>> positives is low).
>
> That depends on the use case.

Obviously. I was pointing to the general characteristics based on  
user feedback. This suggests that B isn't *probably* (provably?)  
pointless.

> If you have a use case for B, fair enough.
> I just couldn't think of one, whereas I can think of various use  
> cases for A.

Sure, hence my pointing out that there are several.

Search engines in general seem to provide plenty of use cases. But  
consider:
	Sales leads..completeness might be better than soundness
	Fraud detection...completeness might be better than soundness
		Any sort of threat detection as long as the false positives aren't  
that bad, e.g., diagonosis
Just about anything that you might verify afterwards. Also consider
	Robot navigation...some answer fast is better than no answer as you  
will have lots of correction
	Stock purchasing...some answer fast may be better than a perfect  
answer too late
		There they will quantify better
	Exploring known dirty data (all data is dirty)
		There are arguments for all three plus s&c here depending on the  
specifics.
Etc. etc. etc.

>> Similarly, if I just need *an* answer (not every answer) but I  
>> need  it quickly, c could be fine as long as 1) the probability of  
>> some  correct answer, if there is one, being in the answer set is  
>> high
>
> the higher this probability, the closer you get to A ;-)

Not really. Think search engines. You may be willing to take a bit of  
noise and missed answers so long as *a good enough* answer appears  
and is discernable by the person (i.e., you don't need the engine to  
filter out all the spurious answers).

> but since I didn't think about probabilistic extensions here yet...

I'm not talking probabilistic extensions. I'm talking about how I, as  
a user, assess the utility of a proof procedure.

> I think then you'd rather need a fallback like:
>
> c') ignoring z will neither preserve soundness nor completeness,
>     but preserve soundness with probability greater than <threshold>

Well, I can trivially meet a) in many cases by ignoring *EVERYTHING*  
and making no inferences at all. Clearly that's not so useful either.

> anyway, if this threshold can't be named, I don't see good use cases.

Total fallacy. Just because I can't measure "exactly" or even roughly  
with precision doesn't mean I can't make reasonable  assessments. Why  
are these use cases bad?

I'll note that you didn't provide even the level of detail about your  
use cases that I did. Your actual example is entirely nominal. (Of  
course, that's fine because it's pretty easy to see your point.) I'm  
unclear about why you are so dismissive of mine, esp. by appealing to  
a standard which you hadn't met or established in this conversation  
yet. (And a standard which probably isn't needed at this stage of the  
game.)

>> and  2) the answer set isn't to big and 3) I can check for  
>> correctness  well enough or the consequence of the occasional  
>> wrong answer is low.
>
> ... as before: if "well enough" can't be qunatified, I feel a bit  
> unease here.

It's no worse than with A, really. What if the missing answers are  
critical? What if the *data* are bad so many of the sound answers are  
actually bad as well, thus you need all of them (or maybe some that  
aren't sound)?

Specific analysis plus testing is the usual way. And testing you have  
to do because of bugs and bad data anyway.

>> And of course, if my overall probability of error due to  
>> (inherent)  unsoundness or incompleteness plus the chance of a bug  
>> in my reasoner  is much less than the chance of a bug in an  
>> inherently sound and  complete reasoner, well, that's a reason to  
>> prefer the former.
>> I imagine that life is easier all around if the ignoring is   
>> standardized. It's probably a bit easier to explain to users that  
>> the  system "ignores these bits and reasons with the rest" than to  
>> explain  how some particular approximation technique works in  
>> other terms. Oh,  and clearly a is the easiest to explain because  
>> of examples like  yours. It's also easier to get interoperability  
>> since you can require  soundness and completeness for the pruned  
>> document.
>
> I think I agree, though I am admittedly not quite sure what is the  
> concrete point you want to make here? :-)

The concrete point is that soundness, completeness, and decidability  
are useful metalogical properties of a system, esp for specification,  
analysis of systems, and interoperability. But there are good cases  
for departing from all of them. The problem is that if you do this in  
the engine, then specification, analysis, and interop, plus  
explaining to users gets harder. If you do it in the document, i.e.,  
generate *documents* which are approximations of the original, then  
run a s&c engine on the approximations, users find that easier to  
understand overall, I believe.

Cheers,
Bijan.
Received on Thursday, 28 June 2007 09:43:54 UTC