- From: Bijan Parsia <bparsia@cs.man.ac.uk>
- Date: Thu, 28 Jun 2007 10:44:43 +0100
- To: axel@polleres.net
- Cc: Gary Hallmark <gary.hallmark@oracle.com>, Sandro Hawke <sandro@w3.org>, Dave Reynolds <der@hplb.hpl.hp.com>, public-rif-wg@w3.org
On 28 Jun 2007, at 10:07, Axel Polleres wrote: > > Bijan Parsia wrote: >> On Jun 27, 2007, at 11:56 AM, Axel Polleres wrote: >> [snip] >>> a) ignoring X will lead to sound inferences only but inferences >>> might be incomplete >>> b) ignoring Y will lead preserve completeness but unsound inferences >>> might arise >>> c) ignoring z will neither preserve soundness nor completeness >>> >>> etc. >>> while the latter two are probvably pointless, >> [snip] >> Since I don't know exactly what's being ignored, my conversations >> with various users, esp. those working on biology, certainly >> suggest that B is to be preferred to A (i.e., they *really* don't >> want to miss answers, and in their case it's pretty easy to check >> for spuriousness, and, typically, the likelihood of false >> positives is low). > > That depends on the use case. Obviously. I was pointing to the general characteristics based on user feedback. This suggests that B isn't *probably* (provably?) pointless. > If you have a use case for B, fair enough. > I just couldn't think of one, whereas I can think of various use > cases for A. Sure, hence my pointing out that there are several. Search engines in general seem to provide plenty of use cases. But consider: Sales leads..completeness might be better than soundness Fraud detection...completeness might be better than soundness Any sort of threat detection as long as the false positives aren't that bad, e.g., diagonosis Just about anything that you might verify afterwards. Also consider Robot navigation...some answer fast is better than no answer as you will have lots of correction Stock purchasing...some answer fast may be better than a perfect answer too late There they will quantify better Exploring known dirty data (all data is dirty) There are arguments for all three plus s&c here depending on the specifics. Etc. etc. etc. >> Similarly, if I just need *an* answer (not every answer) but I >> need it quickly, c could be fine as long as 1) the probability of >> some correct answer, if there is one, being in the answer set is >> high > > the higher this probability, the closer you get to A ;-) Not really. Think search engines. You may be willing to take a bit of noise and missed answers so long as *a good enough* answer appears and is discernable by the person (i.e., you don't need the engine to filter out all the spurious answers). > but since I didn't think about probabilistic extensions here yet... I'm not talking probabilistic extensions. I'm talking about how I, as a user, assess the utility of a proof procedure. > I think then you'd rather need a fallback like: > > c') ignoring z will neither preserve soundness nor completeness, > but preserve soundness with probability greater than <threshold> Well, I can trivially meet a) in many cases by ignoring *EVERYTHING* and making no inferences at all. Clearly that's not so useful either. > anyway, if this threshold can't be named, I don't see good use cases. Total fallacy. Just because I can't measure "exactly" or even roughly with precision doesn't mean I can't make reasonable assessments. Why are these use cases bad? I'll note that you didn't provide even the level of detail about your use cases that I did. Your actual example is entirely nominal. (Of course, that's fine because it's pretty easy to see your point.) I'm unclear about why you are so dismissive of mine, esp. by appealing to a standard which you hadn't met or established in this conversation yet. (And a standard which probably isn't needed at this stage of the game.) >> and 2) the answer set isn't to big and 3) I can check for >> correctness well enough or the consequence of the occasional >> wrong answer is low. > > ... as before: if "well enough" can't be qunatified, I feel a bit > unease here. It's no worse than with A, really. What if the missing answers are critical? What if the *data* are bad so many of the sound answers are actually bad as well, thus you need all of them (or maybe some that aren't sound)? Specific analysis plus testing is the usual way. And testing you have to do because of bugs and bad data anyway. >> And of course, if my overall probability of error due to >> (inherent) unsoundness or incompleteness plus the chance of a bug >> in my reasoner is much less than the chance of a bug in an >> inherently sound and complete reasoner, well, that's a reason to >> prefer the former. >> I imagine that life is easier all around if the ignoring is >> standardized. It's probably a bit easier to explain to users that >> the system "ignores these bits and reasons with the rest" than to >> explain how some particular approximation technique works in >> other terms. Oh, and clearly a is the easiest to explain because >> of examples like yours. It's also easier to get interoperability >> since you can require soundness and completeness for the pruned >> document. > > I think I agree, though I am admittedly not quite sure what is the > concrete point you want to make here? :-) The concrete point is that soundness, completeness, and decidability are useful metalogical properties of a system, esp for specification, analysis of systems, and interoperability. But there are good cases for departing from all of them. The problem is that if you do this in the engine, then specification, analysis, and interop, plus explaining to users gets harder. If you do it in the document, i.e., generate *documents* which are approximations of the original, then run a s&c engine on the approximations, users find that easier to understand overall, I believe. Cheers, Bijan.
Received on Thursday, 28 June 2007 09:43:54 UTC