- From: Axel Polleres <axel.polleres@deri.org>
- Date: Thu, 28 Jun 2007 12:14:16 +0100
- To: Bijan Parsia <bparsia@cs.man.ac.uk>
- Cc: Gary Hallmark <gary.hallmark@oracle.com>, Sandro Hawke <sandro@w3.org>, Dave Reynolds <der@hplb.hpl.hp.com>, public-rif-wg@w3.org
Bijan Parsia wrote: > On 28 Jun 2007, at 10:07, Axel Polleres wrote: > >> >> Bijan Parsia wrote: >> >>> On Jun 27, 2007, at 11:56 AM, Axel Polleres wrote: >>> [snip] >>> >>>> a) ignoring X will lead to sound inferences only but inferences >>>> might be incomplete >>>> b) ignoring Y will lead preserve completeness but unsound inferences >>>> might arise >>>> c) ignoring z will neither preserve soundness nor completeness >>>> >>>> etc. >>>> while the latter two are probvably pointless, >>> >>> [snip] >>> Since I don't know exactly what's being ignored, my conversations >>> with various users, esp. those working on biology, certainly >>> suggest that B is to be preferred to A (i.e., they *really* don't >>> want to miss answers, and in their case it's pretty easy to check >>> for spuriousness, and, typically, the likelihood of false positives >>> is low). >> >> >> That depends on the use case. > > > Obviously. I was pointing to the general characteristics based on user > feedback. This suggests that B isn't *probably* (provably?) pointless. > >> If you have a use case for B, fair enough. >> I just couldn't think of one, whereas I can think of various use >> cases for A. > > > Sure, hence my pointing out that there are several. > > Search engines in general seem to provide plenty of use cases. But > consider: > Sales leads..completeness might be better than soundness > Fraud detection...completeness might be better than soundness > Any sort of threat detection as long as the false positives > aren't that bad, e.g., diagonosis > Just about anything that you might verify afterwards. Also consider > Robot navigation...some answer fast is better than no answer as you > will have lots of correction > Stock purchasing...some answer fast may be better than a perfect > answer too late > There they will quantify better > Exploring known dirty data (all data is dirty) > There are arguments for all three plus s&c here depending on > the specifics. > Etc. etc. etc. > >>> Similarly, if I just need *an* answer (not every answer) but I need >>> it quickly, c could be fine as long as 1) the probability of some >>> correct answer, if there is one, being in the answer set is high >> >> >> the higher this probability, the closer you get to A ;-) > > > Not really. Think search engines. You may be willing to take a bit of > noise and missed answers so long as *a good enough* answer appears and > is discernable by the person (i.e., you don't need the engine to filter > out all the spurious answers). > >> but since I didn't think about probabilistic extensions here yet... > > > I'm not talking probabilistic extensions. I'm talking about how I, as a > user, assess the utility of a proof procedure. > >> I think then you'd rather need a fallback like: >> >> c') ignoring z will neither preserve soundness nor completeness, >> but preserve soundness with probability greater than <threshold> > > > Well, I can trivially meet a) in many cases by ignoring *EVERYTHING* > and making no inferences at all. Clearly that's not so useful either. *ggg* I was waiting for that one same goes for B) when asserting *EVERYTHING* obviously... :-))) we can agree that there are use cases for tboth a and b and that there are trivial corner cases for both. >> anyway, if this threshold can't be named, I don't see good use cases. > > Total fallacy. Just because I can't measure "exactly" or even roughly > with precision doesn't mean I can't make reasonable assessments. Why > are these use cases bad? I didn't say that the use cases are bad. I am simply worried about what something like *good enough* means. If it is not defined (if even informally in a description, let's say, which should be the minimal minimal requirement here), it is hard to use. Obviously, even a given "probability" is hard to assess, but the rationale behind why possibly unsound answers are good enough should be given, and ideally also why unsound answers do not have serious impact. As I said, this could be in a description, or referring to some document which explains the limitations or whatever, it doesn't need to be necessary formally checkable. > I'll note that you didn't provide even the level of detail about your > use cases that I did. Your actual example is entirely nominal. (Of > course, that's fine because it's pretty easy to see your point.) I'm > unclear about why you are so dismissive of mine, I am not at all dismissive! Actually, I appreciate that there are use cases for B (still unsure bout C), just want to clarify things. > esp. by appealing to a standard which you hadn't met or established > in this conversation yet. > (And a standard which probably isn't needed at this stage of the game.) >>> and 2) the answer set isn't to big and 3) I can check for >>> correctness well enough or the consequence of the occasional wrong >>> answer is low. >> >> >> ... as before: if "well enough" can't be qunatified, I feel a bit >> unease here. let me retract "quantified" here to "described" in the sense of the above-said. > It's no worse than with A, really. What if the missing answers are > critical? What if the *data* are bad so many of the sound answers are > actually bad as well, thus you need all of them (or maybe some that > aren't sound)? Yes, it might be a good idea to also specify than for A in some way which are the inferences that you would loose. (For my admittedly simple example, this would be that you possibly loose inferences of rules with negation and upwards in the dependency graph) > Specific analysis plus testing is the usual way. And testing you have > to do because of bugs and bad data anyway. > >>> And of course, if my overall probability of error due to (inherent) >>> unsoundness or incompleteness plus the chance of a bug in my >>> reasoner is much less than the chance of a bug in an inherently >>> sound and complete reasoner, well, that's a reason to prefer the >>> former. >>> I imagine that life is easier all around if the ignoring is >>> standardized. It's probably a bit easier to explain to users that >>> the system "ignores these bits and reasons with the rest" than to >>> explain how some particular approximation technique works in other >>> terms. Oh, and clearly a is the easiest to explain because of >>> examples like yours. It's also easier to get interoperability since >>> you can require soundness and completeness for the pruned document. >> >> >> I think I agree, though I am admittedly not quite sure what is the >> concrete point you want to make here? :-) > > > The concrete point is that soundness, completeness, and decidability > are useful metalogical properties of a system, esp for specification, > analysis of systems, and interoperability. But there are good cases for > departing from all of them. great! > The problem is that if you do this in the engine, then specification, > analysis, and interop, plus explaining to users gets harder. If you do > it in the document, i.e., generate > *documents* which are approximations of the original, then run a s&c > engine on the approximations, users find that easier to understand > overall, I believe. Anyway, trying to sum up, it seems that the original thing in the message then still holds. I tmight be helpful, instead of only saying: If you ignore X then "something bad happens" to distiguish into If you ignore X then you loose soundness If you ignore X then you loose completeness and then allowing additional (possibly descriptive) annotations which say for what cases you loose soundness or completeness, respectively yes? axel p.s.: BTW, incomplete for a rule set/dialect, in some cases can imply unsoundness for rules or queries you add on top, especially in the search scenario, if you allow negation as failure in search queries, see [1], where we tried to nail this down a bit by the notion of "context monotonicity" ... just to put some self-citation ;-) 1. Axel Polleres, Cristina Feier, and Andreas Harth. Rules with contextually scoped negation. In Proceedings of the 3rd European Semantic Web Conference (ESWC2006), volume 4011 of Lecture Notes in Computer Science, Budva, Montenegro, June 2006. Springer. -- Dr. Axel Polleres email: axel@polleres.net url: http://www.polleres.net/
Received on Thursday, 28 June 2007 11:14:33 UTC