- From: Stephen Pollei <stephen_pollei@comcast.net>
- Date: 17 Jan 2004 11:28:24 -0800
- To: pat hayes <phayes@ihmc.us>
- Cc: Michael Kifer <kifer@cs.sunysb.edu>, public-sws-ig@w3.org
- Message-Id: <1074367706.992.91.camel@fury>
On Fri, 2004-01-16 at 14:56, pat hayes wrote: > It would be more informative and safer still if > it could give actual statistics, like "over 95% > of birds in the class fly". Yes thats what thought occurred to me; use some statistically informed reasoning. The only thing is that people are notoriously bad at generating statistics, so it should be more like 96.0+-3.99 % of birds fly. If it really has major exceptional cases like Penguins, baby birds, sick birds etc, then the bird watchers should collect data on approximate numbers of population sizes to this level of detail/distinction. If you have data on species, health of birds, age of birds, etc then computing this flight statistic should be more precise. also sick is vague. How sick? Sick with what? How precise of an answer do you really care about anyway? Only a nit-picky pedantic twit would say that "over 95% of birds fly" is not an acceptable answer for most people in real life. It also seems to me that also having some rough idea of population sizes, can help inform the reasoning engine in other ways. For example for a modified version of the employee problem given in earlier email: 4.0e9 to 7.0e9 people in the world. Company X employs less than 1200 people. I have list of 1057 of them. Y is a person not on the list, what is the chance he is an employee of X? Pretty slim? Of course then if you add that all of the employees live in a town of size 1200-1450 and Y also lives in that town? Still slim, but it gives example that the probability can change rapidly, given new information. How would you model that he is trained to do the kind of work the company does, and no one else in town is hiring for that kind of work? Then from the earlier bird examples, you can then be explicit with it's a healthy , adult, eagle, thats in the wild, etc. Adding information can up your probability for this instance of bird. Then someone can postulate that the eagle somehow got trapped in a cave and thus can't fly;-> > > People like us are not > consciously aware of using rules like "over 95% > of birds in this class fly", and we certainly > don't say things like that to one another. > there is lots of evidence that we unconsciously > do things like make fine-grained numerical > estimates of relative likelihoods which we are > quite unable to sense by introspection; and that > when we do reason, we are using far, far more > information than we are consciously aware of, or > ever actually say explicitly to one another in > natural language. Yes and thats why I think people will need tools to help guide people fill in and make more explicit things we normally take for granted. I also think that in some places we need people to place in vague answers for some things as well. People also use tons of context, and are flexible in the face of new information. I think that the first entering of data will be tougher, but as we put more stuff in it can get easier. The machines can use consistency checks and heuristics to help guide the later data entry. Of course thats my hope.. maybe the exactness will turnoff people from dealing with it. Either way, I think it will be a great learning experience. I also think it really is *relative* likelihoods and not numerical as people can't translate their gut feelings into statistics reliably. Thus if you really wanted to add this to your models you need vagueness for lots of answers and a methodology requirement for more precise answers. Hmm I wonder if having list from "Sun will rise tomorrow" to "pigs will fly in a frozen hell", and then having people place new probabilities within that scale might be interesting.
Received on Saturday, 17 January 2004 14:32:34 UTC