- From: David M Bargeron <davemb@microsoft.com>
- Date: Tue, 19 Mar 2002 10:23:49 -0800
- To: "Ka-Ping Yee" <ping@lfw.org>
- Cc: "Jim Ley" <jim@jibbering.com>, <www-annotation@w3.org>
Thanks for your comments Ka-Ping. I mostly agree with your first assertion -- that we would need to standardize all aspects of our algorithm if it were to be useful as an interoperable standard. However I also partly disagree -- our empirical observations from subsequent studies showed that different users choose different anchoring confidence thresholds below which annotations are automatically orphaned by our algorithm. That means that some users care less about precise anchoring than others. So I don't think we could freeze the confidence thresholds -- they are dependent on user preference, the nature of the material being annotated, the task, and other factors. A natural result of this is that I will get different anchoring results than someone else might. This is acceptible, though, as long as I like my anchoring results more than the other person's ; ) I am not sure I understand your second assertion. The algorithm is deterministic, and relies only on human-level content, so it should produce the same results given the same human-level content (and same confidence threshold settings), regardless of format. What sort of failures do you envision? If you mean orphaning, then yes, of course there will be orphans. This is not a failure of the algorithm, though...it is in the nature of dynamic document annotation. If I annotate a paragraph in a webpage today, and tomorrow the paragraph simply disappears from the webpage, then what is the anchoring algorithm to do? The final arbeiter is always the user, so *any* anchoring algorithm, regardless of how robust it tries to be, must be prepared to ask the user for help when it reaches an impass such as this. Dave -----Original Message----- From: Ka-Ping Yee [mailto:ping@lfw.org] Sent: Tuesday, March 19, 2002 5:00 AM To: David M Bargeron Cc: Jim Ley; www-annotation@w3.org Subject: RE: Orphaned annotations On Mon, 18 Mar 2002, David M Bargeron wrote: > We have done some work in the area of robust anchoring for annotations > on web pages. We found that using "human-level" page content as the > basis for anchoring was more effective than using the internal structure > of the page. Bravo. The truth of this last statement should be obvious to everyone. Of course it has to be human-level content; it is humans that these systems are supposed to serve, after all -- or have we forgotten that? I saw your presentation at CHI 2001, and thought you guys did some pretty cool work with your anchoring algorithm. I was sad that you didn't cite Crit in your paper, but I now see that at least you mentioned it in your tech report. I was impressed by the clever results you got by keeping track of keywords and context, as you do. But i have to say i'm not certain that's necessarily a win. With an algorithm like that, there are a few issues: 1. For deployment as an interoperable standard, you have to specify the algorithm and any data it relies on so that all implementations produce exactly the same result. So you would need to standardize all your tuning parameters, scoring weights, thresholds and so on. 2. It is unreasonable to expect that a user could duplicate all of the steps in your algorithm to predict the outcome. This makes it possible to have failures that no user could prepare for or defend against. There's still potential for fuzzy matching algorithms as a way of suggesting possible relevant areas of the target document after an annotation has been orphaned. It would be worth comparing the utility of this to the utility of simply displaying the original anchor with some context. -- ?!ng
Received on Tuesday, 19 March 2002 13:24:22 UTC