W3C home > Mailing lists > Public > www-annotation@w3.org > January to June 2002

RE: Orphaned annotations

From: David M Bargeron <davemb@microsoft.com>
Date: Mon, 18 Mar 2002 08:50:08 -0800
Message-ID: <0170DDAD0BADFA4CBEC3B55A0748DCCC069DD7F6@red-msg-02.redmond.corp.microsoft.com>
To: "Jim Ley" <jim@jibbering.com>, <www-annotation@w3.org>
We have done some work in the area of robust anchoring for annotations
on web pages. We found that using "human-level" page content as the
basis for anchoring was more effective than using the internal structure
of the page. That is, when a user annotates a sentence in a paragraph,
she is thinking about the text that she is annotating, not the fact that
it is the 3rd sentence in the 5th paragraph, for instance. By anchoring
to the "human-level" content (the actual words that a user sees), we can
better meet the user's expectations when the page is modified.

When the page is edited or updated, our annotations "catch up" with the
content they are anchored to, even if that content has gotten moved
around or changed. And the content-based approach allows users to
annotate *any* text they see on the webpage, not just structural
elements of the page such as whole paragraphs, whole table cells, etc. 

Another interesting side effect of this approach is that we can annotate
the same content even if it is stored in completely different formats.
For instance, the Microsoft Word editor may be used to create a document
that is then saved as Microsoft's .doc format and as html.  The exact
same content is now in two completely incompatible formats with
incompatible internal structures. However, because we anchor to content,
which is the same across the two formats, the annotations created on the
.doc version can be displayed on the html version and vice versa.

I wonder how this could be applied to the Annotea system?  If you are
interested in learning more about the work we have done in this area,
please check out our papers:


CHI 2001:
http://research.microsoft.com/copyright/accept.asp?path=http://www.resea
rch.microsoft.com/research/coet/Annotations/chi2001/paper.pdf&pub=ACM

Microsoft Research TR:
ftp://ftp.research.microsoft.com/pub/tr/tr-2001-107.pdf


Dave
--------------------------
David Bargeron
Interactive Visual Media Group
Microsoft Research
One Microsoft Way
Redmond, WA 98052
davemb@microsoft.com


-----Original Message-----
From: Jim Ley [mailto:jim@jibbering.com] 
Sent: Monday, March 18, 2002 6:42 AM
To: www-annotation@w3.org
Subject: Re: Orphaned annotations


"Marja-Riitta Koivunen" <marja@w3.org>
> At 01:37 AM 3/18/2002 +0000, Nick Kew wrote:
>
> >As soon as I had a working prototype, it became abundantly clear that

> >there is a deep and fundamental flaw in Annotea: we construct long 
> >and detailed pseudo-xpointers, but these become totally useless as 
> >soon as a page is updated.  And annotea has no mechanism for dealing 
> >with this, nor indeed even to detect that a page has changed.
>
> First, the amount of problems depends on what kinds of changes are 
> made
to
> the page and how well id's are used.

id's simply aren't used though, for example one might expect
http://www.w3.org/2001/Annotea to be authored with a nod towards making
Annotation easy, yet

#xpointer(/html[1]/body[1]/table[1]/tbody[1]/tr[1]/td[2]/h1[1])
or
#xpointer(start-point(string-range(/html[1]/body[1]/table[1]/tbody[1]/tr
[
1]/td[2]/ul[1]/li[3],"",23,1))/range-to(end-point(string-range(/html[1]/
b
ody[1]/table[1]/tbody[1]/tr[1]/td[2]/ul[1]/li[3],"",40,1))))
(which considering it's trying to point to an A element shows a pretty
dodgy creation interface IMO.)

are a couple that the page has, id's aren't well used on the general web
(generally only in connection with javascript and the few people who
duplicate name/id in their anchors.)   and the kind of fuzzy pointers
we're getting on even very simple documents such as the one above
illustrate how easily they can be moved within the document.

Again on the Annotea front page we have Jose Kahan saying "Great work
Art!" and pointing to
#xpointer(/html[1]/body[1]/table[1]/tbody[1]/tr[1]/td[2]/p[10])
which today points to "Others are strongly encouraged to start their own
Annotea servers." yet
http://web.archive.org/web/20010703011339/www.w3.org/2001/Annotea/
which whilst not being from the right date (it's as close as
web.archive.org has.) it does point to a paragraph discussing Art
Barstow's javascript bookmarklet approach.

I think it's clear fuzzy pointers without a mechanism to know how
reliable the fuzzy pointer is can't realistically be used.

Jim.
Received on Monday, 18 March 2002 11:50:20 GMT

This archive was generated by hypermail 2.2.0 + w3c-0.30 : Friday, 25 March 2005 11:19:17 GMT