RE: Using XPointer with HTML from Marja-Riitta Koivunen on 2002-04-03 (www-annotation@w3.org from January to June 2002)

From: Marja-Riitta Koivunen <marja@w3.org>
Date: Wed, 03 Apr 2002 17:43:30 -0500
To: Nick Kew <nick@webthing.com>, Laurent Denoue <Denoue@fxpal.com>
Cc: Steven Pemberton <steven.pemberton@cwi.nl>, Jim Ley <jim@jibbering.com>, <www-annotation@w3.org>, <w3c-wai-er-ig@w3.org>, HTML WG <w3c-html-wg@w3.org>
Message-Id: <4.3.2.7.2.20020403143924.038b77f8@localhost>

At 07:56 PM 4/3/2002 +0100, Nick Kew wrote:

>On Wed, 3 Apr 2002, Laurent Denoue wrote:
>
> > Based on his research, David Bargeron (followed by Ka-Ping and myself) 
> pointed out that
> > "using 'human-level' page content as the basis for anchoring was more 
> effective than using the internal structure of the page".

I can imagine that human-level page content pointers are may be easy to 
come up with some pages e.g. those that follow the form of common book, a 
research document, play etc. format. With these formats it is easy to 
define unambiguous pointers that are both intuitive for users and exact for 
machines. For instance, we can talk about Chapter 2 in a book or References 
section or the Chapter named "Architecture". However, many Web pages do not 
follow such commonly known human-level formats. How should we then point to 
them exactly and intuitively?

Even these Web pages have a structure consisting of elements that come from 
the standard mark-up languages. This structure is often visible to users 
even if they are not familiar of the mark-up languages. So I don't think it 
is totally unintuitive to use this structure. For instance, Chapter 2 could 
in XPointer format refer to the second level 2 header, or Chapter with 
"Architecture" could refer either to the 3rd level 2 header that happens to 
have that name or directly refer to the the level 2 header with 
"Architecture" string. I do agree that there are cases, such as with the 
headers in HTML, where it would be easier to think that the paragraphs 
under each header are actually under the header also in the tree structure.

Annotea uses currently the XPointers to be able to point to exact locations 
in the element tree structure. Using XPointer does not in anyway prevent 
adding more human-level content to the annotation schemas to help detect 
document changes. This could be done at least in two ways:

1) Currently we are not using all the capabilities of XPointer to keep it 
simple. There are most probably things that can be changed in our practices 
to make the XPointers themselves more robust and capture more human level 
information e.g. add some of the text from the document to the pointer.

2) It is quite possible to add more information about the selected location 
e.g. the selected text in addition to the XPointer.

Also it is possible to add more info, such as a checksum, so that we know 
if the documents have been changed or not.  It would be great if the Web 
server could give us that information.

While everything is possible our initial goal with Annotea pointers was to 
keep it simple and as standard as possible and avoid adding data and 
processing that only had minor effects to the system. Right now we don't 
have much resources for this question. However, we are always very 
interested in hearing about different experiments that aim to make the 
XPointers in Annotea or similar applications more robust.

Marja

Received on Wednesday, 3 April 2002 17:47:05 UTC