Re: use case clarification - cross format annotations from Nick Stenning on 2014-12-05 (public-annotation@w3.org from December 2014)

From: Nick Stenning <nick@whiteink.com>
Date: Fri, 05 Dec 2014 10:27:04 +0100
To: public-annotation@w3.org
Message-Id: <1417771624.2991929.199162177.64DA2D66@webmail.messagingengine.com>

On Thu, Dec 4, 2014, at 19:39, Bill Kasdorf wrote:
> Isn't it a problem, though, that the DOI _identifies_ the document but it
> doesn't necessarily _locate_, or link _to_ the document?

I'm not sure it is a problem, because as Paulo has already mentioned in
a new thread on this subject, these problems can be addressed on the
server, behind an API.

### Situation 1: retrieve all annotations for current page

I am on a web page, say "http://jimwatson.com/papers/dna.html". I want
to retrieve all annotations for that web page. I can, with a dumb
client, make a call to a search API, providing only the page URL. The
server can then, in principle:

1a) look up the URL in an internal cache mapping URLs to identifiers,
OR, in the event of a cache miss
1b) fetch the URL, and scan it for metadata such as the
previously-mentioned "dc.identifier" meta tags
2) as a result of 1a) or 1b), resolve the URL to a set of URLs: 

        {canonical identifiers for the document} ∪ {URLs for the current
        representation}

3) return all search results for that broader set

This is what Paolo has referred to as "Target extension" in 

    http://lists.w3.org/Archives/Public/public-annotation/2014Dec/0021.html

### Situation 2: retrieve all pages for current annotation

I am starting with an annotation (perhaps previously retrieved from
storage) and I want to find all pages which it annotates. The question
that sits at the core of this discussion is:

    "Should all the information I need be contained within the
    annotation itself, 
     or can I rely on the use of a supporting API to help me?"

My feeling is that answering that the annotation should be
self-contained results in a horrendously complicated wire format that
almost no client implementations will know how to support.

By contrast, answering that you can rely on a (perhaps domain-specific)
storage API, which knows how to resolve DOIs and other canonicalised
identifiers into repr URLs and vice versa, allows for:

- relatively simple clients
- lower network overhead
- vastly increased flexibility in mapping URIs -> URLs and vice versa

To expand on the last point. If we put this mapping in the data model,
we are limited to the concepts that can reasonably be expressed in a
data format we are expecting people to parse.

If we allow this mapping to be encoded in a program that runs behind an
API, we have the full power of any programming language and any
necessary domain assumptions to help us.

-N

Received on Friday, 5 December 2014 09:27:28 UTC