Commentary on the "protocol" draft

Hi all,

Rob asked for some feedback on the protocol early-stage draft:
http://w3c.github.io/web-annotation/protocol/wd/.

Here are some of my thoughts. Minor quibbles first and then a slightly
broader criticism -- hopefully considered a constructive one.

---

There are a couple of relatively minor questions I have, such as:

- When you say "the REST best practice guidelines" -- are you
referring to a specific document?

- You say "interactions are designed to take place over HTTP, but the
design should not prevent implementations that carry the transactions
over other protocols". Perhaps I've misunderstood what you mean, but
this seems to be a design constraint that is at odds with REST. How can
a protocol which specifies verbs and status codes be implementable out
of an HTTP context? My inclination would be to remove this altogether
and be clear that we are designing an HTTP protocol for annotation.

- As specified, a container representation MUST contain a list of
contained annotations. This seems increasingly impractical as a
container grows in size, and doesn't seem to admit much subtlety around
authorisation and access controls. Is this part of LDP? If so perhaps
the container model doesn't fit here -- would a conforming
implementation on top of the Hypothes.is data store have to include tens
of thousands of annotation URIs in the top-level resource body?

---

I also have one overall concern, which is that the design you've
proposed seems like a compromise between two rather different groups of
users, which by virtue of being a compromise doesn't really satisfy
either of them.

My postulate is that there are two broad classes of people who need
annotation protocols:

(1) Bulk data stores and archival services
(2) User-facing clients

By (1), I mean public annotation services such as Hypothesis, libraries
and archival repositories such as archive.org, and well-resourced
organisations such as W3C, which have a desire to protect the data that
is of interest to them. These actors are relatively sophisticated, and
will ingest and republish annotations, typically in bulk, from and to
other similar services around the web.

By (2) I mean client applications, be they running on the desktop or the
web. These will probably take a predominantly document-based view on
annotations, wanting to retrieve annotations referring to a document or
set of documents, and making updates to small numbers of annotations at
a time. It's probably not assuming too much to say that there will be
lots of browser client-side applications, written in JavaScript, that
fit into this category.

Which of these two groups is the current protocol intending to serve?
The feeling I get is that you're trying to serve both of them at once,
and I'm not sure this is wise.

Bulk clients will like:

- interoperability
- discoverability through link headers
- related resources

but they will be stymied by:

- requirement to list all annotations in a container
- no bulk retrieval or submission
- no easy way to retrieve updates since $TIMESTAMP (q.v. SLEEP [1])

[1]: http://dataprotocols.org/sleep/


User-facing clients will like:

- REST API (and attendant use of HTTP)
- JSON-based representation

But they won't like:

- no normative specification of how to search for annotations relevant
to the current page
- LD Patch
- distinction between PUT and PATCH (I understand why this is here, and
it may be common in academic software, but it's highly unusual
elsewhere)
- no guidance on error handling
- conneg


Overall, I think it might worth thinking about whether splitting these
two use cases would allow us to focus on:

- interoperability and bulk data flows for group (1)
- simple data formats, low implementation overhead and document-oriented
workflows for group (2)

Apologies for the lengthy email.



-N

Received on Wednesday, 4 February 2015 14:45:37 UTC