- From: Gerben <gerben@treora.com>
- Date: Sat, 22 Oct 2022 00:34:49 +0200
- To: public-openannotation@w3.org
- Message-ID: <8a48aaa7-0dec-8e57-d8c2-9a251fec4e12@treora.com>
Hi all, for your interests: I published a proposal for a Web Annotation Discovery mechanism <https://code.treora.com/gerben/web-annotation-discovery>, along with a first implementation as a browser extension <https://code.treora.com/gerben/web-annotation-discovery-webextension> (and a compatible server <https://code.treora.com/gerben/web-annotation-discovery-server>). The goal is that people can discover annotation sources and subscribe to annotation ‘feeds’ while browsing the web, then view the annotations on other pages they visit. Here is a 3-minute introduction video <https://archive.treora.com/2022/web-annotation-discovery/introduction-screencast.webm> with a little demo. (To try it yourself: here are the Firefox addon <https://addons.mozilla.org/en-US/firefox/addon/web-annotation-discovery/> and example collection <https://cothink.org/gerben/random_notes/>.) Below is a section of the proposal <https://code.treora.com/gerben/web-annotation-discovery>: Approach To show annotations on a visited page, the web browser needs to somehow obtain these annotations. Various previous annotation projects depend on a single global service to index the annotations by their target, which browsers would query for annotations targeting a particular page. To avoid such centralisation and cater for the diversity of use cases, the browser could instead query any annotation services of the user’s choice. However, querying services for annotations on visited pages has an enormous impact on reader privacy: to find for annotations on pages you read, you have to tell the service which pages you read. Subscribing to multiple sources would reveal this information to even more parties. In many usage scenarios, the annotations a person is actually interested in is limited and from a known source. Centralised services (e.g. Hypothes.is <https://hypothes.is/>) can help discover annotations from any other user, but are often used for annotating in well-defined groups: in classrooms, among colleagues, etc. In such cases, there is no need for a central global index, and moreover the total set of annotations of interest could easily fit on the user’s device. This would solve the reader privacy issue as no querying is needed — the browser can simply look up if it has any relevant annotations for any visited page (and can thereby be much quicker too). Also for somewhat larger-scale annotation consumption, the total size may well remain managable. For example, if an investigative journalist subscribes to a thousand colleagues each writing ten annotations per day of 1KB each, this produces roughly 4GB in a year — significant, but perhaps worth it for their work (at which privacy may be more important than disk storage). This size could still be reduced by an order of magnitude if, of each annotation, only its own URL and the URL it targets are stored (with a tradeoff for latency and privacy, see further below <https://code.treora.com/gerben/web-annotation-discovery#user-content-compacted-storage>). The current proposal omits any querying mechanism and adopts this approach of a local ‘annotation library’. The mechanisms defined below serve to populate this library: How to discover annotation sources and import their current annotations, and subscribe to a source/‘feed’ to obtain their future annotations. To this end, it selects and combines existing parts of the Web Annotation specifications. Two discovery mechanisms are defined: 1. Annotations encountered directly, either served as a file or embedded in a web page. 2. Annotation ‘feeds’: collections of annotations discovered via links in web pages. The essence of the ‘feeds’ is simple: exactly like RSS Autodiscovery, a website can add a <link> pointing to an Annotation Collection/Container <https://www.w3.org/TR/annotation-protocol/#annotation-containers>, with the appropriate rel and type attributes, e.g.: |<link rel="alternate" type='application/ld+json;profile="http://www.w3.org/ns/anno.jsonld"' href="https://annotations.fredsfrets.example/all/" title="Fred’s frets" />| Annotations ‘encountered directly’ are even simpler: these are annotations that the browser visits/opens directly (detected via their Content-Type), or that are embedded in a page (as described in the Embedding Web Annotations in HTML <https://www.w3.org/TR/annotation-html/#embed-json-ld> note). Initially, my intention was in fact to (also) specify a querying mechanism, so the browser could ask subscribed sources for annotations that specifically target the visited page. But that approach would reveal each page one visits to every annotation source/service one is ‘subscribed’ to — a huge privacy problem, as described above and mentioned before <https://lists.w3.org/Archives/Public/public-openannotation/2021May/0004.html#:~:text=Note%20however%20that,pages%20it%20visits.> on this mailing list. For many use cases, local storage seems sufficient and preferable. Curious to hear if anyone has thoughts on this proposal or might like to try it out in practice. — Gerben
Received on Friday, 21 October 2022 22:34:10 UTC