Re: Bringing annotation to the browser from Gerben on 2021-05-24 (public-openannotation@w3.org from May 2021)

From: Gerben <mail@treora.com>
Date: Mon, 24 May 2021 19:39:48 +0200
To: public-openannotation@w3.org
Cc: David Bokan <bokan@chromium.org>
Message-ID: <49ef07dd-ba5e-fe02-54f4-a993f407f627@treora.com>
Hi everyone,

I already sent David feedback privately, which I will share below. I am
curious if anyone feels similar or has other remarks to share. As you
can read below, I suggested David to discuss the plan in this Community
Group, which has pursued highly similar ideas for several years. But the
lack of response so far, to what sounds to me like a vague but exciting
proposal, makes me wonder if anybody is still reading this list, and if
anyone is still interested in the topic…

Firstly, some possibly relevant context for those who missed it: David
and his team already got a significant annotation-related change into
the Chromium browser over the last year: Text Fragments
<https://wicg.github.io/scroll-to-text-fragment/>, defining a special
type of fragment identifier to point to (≈ scroll to and highlight) a
quoted phrase inside the document. E.g. pointing at the words
“illustrative example” on example.org page looks like this: (try open it
in a recent Chromium-based browser
<https://caniuse.com/url-scroll-to-text-fragment>)

https://example.org/#:~:text=use%20in-,illustrative%20examples,-in%20documents
<https://example.org/#:~:text=use%20in-,illustrative%20examples,-in%20documents>

Note for the Web Annotation-savvy: these Text Fragments are more or less
a compacted version of serialising a TextQuoteSelector (or a
RangeSelector containing two TextQuoteSelectors) into a URL, as was
envisioned in the Selectors and States as Fragment Identifiers
<https://w3c.github.io/web-annotation/selector-note/#frags> idea (that
never made it into the Web Annotation recommendations). For comparison,
here is a pointer at the same words using that latter serialisation:

https://example.org/#selector(type=TextQuoteSelector,exact=illustrative%20examples,prefix=use%20in%20,suffix=%20in%20documents)
<https://example.org/#selector(type=TextQuoteSelector,exact=illustrative%20examples,prefix=use%20in%20,suffix=%20in%20documents)>

Having the ability to point at an arbitrary selection in web pages seems
a small but important step to annotating web pages.

Back to the point, here is my response to David’s request for feedback
on the mentioned explainer <https://github.com/bokand/web-annotations>:

    Making annotations part of the web has been a vision of mine for
    many years (like for many others). Time and again the conclusion has
    been that this should be part of user agents, so interest from
    Chromium opens lots of possibilities.

    Overall, I think your outline of your proposal is well-informed and
    reasonable, and seems to head roughly in the direction that I myself
    have been wanting to move things too; see e.g. my current project
    <https://nlnet.nl/project/WebAnnotation/> funded by the EU’s Next
    Generation Internet project, or my blog post
    <https://web.hypothes.is/blog/supporting-open-annotation/> from when
    I was an intern at Hypothesis (2014). I would be glad to exchange
    notes and collaborate in some way, join a call, or whatever works.

    Below some thoughts from reading the explainer.

    *Avoiding the “web graffiti” problem*

    I am very pleased to see that it is a non-goal to make /“a
    centralized “comments section” for the web”/. Many previous
    annotation projects would create a single central annotation store,
    and show all annotations to every reader. Then website publishers
    would often be furious about losing control over the presentation of
    their content. Moreover, lots of this content would be of low
    interest to most readers. Moderation does not scale well, a single
    party as moderator is a bad idea anyhow, etcetera. (I wonder why you
    do not list Google Sidewiki
    <https://googleblog.blogspot.com/2009/09/help-and-learn-from-others-as-you.html>
    among the previous efforts. Perhaps it is rather not spoken about
    among Googlers? ;))

    Given this history, I recommend being extra clear about the
    approach, because you’ll have many people asking “what about
    spam/harassment/…?”. When talking about my project visions I often
    have to make clear that, unlike in many other/previous projects,
    people will only see annotations when they asked for them, from
    sources they selected.

    Consequently, I think the first item+subitems in your list of
    challenges are problematic:

        “Decentralized moderation: Whose responsibility is this? …”

    We could just as well ask: who moderates the WWW? That seems the
    wrong question. Boldly put: /If the annotation system requires a
    moderation system, it is a bad system./ The challenge should be how
    not to need moderation; otherwise it is doomed to fail — probably
    even before launching, it will be killed by a storm of criticism and
    bad publicity (harassment, disinfo and social media’s moderation
    failures are hot topics these days).

    I would therefore strongly consider limiting the design and
    discussion to ] and use cases that do not pull annotations from just
    anywhere & anyone, but only get annotations the reader asked for:

      * Personal private notes — stored in your browser, much like bookmarks
      * Sharing a document+annotations — e.g. I add notes to a page and
        share the annotated page with you. (technically, perhaps I point
        you to a collection of Annotations targeting the same document;
        your browser renders its target document while showing the
        annotations besides it)
      * Group collaboration — a group’s annotates a specific document
        (Google Docs-style but for any web page)
      * Pulling notes from sources one explicitly follows (I call this
        the “Twitter/RSS feed model” to annotiation — imagine seeing
        annotations from, or boosted by, people & organisations you
        follow — and only those people)

    The last option of these sounds similar to your concept of
    “Annotation Sources”. Note however that this may create a huge
    problem for reader privacy, as a query for annotations is made to
    every annotation source upon every page you visit (while having
    annotations enabled). The only solution I can think of is that the
    browser (or a delegated service) retrieves /all/ annotations from
    sources it follows and stores them locally, to not reveal which
    pages it visits.

    *Publisher’s control*

        “Authors should have some control over which annotations can be
        shown (at least by default) when their page is being viewed.
        …
        This mechanism should give control to pages over third-party
        annotations on their content, but it shouldn’t limit what a user
        can do and see on their own device.”

    I suppose some concerns of authors may come up, and compromises
    could be made (e.g. a page could /hint/ to disable annotation), but
    as you write in some points in the explainer this is ultimately a
    front-end feature that users should control. If the graffiti
    concerns are avoided as just described, then one could argue that
    /every/ annotation is something the user does and sees on their own
    device.

    Therefore I would avoid the web page’s interference, and avoid some
    open questions like///“Can the user prevent the page from filtering
    annotations?”/. A web page cannot tell the browser to avoid
    bookmarking it either.

    Except for the page’s ability to block, I do like the schematic
    drawing with the “aggregated annotations” though. I often frame
    annotation as a /display of back-links/: My browser /knows/ about a
    small section of the web, and when I visit a page, it displays (in a
    sidebar) other pages that link to that page. This by itself may be a
    concept worth working out; but in combination with “precise links”
    (text fragment links), we get web annotation.

    *Making annotations available to the web page*

    Best to be avoided, I think.

    It seems impossible to let pages control annotation presentation
    without opening a can of worms, especially privacy issues. A page
    should not get access to the annotations’ content, and which words
    the annotation targets is already private content — so it would be
    impossible to let a page position the annotation without leaking
    information to the page.

    If the annotations are public, then the web page can already fetch
    and display them without the browser’s aid; like a website can
    already load a hypothesis sidebar. I think the main point of web
    annotation is to view the web beyond&between the web pages
    themselves, without requiring the cooperation of those pages. So
    unless I miss something, I think this whole idea of involving the
    page in helping to annotate itself may best be forgotten about
    completely.

    One way in which pages can however help the browser in showing
    annotations on the page: using good semantic html, specifying
    canonical and alternate URLs, using rel=bookmark links in
    <article>s, and so forth. This helps the browser to know which
    document(s) it is looking at — a point I do not see being mentioned
    in your explainer (many documents may appear under numerous URLs;
    e.g. sometimes query parameters matter, sometimes they do not).

    (also, another challenge not mentioned in the explainer: the web
    lacks versioning or history, hence annotations can become ‘orphaned’)

    *Browser extensions*

    The case for web pages is very different from that of browser
    extensions. I like that you take extensions into consideration in
    the explainer. Perhaps it is even okay if initially, for example,
    the browser itself can only display annotations, and browser
    extensions and CMSes will invent ways to create annotations.

    A good example of this approach may be the work
    <https://github.com/mozilla/libdweb> Mozilla did to enable
    distributed web protocols in Firefox: rather than implementing e.g.
    the IPFS or Hypercore protocol, they developed new WebExtension APIs
    that would let browser extensions implement such protocols (e.g.
    have extensions register protocol handlers and create UTP connections).

    For web annotation, perhaps even some small changes could unblock
    other people (like myself) from implementing annotation capabilities
    through extensions; e.g. it might already help if Chromium
    implemented the Sidebar API
    <https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/user_interface/Sidebars>,
    and would provide a way to draw lines and highlights in an overlay
    on top of a page without modifying its DOM.

    Given that there may be many ways in which annotations could work,
    and experimentation may be needed, such an approach would nicely
    allow various experiments to slowly develop the concepts.

    Nevertheless, to make annotation more part of the web, one should be
    able to share an annotation with other people without them having to
    install browser extensions; so at least a basic browser support for
    displaying a pair of document+annotations would be fantastic.

    *Web platform*

    I really like you are thinking how to make annotation part of “the
    web platform”, and that you are looking at web standards to
    implement. It would indeed be sad if this ends up being just a
    feature specific to Chromium (I hope Text Fragments gets wider
    adoption too). It needs a whole ecosystem around it (annotation
    publication software, annotation sources, …), that would ideally
    include existing annotation projects. It would be great if the
    millions of annotations made in Hypothesis and elsewhere could be
    displayed in the browser too. To get the relevant parties to discuss
    this, perhaps we can liven up the discussion in the W3C Open
    Annotation Community Group
    <https://www.w3.org/community/openannotation/> again?

    As an overarching thought, I think the main questions this explainer
    provokes are not technical, but rather about the goal and the
    process towards it. Will the concept be developed by engineers, or
    with rounds of user research and feedback?  Will this be a
    collaborative effort to change the web at large, or Google using its
    dominant browser to push a feature through and wait for others to
    follow? Will it end up being a single product experience, or a
    framework that enables many browsers, browser extensions,
    publishers, CMSes and other parties to try out a variety of
    annotation(-ish) systems and use cases? I guess you might not have
    the answers ready either, but I am curious how you plan to go about.

As said, I would be curious to hear the thoughts from others in this
group. Is the interest in web annotation from Chromium developers (even
if coming ‘late to the party’) an opportunity for web annotation? Or a
hopeless/misguided mission? Or…?

Kind regards,

— Gerben


On 12/05/2021 17.42, David Bokan wrote:
> Hi everyone,
>
> Myself and a few colleagues from the Chrome team have been considering
> ways to bring some annotation use cases to the browser by default.
> This requires a lot of thought about what kind of  APIs and controls
> are given to page authors and users as well as the broader
> implications on the web ecosystem (e.g. security, privacy, UI, etc.)
>
> We want to do this in a way that's open and integrates with existing
> specs and work on annotations. We've put up a public explainer
> <https://github.com/bokand/web-annotations>; there's no concrete
> proposal yet, it's all very early stages. There are a few rough ideas
> though and explains how we're thinking about the challenges. We'll
> continue to develop the ideas there if you'd like to participate or
> just follow along.
>
> Given this group's interest in annotations, I'd like to invite
> thoughts and feedback, particularly if anyone has any experiences with
> similar efforts in the past. 
>
> Thanks!
> David Bokan
Received on Tuesday, 25 May 2021 08:00:41 UTC