Re: Bringing annotation to the browser from David Bokan on 2021-05-27 (public-openannotation@w3.org from May 2021)

From: David Bokan <bokan@chromium.org>
Date: Thu, 27 May 2021 00:50:47 -0400
To: Gerben <mail@treora.com>
Cc: public-openannotation@w3.org
Message-ID: <CANMmsAvnq7H65-YKGNzCR2-Yc7cNDPR1BpvGWUWJ9CxtdsC3wQ@mail.gmail.com>
Hi Gerben,

On Mon, May 24, 2021 at 1:39 PM Gerben <mail@treora.com> wrote:

> (I wonder why you do not list Google Sidewiki
> <https://googleblog.blogspot.com/2009/09/help-and-learn-from-others-as-you.html>
> among the previous efforts. Perhaps it is rather not spoken about among
> Googlers? ;))
>
> Heh - this was discussed extensively and one of the frequent questions we
got internally was "how is this different from Sidewiki?". I think a big
difference this time is that we're looking at this as a platform feature,
rather than a "Google-thing". It was brought up by a colleague that we
forgot it in the Prior Art section (along with dokieli
<https://dokie.li/#annotate>) -  I just haven't gotten around to updating
the explainer yet.

> Given this history, I recommend being extra clear about the approach,
> because you’ll have many people asking “what about spam/harassment/…?”.
> When talking about my project visions I often have to make clear that,
> unlike in many other/previous projects, people will only see annotations
> when they asked for them, from sources they selected.
>
> I agree; as I've thought about this more, having this be gated in some way
behind user intent is key to mitigating most of these issues. There is some
tension here though: annotations need to be discoverable (you should be
able to post a link with "here are my corrections" without confusing the
recipient). There's also some tension between what's a decision left to the
user-agent as a product vs the platform. We don't want to bake in too many
assumptions into the platform side of things so that different user agents
can make different choices. That said, we should be cognizant of the effect
precedent has even if certain aspects are technically left as user-agent
choices.

> Consequently, I think the first item+subitems in your list of challenges
> are problematic:
>
> “Decentralized moderation: Whose responsibility is this? …”
>
> We could just as well ask: who moderates the WWW? That seems the wrong
> question. Boldly put: *If the annotation system requires a moderation
> system, it is a bad system.* The challenge should be how not to need
> moderation; otherwise it is doomed to fail — probably even before
> launching, it will be killed by a storm of criticism and bad publicity
> (harassment, disinfo and social media’s moderation failures are hot topics
> these days).
>
> +1. IMHO, the browser is infrastructure and the wrong place to be applying
moderation in this way, it should be left to the services that host the
content to moderate it. The one caveat here is for narrowly-defined
security/safety related issues, akin to how browsers today soft-block
behavior on SafeBrowsing lists. Given that annotations are much less
flexible than regular content I don't expect this to be an issue but the
baddies are creative.

There is a broader point here though, and "moderation" is probably the
wrong word, but we must consider all the various ways this could be used
harmfully and make sure that we design the system and user agents
to disincentivize and minimize toxic behavior.

> I would therefore strongly consider limiting the design and discussion to
> ] and use cases that do not pull annotations from just anywhere & anyone,
> but only get annotations the reader asked for:
>
>    - Personal private notes — stored in your browser, much like bookmarks
>    - Sharing a document+annotations — e.g. I add notes to a page and
>    share the annotated page with you. (technically, perhaps I point you to a
>    collection of Annotations targeting the same document; your browser renders
>    its target document while showing the annotations besides it)
>    - Group collaboration — a group’s annotates a specific document
>    (Google Docs-style but for any web page)
>    - Pulling notes from sources one explicitly follows (I call this the
>    “Twitter/RSS feed model” to annotiation — imagine seeing annotations from,
>    or boosted by, people & organisations you follow — and only those people)
>
> Incidentally, the order of this list is how I see the progression of use
cases in terms of risk and complexity. I mentioned in the explainer that
one of the principles we're adhering to is "Incremental, careful progress"
- it's likely we'd want to add capabilities gradually, in this order, to
tackle some of the low-hanging fruit first and get experience and feedback
and see how users react and interact with it.

> The last option of these sounds similar to your concept of “Annotation
> Sources”. Note however that this may create a huge problem for reader
> privacy, as a query for annotations is made to every annotation source upon
> every page you visit (while having annotations enabled). The only solution
> I can think of is that the browser (or a delegated service) retrieves
> *all* annotations from sources it follows and stores them locally, to not
> reveal which pages it visits.
>
> +1. I think there's a lot of value to this use case but it's also the most
challenging for the reasons you mention.

> *Publisher’s control*
>
> “Authors should have some control over which annotations can be shown (at
> least by default) when their page is being viewed.
> …
> This mechanism should give control to pages over third-party annotations
> on their content, but it shouldn’t limit what a user can do and see on
> their own device.”
>
> I suppose some concerns of authors may come up, and compromises could be
> made (e.g. a page could *hint* to disable annotation), but as you write
> in some points in the explainer this is ultimately a front-end feature that
> users should control. If the graffiti concerns are avoided as just
> described, then one could argue that *every* annotation is something the
> user does and sees on their own device.
>
> This is where I think it's important that we design the defaults correctly
- if a user consciously indicates that they want to see annotations then
the author shouldn't be able to override that, but I expect that in many
cases users won't be able to make that decision. My initial expectation is
the author controls would be along the lines of hinting relating to initial
appearance and some styling controls to let pages make sure things like
selector highlights match the page theme. However, I expect this will be a
contentious area.

> Therefore I would avoid the web page’s interference, and avoid some open
> questions like *“Can the user prevent the page from filtering
> annotations?”*. A web page cannot tell the browser to avoid bookmarking
> it either.
>
> Except for the page’s ability to block, I do like the schematic drawing
> with the “aggregated annotations” though. I often frame annotation as a *display
> of back-links*: My browser *knows* about a small section of the web, and
> when I visit a page, it displays (in a sidebar) other pages that link to
> that page. This by itself may be a concept worth working out; but in
> combination with “precise links” (text fragment links), we get web
> annotation.
>
> *Making annotations available to the web page*
>
> Best to be avoided, I think.
>
> It seems impossible to let pages control annotation presentation without
> opening a can of worms, especially privacy issues. A page should not get
> access to the annotations’ content, and which words the annotation targets
> is already private content — so it would be impossible to let a page
> position the annotation without leaking information to the page.
>
> If the annotations are public, then the web page can already fetch and
> display them without the browser’s aid; like a website can already load a
> hypothesis sidebar. I think the main point of web annotation is to view the
> web beyond&between the web pages themselves, without requiring the
> cooperation of those pages. So unless I miss something, I think this whole
> idea of involving the page in helping to annotate itself may best be
> forgotten about completely.
>
> I think I've mostly come around to the same conclusion. I think there's
cases where pages may want to ingest annotation data but I'm not sure the
user agent needs to be involved in those cases.

> One way in which pages can however help the browser in showing annotations
> on the page: using good semantic html, specifying canonical and alternate
> URLs, using rel=bookmark links in <article>s, and so forth. This helps
> the browser to know which document(s) it is looking at — a point I do not
> see being mentioned in your explainer (many documents may appear under
> numerous URLs; e.g. sometimes query parameters matter, sometimes they do
> not).
>
> +1 thanks for pointing this out.

> (also, another challenge not mentioned in the explainer: the web lacks
> versioning or history, hence annotations can become ‘orphaned’)
>
> +1, I've been thinking about this. I've noticed this as a major issue with
existing annotation tools - aggregator pages and feeds suffer from this the
most where the page changes rapidly and is customized to the user. Maybe
pages need a way to mark themselves as "frequently changing" (I think this
could be useful in other ways)? Or have more granular ways of referencing
particular items (as you mention above). Agree this needs more thought.

> *Web platform*
>
> I really like you are thinking how to make annotation part of “the web
> platform”, and that you are looking at web standards to implement. It would
> indeed be sad if this ends up being just a feature specific to Chromium (I
> hope Text Fragments gets wider adoption too). It needs a whole ecosystem
> around it (annotation publication software, annotation sources, …), that
> would ideally include existing annotation projects. It would be great if
> the millions of annotations made in Hypothesis and elsewhere could be
> displayed in the browser too. To get the relevant parties to discuss this,
> perhaps we can liven up the discussion in the W3C Open Annotation
> Community Group <https://www.w3.org/community/openannotation/> again?
>
> IMHO, if this ends up Chrome- (or even Chromium-) only then we will not
have succeeded (similarly for text fragments) - after all, you can't know
what user agent someone on the other end of a link will be using. But,
changing an ecosystem as big as teh web takes time. Broader annotation
would be a really big change to the web and feels like a really fundamental
piece of what makes the web...webby...so I think building things out to be
open and extensible is the right approach. I'll certainly continue to
solicit feedback and do the work out in the open (here, in my GitHub,
eventually in WICG hopefully) and welcome contribution and collaboration.
I'd like to follow a similar model to the text fragment work where (IMHO)
we did a decent job of being transparent and collaborative; certainly I can
think of some mistakes we made along that way but I'd like to think we can
apply lessons learned.

> As an overarching thought, I think the main questions this explainer
> provokes are not technical, but rather about the goal and the process
> towards it. Will the concept be developed by engineers, or with rounds of
> user research and feedback?  Will this be a collaborative effort to change
> the web at large, or Google using its dominant browser to push a feature
> through and wait for others to follow? Will it end up being a single
> product experience, or a framework that enables many browsers, browser
> extensions, publishers, CMSes and other parties to try out a variety of
> annotation(-ish) systems and use cases? I guess you might not have the
> answers ready either, but I am curious how you plan to go about.
>
> The explainer is intentionally vague on technical details because...we
don't have any yet :). This is still very early in the process for what we
want to do (and I can't promise we'll still be around in 4 months).
Internally, we're still figuring out what we want and can build, hopefully
we'll have something more technical to share soon. That said, personally,
I'd like to make sure anything we build is on an open infrastructure so
that it works across a variety of user agents and can be used and extended
in new ways.

> As said, I would be curious to hear the thoughts from others in this
> group. Is the interest in web annotation from Chromium developers (even if
> coming ‘late to the party’) an opportunity for web annotation? Or a
> hopeless/misguided mission? Or…?
>
> Kind regards,
>
> — Gerben
>
Thanks!
David

>
> On 12/05/2021 17.42, David Bokan wrote:
>
> Hi everyone,
>
> Myself and a few colleagues from the Chrome team have been considering
> ways to bring some annotation use cases to the browser by default. This
> requires a lot of thought about what kind of  APIs and controls are given
> to page authors and users as well as the broader implications on the web
> ecosystem (e.g. security, privacy, UI, etc.)
>
> We want to do this in a way that's open and integrates with existing specs
> and work on annotations. We've put up a public explainer
> <https://github.com/bokand/web-annotations>; there's no concrete proposal
> yet, it's all very early stages. There are a few rough ideas though and
> explains how we're thinking about the challenges. We'll continue to develop
> the ideas there if you'd like to participate or just follow along.
>
> Given this group's interest in annotations, I'd like to invite thoughts
> and feedback, particularly if anyone has any experiences with similar
> efforts in the past.
>
> Thanks!
> David Bokan
>
>
Received on Thursday, 27 May 2021 04:51:16 UTC