Re: Privacy review request: Web Annotation model and protocol

Hi all,

Ivan Herman reached out to PING to share a trio of documents relating to
the Web Annotation model:

*) The Web Annotation Protocol[1]

*) The Web Annotation Vocabulary[2]

*) The Web Annotation Data Model[3]

Together, these documents propose a way for “annotation servers” to be set
up, which can manage and store annotations about websites.

To start off, I wanted to list off some high level takeaways I gathered. I
have also included a run through of the PING privacy questionnaire[4] I
developed.

1.) Annotations, like all other internet traffic should probably be sent
via HTTPS. The IETF has termed pervasive monitoring as an “attack[4],
recommending all traffic be sent over HTTPS to avoid said attack.
Similarly, the United States CIO has stated that “All browsing activity
should be considered private and sensitive. An HTTPS-Only standard will
eliminate inconsistent, subjective determinations across agencies regarding
which content or browsing activity is sensitive in nature”. [5]

2.) I wasn’t clear reading this spec: Are annotation servers always
controlled by the operators of a given site? Or can one annotation server
annotate any website? Regardless, there there be an opt out mechanism,
similar to a <https://en.wikipedia.org/wiki/Robots_exclusion_standard>robots.txt
on a standard web page? I especially worry about the issue of harassment,
which has been raised with other annotation services like Genius[7].

3.) Finally, I feel it’s important that there be mechanisms to edit and
delete annotations. Annotation servers should not be “write only”. In other
contexts such as on Facebook[8], users often regret the data they upload -
I expect that the annotation servers will have similar incidents.


[1] https://www.w3.org/TR/2016/WD-annotation-protocol-20160331/

[2] https://www.w3.org/TR/2016/WD-annotation-vocab-20160331/

[3] https://www.w3.org/TR/2016/WD-annotation-model-20160331/

[4] https://gregnorc.github.io/ping-privacy-questions/

[5] http://www.w3.org/2001/tag/doc/web-https

[6] https://https.cio.gov/

[7]
http://www.dailydot.com/technology/genius-annotations-online-harrassment/

[8] “I regretted the minute I pressed share”: A Qualitative Study of
Regrets on Facebook
http://cups.cs.cmu.edu/soups/2011/proceedings/a10_Wang.pdf


<http://www.dailydot.com/technology/genius-annotations-online-harrassment/>

In addition to these high level takeaways, below I have walked through the
PING Privacy Questionnaire and included my responses. I encourage other
standards developers to consider using the self questionnaire - and I
welcome feedback on how this questionnaire can better help spec authors
perform privacy audits:


   1.

   Does this specification have a "Privacy Considerations" section?
   -

      Not currently.
      2.

   Does this specification collect personally derived data?
   -

      No. Users could put personal data in a tag if they chose, but that is
      not something the spec specifically asks for or encourages.
      3.

   Does this specification generate personally derived data, and if so how
   will that data be handled?
   -

      No, this standard does not directly generate identifiable information
      such as audio or video.
      4.

   Does this standard allow an origin direct access to a user’s location,
   and if so is that information minimized?
   -

      No, the Annotation Protocol does not collect location data.
      5.

   How should this specification work in the context of a user agent’s
   "incognito" mode?
   -

      The same as without, assuming the server is accessed via the browser.
      6.

   Is it possible to spoof/fake the data being generated for privacy
   purposes?
   -

      I assume users could use a proxy, VPN, or Tor to access the
      annotation server.
      7.

   Does the standard utilize data that is personally-derived, i.e. derived
   from the interaction of a single person, or their device or address?
   -

      No.
      8.

   Does the data record contain elements that would enable re-correlation
   when combined with other datasets through the property of intersection
   (commonly known as "fingerprinting")?
   -

      However I would like to point out that PING  has previously discussed
      sensor-specific question that can get at cross-device or cross-UA
      signaling. (The Vibration API). Can I get a volunteer to submit a pull
      request to add language that would add language to capture this threat
      model to the existing questionnaire?
      9.

   Is the user likely to know if information is being collected?
   -

      Yes, users must expressly navigate to and utilize the annotation
      server.
      10.

   Can the user easily, preferably through an element of the GUI, revoke
   consent granted to a particular feature?
   -

      Again, not clear if users will have the ability to delete/edit
      annotations. Hopefully there will be a discussion on this feature - users
      often regret posts on social media[8], and it’s important they be able to
      delete their posts.
      11.

   Once consent has been given, is there a mechanism whereby it can be
   automatically revoked after a reasonable, or user configurable, period?
   -

      I’m not 100% clear, but I would hope that users can delete their
      annotations if they choose to do so.
      12.

   Does this standard utilize strong end to end encryption?
   -

      I see no mention of using HTTPS in this standard. I’d like to see
      language added that Annotation servers must use TLS.


/********************************************/
Greg Norcie (norcie@cdt.org)
Staff Technologist
Center for Democracy & Technology
District of Columbia office
(p) 202-637-9800
PGP: http://norcie.com/pgp.txt



*CDT's Annual Dinner (Tech Prom) is April 6, 2016.  Don't miss out!learn
more at https://cdt.org/annual-dinner <https://cdt.org/annual-dinner>*
/*******************************************/

On Sat, Apr 2, 2016 at 3:18 AM, Ivan Herman <ivan@w3.org> wrote:

> Christine,
>
> thank you.
>
> Personally I can make it, but that is obviously not enough:-) We will
> discuss this on the group to find out who can make it on the call, and we
> will get back to you soon.
>
> Thanks again
>
> Ivan
>
> > On 1 Apr 2016, at 23:41, Christine Runnegar <runnegar@isoc.org> wrote:
> >
> > Thank you very much Greg.
> >
> > Ivan, PING will have its next teleconference on 28 April 2016 at UTC 16.
> >
> > We would be happy to invite you and your colleagues in the Web
> Annotation WG to join us. This would give you an opportunity to introduce
> the specifications and discuss any privacy considerations that emerge as a
> result of Greg’s and others’ review.
> >
> > Would this be useful?
> >
> > Kind regards,
> > Christine (PING co-chair)
> >
> >> On 1 Apr 2016, at 5:34 PM, Greg Norcie <gnorcie@cdt.org> wrote:
> >>
> >> Hi Ivan,
> >>
> >> I'll take a look using the Ping Privacy Questionaire[1] and send you
> some feedback.
> >>
> >> http://gregnorc.github.io/ping-privacy-questions/
> >>
> >> (I'll also be editing the questionnaire based on this test run and
> emailing out a summary of changes in a separate email to PING)
> >> <signature.asc>
> >
>
>
> ----
> Ivan Herman, W3C
> Digital Publishing Lead
> Home: http://www.w3.org/People/Ivan/
> mobile: +31-641044153
> ORCID ID: http://orcid.org/0000-0003-0782-2704
>
>
>
>
>

Received on Friday, 8 April 2016 19:47:31 UTC