Geolocation API Level 2 Last Call comment on privacy

Hello Geolocation Working Group,

The following Last Call comment suggests several changes to the Geolocation API Level 2 spec to improve support of privacy for users of the Geolocation API. I would have written a shorter comment, but I ran out of time; please let me know if there's anything I can clarify.

Thanks,
Nick

# Origin pairs

Experience with implementation of V1 has shown some ambiguity over how permissions should be presented to the end user. Specifically, Chrome developers noted that if iframes from the same document origin embedded on different sites shared the same persisted permission, users might be unpleasantly surprised at the presence of their real-time locations in unfamiliar contexts [1].

In order to provide a model that matches end user expectations and is consistent across browsers, the V2 specification should reflect this concept of origin pairs and provide guidance to implementers in using these pairs when persisting permissions. The System Information API Editor's Draft, for example, makes this requirement explicit:

> A user agent must separately acquire permission through the user interface when the callback is invoked in the context of a document object that is presented in a nested browsing context, if the origin of the nested browsing context is different from the top-level browsing context's origin. In this case, the permission must be scoped to the pair consisting of the top-level browsing context's origin, and the origin within which the callback is executed.

Similar language added to 4.1 would improve consistency and document a best practice.

# Minimization

One key principle that supports privacy is data minimization: exposing as little information as possible, collecting no more information than is necessary, retaining information for a shorter length of time. Minimization is defined and described in the Hansen Internet Draft on privacy terminology [2], for one example, but a similar concept is well known in computer science and computer security as "least privilege".

As applied to geolocation information, minimization applies allowing requests for lower precision location data, like just the city rather than the street address, or fewer decimal places of precision of latitude and longitude. In some cases (a mobile widget providing current weather, for example), a service can provide exactly the same functionality with much less precise (and therefore less privacy-invasive) information. (Minimization would also suggest options to declare limited retention of collected location information which was debated but not adopted by the Group during V1.)

Discussion of minimization of precision in the Geolocation API was discussed during V1 [3], but largely set aside with the reasoning that fuzzing of lat/lon coordinates would be vulnerable to attack and out-of-scope. The hierarchical fields of the civic address format, however, provide an easy opportunity for sites to request less detailed information from a user. 

Allowing site developers to request less precise data was discussed at the September f2f meeting with an action opened [4] to collect more feedback on the issue. I recommend that the Working Group follow-up on that action (or document any results already gathered) and raise an issue for the spec to enable request of only certain fields in the civic address. I believe such functionality would improve privacy of online location data, improve usability of the API for the end user and allow developers to meet best practices.

In my academic work, we specifically recommend to developers of mobile applications that they support data minimization by requesting less precise location data when full precision is not needed. We are documenting this as part of a privacy design patterns project [5] and note that services such as Fire Eagle, Latitude and Geode (a precursor to the Geolocation API) have provided this type of functionality in the past. As we noted in 2010 [6], a spec that allows developers to request less precise or less granular location information would enable this best practice. It would also meet the privacy requirements laid out by the Device APIs Working Group: "APIs must make it easy to request as little information as required for the intended usage" [7] and the draft guidelines from the TAG: "APIs must offer granularity when requesting user information" [8].

Requesting less precise location information could also improve usability -- browsers could provide innovative UI configuration to not prompt (or not to prompt very frequently) for permission for a site to access city- or state-level location information. Decreasing the frequency of these prompts would improve speed and usability and bolster Web security by reducing the inurement to prompts.

Having a requestAddress option to hint to the user agent that a civic address would be useful is a good starting point for minimization. That option could be extended to specify what level of civic address information is necessary (perhaps as an enumeration of the fields) and the spec should clarify that when requestAddress is false the Position.address must be null. Similarly, the spec should note that when requireCoords is false, Position.coords must be null. Having each field of the Address interface as optional (as it is in the spec now) could enable a user agent to provide a user interface for users to select which fields should be populated in the response (we might explicitly note this possibility in the specification) but letting sites request less precise information will more strongly encourage the reliable use of minimization.

# Issues raised in academic work

I'm aware of two academic reports that have been published regarding privacy and security issues with the Geolocation API: one from Berkeley (which I co-authored) in February 2010 [6][9] and another from the DistriNet Research Group out of Leuven in August 2011 [10]. Neither received responses on the public-geolocation list and V2 would be an appropriate time to address these comments.

Our Berkeley report recommended changes to allow minimization (already discussed above), usage notification from sites or user-attached rulesets (widely debated in V1 and apparently not taken up again in V2), transparency requirements (providing ambient notice, allowing inspection of outgoing data or logging) and documentation of aggregation risks in location providers. Usage notification (which I recommended the Group re-consider for V2 during the V1 Last Call [11]) would require substantive protocol changes: we've proposed a sample API [12] that would require notification by sites of how location data would be used. The issues of transparency and aggregation might best be handled in non-normative guidelines for implementers; I can help compose text in that area if the Working Group would find it useful.

From the two issues in the Leuven report, it appears that the issue of terminating processes for unloaded documents was addressed [13]. I have not seen a response to Philippe on the question of distinguishing permissions for getCurrentPosition and watchPosition. It seems that the specification does not prevent user agents from doing this already (leaving user interface out of scope), but it might be worth adding non-normative guidance on this point to 4.3.

# On gathering feedback

As we've discussed in person, the V2 API specification process has seen less involvement from the privacy community than V1 did. If there are ways to encourage that discussion and review in future, I believe it would help the group. One possibility would be to coordinate with the Privacy Interest Group [14] for which we plan to send a Call for Participation shortly.

# Other notes

The Introduction does not mention the (optional) addition of civic address information to the user agent response, which seems to be the most substantial change to the API. 

No examples show use of the requestAddress option.

The group's home page still has no mention of the LC draft or the request for public comments.

# References

[1] http://www.w3.org/2010/api-privacy-ws/papers/privacy-ws-24.pdf Ian Fette and Jochen Eisinger, Practical Privacy Concerns in a Real World Browser, 12 July 2010

[2] http://tools.ietf.org/id/draft-hansen-privacy-terminology-00.html, Terminology for Talking about Privacy by Data Minimization: Anonymity, Unlinkability, Undetectability, Unobservability, Pseudonymity, and Identity Management

[3] See http://www.w3.org/2008/geolocation/track/issues/18 and the associated mailing list threads, for example.

[4] http://www.w3.org/2008/geolocation/track/actions/91, now four months overdue.

[5] http://privacypatterns.org the full site will be available to the public soon, but this in progress page shows the relevant content: https://github.com/m0hit/privacypatterns/wiki/Location-granularity

[6] http://escholarship.org/uc/item/0rp834wf, Doty, N. Mulligan, D. Wilde, E., "Privacy Issues of the W3C Geolocation API", UC Berkeley School of Information, 24 February 2010

[7] http://www.w3.org/TR/2010/NOTE-dap-privacy-reqs-20100629/#privacy-minimization Device API Privacy Requirements, W3C Working Group Note, 29 June 2010

[8] http://www.w3.org/2001/tag/doc/APIMinimization.html#Guidelines Data Minimization in Web APIs, Draft TAG Finding, 12 September 2011

[9] http://lists.w3.org/Archives/Public/public-geolocation/2010Mar/0000.html

[10] http://lists.w3.org/Archives/Public/public-geolocation/2011Aug/0006.html

[11] http://lists.w3.org/Archives/Public/public-geolocation/2009Aug/0004.html

[12] http://www.w3.org/2010/policy-ws/papers/03-Doty-Wilde-Berkeley.pdf Nick Doty and Erik Wilde, "Simple Policy Negotiation for Location Disclosure", 5 October 2010

[13] http://www.w3.org/2008/geolocation/track/actions/90

[14] http://www.w3.org/2011/07/privacy-ig-charter.html

Received on Monday, 16 January 2012 07:02:21 UTC