Re: wording for the privacy section from Alissa Cooper on 2008-11-12 (public-geolocation@w3.org from November 2008)

From: Alissa Cooper <acooper@cdt.org>
Date: Wed, 12 Nov 2008 16:36:58 -0500
To: Ian Hickson <ian@hixie.ch>
Cc: John Morris <jmorris@cdt.org>, public-geolocation <public-geolocation@w3.org>
Message-Id: <93D65211-23FB-4AC2-B290-0AFFFA5CA4F6@cdt.org>
Ian,

I was holding off on a response in anticipation of Richard's post  
about his Firefox add-on. The UI that I sketched out was a straw man  
based on a few hours' review of the Geopriv architecture. I think  
Richard has significantly evolved the concept by implementing a  
potential UI (although it, too, was created in haste). At this point I  
think it makes more sense to use his implementation as the discussion  
point, since I think it may have answered or be able to answer several  
of your and others' questions. I have briefly responded to some of  
your other queries below.

> On Nov 5, 2008, at 1:01 AM, Ian Hickson wrote:

> We have ample experience showing that when browsers only expose  
> defaults,
> Web pages start depending on them beng the defaults and never actually
> verify that they might not. For example, default text colors,  
> backgrounds,
> font sizes, etc. In this particular case that would be disastrous,  
> because
> not only would it defeat the point of exposing this information, it  
> would
> lead to users not being able to change their _privacy_ settings even  
> if
> they wanted to (at the risk of breaking sites), it would lead to sites
> being able to abuse user privacy by claiming to be following their
> preferences (which are really defaults), and it would lead to a  
> general
> disillusionment among Web authors regarding privacy concerns.

The point of having privacy-protective defaults is that even when  
sites use them in place of more granular preferences, they still  
provide more information about the user's preferences than having no  
rules at all. I don't think the existence of default values obviates  
the possibility to offer more granular settings. Firefox allows sites  
to set cookies by default, but also allows users to block cookies from  
specific domains. How does the existence of a default imply that users  
can't set their own privacy settings?

> On Nov 5, 2008, at 1:01 AM, Ian Hickson wrote:

>> The next case is where there's a per-site UI. Despite some  
>> disagreement
>> in an earlier thread about whether UI would be exposed on a per-site
>> basis, let's assume for a moment that it is, so that every time a  
>> site
>> requests location information via this API, the user gets prompted to
>> consent. In this case, one possible UI would prompt the user with two
>> questions:
>>
>> 	Ask this site not to share my location with others [ yes | no ]
>> 	Ask this site to retain my location for: [ 1 hour | 24 hours |
>>      indefinitely ]
>
> This seems to omit the most important question:
>
>        Share my location with this site [ yes | no ]


In my text quoted above, I said that this example assumes "that every  
time a site requests location information via this API, the user gets  
prompted to consent." I just didn't include it in the little UI mock-up.

> On Nov 5, 2008, at 1:01 AM, Ian Hickson wrote:

> I'd have no problem with Apple
> sharing my location with Google, but I wouldn't want it sharing my
> location with the site run by a political candidate.
>

This is the point of having a ruleset reference, so you can express  
this sort of granular preference. As an aside, if we rely on privacy  
policies as the sole means to protect user privacy interests, this  
point is entirely moot. Most sites' explanations of who they disclose  
data to don't even come close to the level of detail yo suggest above;  
instead, they talk about disclosure to "affiliates," "third parties,"  
or some other generic entities. So if a user like you who is fine with  
disclosure to Google but not to a politician has to rely only on a  
privacy policy that talks about third-part disclosure in vague terms  
in order to decide whether to disclose his location to the site, how  
will he do it?

> On Nov 5, 2008, at 1:01 AM, Ian Hickson wrote:

>> rulesetReference is a URI. What exists at this URI is an extended  
>> set of
>> rules relating to the location being conveyed. These rules can be
>> written using the XML framework described at [1] and [2]. The rules
>> language is highly flexible and extensible, and provides for all  
>> kinds
>> of rules limiting or granting transmission, limiting or granting
>> retention for specified periods of time, specifying location  
>> fuzziness,
>> and other constraints.
>> [1] http://www.ietf.org/rfc/rfc4745.txt
>> [2] ftp://ftp.rfc-editor.org/in-notes/internet-drafts/draft-ietf-geopriv-policy-17.txt
>
> How is this URL supposed to be picked? I mean, how would the user's
> preferences be uploaded somewhere that a script could access? How is  
> the
> script supposed to access this URL? How will the data at this URL be
> protected from other users?


One example would be if my network provider allows me to set privacy  
rules around my location information that it makes available on the  
network. Those rules could be uploaded at a URI designated by the  
network provider and the URL could be passed through the usageRules  
object in the Position interface. I am not an expert on securing this  
kind of transaction but RFC 4119 [1] seems to suggest a few ideas that  
apply broadly to securing XML documents.

> On Nov 5, 2008, at 1:01 AM, Ian Hickson wrote:
>
>> Here's a very simple example of what might exist at a  
>> rulesetReference
>> URI:
>>
>> <?xml version="1.0" encoding="UTF-8"?>
>>  <ruleset xmlns="urn:ietf:params:xml:ns:common-policy">
>>      <rule id="f3g44r1">
>> 	           <conditions>		               <identity>
>> 	                   <many>
>> 	                       <except domain="adserver1.com"/>
>> 	                       <except domain="adserver2.com"/>
>> 	                   </many>
>> 	               </identity>
>> 	           	   <validity>
>> 	                   <from>2009-08-15T10:20:00.000-05:00</from>
>> 	                   <until>2009-09-15T10:20:00.000-05:00</until>
>> 	               </validity>
>> 	            </conditions>
>> 	       <actions/>
>> 	       <transformations/>
>> 	     </rule>
>>  </ruleset>
>>
>> This ruleset says that the Position object with which it is  
>> associated
>> can be conveyed to and stored by anyone (i.e., any domain) between
>> August 15 and September 15 except for adserver1.com and  
>> adserver2.com.
>
> With all due respect, authors aren't going to make head or tails of  
> this
> data. In fact, having Web authors _attempt_ to use data in a format  
> this
> complicated will likely turn them off dealing privacy issues for  
> years,
> leaving them with the impression that privacy is hard and  
> complicated and
> not worth the effort.
>

On the one hand, you claim that the majority of developers out there  
already do a great job of protecting privacy and they care a lot about  
it. On the other hand, you claim that none of them will be willing to  
think about following a simple set of user privacy preferences, and  
that the mere thought of it will cause them to renounce their concern  
for privacy altogether. Which is it?

Plus, if developers are looking for a middle ground, they can pretty  
easily use the defaults.

> On Nov 5, 2008, at 1:01 AM, Ian Hickson wrote:

>> Let's take the pizza place example. The pizza place site has an
>> agreement with an ad network to show location-based ads on the pizza
>> place site. The pizza place also stores the locations of visitors  
>> to the
>> site in its database for demographic analysis. Some pseudocode of how
>> the usageRules might interact with these two endeavors:
>>
>> navigator.geolocation.getCurrentPosition(someFunction(position,  
>> usageRules))
>> if (usageRules.retransmissionAllowed == true )
>> 	send (position, IP address) tuple to adserver
>> insert (position, IP address, usageRules.retentionExpires) tuple in  
>> location
>> database
>
> Those two are understandable, but it was more the other two  
> attributes I
> would like to see script for.


The script would dereference the rulesetReference, parse the privacy  
rules there, and apply them just the same way the rules above are  
applied. What the script code would actually look like largely depends  
on what the rules say and the scope of the rules that the script is  
willing to enforce.

The notewell is human-readable, so likely would be stored where it may  
be accessed by a person.

> On Nov 5, 2008, at 1:01 AM, Ian Hickson wrote:
>> If the user allows retransmission of his location, the pizza place  
>> sends
>> it to the ad server.
>
> With all due respect, I think this underestimates the power of  
> greed. If
> browsers default to "false" for this value, which seems advisable if  
> we
> were to have a default, then sites will just ignore the setting and  
> send
> the user's location out anyway, possibly with a site-level opt-out.
>

See my next email to Angel for a discussion on why privacy rules are  
helpful even when they are ignored.

As a side note, I think it would take a pretty gutsy site to  
acknowledge its purposeful disregard of the user's preference by  
offering an opt-out. But if it did, the presence of the opt-out would  
be a perfect basis for users to make the case that the site is not  
respecting users' explicit preferences.

> On Nov 5, 2008, at 1:01 AM, Ian Hickson wrote:
> Who are these non-malicious developers who would ignore privacy  
> normally
> but would _not_ ignore it if we included this feature? I would be very
> surprised if there were any significant number of such people.

Let's pick two easy examples (although I think there are many).  
Facebook is a company that has put a lot of work and thought into  
designing its services in a privacy-protective way. And yet their  
first implementation of the Beacon incurred such tremendous backlash  
that they eventually made significant changes to its design so that it  
was more in line with users' preferences and expectations. What if the  
purchase data used in the Beacon system had originally been  
accompanied by a rule that expressed "don't post this to my profile  
without asking" or something similar? If this had been the case, I  
would venture to guess that Facebook would not have trudged ahead with  
revealing purchase information without first obtaining consent.  
Instead, they guessed or assumed people's preferences and found out,  
quite publicly, how wrong they were. The intent was not malicious.

The AOL search logs release in 2006 is another example. Had those  
search logs been accompanied by rules that said something along the  
lines of, "don't post my search logs publicly," or even "ask me before  
releasing my logs to researchers," maybe AOL would have thought twice  
about doing those things. AOL developers spend a lot of time thinking  
about privacy, and I don't think the search logs release was motivated  
by malicious intent. But if AOL had received a clear statement of user  
preferences ahead of time, I find it hard to believe that they would  
have gone ahead with the release anyway.

> On Nov 5, 2008, at 1:01 AM, Ian Hickson wrote:
>> Frankly, I would be astounded if even the small number of sites that
>> already obtain location information using the existing version of  
>> this
>> API (1) delete location data after some amount of time less than many
>> years, and (2) commit to not sharing location information with  
>> others.
>
> As far as I can tell, both of these already happen for most reputable
> sites for the majority of user data, including geolocation data. Do  
> you
> have counterexamples? Maybe it would help to know what sites you are
> worried about.
>

I would be very interested in hearing about all the sites that delete  
their data promptly and promise to not share it with others. Do you  
have examples?

As for examples in the other direction, we can look to sites' privacy  
policies to determine their enforceable practices (there's no way to  
check undisclosed practices). I did a brief survey of the privacy  
policies of some of the sites implementing the current geolocation  
API, with an eye towards their data retention and disclosure policies:

ITN has no privacy policy that I could find.

Pownce's policy is mute on data storage limits. In fact, Pownce  
retains the right to retain your data even after you've requested its  
deletion or submitted updates to it. Pownce also retains the right to  
disclose user data without consent to apply their own ToS or "other  
agreements" and to protect the rights of "customers or others." Both  
of these provisions could be interpreted to give Pownce the right to  
disclose location data without consent for a rather broad array of  
purposes, many of which probably contradict user preferences.

Lastminute.com's policy is mute on data storage limits. Lastminute.com  
also retains the right to disclose user data without consent to  
"certain permitted third parties," and while it provides examples of  
such third parties, the list is not in any way limited to those  
examples, meaning that whoever Lastminute.com deems to be a  
"permitted" third party may obtain data without user consent.

Rummble's policy is mute on data storage limits.

Azarask.in has no privacy policy that I could find.

> On Nov 5, 2008, at 1:01 AM, Ian Hickson wrote:
> The problem is with a lack of policy/law that can enforce these
> requirements on malicious or unethical sites, not with reputable sites
> having poor privacy policies.

See my next email to Angel regarding how poorly privacy policies --  
including those of reputable sites -- are working for average Internet  
users

Alissa

[1] http://www.ietf.org/rfc/rfc4119.txt
Received on Wednesday, 12 November 2008 21:38:01 UTC