Re: wording for the privacy section from Ian Hickson on 2008-11-05 (public-geolocation@w3.org from November 2008)

From: Ian Hickson <ian@hixie.ch>
Date: Wed, 5 Nov 2008 06:01:04 +0000 (UTC)
To: Alissa Cooper <acooper@cdt.org>, John Morris <jmorris@cdt.org>, Kartikaya Gupta <lists.geolocation@stakface.com>
Cc: public-geolocation <public-geolocation@w3.org>
Message-ID: <Pine.LNX.4.62.0811042130590.1041@hixie.dreamhostps.com>
On Fri, 31 Oct 2008, Alissa Cooper wrote:
> 
> Inserting a Geopriv rule set does not automatically necessitate any 
> additional UI, because the usage rules have defaults. This is a crucial 
> point. An existing implementation could return a default UsageRules 
> object to the client (retransmissionAllowed = false; retentionExpires = 
> lastPosition + 24 hours; rulesetReference = null; noteWell = null) 
> without any additional UI.

We have ample experience showing that when browsers only expose defaults, 
Web pages start depending on them beng the defaults and never actually 
verify that they might not. For example, default text colors, backgrounds, 
font sizes, etc. In this particular case that would be disastrous, because 
not only would it defeat the point of exposing this information, it would 
lead to users not being able to change their _privacy_ settings even if 
they wanted to (at the risk of breaking sites), it would lead to sites 
being able to abuse user privacy by claiming to be following their 
preferences (which are really defaults), and it would lead to a general 
disillusionment among Web authors regarding privacy concerns.


> The next case is where there's a per-site UI. Despite some disagreement 
> in an earlier thread about whether UI would be exposed on a per-site 
> basis, let's assume for a moment that it is, so that every time a site 
> requests location information via this API, the user gets prompted to 
> consent. In this case, one possible UI would prompt the user with two 
> questions:
> 
> 	Ask this site not to share my location with others [ yes | no ]
> 	Ask this site to retain my location for: [ 1 hour | 24 hours |
>       indefinitely ]

This seems to omit the most important question:

        Share my location with this site [ yes | no ]

Why would the _browser_ ask whether to ask the site whether to share the 
location with other sites, or how long to retain the data, given that it 
has no control over it? Wouldn't it be better to let the site ask this?


> retransmissionAllowed is either true or false (corresponding to the 
> default, the user's choice in the UI, or the existing value if the UA 
> receives a Geopriv object from the host).

This doesn't really make sense. For example I'd have no problem with Apple 
sharing my location with Google, but I wouldn't want it sharing my 
location with the site run by a political candidate.


> rulesetReference is a URI. What exists at this URI is an extended set of 
> rules relating to the location being conveyed. These rules can be 
> written using the XML framework described at [1] and [2]. The rules 
> language is highly flexible and extensible, and provides for all kinds 
> of rules limiting or granting transmission, limiting or granting 
> retention for specified periods of time, specifying location fuzziness, 
> and other constraints.
> [1] http://www.ietf.org/rfc/rfc4745.txt
> [2] ftp://ftp.rfc-editor.org/in-notes/internet-drafts/draft-ietf-geopriv-policy-17.txt

How is this URL supposed to be picked? I mean, how would the user's 
preferences be uploaded somewhere that a script could access? How is the 
script supposed to access this URL? How will the data at this URL be 
protected from other users?


> Here's a very simple example of what might exist at a rulesetReference 
> URI:
> 
> <?xml version="1.0" encoding="UTF-8"?>
>   <ruleset xmlns="urn:ietf:params:xml:ns:common-policy">
>       <rule id="f3g44r1">
> 	           <conditions>		               <identity>
> 	                   <many>
> 	                       <except domain="adserver1.com"/>
> 	                       <except domain="adserver2.com"/>
> 	                   </many>
> 	               </identity>
> 	           	   <validity>
> 	                   <from>2009-08-15T10:20:00.000-05:00</from>
> 	                   <until>2009-09-15T10:20:00.000-05:00</until>
> 	               </validity>
> 	            </conditions>
> 	       <actions/>
> 	       <transformations/>
> 	     </rule>
>   </ruleset>
> 
> This ruleset says that the Position object with which it is associated 
> can be conveyed to and stored by anyone (i.e., any domain) between 
> August 15 and September 15 except for adserver1.com and adserver2.com.

With all due respect, authors aren't going to make head or tails of this 
data. In fact, having Web authors _attempt_ to use data in a format this 
complicated will likely turn them off dealing privacy issues for years, 
leaving them with the impression that privacy is hard and complicated and 
not worth the effort.



> > Could you show how a script would use these values?
> 
> Let's take the pizza place example. The pizza place site has an 
> agreement with an ad network to show location-based ads on the pizza 
> place site. The pizza place also stores the locations of visitors to the 
> site in its database for demographic analysis. Some pseudocode of how 
> the usageRules might interact with these two endeavors:
> 
> navigator.geolocation.getCurrentPosition(someFunction(position, usageRules))
> if (usageRules.retransmissionAllowed == true )
> 	send (position, IP address) tuple to adserver
> insert (position, IP address, usageRules.retentionExpires) tuple in location
> database

Those two are understandable, but it was more the other two attributes I 
would like to see script for.


> If the user allows retransmission of his location, the pizza place sends 
> it to the ad server.

With all due respect, I think this underestimates the power of greed. If 
browsers default to "false" for this value, which seems advisable if we 
were to have a default, then sites will just ignore the setting and send 
the user's location out anyway, possibly with a site-level opt-out.


On Fri, 31 Oct 2008, Alissa Cooper wrote:
> 
> The client would not be constrained to obey the rules any more than a 
> food finder is currently constrained to use the velocity that the 
> Position object conveys to it (which I'm guessing most food finders 
> don't use) or any more than a friend finder is currently constrained to 
> use the altitude accuracy that the Position object conveys to it (which 
> I'm guessing goes unused as well). Every client is not going to use 
> every piece of data it receives via an API call. But that doesn't mean 
> sending that data is not worthwhile for clients that will make use of 
> the data.

Agreed; the reason it is not worthwhile in this case is because including 
this data would actually harm the privacy cause, as noted above.


On Tue, 4 Nov 2008, Alissa Cooper wrote:
> 
> I think the disconnect here is in your conception of "malicious" sites. 
> Truly malicious sites are going to do whatever they want no matter how 
> you write the API. It doesn't matter if they receive privacy rules, or 
> if they make claims in a privacy policy. They will abuse the location 
> information they receive, and they will find ways to obtain information 
> they shouldn't have.

Ok.


> Rather, privacy rules would force the broad swath of non-malicious web 
> developers out there to confront and grapple with privacy. As John said 
> in an earlier email [1],
> 
> "The browser makers will of course not be able to force downstream 
> developers to in fact play nice on privacy, but if the user's 
> 'expectation of privacy' is made clear to the downstream developer, then 
> the developer's local law may will force them to honor those 
> expectations."
> 
> The point is to make developers who would otherwise ignore privacy 
> (without malicious intent) to think about it.

Who are these non-malicious developers who would ignore privacy normally 
but would _not_ ignore it if we included this feature? I would be very 
surprised if there were any significant number of such people. Without any 
clear evidence of large amounts of such people, I'd be very reluctant to 
support such an API. Personally, I think our experience with designing Web 
languages and APIs is that we will not have any such effect. Look at 
DOCTYPEs, which have had no effect on authors thinking of validation; or 
at providing authors with a non-string API for setTimeout() -- authors 
still use the string API. Or with making alt="" on <img> required, which 
should have made authors aware of accessibility, but did little to improve 
the situation.



> > So the only sites that will observe these rules are the well-behaved 
> > ones, which probably don't need these rules anyway. Or is there 
> > something that I am still missing?
> 
> Frankly, I would be astounded if even the small number of sites that 
> already obtain location information using the existing version of this 
> API (1) delete location data after some amount of time less than many 
> years, and (2) commit to not sharing location information with others. 

As far as I can tell, both of these already happen for most reputable 
sites for the majority of user data, including geolocation data. Do you 
have counterexamples? Maybe it would help to know what sites you are 
worried about.


On Fri, 31 Oct 2008, John Morris wrote:
> 
> If you truly believe that the Internet privacy model "has worked very 
> well for all private information up to this point," then we may just 
> have to agree to disagree on that.  I think there is fairly broad 
> consensus in the U.S. at least (and I am pretty sure in the E.U.) that 
> privacy on the Internet has been a failure.  Indeed, your employer, 
> FWIW, has taken a leading role in the industry in the U.S. in calling 
> for new privacy laws to help address a range of serious privacy 
> problems.  According to Google management, the consumer privacy 
> situation is "uneven at best."  See, for example, 
> http://googleblog.blogspot.com/2006/06/calling-for-federal-consumer-privacy.html 
> (joining a broad effort led by CDT to enact new privacy laws).

The problem is with a lack of policy/law that can enforce these 
requirements on malicious or unethical sites, not with reputable sites 
having poor privacy policies. It's the laws that are uneven, not the 
technologies. I certainly agree that we could do with better laws, 
education, and so on. That's entirely unrelated to API design.


On Tue, 4 Nov 2008, Kartikaya Gupta wrote:
> 
> Given the choice between confusing users and misleading users, it seems 
> that CDT is advocating the "misleading users" approach and everybody 
> else is advocating the "confusing users" approach. Both seem pretty bad 
> to me, but I can't think of any other solution that makes sense either.

I really don't think those are different choices. I think that exposing 
these features would both confuse _and_ mislead users, whereas exposing 
only a per-site boolean "allow access / disallow access", with the user 
having to make a determination about the trustworthiness of the site, is 
the option most likely to be understood by and honest to the users (of the 
options presented so far, anyway). I think users would understand this 
mechanism, since it is the mechanism that they've had to deal with so far 
on the Web for all other privacy-sensitive data.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
Received on Wednesday, 5 November 2008 06:02:16 UTC