W3C home > Mailing lists > Public > public-geolocation@w3.org > June 2008

Re: Geolocation: Security and Privacy

From: Aaron Straup Cope <straup@gmail.com>
Date: Thu, 12 Jun 2008 08:35:14 -0700
Message-ID: <48514232.5060600@gmail.com>
To: public-geolocation@w3.org

Having been one of those people who presented on the subject of reverse 
geocoding at Where [1], I will point out that there are two Big Things 
(tm) to be keep in mind :

1) There is always going to be an interpretation of bias in the 
hierarchy of relationships you choose. This is okay, really.

The simplest example is to contract the way that Flickr [2] and 
FireEagle handle "localities" since the two site share an almost exact 
hierarchy of places. Flickr treats anything with neighbourhoods as a 
locality so in our model Duncan Mills, CA (pop. 84) and Mexico City 
(pop. 19M) are assumed to be the same "type" of place.

FireEagle does not. If you authorize an application to know your 
whereabouts at a "city" level there is an expectation that your actual 
location will be suitably "fuzzed" [3] and in a town of 84 people 
there's not a lot of room to get fuzzy in.

2) Never mind so-called disputed places (Kashmir, the West Bank, Cyprus, 
etc.) all neighbourhoods are "disputed" around the edges. (This is often 
true of localities, as well.)

For example, the rough consensus in San Francisco is that Delores street 
is the dividing line between the Mission and Noe Valley. That said there 
are those people who may live on the one side of the line and very much 
believe themselves to be living on the "other". Our experience has been 
that there are few better ways to pick a fight than to tell someone what 
neighbourhood they are in (and being wrong).

There is also the problem where the data simply doesn't exist yet or it 
is just old and dusty, sometimes wrong, and often plain weird : 
"Manhattan Valley", anyone?

This is further compounded by the lack of ideas/tools/infrastructure for 
reflecting changes (both socially and politically) but, ultimately, 
those are both somewhat tangential. I mention them only to highlight 
some of the issues.

So, I agree that there isn't much point in doing anything *but* leaving 
the actual process of reverse-geocoding up to individual sites and 
providers.

To the extent that anything should be codified in an API it might be as 
simple as naming conventions for performing something that looks like 
reverse geocoding, some minimum amount of data to be returned and maybe 
a simple way to denote place types and relationships (perhaps (read: 
probably) at the expense of ISO-levels of thoroughness and complexity 
that the GIS world enjoys).

In the hand waving department :

* geo.resolveLatLon(lat, lon, ?accuracy, ?provider_uri)

     returns a unique identifier

* geo.fetchLocation(uid, ?provider_uri)

     returns lat, lon and [yer namespaced stuff here]

* geo.fetchLocationHierarchy(uid, ?provider_uri)

     returns a nested set of unique identifiers

Or maybe just the first two.

As is usually the case with these things, reliable and permanent 
identifiers (read: don't require users to stab themselves in the face 
with the vagueries of plain-old geocoding) are the key. [4] As long as 
those "nubby bits" are present we can teach the robots to handle the rest.

Cheers,

-- 

[1] http://www.slideshare.net/straup/aware-of-only-one-voice
[2] I work at Flickr.
[3] Assuming that you share an expectation that cities are "big".
[4] Examples include Geonames [5] and Yahoo's GeoPlanet [6]
[5] http://geonames.org/
[6] http://developer.yahoo.com/geo/

Alec Berntson wrote:
> Reverse Geocoding does have its challenges, but I don't think we should 
> remove the notion of a civic address for a few reasons.
>  
> 1.) The address that the API supplies could be user entered. If I am a 
> desktop owner, I could conceivably have a provider that lets me enter my 
> home address. Since the machine doesn't move, it would always be 
> accurate. The value of faster searches, relevant ads, etc as a result of 
> location-based context all still apply to desktops.
>  
> 2.) reverse geocoding is a focus of much research (At the Where 2.0 
> conference many people presented on the topic), and is only going to get 
> better. We should at least keep the framework in place to support it as 
> the technology improves.
>  
> That being said, I agree that we shouldn't place any strong requirements 
> on the availability of a reverse geocoding service in the API. We 
> certainly should not count on it for 'data fuzzing.'
>  
> -----Original Message-----
> From: Ryan Sarver [mailto:rsarver@skyhookwireless.com]
> Sent: Wednesday, June 11, 2008 10:34 AM
> To: Alec Berntson
> Cc: Chris Butler; public-geolocation@w3.org
> Subject: Re: Geolocation: Security and Privacy
>  
> Something we have to be very cognizant of is that reverse-geocoding,
> especially to street level is really only feasible in the US and we
> don't have any reliable, distributed providers.
>  
> Just as a thought -- why not leave reverse-geocoding up to the
> implementing site or developer? I know there are a lot of benefits to
> passing that information from the browser, but it feels like there are
> too many exceptions to make it part of the spec.
>  
> Thoughts?
>  
> On Jun 10, 2008, at 2:09 PM, Alec Berntson wrote:
>  
>  >
>  > I've taken a look at fireeagle's offerings, and I agree that their
>  > "Location Hierarchy" is very nice. However, they are able to
>  > accomplish all of that because they do not just offer an API, but
>  > rather a whole platform and back-end.
>  >
>  > This leads us to the question of, how much should this location api
>  > just be a data pass-through, and how much server-side service
>  > support should be required? We are already treading that line with
>  > reverse-geocoding. Mandating web-service support as part of an API
>  > definition seems a little out of scope to me...
>  >
>  > What do other people think?
>  >   -Alec
>  >
>  > -----Original Message-----
>  > From: public-geolocation-request@w3.org 
> [mailto:public-geolocation-request@w3.org
>  > ] On Behalf Of Alec Berntson
>  > Sent: Tuesday, June 10, 2008 10:25 AM
>  > To: Chris Butler; public-geolocation@w3.org
>  > Subject: RE: Geolocation: Security and Privacy
>  >
>  >
>  > I am still waiting for fireeagle "to be ready" to give me an invite :(
>  >
>  > -----Original Message-----
>  > From: Chris Butler [mailto:cbutler@dash.net]
>  > Sent: Monday, June 09, 2008 7:08 PM
>  > To: Alec Berntson; public-geolocation@w3.org
>  > Subject: RE: Geolocation: Security and Privacy
>  >
>  > Hi Alec.
>  >
>  > The approach to provide a randomized dance around the current location
>  > would provide a way to potentially brute force a more approximate real
>  > location by accessing the data multiple times and doing the average...
>  >
>  > One example of a site that does this in a way that I think is pretty
>  > nice is the way that FireEagle does it.
>  >
>  > Have you checked that out?
>  >
>  > Giving back a reverse geocoded string would work as well so that the
>  > service doesn't need to provide that service...
>  >
>  > Thanks.
>  >
>  > Chris
>  >
>  > -----Original Message-----
>  > From: Alec Berntson [mailto:alecb@windows.microsoft.com]
>  > Sent: Monday, June 09, 2008 11:16 AM
>  > To: Chris Butler; public-geolocation@w3.org
>  > Subject: RE: Geolocation: Security and Privacy
>  >
>  > To accomplish "Data fuzzing," I think the easiest solution is to
>  > randomize the lat/long values to some number of decimal places based
>  > on
>  > what the user is willing to give. I agree that bounding boxes and
>  > center
>  > points of cities makes a lot of sense, but that seems like an
>  > implementation nightmare - all the central points/bounding boxes would
>  > need to be stored in a database somewhere and accessed.
>  >
>  > One alternative is to only perform limited reverse geocoding (i.e.
>  > only
>  > give the city, state, country) for sites that the user does not trust
>  > and withhold the lat/long values. Then if the user gives consent (via
>  > UI?), the actual coordinate could be returned.
>  >
>  > -----Original Message-----
>  > From: Chris Butler [mailto:cbutler@dash.net]
>  > Sent: Saturday, June 07, 2008 6:47 PM
>  > To: Alec Berntson; public-geolocation@w3.org
>  > Subject: RE: Geolocation: Security and Privacy
>  >
>  > Hi Alec.
>  >
>  > I think that you make a good point about the 'fuzzing' of user
>  > location.
>  > I wonder what the best way to do this though is.
>  >
>  > In the case of just giving city level information, here are some
>  > options:
>  >
>  > * Lat/lon of a geocoded center of the city
>  > * Geocode-able city name
>  > * Bounding box of the city
>  >
>  > The last option sounds like the best since it is non specific and
>  > doesn't give any single point as the location...
>  >
>  > Thoughts?
>  >
>  > Thanks.
>  >
>  > Chris Butler | Content Platform Evangelist, Dash Navigation | Office:
>  > 408-543-2939 | Mobile: 415-577-9130 | Fax: 408-400-0939
>  >
>  > -----Original Message-----
>  > From: public-geolocation-request@w3.org
>  > [mailto:public-geolocation-request@w3.org] On Behalf Of Alec Berntson
>  > Sent: Friday, June 06, 2008 11:32 AM
>  > To: public-geolocation@w3.org
>  > Subject: Geolocation: Security and Privacy
>  >
>  >
>  > One of the most important aspects of the geolocation API spec (IMO)
>  > will
>  > be the privacy and security requirements. The user's current
>  > location is
>  > probably the most one of the most sensitive pieces of personal
>  > information available. The references in the draft spec point to a few
>  > solid approaches that I would like to highlight (and build on):
>  >
>  > Opt-out by default
>  >    By default, no page can access the users location
>  >
>  > UI to alert the user
>  >    There needs to be an alert when a page requests the user's location
>  >    There needs to be some form of status UI indicating when location
>  > data is being accessed
>  >
>  > Least privilege
>  >    The user should be given the option to allow access to a page (or
>  > domain) for
>  >       Just this once
>  >       Just this session
>  >       Always
>  >    Data 'fuzzing'
>  >       User can control how much resolution to give to a page
>  >       Add noise to the data if more accurate information is available
>  > than is requested
>  >
>  > Logging
>  >    Keep a log of what information was given out to whom
>  >
>  > Hope that kicks off some discussion!
>  >    -Alec
>  >
>  >
>  >
>  >
>  >
>  >
>  >
>  >
>  
>  
>  
>  
Received on Thursday, 12 June 2008 16:50:29 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 22 March 2012 18:13:39 GMT