Re: Use case: tiger map/census data: have it your way

Eric, I believe the 3rd kind is my preference
e.g. given (on the web somewhere)
[:latitude 42.3; :longitude -71.1; :cityName "Cambridge"; :state "MA";
:country "USA"].
[:latitude 41.9; :longitude -87.6; :cityName "Chicago"; :state "IL";
:country "USA"].
[:latitude 42.1; :longitude -71.3; :cityName _:cn; :state "MA"; :country
"USA"].

and asking (via http)
_:c :latitude _:lat; :longitude _:long; :cityName _:city.
((((_:lat 42.3).math:difference 2).math:exponentiation ((_:long
-71.1).math:difference 2).math:exponentiation).math:sum
0.5).math:exponentiation math:lessThan 0.5.

returns (a graph which I got via euler)
_:2_1 :latitude 42.3.
_:2_1 :longitude -71.1.
_:2_1 :cityName "Cambridge".
0 math:lessThan 0.5.
_:4_1 :latitude 42.1.
_:4_1 :longitude -71.3.
_:4_1 :cityName _:cn_1.
0.282842712474618 math:lessThan 0.5.


--
Jos De Roo, AGFA http://www.agfa.com/w3c/jdroo/


                                                                                                                                         
                      "Eric                                                                                                              
                      Prud'hommeaux"             To:       Dan Connolly <connolly@w3.org>                                                
                      <eric@w3.org>              cc:       RDF Data Access Working Group <public-rdf-dawg@w3.org>                        
                      Sent by:                   Subject:  Re: Use case: tiger map/census data: have it your way                         
                      public-rdf-dawg-req                                                                                                
                      uest@w3.org                                                                                                        
                                                                                                                                         
                                                                                                                                         
                      23/03/2004 04:21                                                                                                   
                                                                                                                                         
                                                                                                                                         





On Wed, Mar 17, 2004 at 12:06:18PM -0600, Dan Connolly wrote:
>
> The U.S. Census Bureau provides some really nify data
>   http://www.census.gov/geo/www/tiger/tiger2003/tgr2003.html
> it's public domain.
>
> I want to do a query like
>            tell me the lat, lon, name, and type
>            of everything within 50 miles of Cambridge, MA
>
> Right now, I have to download all the files, unzip them,
> read a bunch of docs, write some software, blah blah blah.
>
> I'd like to just look at it as a big RDF graph and issue
> a query.
>
> Hmm... it's not clear they (the census folks) have motivation
> to offer a query service. But clearly a third party could.

I think this is a hard problem.

I know we are supposed be writing fairy tails and not drilling down
into the nitty gritty, but I can't get my head around the size of this
problem without envisioning the mechanics.

write some software, blah blah blah approach:
  Given lat/long of every city center in Massachusetts (a finite number
  of locations) expressed in, or translatable to, RDF, query for each
  lat/long for each city and use sqrt(a^2+b^2) to calculate the distance
  for each. Take the ones where that is < 50 miles.
QL requirements: simple conjunction --
    ?city gis:latitude ?lat
    ?city gis:longitude ?long
  collect (?city ?lat ?long)
and do the rest with custom software.

value-constrained query approach:
  same as above, only limit the scope to those cities within a 50 mile
  *square* of Cambridge. (Assuming 42.3, -71.1 for Cambridge, MA and
  one mile corresponds to .01 degrees in both latitude and longitude):
QL requirements: conjunction+numeric comparison
    ?city gis:latitude ?lat
    ?city gis:longitude ?long
    ?lat <= 42.8
    ?lat >= 41.8
    ?long <= 70.6
    ?long >= 71.6
  collect (?city ?lat ?long)
You still have to write a program to do the same math, but you get to
greatly reduce the query result set that the program must walk through.

crazy mad arithmatic approach:
Put all the math into the query:
QL requirements: the conjunction+numeric comparison+math library
    ?city gis:latitude ?lat
    ?city gis:longitude ?long
    sqrt((?lat-42.3)^2 + (?long-71.1)^2) < 0.5
  collect (?city ?lat ?long)

I wonder which you would like to put forth as a use case, the fairy
tale where someone still has to write the program et al, or the fairy
tale where the QL has math libraries. I guess both are use cases, and
the use case evaluation is the time to decide which approach the QL
should cater to.

> Dan Connolly, W3C http://www.w3.org/People/Connolly/
> see you at the WWW2004 in NY 17-22 May?

be seeing you
--
-eric

office: +81.466.49.1170 W3C, Keio Research Institute at SFC,
                        Shonan Fujisawa Campus, Keio University,
                        5322 Endo, Fujisawa, Kanagawa 252-8520
                        JAPAN
        +1.617.258.5741 NE43-344, MIT, Cambridge, MA 02144 USA
cell:   +1.857.222.5741 (does not work in Asia)

(eric@w3.org)
Feel free to forward this message to any list for any purpose other than
email address distribution.

Received on Tuesday, 23 March 2004 17:15:58 UTC