RE: Use case: tiger map/census data: have it your way

When I first experimented with an RDF QL I allowed arbitrary plug-in
functions for this sort of situation (this is pre-datatyping).  I could
imagine specialised functions to answer domain specific questions like
geographical metrics or one protein could be folded into another.

Then it seemed a bad idea because it meant a query could go to a source that
had no idea what to do with it.  A better approach was to have the source
have the information through predicates like "?x :isWithin50Miles ?y" but of
course you need parameters to the properties.

Now I think it's a "necessary evil" to have some domain-specific value
processing - computation of a domain specific data constraint server-side
can reduce the amount of data transfer so much it has to be possible.
Whether that's through functions or computed predicates that express these
relations, or both, I don't know.

I'm neutral as to whether DAWG-QL v0.1 has this feature or not.

	Andy

-------- Original Message --------
> From: Eric Prud'hommeaux <>
> Date: 23 March 2004 03:23
> 
> On Wed, Mar 17, 2004 at 12:06:18PM -0600, Dan Connolly wrote:
> > 
> > The U.S. Census Bureau provides some really nify data
> >   http://www.census.gov/geo/www/tiger/tiger2003/tgr2003.html it's
> > public domain. 
> > 
> > I want to do a query like
> > 	tell me the lat, lon, name, and type
> > 	of everything within 50 miles of Cambridge, MA
> > 
> > Right now, I have to download all the files, unzip them,
> > read a bunch of docs, write some software, blah blah blah.
> > 
> > I'd like to just look at it as a big RDF graph and issue
> > a query.
> > 
> > Hmm... it's not clear they (the census folks) have motivation
> > to offer a query service. But clearly a third party could.
> 
> I think this is a hard problem.
> 
> I know we are supposed be writing fairy tails and not drilling down
> into the nitty gritty, but I can't get my head around the size of this
> problem without envisioning the mechanics.
> 
> write some software, blah blah blah approach:
>   Given lat/long of every city center in Massachusetts (a finite number
>   of locations) expressed in, or translatable to, RDF, query for each
>   lat/long for each city and use sqrt(a^2+b^2) to calculate the distance
>   for each. Take the ones where that is < 50 miles.
> QL requirements: simple conjunction --
>     ?city gis:latitude ?lat
>     ?city gis:longitude ?long
>   collect (?city ?lat ?long)
> and do the rest with custom software.
> 
> value-constrained query approach:
>   same as above, only limit the scope to those cities within a 50 mile
>   *square* of Cambridge. (Assuming 42.3, -71.1 for Cambridge, MA and
>   one mile corresponds to .01 degrees in both latitude and longitude):
> QL requirements: conjunction+numeric comparison
>     ?city gis:latitude ?lat
>     ?city gis:longitude ?long
>     ?lat <= 42.8
>     ?lat >= 41.8
>     ?long <= 70.6
>     ?long >= 71.6
>   collect (?city ?lat ?long)
> You still have to write a program to do the same math, but you get to
> greatly reduce the query result set that the program must walk through.
> 
> crazy mad arithmatic approach:
> Put all the math into the query:
> QL requirements: the conjunction+numeric comparison+math library
>     ?city gis:latitude ?lat
>     ?city gis:longitude ?long
>     sqrt((?lat-42.3)^2 + (?long-71.1)^2) < 0.5
>   collect (?city ?lat ?long)
> 
> I wonder which you would like to put forth as a use case, the fairy
> tale where someone still has to write the program et al, or the fairy
> tale where the QL has math libraries. I guess both are use cases, and
> the use case evaluation is the time to decide which approach the QL
> should cater to.
> 
> > Dan Connolly, W3C http://www.w3.org/People/Connolly/
> > see you at the WWW2004 in NY 17-22 May?
> 
> be seeing you

Received on Tuesday, 23 March 2004 05:51:19 UTC