Re: SPARQL and Web 2

On 10 Oct 2005, at 10:37, Giovanni Tummarello wrote:
> Hi Henry,
> I dont see how sparql exactly saves them bandwidth.. I havent been  
> much into sparql lately but last i checked it was supporting the  
> named graph paradigm which means you'd ask amazon for their entire  
> graph and then execute the query locally.

Clearly that would be crazy. Imagine you had to do that when you  
queried Google:
download all their index and then query the index. There just is too  
much data.
At AltaVista 5 years ago, it would takes weeks to backup all the  
index locally,
let alone moving all the data over the wire. This would be even more  
now with the growth of the web, even taking into account bandwith  
improvements ;-)

> If you meant that amazon would instead execute an sparql query for  
> you an return you just the very hits then
>
> a) if just some sparql queries are allowed, isnt this what a web  
> service would do today anyway with just rdf results?

The beauty of SPARQL is that it gives us a very clear and well  
defined interface to
do the queries that map beautifully on an ontology. Web services  
currently have to
keep reinventing a query interface for each service they provide.  
SPARQL reduces that
down to needing to describe the ontology (the types of objects and  
the relations between
them). As these services start growing even that creativity will be  
simplified as
standard well defined ontologies emerge.

>
> b) otherwise, would they really allow you to execute arbitrary  
> graph queries? complexities usually explodes exponentially  
> (possibly) with the number of unbounded variables, and is very  
> sensitive to the data structure as well

As I pointed out in a previous post you get this with any query  
interface. You
can make hugely complex text queries to Google. Those services have  
found ways
around the problem. We can learn from them.

> c) even if they really did execute our arbitrary query, would they  
> provide the answer we look for? if you want to know all amazon  
> books written by a stanford professor there is no way out. There  
> must be a computational space where the list of professors and the  
> list of books is known at the same time. What to do then, cause  
> amazon to request the list of professors from stanford and execute  
> the query there?

Good point. You just would not start by making a huge ontology  
available. A book
store would just need an ontology describing a book, that it has an  
author, a price
a publication date, an ISBN number, links to comments, etc... That  
comments have
authors that are foaf:Persons. That these may have nicknames. That a  
book may have ratings. That authors have books they have written,  
etc... You would not of
course put out information such as that authors belong to particular  
institutions,
and that they know so and so. That would be for different database to  
take care of.

We can't solve all the problems at once. But we can solve a few at a  
time. By
standardizing on a query interface we have taken a big step in  
educating people
about ontologies, giving people a reason to develop good ones, and  
making the
life of developers much more easy, as they now need to only  
understand one interface.
At the same time businesses that open up their database this way will  
start being
able to asses what the most valuable part of their data is (in  
economics value is
determined in part by demand), which optimizations are needed to help  
grow the service,
and what business models best fit their target audience.

Then, later, as these services grow, and ontologies start becoming  
standardised
through the network effect it will start to become possible to use  
the same query
to query multiple databases.

But let us not put the proverbial cart before the horse. First let us  
try out
opening a few databases. Then with the momentum this generates, the  
rest will
follow.

Henry Story

> Giovanni
>

Received on Monday, 10 October 2005 11:50:36 UTC