Re: Generating rdfs:seeAlso links for Tabulator with SPARQL CONSTRUCT from Chris Bizer on 2006-10-28 (semantic-web@w3.org from October 2006)

From: Chris Bizer <chris@bizer.de>
Date: Sat, 28 Oct 2006 18:33:35 +0200
To: "Giovanni Tummarello" <g.tummarello@gmail.com>, "Chris Bizer" <bizer@zedat.fu-berlin.de>
Cc: <semantic-web@w3.org>, <bastian@quilitz.de>, "Tim Berners-Lee" <timbl@w3.org>, "Dan Connolly" <connolly@w3.org>, "Richard Cyganiak" <richard@cyganiak.de>, "Seaborne, Andy" <andy.seaborne@hp.com>, "Ivan Herman" <ivan@w3.org>, Tobias Gauß <tobias.gauss@web.de>
Message-ID: <002801c6faae$d1129fb0$83ec2da0@named4gc1asnuj>
Hi Giovanni,

> Responding to your request for comments, i feel that the technique is
> interesting and will certainly fit some use cases.
> I don't see it however much less hardcoded than a link to what appears
> to be a static RDF file but could be a dump of the whole db or a generic
> construct query which dumps all off a sparql endpoint (which, i fear..
> will be what people would be tempted to do).

I'm not that pessimistic on this point and think that tool like Tabulator 
will teach people that it is no good idea to put megabyte-sited RDF files on 
their servers, but server RDF in smaller, easy to consume chuncks.

> The cool factor is probably in the URI rewriting.
> But then again, maybe that is not a good idea? I wonder if  putting
> instead in your foaf file a triple that says yourpage#cris sameAs
> bigtimedb/id12345 would be more useful.. e.g.  if the tabulator found
> other graphs where you are mentioned as the OTHER URI then such triple
> would do the mapping , the query wouldn't.

You are right, but this should be done in addition to setting the link not 
as an alternative.
If we want the Semantic Web to succeed, we need easily navigate-able links 
NOW, instead of waiting for some hyper-intelligent discovery and mapping 
algorithms that might be invented sometime in the future.

>
> What in reality we would like to see on the SW is visiting somebody's
> homepage and see somehow the publications in all the major site
> regardless of which ones the author decided (or remembered) to point to.
> But this is impossible in these "direct HTTP link" scenarios, which
> however are inspiringly simple and demonstrable.

Yes this would be nice, but I think we also get with the direct link 
scenarios pretty far as the mostly used data on the Web comes from a 
relatively small number of sources.

For instance, we plan to write RDF wrappers for the Google, eBay and Amazon 
APIs (see  http://www.programmableweb.com/). We will assign a URI like 
www4.wiwiss.fu-berlin.de/eBayWrapper/auction3465475 to each action on eBay, 
each review on Amazon and so on. When a client dereferences this URI, our 
server will rewrite the request to a request against the eBay or whatever 
API. The server will transform the answer to RDF and return it to the 
client. So far so good, but what could make the stuff interesting would be 
RDF links between the different wrappers. For instance the Amazon wrapper 
could query the eBay API with a ISBN number and get the IDs of all actions 
for the book. He could use the IDs to set links to another wrapper which 
represents eBay, for instance he could include triples like 
www4.wiwiss.fu-berlin.de/AmazonWrapper/bookISBN3465475 ex:soldAt 
www4.wiwiss.fu-berlin.de/eBayWrapper/auction3465475 into the description of 
the book. This would allow tools like the Semantic Web Client Library 
(http://sites.wiwiss.fu-berlin.de/suhl/bizer/ng4j/semwebclient/index.html) 
to answer the SPARQL query "Give me general information, reviews and actions 
about the book with the ISBN XXXX" with data from Amazon and eBay.



No magic and something that can be realized with a couple PHP scripts, but 
maybe interesting enough to get the Web 2.0 mashup hackers interested in 
SPARQL and Semantic Web technologies, which I think would be very good 
because they want solutions that work in the short term and not sometime in 
the future.


> Lets throw in the pic some sort of lightweight inverse URI/URL index to
> find the unexpected stuff?
>
> Something to support the vision of the Semantic Web as a collection of
> URI each of which can be freely annotated  and there is some good
> chances that these annotations are automatically discovered, rather than
> a collection of endpoints each of which must me known in advance and
> hardwired to the point where you think it might be useful.
> This would give such "semantic web" the ability to fulfill expectations
> similar to what the user has today the web, that is, in the powerful
> search engine era. (rather than a "web of the old days" made of direct
> links with bookmarks being really important)

Yes, it is right to mention search engines here. They maintain a index of 
all documents that contain a certain  word and I think the same will happen 
with URIs for the Semantic Web. But the crawlers of the search engines 
require links between HTML documents to build the index. And the crawler 
that would build your URI/URL index will also need hard-wired links to find 
the interesting stuff on the Semantic Web.

> In RDFGrowth as implemented in DBin (new version coming out in days to
> support the ISWC :-) ) the use of a DHT such inverse scenario, as well
> as collecting knowledge locally which is something we like a lot. But
> then again is a completely different system than HTTP and the usual web
> and instead supports a new scenario (Semantic Web Communities or
> Newsgroups , that is people that want to collaboratively create and
> browse a common RDF graph, yet replicated at each peer) rather than a
> "lightweight" extension of the current web.

I would not draw such a distinct line between your stuff and the Semantic 
Web. If you would start perceiving the nodes in your system that store 
information and pass information to other nodes as RDF caches, similar to 
classic HTML-Webpage caches, then you would get pretty close to the general 
Semantic Web / Web of Data scenario with the difference that for some reason 
you don't want to use HTTP but some other P2P protocols to send data around.

> Looking forward to discuss this live at ISWC. A BOF maybe?

Yes, will be fun and let's also not forget the beer.

Chris

> Giovanni
>
>
>>
>> Hi all,
>>
>> in a recent talk about Tabulator and linked data on the Web
>> (http://www.w3.org/2006/Talks/1019-tab-tbl/), Tim says that "The
>> biggest challenge is links to other systems."
>> http://www.w3.org/2006/Talks/1019-tab-tbl/#(12), meaning that the Web
>> of Data can only be browsed or queried
>> (http://sites.wiwiss.fu-berlin.de/suhl/bizer/ng4j/semwebclient/), if
>> there are enough links connecting the RDF documents on different servers.
>>
>> So how to we get these links?
>>
>> One approach is to set them manually, by including rdf:seeAlso triples
>> and by using dereferencable URIs from different servers in RDF
>> documents. For instance, by FOAF profile
>> (http://sites.wiwiss.fu-berlin.de/suhl/bizer/foaf.rdf) contains lots
>> of seeAlso links to other profiles and the triple:
>> <http://www.bizer.de#chris> foaf:based_near
>> <http://ws.geonames.org/rdf?geonameId=2950159>
>> which is a link to data about Berlin, that people can follow by
>> dereferencing the object of the triple.
>>
>> An alternative to manually set links is to generate them with a SPARQL
>> CONSTRUCT query.
>>
>> We are currently setting up a D2R Server
>> (http://sites.wiwiss.fu-berlin.de/suhl/bizer/d2r-server/) which will
>> publish the DBLP bibliographic database (800000 articles, 200000
>> authors) as linked data on the Web. The server will go live some time
>> next week and will allow you to query the DBLP database with SPARQL
>> and to dereference all generated URIs.
>>
>> So let's assume, I want to set links from my FOAF profile to my papers
>> in the DBLP database
>> (http://www.informatik.uni-trier.de/~ley/db/indices/a-tree/b/Bizer:Christian.html).
>> I could do the following:
>>
>> 1. Ask the server for his URI identifying me.
>> SELECT ?me
>> WHERE {?me foaf:name "Chris Bizer"}
>> This will return a URI like
>> http://[DBLPServerRoot]/persons/person15437. Our server will for
>> example return http://www4.wiwiss.fu-berlin.de/DBLP/persons/person15437
>>
>> 2. Then I could add a rdfs:seeAlso link to my foaf profile that
>> contains the following SPARQL CONSTRUCT query as target:
>>
>> CONSTRUCT   { ?paper dc:author <http://www.bizer.de#chris> }
>> WHERE       { ?paper dc:author
>> <http://www4.wiwiss.fu-berlin.de/DBLP/persons/person15437> }
>>
>> encoded into a seeAlso link the query would look like
>>
>> <http://www.bizer.de#chris> rdfs:seeAlso
>> <http://www4.wiwiss.fu-berlin.de/DBLP/sparql?query=CONSTRUCT+%3Chttp%3A%2A%....>
>>
>>
>> When an RDF browser like Tabulator dereferences the object of the
>> triple, it gets an RDF document like:
>>
>> <http://www4.wiwiss.fu-berlin.de/DBLP/papers/paper234137> dc:author
>> <http://www.bizer.de#chris> .
>> <http://www4.wiwiss.fu-berlin.de/DBLP/papers/paper436178> dc:author
>> <http://www.bizer.de#chris> .
>> <http://www4.wiwiss.fu-berlin.de/DBLP/papers/paper554632> dc:author
>> <http://www.bizer.de#chris> .
>> <http://www4.wiwiss.fu-berlin.de/DBLP/papers/paper444188> dc:author
>> <http://www.bizer.de#chris> .
>>
>> and the user can browse to my papers in the database by dereferencing
>> the subjects of the triples.
>>
>> Thus, combining rdfs:seeAlso with SPARQL CONSTRUCT allows you to set
>> dynamic links in the Web of Data, that reflect changes in the target
>> data source (If a new paper I have authored is added to the DBLP
>> database, there will also be a new link in my FOAF profile).
>>
>> Any comments on this idea?
>>
>> We will demo this kind of links at our poster presentation about D2R
>> Server at ISWC in two weeks. So if you think this is interresting,
>> just drop by.
>>
>> Cheers
>>
>> Chris
>>
>>
>
>
Received on Saturday, 28 October 2006 16:33:57 UTC