W3C home > Mailing lists > Public > public-lod@w3.org > October 2009

Re: The Power of Virtuoso Sponger Technology

From: Kingsley Idehen <kidehen@openlinksw.com>
Date: Sat, 17 Oct 2009 14:08:32 -0400
Message-ID: <4ADA0820.2070008@openlinksw.com>
To: Georgi Kobilarov <georgi.kobilarov@gmx.de>
CC: Juan Sequeda <juanfederico@gmail.com>, hepp@ebusiness-unibw.org, public-lod@w3.org
Georgi Kobilarov wrote:
>> The Web of Linked
>> Data shouldn't be about mass crawling (search engine style) etc...  
>>     
>
> It has to be. How would you answer a query like "all offers for a book
> written by a German author" without crawling the relevant data sets?
>   
To qualify my response:
It shouldn't be about mass crawling (search engine style) that results 
in Google or Yahoo! style indexes.

It should be about smart walking and indexing that uses HTTP to device 
smart cache invalidation schemes and Linked Data oriented URIs, for 
smart pathways.

The comment: does Sindice Index Sponger URIs is not the answer.  Just as 
the Sponger indexing Sindice isn't the answer. Both services can use 
their data pathways to make newer and better pathways depending on the 
query at hand. Basically,  "No Mass Dumb Crawling & Indexing" is what I 
am  trying to relay via my comments :-)

If we stick with the traditional search approach, how do we deal with 
the "change sensitivity" factor re: "all offers for a book written by a 
German author" ? 

Kingsley
> Georgi
>
>   
>> -----Original Message-----
>> From: public-lod-request@w3.org [mailto:public-lod-request@w3.org] On
>> Behalf Of Kingsley Idehen
>> Sent: Saturday, October 17, 2009 4:58 PM
>> To: Juan Sequeda
>> Cc: hepp@ebusiness-unibw.org; public-lod@w3.org
>> Subject: Re: The Power of Virtuoso Sponger Technology
>>
>> Juan Sequeda wrote:
>>     
>>> Does Sindice crawl this (or any other semantic web search engines)?
>>>       
>> Juan,
>>
>> Sponger is not about Sindice crawling our proxy URIs. The Web of Linked
>> Data shouldn't be about mass crawling (search engine style) etc...  Its
>> really supposed to be about smarter data network traversals triggered
>> by
>> data access requests. Basically, make the pathway "on the fly",
>> remember
>> it for future reference, and know when its obsolete.
>>
>> If you look at it the other way round, our Sponger has Meta Cartridges
>> that will lookup Sindice (via their APIs) for specific data about a
>> various entities. It won't seek a complete dump of Sindice etc.. The
>> same applies to a plethora of Web 2.0 style services.
>>
>>
>> We can do smart database queries on the Web by simply meshing
>> fundamental database principles with the inherent sophistication of
>> HTTP :-)
>>
>> Kingsley
>>
>>
>>     
>>> Juan Sequeda, Ph.D Student
>>> Dept. of Computer Sciences
>>> The University of Texas at Austin
>>> www.juansequeda.com <http://www.juansequeda.com>
>>> www.semanticwebaustin.org <http://www.semanticwebaustin.org>
>>>
>>>
>>> On Sat, Oct 17, 2009 at 4:24 AM, Martin Hepp (UniBW)
>>> <hepp@ebusiness-unibw.org <mailto:hepp@ebusiness-unibw.org>> wrote:
>>>
>>>     Dear all:
>>>
>>>     I just found out that the Virtuoso Sponger technology is even
>>>       
>> more
>>     
>>>     powerful than I thought.
>>>
>>>     Briefly: "Spongers" create rich GoodRelations (and other RDF)
>>>     meta-data
>>>     for existing Web pages on-the-fly. Other than traditional
>>>     screen-scraping approaches, Spongers reuse public APIs and other
>>>     techniques, so the data is of unprecedented degree of structure.
>>>
>>>     Now, this can be directly used in arbitrary queries... by simply
>>>       
>> using
>>     
>>>     the URI of the *existing* HTML Web page in the FROM clause of a
>>>       
>> SPARQL
>>     
>>>     query.
>>>
>>>     Example:
>>>
>>>
>>>     http://www.amazon.com/Semantic-Web-Real-World-Applications-
>>>       
>> Industry/dp/0387485309
>>     
>>>     is a Web page in plain HTML offering a book. Amazon does not yet
>>>     produce GoodRelations meta-data on their pages.
>>>
>>>     If you go to
>>>
>>>        http://uriburner.com/sparql
>>>
>>>     and paste the URI in the "Default Graph URI " field and select
>>>     "Retrieve
>>>     remote RDF for all missing source graphs", then a query like
>>>
>>>       "SELECT * WHERE {?s ?p ?o} LIMIT 50"
>>>
>>>     returns a fully-fledged GoodRelations description for that page -
>>>     as if
>>>     Amazon was already supporting GoodRelations for each of its > 4
>>>     million
>>>     items!
>>>
>>>     There are spongers for BestBuy, eBay, Zillow, and many other
>>>       
>> types of
>>     
>>>     resources.
>>>
>>>     Wow!
>>>
>>>     Congrats to Kingsley and his team!
>>>
>>>     Best wishes
>>>
>>>     Martin Hepp
>>>
>>>     --
>>>     --------------------------------------------------------------
>>>     martin hepp
>>>     e-business & web science research group
>>>     universitaet der bundeswehr muenchen
>>>
>>>     e-mail:  hepp@ebusiness-unibw.org <mailto:hepp@ebusiness-
>>>       
>> unibw.org>
>>     
>>>     phone:   +49-(0)89-6004-4217
>>>     fax:     +49-(0)89-6004-4620
>>>     www:     http://www.unibw.de/ebusiness/ (group)
>>>             http://www.heppnetz.de/ (personal)
>>>     skype:   mfhepp
>>>     twitter: mfhepp
>>>
>>>     Check out GoodRelations for E-Commerce on the Web of Linked Data!
>>>     =================================================================
>>>
>>>     Webcast:
>>>     http://www.heppnetz.de/projects/goodrelations/webcast/
>>>
>>>     Recipe for Yahoo SearchMonkey:
>>>     http://www.ebusiness-
>>>       
>> unibw.org/wiki/GoodRelations_and_Yahoo_SearchMonkey
>>     
>>>     Talk at the Semantic Technology Conference 2009:
>>>     "Semantic Web-based E-Commerce: The GoodRelations Ontology"
>>>     http://www.slideshare.net/mhepp/semantic-webbased-ecommerce-the-
>>>       
>> goodrelations-ontology-1535287
>>     
>>>     Overview article on Semantic Universe:
>>>     http://www.semanticuniverse.com/articles-semantic-web-based-e-
>>>       
>> commerce-webmasters-get-ready.html
>>     
>>>     Project page:
>>>     http://purl.org/goodrelations/
>>>
>>>     Resources for developers:
>>>     http://www.ebusiness-unibw.org/wiki/GoodRelations
>>>
>>>     Tutorial materials:
>>>     CEC'09 2009 Tutorial: The Web of Data for E-Commerce: A Hands-on
>>>     Introduction to the GoodRelations Ontology, RDFa, and Yahoo!
>>>     SearchMonkey
>>>     http://www.ebusiness-unibw.org/wiki/Web_of_Data_for_E-
>>>       
>> Commerce_Tutorial_IEEE_CEC%2709
>>     
>>>
>>>
>>>       
>> --
>>
>>
>> Regards,
>>
>> Kingsley Idehen	      Weblog:
>>     
> http://www.openlinksw.com/blog/~kidehen
>   
>> President & CEO
>> OpenLink Software     Web: http://www.openlinksw.com
>>
>>
>>
>>     
>
>
>
>   


-- 


Regards,

Kingsley Idehen	      Weblog: http://www.openlinksw.com/blog/~kidehen
President & CEO 
OpenLink Software     Web: http://www.openlinksw.com
Received on Saturday, 17 October 2009 18:09:26 UTC

This archive was generated by hypermail 2.3.1 : Sunday, 31 March 2013 14:24:23 UTC