W3C home > Mailing lists > Public > public-lod@w3.org > October 2009

Re: The Power of Virtuoso Sponger Technology

From: Kingsley Idehen <kidehen@openlinksw.com>
Date: Sat, 17 Oct 2009 14:11:38 -0400
Message-ID: <4ADA08DA.3020906@openlinksw.com>
To: Juan Sequeda <juanfederico@gmail.com>
CC: Georgi Kobilarov <georgi.kobilarov@gmx.de>, hepp@ebusiness-unibw.org, public-lod@w3.org
Juan Sequeda wrote:
> I agree with Georgi. I would like to know what others think about this.
What do you actually mean by Sindice indexing Sponger proxy URIs? Are 
you talking about it indexing in the same manner it does say, 
PingTheSemanticWeb? If so, then you are still thinking Google / Yahoo! 
style behavior.

The better way is to work like a DBMS, have a base of data and 
progressively build it up while remaining sensitive to change. HTTP, 
Linked Data Objects, SPARQL, and OWL collectively make it possible for 
the Web of Linked Data to work like a very smart Federated DBMS.

Kingsley
>
> On Sat, Oct 17, 2009 at 11:39 AM, Georgi Kobilarov 
> <georgi.kobilarov@gmx.de <mailto:georgi.kobilarov@gmx.de>> wrote:
>
>     > The Web of Linked
>     > Data shouldn't be about mass crawling (search engine style) etc...
>
>     It has to be. How would you answer a query like "all offers for a book
>     written by a German author" without crawling the relevant data sets?
>
>     Georgi
>
>     > -----Original Message-----
>     > From: public-lod-request@w3.org
>     <mailto:public-lod-request@w3.org>
>     [mailto:public-lod-request@w3.org
>     <mailto:public-lod-request@w3.org>] On
>     > Behalf Of Kingsley Idehen
>     > Sent: Saturday, October 17, 2009 4:58 PM
>     > To: Juan Sequeda
>     > Cc: hepp@ebusiness-unibw.org <mailto:hepp@ebusiness-unibw.org>;
>     public-lod@w3.org <mailto:public-lod@w3.org>
>     > Subject: Re: The Power of Virtuoso Sponger Technology
>     >
>     > Juan Sequeda wrote:
>     > > Does Sindice crawl this (or any other semantic web search
>     engines)?
>     > Juan,
>     >
>     > Sponger is not about Sindice crawling our proxy URIs. The Web of
>     Linked
>     > Data shouldn't be about mass crawling (search engine style)
>     etc...  Its
>     > really supposed to be about smarter data network traversals
>     triggered
>     > by
>     > data access requests. Basically, make the pathway "on the fly",
>     > remember
>     > it for future reference, and know when its obsolete.
>     >
>     > If you look at it the other way round, our Sponger has Meta
>     Cartridges
>     > that will lookup Sindice (via their APIs) for specific data about a
>     > various entities. It won't seek a complete dump of Sindice etc.. The
>     > same applies to a plethora of Web 2.0 style services.
>     >
>     >
>     > We can do smart database queries on the Web by simply meshing
>     > fundamental database principles with the inherent sophistication of
>     > HTTP :-)
>     >
>     > Kingsley
>     >
>     >
>     > >
>     > > Juan Sequeda, Ph.D Student
>     > > Dept. of Computer Sciences
>     > > The University of Texas at Austin
>     > > www.juansequeda.com <http://www.juansequeda.com>
>     <http://www.juansequeda.com>
>     > > www.semanticwebaustin.org <http://www.semanticwebaustin.org>
>     <http://www.semanticwebaustin.org>
>     > >
>     > >
>     > > On Sat, Oct 17, 2009 at 4:24 AM, Martin Hepp (UniBW)
>     > > <hepp@ebusiness-unibw.org <mailto:hepp@ebusiness-unibw.org>
>     <mailto:hepp@ebusiness-unibw.org
>     <mailto:hepp@ebusiness-unibw.org>>> wrote:
>     > >
>     > >     Dear all:
>     > >
>     > >     I just found out that the Virtuoso Sponger technology is even
>     > more
>     > >     powerful than I thought.
>     > >
>     > >     Briefly: "Spongers" create rich GoodRelations (and other RDF)
>     > >     meta-data
>     > >     for existing Web pages on-the-fly. Other than traditional
>     > >     screen-scraping approaches, Spongers reuse public APIs and
>     other
>     > >     techniques, so the data is of unprecedented degree of
>     structure.
>     > >
>     > >     Now, this can be directly used in arbitrary queries... by
>     simply
>     > using
>     > >     the URI of the *existing* HTML Web page in the FROM clause
>     of a
>     > SPARQL
>     > >     query.
>     > >
>     > >     Example:
>     > >
>     > >
>     > >     http://www.amazon.com/Semantic-Web-Real-World-Applications-
>     > Industry/dp/0387485309
>     > >
>     > >     is a Web page in plain HTML offering a book. Amazon does
>     not yet
>     > >     produce GoodRelations meta-data on their pages.
>     > >
>     > >     If you go to
>     > >
>     > >        http://uriburner.com/sparql
>     > >
>     > >     and paste the URI in the "Default Graph URI " field and select
>     > >     "Retrieve
>     > >     remote RDF for all missing source graphs", then a query like
>     > >
>     > >       "SELECT * WHERE {?s ?p ?o} LIMIT 50"
>     > >
>     > >     returns a fully-fledged GoodRelations description for that
>     page -
>     > >     as if
>     > >     Amazon was already supporting GoodRelations for each of
>     its > 4
>     > >     million
>     > >     items!
>     > >
>     > >     There are spongers for BestBuy, eBay, Zillow, and many other
>     > types of
>     > >     resources.
>     > >
>     > >     Wow!
>     > >
>     > >     Congrats to Kingsley and his team!
>     > >
>     > >     Best wishes
>     > >
>     > >     Martin Hepp
>     > >
>     > >     --
>     > >     --------------------------------------------------------------
>     > >     martin hepp
>     > >     e-business & web science research group
>     > >     universitaet der bundeswehr muenchen
>     > >
>     > >     e-mail:  hepp@ebusiness-unibw.org
>     <mailto:hepp@ebusiness-unibw.org> <mailto:hepp@ebusiness-
>     <mailto:hepp@ebusiness->
>     > unibw.org <http://unibw.org>>
>     > >     phone:   +49-(0)89-6004-4217
>     > >     fax:     +49-(0)89-6004-4620
>     > >     www:     http://www.unibw.de/ebusiness/ (group)
>     > >             http://www.heppnetz.de/ (personal)
>     > >     skype:   mfhepp
>     > >     twitter: mfhepp
>     > >
>     > >     Check out GoodRelations for E-Commerce on the Web of
>     Linked Data!
>     > >    
>     =================================================================
>     > >
>     > >     Webcast:
>     > >     http://www.heppnetz.de/projects/goodrelations/webcast/
>     > >
>     > >     Recipe for Yahoo SearchMonkey:
>     > >     http://www.ebusiness-
>     > unibw.org/wiki/GoodRelations_and_Yahoo_SearchMonkey
>     <http://unibw.org/wiki/GoodRelations_and_Yahoo_SearchMonkey>
>     > >
>     > >     Talk at the Semantic Technology Conference 2009:
>     > >     "Semantic Web-based E-Commerce: The GoodRelations Ontology"
>     > >    
>     http://www.slideshare.net/mhepp/semantic-webbased-ecommerce-the-
>     > goodrelations-ontology-1535287
>     > >
>     > >     Overview article on Semantic Universe:
>     > >     http://www.semanticuniverse.com/articles-semantic-web-based-e-
>     > commerce-webmasters-get-ready.html
>     > >
>     > >     Project page:
>     > >     http://purl.org/goodrelations/
>     > >
>     > >     Resources for developers:
>     > >     http://www.ebusiness-unibw.org/wiki/GoodRelations
>     > >
>     > >     Tutorial materials:
>     > >     CEC'09 2009 Tutorial: The Web of Data for E-Commerce: A
>     Hands-on
>     > >     Introduction to the GoodRelations Ontology, RDFa, and Yahoo!
>     > >     SearchMonkey
>     > >     http://www.ebusiness-unibw.org/wiki/Web_of_Data_for_E-
>     > Commerce_Tutorial_IEEE_CEC%2709
>     > >
>     > >
>     > >
>     > >
>     >
>     >
>     > --
>     >
>     >
>     > Regards,
>     >
>     > Kingsley Idehen             Weblog:
>     http://www.openlinksw.com/blog/~kidehen
>     <http://www.openlinksw.com/blog/%7Ekidehen>
>     > President & CEO
>     > OpenLink Software     Web: http://www.openlinksw.com
>     >
>     >
>     >
>
>
>


-- 


Regards,

Kingsley Idehen	      Weblog: http://www.openlinksw.com/blog/~kidehen
President & CEO 
OpenLink Software     Web: http://www.openlinksw.com
Received on Saturday, 17 October 2009 18:12:22 UTC

This archive was generated by hypermail 2.3.1 : Sunday, 31 March 2013 14:24:23 UTC