RE: Valid representations, canonical representations, and what the SW needs from the Web... from Patrick.Stickler@nokia.com on 2003-02-04 (www-tag@w3.org from February 2003)

From: <Patrick.Stickler@nokia.com>
Date: Tue, 4 Feb 2003 13:08:25 +0200
To: <jbone@deepfile.com>, <paul@prescod.net>
Cc: <sandro@w3.org>, <www-tag@w3.org>
Message-ID: <A03E60B17132A84F9B4BB5EEDE57957B01B90ABD@trebe006.europe.nokia.com>

>  If 
> *could be* that 
> services e.g. Google could perform a useful function in mining and 
> making (minimal) semantic information about the existing 
> (non-semantic) 
> Web available in a machine-usable fashion.  I.e., to some extent your 
> comment about the existing Web being the subject of the semantic Web 
> belies your comments about the value of the existing Web.

Per my recent post, recommending extensions to HTTP to provide for
both Web and SW interpretations of resource URIs, Google and the like
could do SW mining using MGET rather than GET to create an enormous
mother-of-all knowledge base of RDF statements which could be
queried by SW agents more efficiently than having to harvest all
the individual bits of knowledge directly from the many servers
on the planet, and thus provide for (frighteningly) information
rich inference about resources.

> >  Plus, not that URIs are not expensive. They are cheap. Making new 
> > ones is easy. We need to make new ones to have a home for 
> the RDF data 
> > anyhow. So what.
> 
> I wonder about this.  Google knows about ~ 3B "pages."  I wonder what 
> the amortized mean economic cost- and / or value-per-page for all of 
> those is, how much money (time, etc.) was sunk in creating 
> them and the 
> software and hardware infrastructures that manage / host them...  At 
> least one of those quantities (cost) is probably higher than we might 
> at first imagine.

Right. The creation of knowledge about resources is very expensive, and
if the infrastructure for managing that knowledge is duplicative of what
already exists for the Web, it won't happen.

Using the same server to store, manage, and serve both representations
and knowledge for a given resource, per the same URI denoting that
resource is highly efficient, as well as fully backwards compatible.
Content owners can decide which resources will be described and how
much, and the same infrastructure (albeit slightly extended) provides
an incremental crawl-walk-run transition towards ever richer sources
of knowledge.

Patrick

--
Patrick Stickler, Nokia/Finland, (+358 40) 801 9690, patrick.stickler@nokia.com

Received on Tuesday, 4 February 2003 06:08:30 UTC