RE: [Fwd: RE: "information resource"] from Patrick.Stickler@nokia.com on 2004-10-18 (www-tag@w3.org from October 2004)

From: <Patrick.Stickler@nokia.com>
Date: Mon, 18 Oct 2004 12:00:41 +0300
To: <sandro@w3.org>, <fielding@gbiv.com>
Cc: <hhalpin@ibiblio.org>, <www-tag@w3.org>, <distobj@acm.org>
Message-ID: <1E4A0AC134884349A21955574A90A7A56471E3@trebe051.ntc.nokia.com>
> -----Original Message-----
> From: www-tag-request@w3.org 
> [mailto:www-tag-request@w3.org]On Behalf Of
> ext Sandro Hawke
> Sent: 17 October, 2004 21:56
> To: Roy T. Fielding
> Cc: Harry Halpin; www-tag@w3.org; Mark Baker
> Subject: Re: [Fwd: RE: "information resource"] 
> 
> 
> 
> 
> > > The problem seems to all come from when the idea of resource is
> > > de-anchored from the Web so that a "resource" can mean 
> anything. I have
> > > no problem per se with that, but just am pointing out that some of
> > > the reasons people are looking into ideas such as "information
> > > resource" are because problems of authority and representation
> > > are a lot trickier off the Web than on the Web, or when the two 
> > > intermix
> > > such as in the Semantic Web, where we have Web-statements 
> about things
> > > off the Web.
> > 
> > No, the problem is that they are exactly the same issues and folks
> > just assume they are different because they don't understand the
> > actual issues faced by current Web implementations of resources and
> > how those issues impact what Web clients can assume about resources.
> > Make Web statements about things on the Web and you have the same
> > problems (and the same solutions) as those things off the Web.
> 
> It seems to me the problem is this: with RDF it becomes useful to
> assign URIs to things like dogs and movies (even ones which are not
> available for download).  When people do that, we easily get
> unintended URI collisions, as discussed in the current draft:
> 
>       Suppose, for example, that one organization makes use of a URI
>       to refer to the movie "The Sting", and another organization uses
>       the same URI to refer to a discussion forum about "The Sting."
>       This collision creates confusion about what the URI identifies,
>       undermining the value of the URI. If one wanted to talk about
>       the creation date of the resource identified by the URI, for
>       instance, it would not be clear whether this meant "when the
>       movie created" or "when the discussion forum about the movie was
>       created." 
>           - 
> http://www.w3.org/TR/2004/WD-webarch-20040816/#URI-collision
> 
> Some people seem to find the notion of "information resources" helps
> them avoid this kind of modeling error, or detect when other people do
> it.  In fact, an OWL reasoner will often be able to report an error
> when someone accidentally uses the URI of a movie when they meant the
> URI of a discussion forum -- as long as there is a sufficiently
> detailed ontology involved.  In "the running_time of X is ...", if the
> declared domain of running_time is movies, and someone uses the
> discussion forum URI instead, software can detect that.
> 
> A fairly simple ontology which might help a lot is to divide the world
> into things which are and are not information resources.   The World
> Wide Web Consortium is not an information resource, but what
> http://www.w3.org/ idenfities is,

Err... who says it is?

Why can't <http://www.w3.org/> identify the actual World Wide Web
Consortium, and whatever one might GET via that URI is a representation
of the consortium.

Granted, if one wishes to have distinct URIs to talk about both the
consortium and the home page of the consortium, then if the W3C does
not explicitly say what such URIs identify, confusion will result. But
that is a separate issue (e.g. don't use URIs in RDF statements if you 
aren't sure what they actually identify, or you pay the price)

>  ....  so it becomes practical to
> detect and report the error of someone using that URI to (directly)
> identify that organization.  Some of us think that an HTTP "200 OK"
> response on a GET or HEAD for some URI means it identifies an
> information resource,

And many of us think that such a conclusion is unfounded, invalid,
and dangerous.

And at least one of us thinks that one should simply be able to 
*ask* a web authority what a given URI identifies, so that folks
aren't stumbling around in the dark making wild guesses.

>  so this process becomes even easier:  if someone
> writes that they work for http://www.w3.org/, the type-error can be
> detected by only know about "works for" -- nothing needs to have been
> said about "http://www.w3.org/".

Such an approach grossly limits the potential of the web as a means
for providing information *about* resources of any kind, even information
resources.

A much more precise solution to this problem is, as mentioned
above, to allow folks to ask about URIs and be told explicitly
what kind of resource they identify, among other things.

And being able to use http: URIs to identify *any* resource whatsoever
allows for the publication of information about those resources via
the globally deployed, proven web infrastructure.

> (I also tend to think 301, 302, and 307 imply that something is an
> information resource, but too much good data (like foaf and dublin
> core) breaks with that assumption, 

Which should be reasonable evidence that trying to draw such
assumptions are wrong.

> which is why I'm trying to get them
> to switch to using 303 See Other, and for now my code does not make
> such an assumption.)

I don't see why any particular HTTP response should be expected to
provide information about the class of a resource -- except for
the "trivial" class of web resources. 

The web architecture IMO should be completely agnostic to the meaning
which we associate with URIs, dealing solely with the business of 
providing access to representations via particular URIs. At the
web layer, a URI denotes a resource. Period. And one might access
representations of that resource (whatever it may be) via that URI,
using the web machinery. Period. The web architecture should not
have to concern it self with any details about any resource denoted
by a URI, other than what representations of that resource are
accessible.

What those URIs identify, and the characteristics of those resources,
is a matter for the semantic web layer, not the web layer.

The web and semantic web layers are seamlessly integrated because
they share a common set of URIs which are presumed to identify the
same resources for both layers, so that one can provide access
to representations of a resource at the web layer and make
statements about and reason about the same resource at the 
semantic web layer.

But let's stop trying to make the web architecture do what the
semantic web machinery is better suited for.

> So from a process perspective, having Information Resource defined
> turns httpRange-14 into pretty much a Yes or No question, 

I guess that all depends on what the definition of "information resource"
is. You seem to be defining it the same as I am defining "web resource"
(as it was defined in the 2nd call draft).

And that was the source of my original objections, as it immediately
invalidates all of my work and what is already deployed by Nokia and
many others (including all Dublin Core terms, etc.).

> although I'm
> still arguing that you have to do the dereference.  This is mostly for
> practicality, since I don't see DC, FOAF, or RSS changing to use hash
> URIs.  

That's for sure.

>  I sympathize with people who want to learn something about a
> URI's resource (eg that it's an information resource) just by looking
> at the text of the URI, but I think it's already too late to do that
> with http.

Agreed. Given the already deployed, successful applications which are
not compatible with the much narrower side of the httpRange-14 debate,
it's too late, even if it were largely agreed that doing so is a good 
thing, which is not the case.

Patrick
Received on Monday, 18 October 2004 09:09:03 UTC