RE: on documents and terms [was: RE: [WNET] new proposal WN URIs and related issues] from Booth, David (HP Software - Boston) on 2006-04-29 (public-swbp-wg@w3.org from April 2006)

From: Booth, David (HP Software - Boston) <dbooth@hp.com>
Date: Fri, 28 Apr 2006 23:44:08 -0400
To: "Dan Connolly" <connolly@w3.org>, <public-swbp-wg@w3.org>
Message-ID: <EBBD956B8A9002479B0C9CE9FE14A6C20B9320@tayexc19.americas.cpqcorp.net>
> From:  Dan Connolly
> . . .
> Pat Hayes wrote:
> > My current
> > understanding is that an information resource is some thing 
> > that can 
> > be transmitted over a network by a transfer protocol. On this 
> > understanding, one could argue that a word was an information 
> > resource.
> 
> On Thu, 20 Apr 2006 17:40:20 -0400 Booth, David wrote: 
> > It sounds like you are mainly disagreeing with the TAG's guidance.
> 
> For what it's worth, I think Pat's position is consistent 
> with the TAG's position (i.e. the W3C's position, since 
> webarch is now a W3C Recommendation).

I'm surprised and baffled, since I thought Pat argued that it is okay
for a URI to be used both as a name for a person and a name for a
document that describes that person.  But I guess you're referring to
this one point about a word being an information resource.

> . . . The definition of "Information Resource" that W3C 
> endorses[10] is:
> . . .
>
http://www.w3.org/TR/2004/REC-webarch-20041215/#def-information-resource
>
> I don't think that means that words are not information resources.

I think it may depend on what you mean by "words".  

If http://example.org/doc.html identifies a single resource, and the
associated document is updated to correct typos, then clearly
http://example.org/doc.html is identifying more than just the words that
are *currently* served from that URI: it is identifying a document
*abstraction*, rather than a particular document instance or a
particular set of words.  I don't see how "all of [the] essential
characteristics"[10] of that document *abstraction* can be "conveyed in
a message"[10].

Similarly, if http://weather.example.com/oaxaca identifies a single
resource that is "a periodically updated report on the weather in
Oaxaca"[10], then I don't see how "all of [the] essential
characteristics"[10] of that periodically updated report can be
"conveyed in a message"[10].

Because "information resources" can return different "representations"
at different times (even if some happen to return the same
representation every time), it seems to me that "information resources"
are by their very nature abstract.  

Clearly the notion of an "information resource" is modeled after the
real life notion of the contents of a (logical) disk region, on a Web
server, that is associated with a URI "racine".  (The "racine" is all of
the URI except the fragment identifier.[11])  The server is configured
to return those contents, whatever they are, when the URI racine is
dereferenced.  And those contents may change over time!  Thus, the URI
racine is not identifying any *particular* contents, it is identifying
the logical *location* where those contents are stored, and the server
provides whatever contents happen to be stored there at the moment they
are requested.  

In fact, it is not even possible on the Web to create a URI that is
permanently bound to a single document instance that can never change:
it is *always* possible to change the server configuration or domain IP
mapping to cause a different document instance to be served.  In other
words, an http URI on the real Web identifies a logical *location* whose
content *always* has the potential of changing.  Similarly (I argue), an
"information resource" is *necessarily* abstract.  Thus, if something is
not abstract, then it cannot be an "information resource".

So returning to your comment about whether a word could be an
"information resource", it depends on what you mean by "word".  If an
alternate spelling of "color" is "colour", then we are referring to an
abstract notion of a word, whose spelling may vary.  However, if you are
referring to particular sequence of characters that can be transmitted
over the network, that is a *concrete* notion of "word", and thus cannot
be an "information resource".

> 
> I tried to cover this in a recent submission to IRW2006...
> 
> [[
> Note that the TAG has not taken a position on whether
>  w:InformationResource intersects with rdf:Property. ]]
>  -- "An analysis of httpRange-14" section  
> http://www.w3.org/2006/04/irw65/urisym#hr14

Great paper!

[8] TAG httpRange-14 decision:
http://lists.w3.org/Archives/Public/www-tag/2005Jun/0039.html

[9] Tim Bray's proposed definition of "information resource":
http://lists.w3.org/Archives/Public/www-tag/2003Jul/0377.html

[10] WebArch definition of "information resource":
http://www.w3.org/TR/2004/REC-webarch-20041215/#def-information-resource

[11] Definition of "racine":
http://www.w3.org/2000/10/swap/log#racine

David Booth
Received on Saturday, 29 April 2006 03:53:56 UTC