Proposed AWWW erratum on "information resources" [was Re: Fwd: Splitting vs. Interpreting]

On Thu, 2009-06-18 at 17:34 +0100, Sean B. Palmer wrote: 
> On Thu, Jun 18, 2009 at 7:45 AM, David Booth wrote:
> > The flaw that I think should be fixed is the definition of "information
> > resource" (IR) in the AWWW:
> >
> > "all of their essential characteristics can be conveyed in a message".
> What would you propose for an erratum?

Okay, since you asked . . .  ;)  I'd suggest the following changes.

1. The first three paragraphs of section 2.2 currently read:
By design a URI identifies one resource. We do not limit the scope of
what might be a resource. The term "resource" is used in a general sense
for whatever might be identified by a URI. It is conventional on the
hypertext Web to describe Web pages, images, product catalogs, etc. as
“resources”. The distinguishing characteristic of these resources is
that all of their essential characteristics can be conveyed in a
message. We identify this set as “information resources.”

This document is an example of an information resource. It consists of
words and punctuation symbols and graphics and other artifacts that can
be encoded, with varying degrees of fidelity, into a sequence of bits.
There is nothing about the essential information content of this
document that cannot in principle be transfered in a message. In the
case of this document, the message payload is the representation of this

However, our use of the term resource is intentionally more broad. Other
things, such as cars and dogs (and, if you've printed this document on
physical sheets of paper, the artifact that you are holding in your
hand), are resources too. They are not information resources, however,
because their essence is not information. Although it is possible to
describe a great many things about a car or a dog in a sequence of bits,
the sum of those things will invariably be an approximation of the
essential character of the resource.

I suggest changing the above paragraphs to:
By design a URI identifies one resource.  The term "resource" is used in
a general sense for whatever might be identified by a URI.  We do not
limit the scope of what might be a resource.  A resource could be
anything that one may wish to identify --  physical, conceptual, real or

An "information resource" is any resource that plays a role in the
hypertext Web by producing "representations"[link to definition in sec
3.2] in response to Web requests.  Web pages, images, product catalogs
and other things that are made available on the Web are all information
resources.  Some information resources, such as static web pages, may
change very little or not at all over time.  Others, such as one that
displays the current weather report for Oaxaca, may vary frequently.
Similarly, some information resources, such as an interactive travel
booking site, may vary their representations depending on their
requests.  Others, such as simple Web pages, may not.  Conceptually one
can think of an information resource as a function from time and request
to representation. 

Ambiguity of Resource Identity

Although a URI is intended to identify one resource, and ambiguity about
the identity of that resource should be avoided to the extent possible,
ultimately ambiguity is in the eye -- or the application -- of the
beholder.  Because anything can be a resource, what one party considers
a single resource (perhaps having multiple aspects) another party making
finer distinctions might consider multiple resources that should have
distinct URIs.  

For example, the content of a book may be placed on the web and
identified by a particular URI.  Many parties will have no need to
distinguish between the web page that provides the content of the book
and the content of the book as an artistic work that is subject to
copyright law.  Depending on one's perspective (or application) this may
be viewed as a case in which the URI unambiguously identifies a resource
that has multiple aspects or as a case of ambiguity, in which the
artistic work and the web page are each deserving of their own distinct

Resources whose essential characteristics can be conveyed in message are
good candidates for being considered information resources.  Other
things, such as cars and people are less good, because some applications
are likely to find them ambiguous.  For example, if the same URI is used
to directly identify both a person and a Web page -- an information
resource -- an application that records the creation dates of people and
Web pages may find this resource ambiguous, because it cannot
distinguish between the creation date of the person and the creation
date of the web page.  This ambiguity would be avoided by giving the
person and the web page separate URIs.  On the other hand, the use of
two separate URIs may impart a cost to other applications that have no
need to distinguish between the person and the Web page, because it
requires these applications to recognize two URIs that those
applications consider equivalent.

2. The current definition of "representation" reads:
A representation is data that encodes information about resource state.
Representations do not necessarily describe the resource, or portray a
likeness of the resource, or represent the resource in other senses of
the word "represent".

I suggest changing this to:
A representation is a response, from an information resource, that
encodes information reflecting that information resource's state.
Representations do not necessarily describe the information resource, or
portray a likeness of the resource, or represent the resource in other
senses of the word "represent".  Only an information resource can have
representations in the sense used herein.

3. In addition to the above changes, there are many instances of the
word "resource" that should be changed to "information resource",
because the context only applies to information resources -- not
resources in general.  

David Booth, Ph.D.
Cleveland Clinic (contractor)

Opinions expressed herein are those of the author and do not necessarily
reflect those of Cleveland Clinic.

Received on Monday, 13 July 2009 04:06:21 UTC