Re: JAR's exploration of TimBL's notion of information resource from Jonathan Rees on 2009-05-26 (public-awwsw@w3.org from May 2009)

From: Jonathan Rees <jar@creativecommons.org>
Date: Tue, 26 May 2009 12:37:49 -0400
To: Alan Ruttenberg <alanruttenberg@gmail.com>
Cc: AWWSW TF <public-awwsw@w3.org>
Message-ID: <760bcb2a0905260937o3e789f71hc925121f4acbdc26@mail.gmail.com>
Short form:

1. We start by only looking at the *formal* system - i.e. no
definitions, just suggestive,
mysterious, and unquestionable words and phrases.

Assume:
A class GR of generic resources G
A class Rep of wa-representations R
A class Z of parameter vectors P, selecting state (time), language, etc.

A ternary relation "R is a wa-representation of G at P"

(A wa-representation does not necessarily have a G of which it is a
wa-representation, right?)

Define:
wa-representations(G, P) = set of all R such that R is a
wa-representation of G at P

Define:
trace(G) = lambda(P).wa-representations(G, P)

(N.b. there can be multiple GRs with identical traces.)

(could define time-invariant resource, fixed resource, etc. but don't
need them here)

If Z = time * request, then every trace(G) is a Boothian ftrr.
If Z = time, then every trace is a "resource" in the formal sense of
Fielding&Taylor.
  (In F&T the other parameters are carried by the representation itself.)

The unfortunate thing is that no consequences flow from these definitions.
So there are no useful theorems, and little help with interpretation (ontology).

2. So we take a step away from formalization, not in the direction of
ontology but in the
direction of empiricism. The web is an apparatus for doing
GET-experiments, whose results
we can record as formal statements:
  we write  GOT(U,P,R)  to mean that we did an HTTP GET with
request-URI U and parameters P,
    and received a 200 response containing an entity R.

(By hypothesis every HTTP entity is a wa-representation.)

(Suppressing provenance - to whom the statement should be attributed -
at Tim's request.)

So this gives a chance to tenuously relate the "generic resource" idea
to reality:

I define "GOT(U,P,R) is consistent with U being a name for G" to mean
simply that R is a wa-representation of G at P. That is, if a server
wants to behave
consistently with the notion of U being a name for G, then it had
better deliver
wa-representations of G when a U-request comes along.

Thus, a single GET-experiment GOT(U,P,R) doesn't tell us that
U names any particular G, but it does rule out U being a name for any
G that does
*not* have R in wa-representations(G,P).

If GOT(U,P,R) is false, that cold just mean the URI U is not meant to be a
name for G - I never said it was. It could be a name for something
else, or not a name
for anything.

(This is not deep. I keep hoping I'm stating the obvious...)

--- End of short part. ---

One might say that G is "on the web" at U if all GET-experiments
GOT(U,P,R) are consistent with U being a name for G. As the future
can't be predicted, and any future GET-experiment might falsify this
statement, any belief in such a statement is a gamble (just as is
belief in any inductive hypothesis, cf. Popper), as one can never say
for sure that it's true; but it may nevertheless be very useful to
assert it and believe it.

This is not the same as saying that anyone thinks U is a name for, or
"identifies", or "denotes" G; it may be just coincidence that we get
its representations at this URI. In particular, if G1 and G2 have the
same trace, and G1 is "on the web" at U, then G2 is also "on the web"
at U. If you want to tell me that you use U as a name for G, you'll
have to communicate that fact using some means other than
GET-experiments. (Thus my interest in LRDD [1], W3C site policy, the
data: URI scheme, and so on.)

A generic resource G can potentially be "on the web" at any URI U. The
server responsible for U just has to never respond with any entity
that is not a wa-representation of G.

This framework is meant to apply to things like G = Moby Dick. But the
more pedestrian situation for G to be "on the web" is if G has some
physical representative or container (such as a file on disk) that a
server can probe when a request arrives. That is, suppose G is defined
to be the contents of a particular file on some disk (i.e.: its single
wa-representation at any time t is the contents of the file at t), and
suppose we arrange for some server running, say, Apache, to deliver
the contents of the file whenever a request comes in for some URI U
(owned by someone who controls the server and so on). Then every
GET-experiment GOT(U,P,R) will be consistent with U being a name for
G, by construction; so G is "on the web" at U.

In any case, it's the information that the file holds, and not the
(physical) file itself, that is the generic resource. And the server
does have discretion in choosing which of the resource's many
representations (when there are many for a given P) it delivers; it
just has to avoid returning wa-representations that are not
wa-representations of the resource, respecting any requested parameter
constraints.

The baffling part for me has been how Moby Dick and a "network
resource" (in the sense of RFC 2616) can both be members of any single
coherent class. Let me repeat my disclaimer: This is not my theory;
it's my attempt at understanding someone else's. I'm not quite there
yet.

Jonathan

[1] http://www.hueniverse.com/hueniverse/2009/03/the-discovery-protocol-stack.html
Received on Tuesday, 26 May 2009 16:38:29 UTC