JAR's exploration of TimBL's notion of information resource from Jonathan Rees on 2009-05-13 (public-awwsw@w3.org from May 2009)

From: Jonathan Rees <jar@creativecommons.org>
Date: Wed, 13 May 2009 10:34:33 -0400
To: AWWSW TF <public-awwsw@w3.org>
Message-ID: <760bcb2a0905130734j7099560cxa098ff8e610c643c@mail.gmail.com>
JAR's exploration of TimBL's notion of information resource.

[The below does not constitute an endorsement of any particular theory
of generic resources or information resources - especially not the one
described herein.]

Terminology / notation:
  generic-resource = GR = 'information resource' in Tim's sense [1] as
    JAR currently understands it
  wa-representation = 'representation' in the REST or web architecture
    (AWWW) sense (NOT in the Xiaoshu or plain-English sense that
    permits, say, a rock, or a citizen, to be represented)

  G = some generic resource
  Z = a multidimensional parameter space
    (e.g. time * language * content-type * user-agent)
  P = a point in Z

So far all we know formally is that there is a 3-way relation

  G has wa-representation R at point P

That is, for each P = tuple of parameter values, there is a set
(possibly empty, or quite large) of wa-representations with the
property that they are wa-representations of G for the parameters P.

We can derive other relations from this one, e.g.

  G has wa-representation R at time t

meaning G has wa-representation R for some P with P.time = t.

I define the *trace* of G to be the function mapping each point P in Z
to the set of wa-representations (possibly empty) that G has at P.

If the parameter space Z is one-dimensional consisting only of time,
we get the formal model of Roy's thesis: a REST resource G is formally
[modeled as] a function from time to the set of its wa-representations
at that time.

If the parameter space is Z has two dimensions (time, HTTP GET
request), AND every set in the image of the trace has at most one
element, then the resulting class of traces coincides with David's
FTRR definition.  (So I would say the FTRR is the *trace* of some GR,
not that it *models* the GR, because it might or might not depending
on what one wants to use the model for.)  (I assume David means for an
FTRR to be partial - you don't *have* to have a wa-representation for
every request and time.)

In Tim's theory we know that Z has at least three axes (time,
language, content-type), maybe more (user-agent, authorization,
Russell 2000 index).  We know that a wa-representation can belong to
the trace of more than one GR, and that a GR can have, at one point P,
more than one wa-representation (as would e.g. Moby Dick).

As determined on the call, there is nothing that formally rules out a
"bottom" GR that has no wa-representations (trace is everywhere
empty), or a "top" GR that has *all* wa-representations (i.e. GR
has wa-representation R at P for all R and P, or trace is everywhere universal).
The latter may be useless, but not nonsensical.  In fact it may be the
case that given an *arbitrary* trace, there is (or could be?) a GR
with that trace.  This is not essential in what follows, but it would
be nice to know, if it is not true (ontologically), why it isn't -
what kinds of traces *do* not have corresponding GRs?

Suppose that two parties, Alice and Bob, get together.  Between the
two of them they somehow agree to talk about a particular
generic-resource (such as Moby Dick generically, or perhaps the
Penguin 2001 edition of Moby Dick, or a GR having as its sole
wa-representation one whose content-type and content are those from
[2] with content checksum 137aace70c30eb076407cf28bd78b884), which
they between themselves call G1.  Suppose that they agree on what G1
is to the extent that they can each separately distinguish
wa-representations that are wa-representations of G1 from those that
aren't, for any parameters P, and do so with perfect agreement - that
is, they both have full knowledge of the trace of G1.

We determined on the call that the trace isn't adequate to determine
G1 - there may be some other generic-resource G2 with exactly the same
trace as G1 that is still somehow different from G1.  So if Alice and
Bob are to know that they're really talking about the same GR
(assuming it's possible for them to know that), they will also have to
exchange additional information.  We don't yet know what additional
characteristics would be sufficient (essential) for determining
sameness, and since these characteristics must be message-conveyable
(according to AWWW), it will be very interesting to learn what they
are....

OK, now suppose that S is an HTTP server, and G is a generic-resource,
and U is a URI.  Define "S is consistent with G at U" as follows:

  if whenever S receives an HTTP GET request with request-URI U and
    responds with a 200 response is received,
  the RFC2616-entity in the 200 response is a wa-representation of G,
  then S is consistent with G at U.

Suppose that Alice's server A is consistent with G1 at URI UA, and
Bob's server B is consistent with the same generic resource G1 at URI
UB.  Then A and B are each obligated only to respond with
wa-representations of G1; except as constrained by the HTTP protocol,
they are *not* required to deliver any *particular* wa-representation
of G1, or to respond in the same way to the same request.  That is, it
is the servers A and B that choose the wa-representation, individually
(subject to the rules of CN of course), which among the many
wa-representations of G1 they would like to return.

It is entirely possible that certain generic-resources have so few
wa-representations that the choice is entirely determined by CN
parameters, in which case A and B *will* deliver the same response for
the same request.  But this would be a special case.

As an interesting mathematical construction, one could for a given URI
U the trace T_U corresponding to the GET-request/200-response events
*actually* occurring over the network for GETs of U from the server
responsible for U.  Obviously T_U does not determine a single GR, as
many GRs might have T_U as its trace or as a "subtrace" of its trace
(i.e. T_U(P) a subset of T2(P) for all P).  Any series of GETs is only
a sample of the trace of the particular GR that is served at U.  But a
client might
still ask: What are some interesting generic resources that *might* be
the GR that is served at U, according to evidence seen so far?  and
form plausible hypotheses about the GR on which it might gamble, if
it were the gambling type.

Jumping ahead, let me take a crack at the web / semantic web
unification, which I'm sure will be wrong: Let I be an
(RDFS-)interpretation in the RDF semantics sense [3].  Then I is a
"web compatible interpretation" if for every URI U, the server S
responsible for U is consistent with I(U) at U.

(You might want to require that I(U) actually *be* the resource known
to S as U, as opposed to another one that's merely consistent with
observation, but it's almost never possible for an observer to
determine what that GR is.  The nice thing
about requiring only consistency is that it is equivalent to merely
adding triples asserting that the wa-representations observed from S
are representations of whatever U is interpreted to be, which seems
almost tractable.)


[1] http://www.w3.org/DesignIssues/Generic.html
[2] http://www.gutenberg.org/dirs/etext91/moby.zip
[3] http://www.w3.org/TR/rdf-mt/
Received on Wednesday, 13 May 2009 14:35:08 UTC