sketch of an exposition from Jonathan Rees on 2010-05-17 (public-awwsw@w3.org from May 2010)

From: Jonathan Rees <jar@creativecommons.org>
Date: Mon, 17 May 2010 17:40:59 -0400
To: AWWSW TF <public-awwsw@w3.org>
Message-ID: <AANLkTikRBfQ7me9SOub5721t4WbjMGxY_uOJKMJIBBh7@mail.gmail.com>
Apologies up front:
  - sorry it's rough and unformatted.  I'm trying out expository ideas
and terminology & wanted to get this out to you all for critique
  - topic not covered: metadata subjects (DC, FRBR, etc.); redirections
  - tell me which statements you disagree with! we thrive on
statements that are interesting enough that one can argue over them.
  - idle question: does every IR have a REST-representation?

-Jonathan

-------------------

Axiomatic method = don't take anything for granted - if some result
can't be proved from axioms already stated, do not assume that it is
true.

Assume a universe of discourse, which I'll call Thing.

In formal treatments one needs a way to refer to (or name or
designate) things.  For this purpose we may use URIs, although other
notations may be useful too.

Reference is not objective; when a URI refers to a Thing it's
because someone has chosen to have it do so.

Reference does not imply any special knowledge of a Thing.  I can
talk about a thing without knowing eactly which thing I'm talking
about - for example, I might be communicating partial knowledge
(properties) that I received from someone else.  Reference is not
"identification".

We'll suppose that (in any given conversation or context) a URI refers
to at most one Thing.  An agent may take a URI to refer to no Thing at
all, or refer to a Thing by multiple URIs, or not take any URI to
refer to some Thing.

If a URI U refers to some thing T then <U> is another name for T.

Some Things will be what we call 'REST-representations'.

   For now think of them as being similar to HTTP 'entities' - they
   consist of content and a few headers such as media type.
   But we'll figure out the details later.

   We don't assume that these REST-representations are 'on the wire'
   or associated with particular events or messages.
   We reserve the right to refer to them using URIs, but generally
   this will be unnecessary.

Posit a relationship, which I'll call 'W', between some Things and
some 'REST-representations' e.g. W(T,R).

   The intent is for W to capture what gets written variously
     R is "an entity corresponding to" T (RFC 2616 10.2.1)
     T "corresponds to" R (RFC 2616 10.3.1)
     R is a representation of the state of T (Fielding and Taylor)
     R "encodes information about state" of T (AWWW glossary)
     R "is a representation of" T (AWWW 2.4)

   We permit the same REST-representation to be W-related to multiple
   Things, i.e. W(T,R) and W(T',R) is consistent with T != T'.

   We permit one Thing to be W-related to more than one
   REST-representation, i.e. W(T,R) and W(T,R') is consistent with
   R != R'.

   If you don't accept web architecture as expressed in RFC 2616 in
   its rudiments, you should stop reading here.

Let us stipulate that a GET/200 HTTP exchange expresses a
W-relationship between a Thing and a REST-representation.  That is:
  1. If a URI U refers to a Thing <U>, and
  2. an HTTP request GET U results in a 200 response carrying
     REST-representation R, then
  3. we will interpret the exchange as communicating W(<U>, R).

   WHETHER WE CHOOSE TO BELIEVE W(<U>, R) IS ANOTHER STORY.
   (Consider a buggy or malicious proxy.  HTTPbis starts to address
   believability by trying to specify a notion of 'authority'.)
   ISSUES OF TRUST AND AUTHORITY WILL BE TREATED SEPARATELY (if we get
   around to it).

   We might fudge this by speaking of "credible" HTTP exchanges without
   saying exactly what that means (as indeed one cannot say).

The implication goes in only one direction: a credible GET U/200 R
exchange implies W(<U>, R), but the absence of such an exchange does
not imply that W(<U>, R) is not the case.

In fact there may be other ways to communicate or infer W(<U>, R) -
by consulting a cache, for example.

A consequence (or precondition) of this stipulation is that for each
URI U for which there is a GET/200 exchange, there exists a Thing <U>
that U refers to.  Roughly speaking, all web URIs refer to
*something*.

   This is the way in which the web is "grandfathered" into the
   semantic web.

   Although it's not falsifiable, this seems to be the idea that IH
   denies (there are no resources).

This is a powerful constraint.  Since servers are "authoritative",
they can produce whatever 200 responses they like for a URI that they
control, and not violate protocol.  That is, for an *arbitrary* set of
REST-representations concoctable by a server, we've committed to
allowing the existence of a Thing that has those REST-representations.

Note on what is NOT provable at this point

   We haven't created a way to falsify any W-statement.  That is,
   there is no way to infer the W(T,R) does not hold.  Therefore this
   theory is satisfiable by having a single Thing T, that all URIs
   refer to, having the property that W(T,R) for all
   REST-representations R.

Note on time

   Although W is time-sensitive, we'll ignore time as it is not
   helpful to account for it right now.  later we'll redo the
   treatment to take time into account.

   So W is OK as a binary relation for now.  Later it might be
   W(X,R,t).

Note on RDF

   RDF is just a vector for FOL, and FOL is a lot easier to read and
   think about, so better to start with FOL and then render it in RDF
   (and perhaps other notations) later on.

No number of GET/200 exchanges can tell you what a resource is.
There are several reasons for this.
  1. The absence of a GET/200 giving W(T,R) does not mean that W(T,R)
     isn't true.
  2. Two Things T,T' could have W(T,R) but not W(T',R) for some
     REST-representation R not hitherto obtained by a GET/200 exchange.
  3. T and T' could agree on the truth or falsehood of *every*
     W-statement and *still* be different

Information distinguishing such Things, if it were available, would
have to come through a different channel (e.g. RDF).

httpRange-14
------------

Let IR be a proper subclass of Thing containing the domain of W,
i.e. suppose W(T,R) implies that T is in IR.

Properties of IR:
   Grandfathering: "web resources" (those for which we get 200s) are in IR
     - this is a consequence of the above stipulation.
   TimBL: "generic resources" are in IR (genont)
   TimBL: literary works are in IR  (Pat Hayes disagrees)

   TimBL: dogs and people are disjoint with IR
     (by extension: anything physical)
   TimBL: strings and numbers are disjoint with IR
     (by extension: anything mathematical)
   TimBL: REST-representation is disjoint with IR
     (JAR doesn't see the point)
   Pat: RDF graphs are not in IR

   TimBL: members of IR are not determined by their W-relations
     i.e. one might have W(T,R) = W(T',R) for all REST-representations
     R, yet T != T'   [time sheet example]

We have three theories of IR in the works now: Dan's speaks-for
theory, Alan's what-is-on-the-web theory, and JAR's property-transfer
theory.
Received on Monday, 17 May 2010 23:37:57 UTC