My two cents on httpRange-14 from wangxiao on 2005-06-17 (www-tag@w3.org from June 2005)

From: wangxiao <wangxiao@musc.edu>
Date: Fri, 17 Jun 2005 14:47:51 -0400
To: <www-tag@w3.org>
Message-Id: <200506171847.j5HIlqnr027380@flopsy.musc.edu>
The discussion thread on httpRange-14 is both fascinating and frustrating so
I can't help putting into my two cents.  

The discussion reminds me the P != NP problem.  And IMH I think httpRange is
a much bigger issue than its name suggest.  So, allow me to rephrase the
question in that line.  (Since I just started reading/thinking this problem,
correct me if I am wrong).

What we all agree are:

1.  The universe of discourse is the set of all "resource", let it be R.
According to the definition of URI, R is the set of entities that can be
identified with a URI.
2.  There is a set of "information resource", let it be IR, that can somehow
be represented in bits.

The question is: what is the relationship between R and IR.
a) IR is a subset of R
b) IR is disjoint from R

At first, to ask this question seems just silly because (a) appears to be
obvious.  Since HTTP URI is URI, the resource it is identified must also be
R and cannot be disjoint. 

But to give http URI a special treatment (i.e., internel signalling or reply
code) would in essence argue for answer (b).  The cause for the "ambiguity"
actually lies in the definition of "information resource".

I think if we take "information resource" as the set of resource that can be
manifested in bits, but not bits themselves.  We would have a consistent
view.  In other words, all URIs identify "abstract" entities.  The entities
(i.e., the electronic bits) that the WEB works with are disjoint from R.
Let's all the entities in WEB be W, internet protocol (such as http) is a
function that maps R -> W.  URI is a uniary predicate on R. I.e, URI(R)
unqiuely identify an elelemt "r" and "r" in R.  What we lacks now is an ID
scheme for those "w" in W.

The semantic equivalence of R can be specified by RDF (that is what RDF good
at right?) and semantic equivalence in W should be bit-by-bit comparision.

If I convert http://www.w3.org/TR/webarch into a PDF document, is it
different from the the one I retrieve with my browser? To obtain a
representation, we must apply a protocol to get the R in W.  

Now, let "u" be the URI of resource "r", ane "w" the representation state
manifested by let the protocol be p, then let's assign "w" an ID (let's call
it, URI+):

p > u.  (I understand, the > is not a reserved char in URI, but use it for
clarity in email).

For example, the representation of "http://www.w3.org/TR/webarch" will have
a name of
"http>http://www.w3.org/TR/webarch"

But the pdf document of the same URI will be:
"pdf>http://www.w3.org/TR/webarch"

I can further zip the pdf document, which will have a URI+ of
"zip>pdf>http://www.w3.org/TR/webarch"

Obviously, in the world of R,

"http>http://www.w3.org/TR/webarch" 
	= "pdf>http://www.w3.org/TR/webarch" 
		= "zip>http://www.w3.org/TR/webarch"
			="...>http://www.w3.org/TR/webarch"

So, we reason with URI, but we "electronicially work" with URI+ in WWW.  So,
instead of giving URI a special treatment, we need to find a solution to
name "electronic bitstreams". By this way, it preserve the Web as it is and
resolved ambiguity at the same time?

Xiaoshu Wang
Received on Friday, 17 June 2005 18:49:21 UTC