RE: referendum on httpRange-14 (was RE: "information resource") from Jon Hanna on 2004-10-26 (www-tag@w3.org from October 2004)

From: Jon Hanna <jon@hackcraft.net>
Date: Tue, 26 Oct 2004 17:04:42 +0100
To: <www-tag@w3.org>
Message-Id: <20041026160445.35E3715C71777@postie.hosting365.ie>
> From: www-tag-request@w3.org [mailto:www-tag-request@w3.org] 
> On Behalf Of Tim Berners-Lee
> Sent: Wed 20 October 2004 02:19
[snip]
> > Also, using a particular URI to identify the *picture* of a dog
> > does *not* preclude someone using some *other* URI to identify the
> > *actual* dog and to publish various representations of that dog via
> > the URI of the actual dog itself; and someone bookmarking the
> > URI of the *actual* dog should derive just as much benefit
> > from someone bookmarking the URI of the *picture* of the dog,
> > even if the representations published via either URI differ
> > (as one would expect, since they identify different things).
> 
> No, they would *not* gain as much benefit.
> They would, under this different design, not have any expectation of
> the same information being conveyed to (b) as was conveyed to (a).
> What would happen when (b) dereferences the bookmark? Who knows
> what he will get?  Something which is *about* the dog. Could be
> anything.  That way the web doesn't work.

When doing the same operation (i.e. deferencing with the same or similar
headers that affect negotiation) with either URI then the same information
should be conveyed barring updates or denial of access. While it is possible
for an author to fulfil the task of "provide a representation of the dog"
with completely different data for each access, doing so would be no more
perverse than it is with the current uses of the web (and pages that do so
are by no means unheard of on the "old" web, if anything I imagine they
would be rarer amongst those consciously seeking to fulfil the
httpRange-is-anything model than with those seeking to provide an
"experience" without any strong model of the architecture - which is what
the majority of webmasters are doing).

> The current web relies on people getting the same information from 
> reuse of the same URI.
> The system relies on the URI being associated with information of 
> consistent
> content, not of consistent subject.

The current web (ignoring that it contains uses of httpRange-is-anything
already) already has scope for people to be confused about items that are
thought of by the author as mere artefacts of implementation. "He puts this
really nicely" may not apply to translations, "Nice use of colour" may not
apply to text-only versions, or even to versions which are
backwards-compatible renderings of some technique not available on all
browsers.

The sender and sendee of a URI are looking at shadows on the wall of Plato's
cave either way, and allowing them to communicate as consistently as
possible is a responsibility for authors either way.

> > I think it is a major, significant, and beneficial breakthrough
> > in the evolution of the web that the architecture *was* generalized
> > to the more general class of resources -- so that users can
> > name, talk about, and provide access to representations of, any
> > thing whatsoever.
> 
> 1. The URI itself was never constrained -- only HTTP URIs.
> 2. A great way is to write RDF files so you refer to a concept as 
> described in a document, a la foo#bar

I find it hard to see how a HTTP URI can refer only to documents (for
however a broad or narrow definition we give to "document") and yet an
identifier for fragments of said documents can refer to concepts.

> Here we are trying to get the semantic web, which really 
> cares about the
> difference between a dog and a picture of a dog, to operate over
> and also to model the HTTP web, which doesn't care about
> dogs at all.

Current web users care about the difference between a dog and a picture of a
dog, and the current web serves them well. I think the same will hold for
automated consumers of semantic web representations.

> One can certainly design different protocols, in which the URIs 
> (without hashes)
> denote arbitrary objects, and one fetches some sort of information 
> about them.
> I know you have been designing such systems -- you described them in
> the RDF face-face meeting in Boston.  These are a different system:
> similar to HTTP, but yo added more methods, and you don't 
> have URIs for 
> the
> documents.  But it is a different design to the current web. 
> You claim 
> utility
> for it.  Maybe it would be useful.  But please don't call it HTTP.

Thinking about how this different protocol would work, as a
gedankenexperiment:

We would want Anything's Representation Transfer Protocol (ARTP) to be
decentralised in how it allows us to obtain at least one piece of
information about the resource, viz. that provided by the URI's owner. As
such we would want semweb servers to be found from domain name information
much as web servers currently work.

I think we would also want to borrow from HTTP at least its content
negotiation, caching, and small set of clearly defined verbs (though we
could perhaps want slightly different verbs).

In short we would most likely either mirror HTTP entirely with a few tweaks
(if only because there would inevitably be politics between those who wanted
to mirror HTTP entirely and those who didn't and the former wouldn't win
every battle) or else it would be a re-writing of HTTP attempting to gain
from hindsight (e.g. some have argued that a problem with HTTP is that it
doesn't clearly indicate which headers are hop-to-hop and which are
end-to-end, and with the advantage of a year-zero these people would want
such a mechanism).

[Technically, just using HTTP with a different default port would probably
be the best solution, but doing something terribly novel would be more
likely to get enough hype to bootstrap]

So now we have HTTP and ARTP running side by side. HTTP deals with those
things that are documents or information resources with 1 or more
representations, ARTP deals with anything else, and the use of URIs with
different schemes allows the two to "talk about" each other.

Now what happens? I think really there are only three possibilities.

Either ARTP will never take off (and I imagine the success rate for new
protocols is probably even worse than that for new businesses, so this is by
far the most likely incidence) in particular if rival protocols
("Everything's Description Transfer Protocol", "WSTransfer++", and eh,
"Purple" or something beginning with the letters "Gn-") block each other
from the sunlight too much for any of them to thrive. Or else it does
succeed, and there are lots of ARTP servers and clients alongside lots of
HTTP servers and clients.

Web browsers will start handling ARTP to greater or lesser degrees. Further
ARTP will begin to do some of the work of HTTP since applications that
require both can just use ARTP were they might use HTTP (ARTP would have no
problem considering "document" as following within it's range of "anything")
and  staying within the one protocol is likely to be easier in practice even
if there is no theoretical reason to do so.

In short the two protocols would be too similar to tolerate each other much.
Either:
1. ARTP would never take off.
Or
2. ARTP would gradually encroach onto HTTP's turf, and HTTP would go the way
of gopher.
Or
3. (Most likely) HTTP would start to be used to do what ARTP does no matter
how wrong the official theory says this is, and ARTP would be short lived,
but HTTP would be used with a de facto range of "anything".

If the need for a protocol for transferring representations of arbitrary
objects and concepts is going to be fulfilled, it will be fulfilled by the
web's primary transfer protocol.

Regards,
Jon Hanna
<http://www.selkieweb.com/>
Received on Tuesday, 26 October 2004 16:04:50 UTC