Re: httpRange-14 , what's the problem from Roy T. Fielding on 2002-08-01 (www-tag@w3.org from August 2002)

From: Roy T. Fielding <fielding@apache.org>
Date: Wed, 31 Jul 2002 23:48:05 -0700
To: "Tim Berners-Lee" <timbl@w3.org>
Cc: "www-tag" <www-tag@w3.org>
Message-Id: <A2224494-A51A-11D6-8895-000393753936@apache.org>
On Monday, July 29, 2002, at 09:59  AM, Tim Berners-Lee wrote:
> http://www.w3.org/DesignIssues/HTTP-URI.html

> Tim Berners-Lee
> Date: 2002-07-27, last change: $Date: 2002/07/31 20:15:46 $
> [...]

>  This question has been addressed only vaguely in the specifications.
>  However, the lack of very concise logical definition of such things had
>  not been a problem, until the formal systems started to use them. There
>  were no formal systems addressing this sort of issue (as far as I know,
>  except for Dan Connolly's Larch work [@@]), until the Semantic Web
>  introduced languages such as RDF which have well-defined logical
>  properties and are used to describe (among other things) web operations.

There has been quite a lot of work outside the W3C regarding the Web
architecture, both in formalisms and mere descriptions.  Google is a
good way to find them, though most are intended more as a way of showing
off the properties of some variation on a formalism than they are of
actually modeling the Web.

>  The efforts of the Technical Architecture Group to create an architecture
>  document with common terms highlighted this problem. (It demonstrates the
>  ambiguity of natural language that no significant problem had been 
> noticed
>  over the past decade, even though the original author or HTTP , and later
>  co-author of HTTP 1.1 who also did his PhD thesis on an analysis of the
>  web, and both of whom have worked with Web protocols ever since, had had
>  conflicting ideas of what the various terms actually mean.)

Tim, when you invented HTTP it only allowed one method (GET), did not
have header fields, and interpreted every response as HTML.  HTTP has
changed considerably over time.

I don't think we have conflicting ideas about the terms.  I think that
the changes introduced in 1995 for the sake of HTTP caching and content
negotiation are absent from your model of how the Web works because we
needed to change the model in order for it to work at all. Furthermore,
the community made a very conscious decision to stop referring to
resources as documents because they simply do not always fit the mental
model of the English word "document".

>  This document explains why the author find it difficult to work in the
>  alternative proposed philosophies. If it misrepresents those others'
>  arguments, then it fails, for which I apologize in advance and will
>  endeavor to correct.
>
> 1. Web Concepts as proposed
>
>  The WWW is a space of information objects. The URI was originally called 
> a
>  UDI, and originally all URIs identified information objects. Now, URI
>  schemes exist which identify more or less anything (eg uuids) or
>  emailboxes (mailto:) but is we look purely at HTTP URIs, they define a 
> web
>  of information objects. Information objects -- perhaps in Cyc terms
>  ConceptualWorks -- are normally things which
>
>  * Carry some sort of message, and

What does that mean?  Information objects are things that carry 
information?
All objects carry information by virtue of having state.

>  * Can be represented, to a greater or lesser authenticity, in bits

Any object's state can be represented in bits.  The state doesn't have
to be stored in bits -- it can merely be observed at the time that a
method is applied.

>  I want to make it clear that such things are generic (See Generic
>  Resources) -- while they are documents, they generally are abstractions
>  which may have many different bit representations, as a function of, for
>  example:
>
>  * Time -- the contents can vary with revision --
>  * Content-type in which the bits are encoded
>  * Natural language in which a human-readable document is written
>  * Machine language in which a machine-processable document is written
>  * and a few more
>
>  but the philosophy is that an HTTP URI may identify something with a
>  vagueness as to the dimensions above, but it still must be used to refer
>  to a unique conceptual object whose various representations have a very
>  large a mount in common. Formally, it is the publisher which defines the
>  what an HTTP URI identifies, and so one should look to the publisher for 
> a
>  commitment as to the exact nature of the identity along these axes.

Yes, no argument there.

>  I'm going to refer to this as a document, because it needs a term and 
> that
>  is the best I have to date, but the reader should be sure to realize that
>  this does not mean a conventional office document, it can be for example
>
>  * A poem
>  * An order for ball bearings
>  * A painting
>  * A Movie
>  * A reveiw of a movie
>  * A sound clip
>  * A record of the temperature of the furnace
>  * An array a million integers, all zero
>
>  and so on, as limited only by our imagination.

None of which are similar to the examples I gave, wherein an http URI
is being used to identify the I/O control-system for a physical robot
or a gateway to SMS-enabled phone devices.  Nor does it lend appropriate
significance to the properties of a Web-enabled refrigerator or a
car radio, both of which I have personally interacted with via HTTP.
It is therefore far more reasonable to refer to the thing identified
by an http URI as a resource that is accessible via HTTP.

>  The Web works because, given an HTTP URI, one can in a large number of
>  cases, get a representation of the document. For a human readable
>  document, the person is presented with the information by virtue of some
>  gadget which is given the bits of a representation. In the case of a
>  hypertext document, a reference to another document is encoded such that,
>  upon user request, the referenced document can in turn be automatically
>  presented. In the case of a machine-readable document, identifiers of
>  concepts, being HTTP URIs, will often allow definitive reference
>  information about those concepts to be pulled in to guide further 
> actions.
>
>  The web, then, is made of documents as the internet is made of cables and
>  routers. The documents can be about anything, so when we move to talk
>  about the contents of documents we break away from talking about
>  information space and the whole universe of human -- and machine --
>  discourse is open to us. Web pages can compare a renaissance choral works
>  with jazz pop hits, and discuss whether pigs have wings.
>  Machine-processable documents can encode information about shoes, and
>  ships, and sealing-wax. Until recently, the Internet protocol standards
>  out of which the Web is built had little to say about such things. They
>  were concerned only with the human-readable side, so it was people,
>  reading natural language (not internet specs) who formed and communicated
>  the concepts at this level. Nowadays, however, semantic web languages
>  allow information to be expressed not only about URIs, TCP ports and
>  documents, but also about arbitrary concepts - the shoes, and ships and
>  sealing wax, and whether pigs have wings. Simple semantic web application
>  allow one to order shoes and travel on ships, and determine that, given
>  the data, pigs do not have wings.
>
>  For these purposes it is of course quite essential to distinguish between
>  something described by a document and the document itself. Now that we -
> -
>  for the first time -- have not only internet protocols which can talk
>  about document but also those which talk about real world things, we must
>  either distinguish or be hopelessly fuzzy.

No argument there either, except that I don't think that this is
anything new.  The solution is to use a formalism that understands
the difference between an identifier and *use* of that identifier.

In order for HTTP caching to work, there needs to be a distinction
between the attributes of a resource (whatever is identified by *any*
scheme accessed via HTTP) and the attributes of one particular
representation of that resource obtained via GET.  Assertions in
the form of HTTP metadata (header fields) are made about each,
independently, and without ambiguity because they are defined by
a shared standard, albeit without syntactic clarity and independent
extensibility due to the limitation of mixing them all together
in a MIME-like header.  Apparently, RDF is capable of the same
distinctions, so there is no technical issue here.

>  And is this bad, is it an inhibition to have to work our way though
>  documents before we can talk about whatever we desire? I would argue not,
>  because it is very important not to lose track of the reasons for our
>  taking and processing any piece of information. The process of publishing
>  and reading is a real social process between social entities, not
>  mechanical agents. To be socially responsible, to be able to handle 
> trust,
>  and so on, we must be aware of these operations. The difference between 
> a
>  car and what some web page says about it is crucial - not only when you
>  are buying a car.

Correct, but in those circumstances we would not be using the "http"
URI to define the identity of the car.  Instead, we use the http URI
to provide access via HTTP to a representation of a car that is ALSO
identified by a VIN (without any URI form being necessary).  A legal
document that would later be drawn up, or even a transaction via
contract passed through a cashier (also identified via a http URI),
would use the VIN for physical identification of the car, not
because it is an inherently better string than one beginning "http:",
but because the VIN has been permanently affixed to the dashboard and
engine block for precisely this purpose.

If car producers wished such a thing, they could all agree to stamp
each car with a unique http URI instead and achieve the same
purpose: unique identification within the class of objects under use.
The fact that we could also use that URI to access information about
that specific car via the Web, which is a different resource from the
car itself, doesn't change the fact that it uniquely identifies the
car within the realm of cars.

However, it is a bit of a waste of time to talk about this in terms
of cars when the real objective, or at least that of the vocal
minority, is to distinguish between the abstract concept of a
namespace and a document describing that namespace.  It is the same
problem, but is easier to think about.  When used within an xmlns
attribute of an XML document, an http URI identifies the namespace.
When used within an xlink:href attribute of an XML document or a
browser's URI entry field or similar construct, an http URI
identifies the namespace by providing a consistent view of that
namespace in the form of representations.

The URI still identifies an unambiguous resource precisely because
we do not say that the result of a GET is the thing that is
identified, and we do not say that the thing identified is a
document just because access is allowed via HTTP by way of documents.
People who use the Web don't care about that difference, but the
technology of distributed caching absolutely depends on it.
In other words, just because a URI only identifies one resource
does not mean that every use of that URI is equivalent, just as
using an http URI as a cache key is not equivalent to using it
as the target of an anchor href.

I say an http resource is a conceptual object that has state and
identity and behavior, just as you define it in your own design notes
prior to getting involved in this debate, but I do not generally refer
to it as an object because all of the OOP developers get hot and
bothered when I do so -- it is a term that is inextricably linked
with a common implementation, just like document is a term that is
inextricably linked to words/images on renderable media.  HTTP is
designed to hide all details of the implementation, so saying http
URI identify resources is the most accurate statement.

>  Some have opined that the abstraction of the document is nonsense, and 
> all
>  that exists, when a web page describes a car, is the car and various
>  representations of it, the HTML, PNG and GIF bit streams. This is however
>  very weak in my opinion. The various representations have much more in
>  common than simply the car. And the relationship to the car can be many
>  and varied: home page, picture, catalog entry, invoice, remote control
>  panel, weblog, and so on. The document itself is an important part of
>  society - to dismiss its existence is to prevent us being aware of human
>  and aspects of information without which we are impoverished. By 
> contrast,
>  the difference between different representations of the document (GIF or
>  PNG image for example) is very small, and the relationship between
>  versions of a document which changes through time a very strong one.

That argument is weird.  No one has opined that the abstraction of
a document is nonsense -- it is merely insufficient to describe all http
resources.  Furthermore, if the same URI is used to identify a resource
whose representations are a home page, picture, catalog entry, invoice,
remote control panel, weblog, and so on, then that URI obviously does
not identify the car.  It might be said to identify a bunch of random
things related to a car of that type, but certainly not the car.

URI in general identify a resource -- one concept, one identity, one
sameness that might be observable via its representations.  The vast
majority of URI do identify documents.  HOWEVER, the architecture is
not defined by what is true of the vast majority -- it is defined by
what is true of ALL resources that fit the given criteria.  And if the
criteria is "all http URI", your definition of "document" simply does
not fit.  That does not in any way prevent people from using unambiguous
identifiers in http or the semantic web, nor does it somehow reduce
the value of a document as an abstraction.

> 2. Trying out the Alternatives
>
>  The folks who disagree with the model do so for a number of different
>  arguments. This article, therefore will have to take them one by one but
>  the ones which come to mind are as follows:

Why didn't you simply refer to the arguments that others made, rather
than your interpretation of what they meant to say?  My messages have
been a lot more carefully worded than this document.

>     1. Every web page (or many of therm) are in fact themselves
>  representations of some abstract thing, and the URI really identifies
>  that thing, not a document at all.

Not *necessarily* a document.

>     2. There are many levels of identification (representation as a set of
>  bits, document, car which the web page is about) and the URI
>  publisher, as owner of the URI, has the right to define it to mean
>  whatever he or she likes;

They can define it to mean anything, but it only has meaning if it is
used according to that definition.  Likewise, it may take on meaning
that wasn't intended by the publisher if it can be consistently used
as such, and may take on a temporary meaning if the temporal period is
sufficient to be usable.  That's because the meaning of a URI is
insignificant when compared to the reason why the reference is
being made (the meaning of the resource in context of its use), and
not all references are made by the publisher of the URI.

It is possible, though not at all desirable, that the meaning of a
resource will at some point differ from the intended meaning that
a person had in mind when they used the URI as a reference.  That
is a well-known problem that will affect the Semantic Web just as
much as it does the current Web.  It is a social problem of any
system that allows identifiers to exist independent of the entity
being identified.  Yes, it has drawbacks, but try identifying a concept
that has no current realization within a system that depends on
the realization for identity.

>     3. Actually the URI has to, like in English, identify these different
>  things ambiguously. Machines have to disambiguate using common sense
>  and logic
>     4. Actually the URI has to, like in English, identify these different
>  things ambiguously. Machines have to disambiguate using the fact that
>  different properties will refer to different levels.
>     5. Actually the URI has to, like in English, identify these different
>  things ambiguously. Machines have to disambiguate using extra
>  information which will be provided in other ways along with the URI
>     6. Actually the URI has to, like in English, identify these different
>  things ambiguously. Machines have to disambiguate them by context: A
>  catalog card will talk about a document. A car catalog will talk about
>  a car.

None of the above.  I have consistently stated that the URI identifies
the same resource as far as the architecture is concerned, even if the
people using that URI are only partially aware of its real sameness
over time, and even if its meaning changes over time.  The only thing
that differs by context is the RESULT of using that URI.  That is the
separation of concerns between methods and identifiers which has been
central to the architecture since HTTP/1.0 was introduced.

>     7. They may have been used to identify documents up till now, but for 
> RDF
>  and the Semantic Web, we should change that and start to use them as
>  the Dublin Core and RDF Core groups have for abstract concepts.

IMO, there is only one Web.

> 2.1 Identify abstract things not documents
>
>  Let's take the alternatives in order. These alternatives all make sense.
>  Each one, however, has problems I can't see any way around when we
>  consider them as a basis as
>
>  The first was,
>
>  Every web page (or many of them) are in fact themselves representations
>  of some abstract thing, and the URI really identifies that thing, not a
>  docuemnt at all.
>
>  Well, that wasn't the model I had when URIs were invented and HTTP was
>  written. However, let's see how it flies. If we stick with the principle
>  that a URI (or URIref) must unambiguously identify the same thing in any
>  context, then we come to the conclusion that URIs can not identify the 
> web
>  page. If a web page is about a car, then the URI can't be used to refer 
> to
>  the web page.

It doesn't identify both.  It identifies the car.  The web page is
what you GET.  The same URI can then be used, in another context, to
indirectly identify a representation that was formerly the result of
a GET (which is what caches do when they lookup a response), but the
cache isn't even remotely confused between the two because we have
defined them as different things.

>  2.1.1 Same URI can identify a web page and a car
>
>  What, a web page can't be a car? At this point a pedantic line reasoning
>  suggests that we should allow web pages and cars to conceptually overlap,
>  so that something can be both. This is counterintuitive, as a web page is
>  in common sense, not a concrete object whereas a car is. But sure, we
>  could construct a mathematics in which we use the terms rather specially
>  and something can be at the same time a web page and a car.
>
>  Frankly, this doesn't serve the social purpose of the semantic web, to be
>  able to deal with common sense concpets and objects. A web page about a
>  car and a car are in most people's minds quite distinct (as I argue
>  further below). A philosophy in which they are identical does not allow 
> me
>  to distinguish between them. not only conflicts with reality as I see it,
>  but also leaves us no way to make statements individually about the two
>  things.

A web page is something that you GET from a resource, not the resource
itself.

The only aspect of this that limits the Semantic Web is that it
cannot pretend the result of a GET and the resource identified by the
URI that was used to perform the GET are necessarily the same thing,
which is a perfectly reasonable thing to require considering that they
aren't even the same thing for time-varying documents, let alone cars.
A URI is an identifier of a resource, not the resource itself.

What is necessary for the Semantic Web is that it be able to distinguish
between resources and representations, and further that it can deal with
the very common situation where the representation has no known URI
by which it can be directly referred, because Web sites deliberately
hide the URI of those resources that they do not wish to be directly
accessible.  [BTW, Content-Location is not a sufficient fix for this
problem simply because the resource provider has no desire to use it.]

>  2.1.2 The URI identifies the car, not the web page
>
>  So lets fall back on the idea that the URI identifies the subject of the
>  web page, but not the web page itself. This makes sense. We can build the
>  semantic web on top of that easily.
>
>  The problem with this is that there are a large number of systems which
>  already do use URIs to identify the document. This is the whole metadata
>  world. Think of a few:
>
>  * The Dublin Core

uses URI to identify abstract concepts (metadata relationships),
indirectly obtain sections of a resource that describes, and identify
other resources that are the target of that relationship.

>  * RSS

uses URI to identify namespaces, indirectly obtain a document
that defines a namespace, and other resources that supply
representations in a given format.

>  * The HTTP headers

refer to the resource, the representation, or the message, depending
on the definition of the header field.

>  * The Adobe XML system

no idea

>  * Access control systems

always refer to the resource.

I don't see any problem.

>  (I'm sticking with the machine-processable languages as examples because
>  human-processable ones like HTML have a level of ambiguity traditional in
>  human natural language but quite out of place in the WWW infrastructure 
> --
>  or the Semantic Web. You can argue that people say "I work for w3.org" or
>  "http://www.amazon.com/shrdlu?asin=314159265359" is a great book, just as
>  they happily say "Moby Dick weighs over three thousand tonnes", "Moby 
> Dick
>  was finished over a century ago" and "I left Moby Dick on the beach"
>  without expecting to be misunderstood. So we won't use human language as 
> a
>  guide when defining unambiguously the question of what a URI identifies.
> )

So you intend to define meaning without reference to humans?  I thought
that the purpose of the Semantic Web was to help humans understand
and operate within the realm of interrelated resources.  What good does
it do if the human is first required to translate their "real world"
reference to one that applies to the less-messy-than-the-real-world
Semantic Web?   I think that is an interface error.

>  Roy Fielding argues the the URI which I associate with his web page
>  actually identifies him.

No, I do not.  I never have.  I even explicitly corrected you on
this very point while we were in your office talking this over.
My home page URI identifies my home page, where I go for a hypertext
representation of the topics that I am working on so that I can
easily jump from there to other resources of interest.  It is a
public resource so that others can do the same.  Some people use
that URI as an indirect way of identifying me, but only in the
sense that the resource contains more information about me.
However, someone could build a system that accepts the URI of a
home page as an indirect identifier of a person and performs some
action based on that relationship that affects me as a person, such
as calling my phone number, just as any identifier can be indirectly
used in ways that are not expected by the identifying authority.

Mark Baker has argued in the past that an http URI does identify
him, but I think that is reasonable since he owns the naming authority.
If you have complete control of the namespace, the identifiers
within it can identify anything provided that you only use them
consistently to identify that thing.  I doubt that is the case for
his home page URI, but it could be for some other URI.

>  He argues that conventionally people use the
>  identifier to identify the person. However, consider another Roy Fielding
>  page put together by freinds who found a photograph of him with no 
> clothes
>  on. A lot of content filtering systems would collect that URI and put put
>  into their list. Even though the photo had many represnetations which
>  different devices could download using content negotiation and/or CC/PP
>  (color orblack and white and variosu different resolutions) the URI 
> istelf
>  would be listed as containing nudity. The public are very aware of
>  different works on the web, even though they have the same topic.

Yikes, what an unpleasant mental picture.  Does that identifier provide
representations of my state, or the state of a nude picture taken at
some particular point in time?  The public is capable of distinguishing
the two over time, and thus those two resources do not have the same
topic/meaning/identity, even though there does exist a relation between
the two.  Fortunately, I don't have "friends" like that.

>  2.2.3 Indirect identification
>
>  You can argue that a web page indirectly identifies something, of course,
>  and I am quite happy with that. If you identify an organization as that
>  which has home page http://www.w3.org, then you are not saying that
>  http://www.w3.org/ itself is that organization. This scenario is very 
> very
>  common, just as we identify people and things by their "unambiguous
>  properties": books by ISBN, people by email address, and so forth. So 
> long
>  as we don't think that the person is an email address, we are fine. Some
>  people have thought that in saying "An HTTP URI can't identify an
>  organization" I was ruling out this indirect identification, but not so:
>  I
>  am very much in favor of it. The whole SQL world, after all, only
>  identified things indirectly by a key property. This causes no
>  contradiction. Perhaps I should say "An HTTP URI can't directly identify
>  an organization". But by "identify" I mean "directly identify", and
>  "identity" is a fairly direct word and concept, so I will stick with it.

Identity is not a fairly direct word and concept -- that is why I posted
a very long description of what it means to www-tag, including its
definition according to webster.com.  If this is the source of our
disagreement, then I give up.  All identifiers are by their very nature
an indirect means to establishing identity.

An http URI, when it is dereferenced, activates a mechanism whereby
the string of characters in the URI is used to select a bag of bits
that is supposed to represent the state of the abstract thing
identified as a resource via the http naming authority.  Does that
imply that the resource is an HTTP mechanism?  No.  The URI is not
the resource, the mechanism is not the resource, and the bag of bits
is not the resource.  Therefore, an http resource is always
restricted to indirect identification.  It is simply impossible to
"directly" identify a resource for which the representation is
allowed to change over time.

Consumers don't give a rat's ass about the mechanism beyond a desire
that it consistently provide representations that match the semantics
they intended by referencing it in the first place.  The semantics
are the resource.  Usually the semantics correspond to a "living
document about a particular subject", but not always.

>  Conclusion so far: the idea that a URI identifies the thing the document
>  is about doesn't work because we can only use a URI to identify one thing
>  and we have and already do use it to identify documents on the web.

No, we use it to obtain documents via the Web by identifying a
resource and asking for a representation of its current state.
We indirectly identify information within the representations,
the web page, by referring to it as the state obtained by doing
a GET on a resource's URI.  All http-based Web pages are only
indirectly identified by http URI, since a Web page is a
representation obtained at an instance in time and not the
resource itself.

> 2.2 Author definition
>
>  So how can we break free of that line of reasoning? We can try throwing
>  away the rule that a URI identifies only one thing.

I don't.

> 2.3 Logic disambiguates
>
>  Otherwise,we have to try another way of letting the URI mean sometimes 
> one
>  thing and sometimes another. Here is another.

Nope, not here either.

> 2.4 Different Properties
>
>  Actually the URI has to, like in English, identify these different
>  things ambiguously. Machines have to disambiguate using the fact that
>  different properties will refer to different levels.

Machines do have to know the realm of identification.  If an identifier
is used for identifying multiple things in the same realm, then it
is clearly ambiguous.  However, if the machine knows that it is using
the URI in a context that is clearly direct, such as xmlns attributes,
then there is no ambiguity just because it is used differently in
other realms.  It is still better though to remain unambiguous, which
is the case for the examples I described.  An xmlns attribute doesn't
access the resource -- it only uses the name of the resource as an
identifier.  Whether or not the same identifier can be used in a GET
is irrelevant to the mechanism of xmlns.

> 2.5 Extra info with URI
>
>  Actually the URI has to, like in English, identify these different
>  things ambiguously. Machines have to disambiguate using extra
>  information which will be provided in other ways along with the URI

No, that twists the argument. The argument is that people use identifiers
in an ambiguous way because there is no such thing as universal agreement
about the semantics of a resource if the publisher does not make those
semantics explicit.  The true nature of a resource cannot be observed at
any instant in time because its definition depends on how much it varies
over time, which is outside the perceptive capacity of humans and
machines.  The publisher can improve understanding of the semantics
of a resource by adding external assertions, but the identity of the
resource itself does not change by those assertions.

> 2.6 Different meaning in different context
>
>  Actually the URI has to, like in English, identify these different
>  things ambiguously. Machines have to disambiguate them by context: A
>  catalog card will talk about a document. A car catalog will talk about a
>  car.

The URI doesn't identify different things in different contexts.  It is,
however, used for different purposes in different contexts.  An xmlns
uses the URI of a resource on the Web (or not) to directly identify a
namespace whose state may (or may not) be indirectly described by a
representation found by performing a GET on the resource identified by
that URI.  There is no ambiguity here.

> 2.7 Change it for the Semantics Web
>
>  They may have been used to identify documents up till now, but for RDF
>  and the Semantic Web, we should change that and start to use them as the
>  Dublin Core and RDF Core groups have for abstract concepts.

There is no need.  In any case, the world doesn't need another AI system
for describing semantic networks in isolation -- they are only useful
when they are allowed to be enmeshed in the real world.

> 2.8 Abandon any identification of abstract things

I can't imagine anything more abstract than an identifier that identifies
"Roy's favorite quote from TimBL", which is bound to change over time.
I refuse to let anyone stick an HTTP server in my head just because that
is the only way to directly identify that resource.  They will have to
make do with a URI that I control, wherein I manage an appropriate
mapping and occasionally drop a bag of bits that represents the
last recorded value of the resource.

> 3. Conclusion
>
>  I didn't have this thought out a few years ago. It has only been in
>  actually building a relatively formal system on top of the web
>  infrastructure that I have had to clarify these concepts my own mind. I 
> am
>  forced to conclude that modeling the HTTP part of the web as a web of
>  abstract documents if the only way to go which is practical and, by the
>  philosophical underpinnings of the WWW, tenable.

I still disagree, particularly since you still haven't described how
you can hold that position for resources that are clearly not documents,
namely the resources that are service gateways to other systems.  POST
does not always mean "append to this document".  Oh, wait ...

>  Q: Some HTTP URIs can be POSTed to. Can you still say they identify
>  documents?
>
>  A: Wel, some HTTP URIs can't be accessed at all, and some access is not
>  allowed, and yes, some URIs are not only documents but also can be posted
>  to. So they object is more complex than simply a document. But that it 
> has
>  this extra functionality doesn't make it any less a HTTP document
>  formally. Something can have extra features and still remain in the same
>  class of things.

That makes no sense to me at all.  There is no stretch of the imagination
that would allow an HTTP POST to a URI that consistently identifies an
HTTP-to-GSM SMS message gateway to be formally equivalent to a document.
REST defines the message to be a representation of a document and the
service to be a resource that consumes representations, resulting in
a state change in the service that is reflected in the response message.
One could claim that the state of all SMS messages flying though the
GSM network is identified by this URI, and that therefore we are only
appending to that state, but that clearly is not the intention of the
publisher of the URI and is not consistent with the result of a GET
on that same URI, and is certainly not useful for reasoning about the
interaction.  It is an invalid model of the system.

Cheers,

Roy T. Fielding, Chief Scientist, Day Software
                  (roy.fielding@day.com) <http://www.day.com/>

                  Chairman, The Apache Software Foundation
                  (fielding@apache.org)  <http://www.apache.org/>
Received on Thursday, 1 August 2002 02:48:12 UTC