What the client can gather from the 200 code. Was: Re: Rathole4: Time variance from Alan Ruttenberg on 2011-06-25 (www-tag@w3.org from June 2011)

From: Alan Ruttenberg <alanruttenberg@gmail.com>
Date: Sat, 25 Jun 2011 01:36:39 -0400
To: Tim Berners-Lee <timbl@w3.org>
Cc: Xiaoshu Wang <xiao@renci.org>, David Booth <david@dbooth.org>, Jonathan Rees <jar@creativecommons.org>, Jeni Tennison <jeni@jenitennison.com>, "www-tag@w3.org List" <www-tag@w3.org>
Message-ID: <BANLkTikVupb5UuoWoocGE3M5qda9ttdzCQ@mail.gmail.com>
On Fri, Jun 24, 2011 at 9:58 PM, Tim Berners-Lee <timbl@w3.org> wrote:

>
> On 2011-06 -24, at 16:45, Alan Ruttenberg wrote:
>
> > On Wed, Jun 22, 2011 at 10:44 PM, Tim Berners-Lee <timbl@w3.org> wrote:
> >>
> >> If the client can gather nothing from the 200 code at all, the there is
> not
> >> much point  in doing the operation.  In the web architecture, it is the
> 200 that allows
> >> the client to in future point others too the web page using the same URI
> >> and expect them to get the same document. (not any document about the
> same
> >> thing, or any document by the same author or any document of the same
> length)
> >> (You can nit-pick about the definitions, but it is not very
> constructive).
> >
> > Hi Tim,
> >
> > I think this is a nice idea in theory and might have been true at some
> > point, but it doesn't seem to be true anymore. And even the example in
> > the AWWW seem to contradict this. If a server responds with a 200 for
> > the weather report for Oaxaca, on subsequent days it is not returning
> > the "same" document in any sense that "same" is used in English. In
> > such cases it is returning a document about the same thing (the
> > weather in Oaxaca). In the case of http://news.google.com/ what they
> > get each time isn't even about the same thing - it is about the same
> > *sort* of thing (events that have happened recently).
>
>
> Absolutely.  Web pages change with time. A web page which is
> the current front page on The Times  is "same" in this sense
> as the current front page of The Times.


Well, this isn't what "same" means. But perhaps that's splitting hairs.


> It changes with time.
> See the ontology and discussion in
> http://www.w3.org/DesignIssues/GenericResources
> which also deals with variation within a generic resource, by language,
> encoding, etc.
>

FYI it apparently needs to be http://www.w3.org/DesignIssues/Generic.html  -
the above gives 404
(somewhere along the line the generics seem to have been lost, according to
some looking at the wayback machine)
FWIW, "generic resource" is a better term to use than document, which has
much historical meaning.

>
>  > Anyways, perhaps how I interpret what you say will help you debug why
> > I am not understanding the sense of 200 you are trying to convey, and
> > I'd appreciate any insight about it. I'd really like to at a minimum
> > to understand what the idea you have is.
>
> The discussion which you forked this off was about whether
> you could use the URL of a web page to ref to (a) the web page
> or (b) the subject of the web page, or (c) none of the above.
> That was driving some people crazy, those people who want
> to use it to refer to the web page.
>

I know. But you said something specific about 200 and that's up the stack to
the current conversation.


> > My best interpretation is that what you are saying is true for a
> > subset of URIs substantially smaller than the 200 responders (without
> > even having too strict a definition of document).
>
> It depends if you want to split hairs and not allow
> documents to be things which change. If you do, you depart from the web.
>

Nope. What I meant (as you see below), is that even if you allow
document-sort changes there is much of the web that responds to 200 that
isn't like that.


> > I don't think one has to nail down exactly what the definition of
> > document is for the purposes of webarch, but it would be good to get
> > into the ballpark. Right now I have two points of reference for what a
> > 200 response means, neither of which seem to be close.
> >
> > 1) Something like "the server responded with a representation that
> > answers the GET according to its intended design" (usually seems to be
> > true, and a statement about the server, not the resource)
>
> > 2) Something like "you are getting (mostly) a complete encoding of a
> > document-like thing - something authored with a purpose, and which is
> > relatively stable in (interpreted) content but which may change over
> > time due to the normal process of author revision". (true for an
> > important class of things on the web, but only a subset of 200
> > responses)
>
> 3) Something like "you are getting (mostly) a complete encoding of a
> document-like thing - something authored or generated with a purpose, and
> which is
> relatively stable in (interpreted) content but which may change over
> time due to the normal process of author revision, or the change
> of state of things described document, or the release of new things of
> which the document represents the latest".  (true for an important class of
> things on the web, most 200 responses)
>

Good. This is more specific than before. You've added:

may change over time due to
a) the change of state of things described document [I think the word
"document" here is superfluous]
b) release of new things of which the document represents the latest

I understand these cases to be addressing the cases
(a) is like the weather report example
(b) is like a news feed


> This discussion, as to how you define the front page of The Times
> to be a document you can go into much depth on --
> but it is not very constructive.
>

What's constructive is knowing what you know when you get a 200. As you say
"If the client can gather nothing from the 200 code at all, the there is
not much point in doing the operation".

My feeling:
a) Don't talk about "documents" in this context. Its too confusing. Your
generic resources document (*that's* a document) and its discussion of axes
of variation is much clearer.
b) We need to work better to explain  "what the client can gather from the
200 code". Because the consequence of that will be that we will be able to
assess what sorts of things we can say about a resource for which a server
responds 200.

That said, I still don't think that with your additions we've covered enough
of the 200 responders. Here's a couple more for your consideration:

http://www.amazon.com/gp/cart/view.html/ref=gno_cart - what comes back is
different for every person. That there be variation by person isn't
mentioned in the generics discussion nor in any of the above.

Any URI which has a form that generates a POST for any other purpose than to
change the document in one of the ways described in our list (3) (*).
Because none of those things are the sort of thing that processes data. For
example, http://www.ncbi.nlm.nih.gov/pubmed returns 200 to a GET and
provides a document identifying the Pubmed database and tools for it. But it
also has a search box that POSTs queries to to the same URI and to which the
response is answers to those queries.

In both these cases, the question is the one you identify: what can the
client can gather from the 200 code?
I don't know how to answer that question other than to say: "nothing".

The original message was to convince Xioshu Wang that it  was important to
> be able to use the URI for the document,  not its subject.
>

I don't think that's productive, FWIW. The longer that conversation
proceeds, the less sense he makes. I quit earlier while while the going was
good.

-Alan

(*) Even the case I allow is a bit dubious with current documentation.

"The POST method is used to request that the origin server accept the entity
enclosed in the request as data to be processed by the resource identified
by the Request-URI in the Request-Line." -
http://www.w3.org/Protocols/HTTP/1.1/rfc2616bis/draft-lafon-rfc2616bis-03.txtsection
9.4

The point is that if you look at what has been defined by (3) it doesn't
sound like the resource is an agent. But it takes an agent to process a
query and give a result. Or to update a document.
Received on Saturday, 25 June 2011 05:37:28 UTC