RE: caching HTTP 303 responses from Williams, Stuart (HP Labs, Bristol) on 2007-07-12 (semantic-web@w3.org from July 2007)

From: Williams, Stuart (HP Labs, Bristol) <skw@hp.com>
Date: Thu, 12 Jul 2007 12:08:39 +0100
To: "Jeremy Carroll" <jjc@hpl.hp.com>, "Jacek Kopecky" <jacek.kopecky@deri.org>
Cc: "Giovanni Tummarello" <g.tummarello@gmail.com>, <semantic-web@w3.org>
Message-ID: <C4B3FB61F7970A4391A5C10BAA1C3F0DBB34F9@sdcexc04.emea.cpqcorp.net>
Forgot to mention.... re using non-3xx response codes.

There is sometimes a need to be able to have distinguish URI for a thing
and its description.

hash'ed URIs for a thing inherently access a different URI when handed
to an HTTP engine (the fragID gets stripped before access).

3xx redirects can provide a different URI (for a description) in the
Location header.

If you were to use a 2xx or a 4xx response code... how would you go
about having a distiguished URI originally referenced thing and its
description - at least in those cases where the distinction is important
to you? 

Stuart
--
Hewlett-Packard Limited registered Office: Cain Road, Bracknell, Berks
RG12 1HN
Registered No: 690597 England

> -----Original Message-----
> From: Williams, Stuart (HP Labs, Bristol) 
> Sent: 12 July 2007 12:00
> To: 'Jeremy Carroll'; Jacek Kopecky
> Cc: Giovanni Tummarello; semantic-web@w3.org
> Subject: RE: caching HTTP 303 responses
> 
> 
> Ok... so I'll offer a thought...
> 
> At least is some of the problem cases, the retieval URIs are 
> the URIs of properties and classes in OWL ontologies and RDFS 
> vocabulary descriptions. Presumably the motivation to perform 
> such retrievals is a lack of knowledge about the thing 
> referred to by the URI. A successful retrieval, whether it 
> arises from a protocol redirect or a client side redirection 
> through the stripping of a fragID, renders the requesting 
> agent *informed* about the referrent. The question is surely 
> the persistence of that information rather than the 
> persistence of redirection. 
> 
> So... does your agent already know the answer to a question 
> that its about to ask?
> 
> 1) Do I need to ask this question or do I know enough about 
> this thing already - chances are the answer is already 
> squirrelled away in the agents knowledge base.
> 
> 2) Some answer tell you about more things than you asked 
> about eg. Dublic Core - because a bunch of URIs for dc 
> properties all redirect to the same description of all of 
> them (modulo a spurious fragId last time I looked, that gets 
> stripped anyway - and for which there is no referent in the 
> resulting description). So you may already be informed about 
> things that you haven't asked about.
> 
> The imperative for the agent to ask a question seems to be 
> lack of knowledge of the answer. If it already has the 
> answer... you can avoid asking the question.
> 
> My 2 cents.
> 
> Stuart Williams
> --
> Hewlett-Packard Limited registered Office: Cain Road, 
> Bracknell, Berks RG12 1HN Registered No: 690597 England
> 
> > -----Original Message-----
> > From: semantic-web-request@w3.org
> > [mailto:semantic-web-request@w3.org] On Behalf Of Jeremy Carroll
> > Sent: 10 July 2007 13:45
> > To: Jacek Kopecky
> > Cc: Giovanni Tummarello; semantic-web@w3.org
> > Subject: Re: caching HTTP 303 responses
> > 
> > 
> > 
> > Is there some motivation for the MUST NOT cache constraint?
> > 
> > A thought is that there are quite complex HTTP cache control 
> > mechanisms which may not work correctly. But I suppose 302s are 
> > cached, and can be updated, and the behaviour is acceptable.... so 
> > that the same mechanisms should work with
> > 303 (except for the prohibition).
> > 
> > ....
> > 
> > thinking out loud, without reading the specs,
> > 
> > Jeremy
> > 
> > 
> > 
> > Jacek Kopecky wrote:
> > > Hi Giovanni,
> > > 
> > > barring the change away from 303 for non-information 
> resources, or a 
> > > change to the cacheability of 303, one could indeed make 
> a patch for 
> > > squid.
> > > 
> > > The way I'd go about it, not to break too much, would be to add a 
> > > request ID header which would differ for different user requests, 
> > > and the squid would cache everything within the same 
> request ID, and 
> > > it would follow the specs for different requests.
> > > 
> > > The request ID would be treated as enabler for these "atomically 
> > > cacheable" things (everything), atomically as in "in the 
> same user 
> > > request processing". And this could mean statefulness in squid 
> > > (prolly a very bad thing) if there was a requirement to 
> interleave 
> > > the processing of multiple user requests.
> > > 
> > > But thinking about this, fixing 303 cacheability or maybe 
> adding a 
> > > cacheable 308 Description Elsewhere sounds easier now. 8-)
> > > 
> > > Jacek
> > > 
> > > On Tue, 2007-07-10 at 01:20 +0100, Giovanni Tummarello wrote:
> > >> Hi Jacek,
> > >>
> > >> unfortunately the "application cache" is not always possible. .
> > >> The key to cluster scalability is splitting jobs across 
> the cluster 
> > >> nodes so each file is more or less processed per so.
> > >> Web architecture then says that if you want to go fast.. 
> you can cache.. 
> > >> so one puts a large proxy where all the nodes in theory 
> can feed. 
> > >> This is what we thought we'd do.. just to find out that each 
> > >> process was running a few dozen times slower than what 
> it could (to 
> > >> say nothing on the remote hits which is the real problem) due to 
> > >> squid rightfully refusing to cache 303.
> > >> We could write a "semantic web patch" for squid to explicitly 
> > >> violate a MUST NOT.. but.. :-) .
> > >> Giovanni
> > >>
> > > 
> > > 
> > > 
> > 
> > --
> > Hewlett-Packard Limited
> > registered Office: Cain Road, Bracknell, Berks RG12 1HN 
> Registered No: 
> > 690597 England
>
Received on Thursday, 12 July 2007 11:10:21 UTC