Re: A question - use 301 instead of 406? from Nathan on 2010-03-24 (public-lod@w3.org from March 2010)

From: Nathan <nathan@webr3.org>
Date: Wed, 24 Mar 2010 15:11:07 +0000
CC: Richard Cyganiak <richard@cyganiak.de>, Hugh Glaser <hg@ecs.soton.ac.uk>, Linking Open Data <public-lod@w3.org>
Message-ID: <4BAA2B8B.8020307@webr3.org>
forgot to mention.. if you have the following urls:

http://example.com/resource/London
http://example.com/descriptions/London.html
http://example.com/descriptions/London.rdf

then you can simply enable multiviews for apache and 303
/resource/London through to /descriptions/London, and apache handles the
rest

regards!

Nathan wrote:
> Hi All,
> 
> After much thought recently I've taken the following approach (please do
> negate the fact I'm using .html etc in examples, it's only for clarity
> in this email).
> 
> Suppose I have a real world object:
> http://example.com/resource/London
> 
> and then an html and rdf description
> http://example.com/page/London.html
> http://example.com/data/London.rtf
> 
> then I am adding in one more resource to the equation; a resource which
> identifies the description; which then acts as the point for content
> negotiation. http://example.com/descriptions/London
> 
> Thus:
> 
> REQUEST->>>
> GET /resource/London HTTP/1.1
> Host: example.com
> Accept: text/html;q=0.5, application/rdf+xml
> 
> <<<-RESPONSE
> HTTP/1.1 303 See Other
> Location: http://example.com/descriptions/London
> 
> REQUEST->>>
> GET /descriptions/London HTTP/1.1
> Host: example.com
> Accept: text/html;q=0.5, application/rdf+xml
> 
> <<<-RESPONSE
> HTTP/1.1 200 OK
> Content-Location: http://example.com/page/London.html
> Content-Type: text/html
> (and Vary: etc)
> 
> This way /descriptions/London stays in the address bar and the:
> GET /descriptions/London HTTP/1.1
> Accept: application/rdf+xml
> request that may be used in the future stays good because we've
> separated the cross cutting concerns.
> 
> note: it could also be a 300 Multiple Choices + Location header for that
> final response.
> 
> This also helps with the wrong uri in the db scenario, because even in a
> worst case scenario where /descriptions/London is used rather than
> /resource/London then any RDF processor can simply read that
> /descriptions/London is a resource which describes /resource/London
> rather than being /resource/London itself; a simple bit of reasoning
> over isPrimaryTopicOf or similar will fix this.
> 
> Comments / Corrections?
> 
> Regards!
> 
> Richard Cyganiak wrote:
>> Hugh,
>>
>> On 23 Mar 2010, at 22:50, Hugh Glaser wrote:
>>> Assuming that we are in the usual situation of http://foo/bar doing a
>>> 303 to
>>> http://foo/bar.rdf when it gets a Accept: application/rdf+xml
>>> http://foo/bar
>>> what should a server do when it gets a request for
>>> Accept: application/rdf+xml http://foo/bar.html ?
>>>
>>> OK, the answer is 406.
>> No. The answer is 200, with the HTML representation. Content negotiation
>> should happen on the “generic” URI, e.g., <http://foo/bar>, but not on
>> the representation format specific URIs.
>>
>> The reason for having the representation format specific URIs
>> </bar.html> and </bar.rdf> in the first place is to allow users to
>> override their user agent's Accept header.
>>
>> For example, normal web browsers accept text/html but not
>> application/rdf+xml. There is no way how an average user can change the
>> browser's behaviour in this regard. Thus, if I direct my browser to
>> </bar> I would always get HTML. If, for whatever reason, I want to see
>> the RDF/XML, there's no way how I can do it. But if the </bar.rdf> URI
>> is configured to always returns RDF/XML, no matter what the Accept
>> header says, then the HTML can include a link to </bar.rdf> and say, “go
>> here if you really want RDF/XML.” Problem solved.
>>
>> Sending 406 (or 301) on the representation format specific URIs like
>> </bar.html> and </bar.rdf> negates the entire purpose of having those
>> URIs in the first place.
>>
>> A key bit of text from RFC 2616:
>>
>>>> Note: HTTP/1.1 servers are allowed to return responses which are not
>>>> acceptable according to the accept headers sent in the request. In
>>>> some cases, this may even be preferable to sending a 406 response.
>> Amen. 406 is actually counterproductive IMO. It just forces user agents
>> to include something like "*/*;q=0.01" in the Accept header to work
>> around those overeager content negotiation implementations that are just
>> looking for an excuse not to send a representation to the client.
>>
>> <snip>
>>> That's OK if all that happens is I use the wrong URI straight away.
>>> But what happens if I then enter it into a form that requires a LD
>>> URI, and
>>> then perhaps goes into a DB, and becomes a small part of a later process?
>>> Simply put, the process will fail maybe years later, and the
>>> possibility and
>>> knowledge to fix it will be long gone.
>>>
>>> Maybe the form validation is substandard, but I can see this as a
>>> situation
>>> that will recur a lot, because the root cause is that the address bar URI
>>> changes from the NIR URI. And most html pages do not have links to the
>>> NIR
>>> of the page you are on - I am even told that it is bad practice to
>>> make the
>>> main label of the page a link to itself - wikipedia certainly doesn't,
>>> although it is available as the "article" tab, which is not the normal
>>> thing
>>> of a page. SO in a world where wikipedia itself became LD, it would
>>> not be
>>> clear to someone who wanted the NIR URI where to find it.
>> This is a serious problem. It is a UI problem and should be solved on
>> the UI level, not on the transfer protocol level. We have lots of
>> protocol people here and few UI people, so everyone tries to fix
>> everything in the protocols.
>>
>> A similar problem has plagued RSS in its early years. The solution was
>> the feed autodiscovery convention for the HTML header, and the universal
>> feed icon. Linked data needs something similar.
>>
>> Best,
>> Richard
>>
>>
>>> So that is some of the context and motivation.
>>> If we were to decide to be more forgiving, what might be done?
>>> How about using 301?
>>> <<Ducks>>
>>> To save you looking it up, I have appended the RFC2616 section to this
>>> email.
>>> That is
>>> Accept: application/rdf+xml http://foo/bar.html
>>> Should 301 to http://foo/bar
>>> It seems to me that it is basically doing what is required - it gives the
>>> client the expected access, while telling it (if it wants to hear)
>>> that it
>>> should correct the mistake.
>>> One worry (as Danius Michaelides pointed out to me) is that the
>>> caching may
>>> need careful consideration - should the response indicate that it is not
>>> cacheable, or is that not necessary?
>>>
>>> So that's about it.
>>> I am unhappy that users doing the obvious thing might get frustrated
>>> trying
>>> to find the URIs for heir Things, so really want a solution that is
>>> not just
>>> 406.
>>> Are there other ways of being nice to users, without putting a serious
>>> burden on the client software?
>>>
>>> I look forward to the usual helpful and thoughtful responses!
>>>
>>> By the way, I see no need to 301 to http:/foo/bar if you get a
>>> Accept: text/html http://foo/bar.rdf as the steps to that might lead
>>> to this
>>> would require someone looking at an rdf document to decide to use it as a
>>> NIR, which is much less likely. And the likelihood is that there is an
>>> eyeball there to see the problem.
>>> But maybe it should?
>>>
>>> Best
>>> Hugh
>>>
>>>
>>> 10.3.2 301 Moved Permanently
>>>
>>>   The requested resource has been assigned a new permanent URI and any
>>>   future references to this resource SHOULD use one of the returned
>>>   URIs.  Clients with link editing capabilities ought to automatically
>>>   re-link references to the Request-URI to one or more of the new
>>>   references returned by the server, where possible. This response is
>>>   cacheable unless indicated otherwise.
>>>
>>>   The new permanent URI SHOULD be given by the Location field in the
>>>   response. Unless the request method was HEAD, the entity of the
>>>   response SHOULD contain a short hypertext note with a hyperlink to
>>>   the new URI(s).
>>>
>>>   If the 301 status code is received in response to a request other
>>>   than GET or HEAD, the user agent MUST NOT automatically redirect the
>>>   request unless it can be confirmed by the user, since this might
>>>   change the conditions under which the request was issued.
>>>
>>>      Note: When automatically redirecting a POST request after
>>>      receiving a 301 status code, some existing HTTP/1.0 user agents
>>>      will erroneously change it into a GET request.
>>>
>>>
>>
>>
>>
> 
>
Received on Wednesday, 24 March 2010 15:11:49 UTC