Re: A question - use 301 instead of 406? from Nathan on 2010-03-24 (public-lod@w3.org from March 2010)

From: Nathan <nathan@webr3.org>
Date: Wed, 24 Mar 2010 15:52:06 +0000
To: Robert Sanderson <azaroth42@gmail.com>
CC: public-lod@w3.org
Message-ID: <4BAA3526.1080006@webr3.org>
Robert Sanderson wrote:
> To abuse an overused quote: "And now you have two problems."
> 
> Firstly, you have an additional kitten (URI) to pay for with the
> descriptions resource in addition to the other URIs.
> 
> Secondly, the semantics of your descriptions resource are unclear.  Is it an
> information resource or not?  Is it a conceptual set of all of the formats
> of the descriptions of the original resource? If so, shouldn't it have its
> own description?  If it's not that, what is it? If it is, how do you
> negotiate for which format you want the description of the set to be in,
> rather than the item from the set?

disagree (but also get your point and disagree in the nicest way
possible); neither the html document or the rdf are the description. the
description is a different thing entirely which is contained by either
the html document or the rdf document.

/resource/London
   rdfs:label "London"@en ;
   isPrimaryTopicOf /description/London .

/description/London
   primaryTopic /resource/London ;
   isPrimaryTopicOf /description/London.html,
                    /description/London.rdf .

/description/London.html a Document .
/description/London.rdf a Document .


Thus you already always have the /descriptions/London resource.

nb: there is something about justifying the use of /descriptions/London
as a negotiation point in addition to it being the identifier of the
description that is niggling me, i.e. which status code to use and
whether to use content-location or just Location. I am though certain
that just a blog post html page is the primaryTopicOf the sioc:Post, the
rdf and html in this example are the primaryTopicOf the description.

hope this didn't sound too assertive, I am looking for a bit of
discussion / debating about it.

Best,

Nathan
> "Do not multiply entities unnecessarily" also comes to mind.
> 
> Rob Sanderson
> 
> On Wed, Mar 24, 2010 at 8:11 AM, Nathan <nathan@webr3.org> wrote:
> 
>> forgot to mention.. if you have the following urls:
>>
>> http://example.com/resource/London
>> http://example.com/descriptions/London.html
>> http://example.com/descriptions/London.rdf
>>
>> then you can simply enable multiviews for apache and 303
>> /resource/London through to /descriptions/London, and apache handles the
>> rest
>>
>> regards!
>>
>> Nathan wrote:
>>> Hi All,
>>>
>>> After much thought recently I've taken the following approach (please do
>>> negate the fact I'm using .html etc in examples, it's only for clarity
>>> in this email).
>>>
>>> Suppose I have a real world object:
>>> http://example.com/resource/London
>>>
>>> and then an html and rdf description
>>> http://example.com/page/London.html
>>> http://example.com/data/London.rtf
>>>
>>> then I am adding in one more resource to the equation; a resource which
>>> identifies the description; which then acts as the point for content
>>> negotiation. http://example.com/descriptions/London
>>>
>>> Thus:
>>>
>>> REQUEST->>>
>>> GET /resource/London HTTP/1.1
>>> Host: example.com
>>> Accept: text/html;q=0.5, application/rdf+xml
>>>
>>> <<<-RESPONSE
>>> HTTP/1.1 303 See Other
>>> Location: http://example.com/descriptions/London
>>>
>>> REQUEST->>>
>>> GET /descriptions/London HTTP/1.1
>>> Host: example.com
>>> Accept: text/html;q=0.5, application/rdf+xml
>>>
>>> <<<-RESPONSE
>>> HTTP/1.1 200 OK
>>> Content-Location: http://example.com/page/London.html
>>> Content-Type: text/html
>>> (and Vary: etc)
>>>
>>> This way /descriptions/London stays in the address bar and the:
>>> GET /descriptions/London HTTP/1.1
>>> Accept: application/rdf+xml
>>> request that may be used in the future stays good because we've
>>> separated the cross cutting concerns.
>>>
>>> note: it could also be a 300 Multiple Choices + Location header for that
>>> final response.
>>>
>>> This also helps with the wrong uri in the db scenario, because even in a
>>> worst case scenario where /descriptions/London is used rather than
>>> /resource/London then any RDF processor can simply read that
>>> /descriptions/London is a resource which describes /resource/London
>>> rather than being /resource/London itself; a simple bit of reasoning
>>> over isPrimaryTopicOf or similar will fix this.
>>>
>>> Comments / Corrections?
>>>
>>> Regards!
>>>
>>> Richard Cyganiak wrote:
>>>> Hugh,
>>>>
>>>> On 23 Mar 2010, at 22:50, Hugh Glaser wrote:
>>>>> Assuming that we are in the usual situation of http://foo/bar doing a
>>>>> 303 to
>>>>> http://foo/bar.rdf when it gets a Accept: application/rdf+xml
>>>>> http://foo/bar
>>>>> what should a server do when it gets a request for
>>>>> Accept: application/rdf+xml http://foo/bar.html ?
>>>>>
>>>>> OK, the answer is 406.
>>>> No. The answer is 200, with the HTML representation. Content negotiation
>>>> should happen on the “generic” URI, e.g., <http://foo/bar>, but not on
>>>> the representation format specific URIs.
>>>>
>>>> The reason for having the representation format specific URIs
>>>> </bar.html> and </bar.rdf> in the first place is to allow users to
>>>> override their user agent's Accept header.
>>>>
>>>> For example, normal web browsers accept text/html but not
>>>> application/rdf+xml. There is no way how an average user can change the
>>>> browser's behaviour in this regard. Thus, if I direct my browser to
>>>> </bar> I would always get HTML. If, for whatever reason, I want to see
>>>> the RDF/XML, there's no way how I can do it. But if the </bar.rdf> URI
>>>> is configured to always returns RDF/XML, no matter what the Accept
>>>> header says, then the HTML can include a link to </bar.rdf> and say, “go
>>>> here if you really want RDF/XML.” Problem solved.
>>>>
>>>> Sending 406 (or 301) on the representation format specific URIs like
>>>> </bar.html> and </bar.rdf> negates the entire purpose of having those
>>>> URIs in the first place.
>>>>
>>>> A key bit of text from RFC 2616:
>>>>
>>>>>> Note: HTTP/1.1 servers are allowed to return responses which are not
>>>>>> acceptable according to the accept headers sent in the request. In
>>>>>> some cases, this may even be preferable to sending a 406 response.
>>>> Amen. 406 is actually counterproductive IMO. It just forces user agents
>>>> to include something like "*/*;q=0.01" in the Accept header to work
>>>> around those overeager content negotiation implementations that are just
>>>> looking for an excuse not to send a representation to the client.
>>>>
>>>> <snip>
>>>>> That's OK if all that happens is I use the wrong URI straight away.
>>>>> But what happens if I then enter it into a form that requires a LD
>>>>> URI, and
>>>>> then perhaps goes into a DB, and becomes a small part of a later
>> process?
>>>>> Simply put, the process will fail maybe years later, and the
>>>>> possibility and
>>>>> knowledge to fix it will be long gone.
>>>>>
>>>>> Maybe the form validation is substandard, but I can see this as a
>>>>> situation
>>>>> that will recur a lot, because the root cause is that the address bar
>> URI
>>>>> changes from the NIR URI. And most html pages do not have links to the
>>>>> NIR
>>>>> of the page you are on - I am even told that it is bad practice to
>>>>> make the
>>>>> main label of the page a link to itself - wikipedia certainly doesn't,
>>>>> although it is available as the "article" tab, which is not the normal
>>>>> thing
>>>>> of a page. SO in a world where wikipedia itself became LD, it would
>>>>> not be
>>>>> clear to someone who wanted the NIR URI where to find it.
>>>> This is a serious problem. It is a UI problem and should be solved on
>>>> the UI level, not on the transfer protocol level. We have lots of
>>>> protocol people here and few UI people, so everyone tries to fix
>>>> everything in the protocols.
>>>>
>>>> A similar problem has plagued RSS in its early years. The solution was
>>>> the feed autodiscovery convention for the HTML header, and the universal
>>>> feed icon. Linked data needs something similar.
>>>>
>>>> Best,
>>>> Richard
>>>>
>>>>
>>>>> So that is some of the context and motivation.
>>>>> If we were to decide to be more forgiving, what might be done?
>>>>> How about using 301?
>>>>> <<Ducks>>
>>>>> To save you looking it up, I have appended the RFC2616 section to this
>>>>> email.
>>>>> That is
>>>>> Accept: application/rdf+xml http://foo/bar.html
>>>>> Should 301 to http://foo/bar
>>>>> It seems to me that it is basically doing what is required - it gives
>> the
>>>>> client the expected access, while telling it (if it wants to hear)
>>>>> that it
>>>>> should correct the mistake.
>>>>> One worry (as Danius Michaelides pointed out to me) is that the
>>>>> caching may
>>>>> need careful consideration - should the response indicate that it is
>> not
>>>>> cacheable, or is that not necessary?
>>>>>
>>>>> So that's about it.
>>>>> I am unhappy that users doing the obvious thing might get frustrated
>>>>> trying
>>>>> to find the URIs for heir Things, so really want a solution that is
>>>>> not just
>>>>> 406.
>>>>> Are there other ways of being nice to users, without putting a serious
>>>>> burden on the client software?
>>>>>
>>>>> I look forward to the usual helpful and thoughtful responses!
>>>>>
>>>>> By the way, I see no need to 301 to http:/foo/bar if you get a
>>>>> Accept: text/html http://foo/bar.rdf as the steps to that might lead
>>>>> to this
>>>>> would require someone looking at an rdf document to decide to use it as
>> a
>>>>> NIR, which is much less likely. And the likelihood is that there is an
>>>>> eyeball there to see the problem.
>>>>> But maybe it should?
>>>>>
>>>>> Best
>>>>> Hugh
>>>>>
>>>>>
>>>>> 10.3.2 301 Moved Permanently
>>>>>
>>>>>   The requested resource has been assigned a new permanent URI and any
>>>>>   future references to this resource SHOULD use one of the returned
>>>>>   URIs.  Clients with link editing capabilities ought to automatically
>>>>>   re-link references to the Request-URI to one or more of the new
>>>>>   references returned by the server, where possible. This response is
>>>>>   cacheable unless indicated otherwise.
>>>>>
>>>>>   The new permanent URI SHOULD be given by the Location field in the
>>>>>   response. Unless the request method was HEAD, the entity of the
>>>>>   response SHOULD contain a short hypertext note with a hyperlink to
>>>>>   the new URI(s).
>>>>>
>>>>>   If the 301 status code is received in response to a request other
>>>>>   than GET or HEAD, the user agent MUST NOT automatically redirect the
>>>>>   request unless it can be confirmed by the user, since this might
>>>>>   change the conditions under which the request was issued.
>>>>>
>>>>>      Note: When automatically redirecting a POST request after
>>>>>      receiving a 301 status code, some existing HTTP/1.0 user agents
>>>>>      will erroneously change it into a GET request.
>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>>
>
Received on Wednesday, 24 March 2010 15:52:49 UTC